摘要: |
移动互联网的发展使得移动设备已经影响到了我们生活的方方面面,这导致了个人信息在移动设备的集中。由于Android系统的开放性和其自身安全机制的不完善,软件非法窃取隐私信息已是普遍存在的问题。信息流分析技术以保证信息的安全性为目标,通过分析应用程序中数据传播的合法性来检测隐私数据是否遭到泄露,当前利用信息流分析来检测恶意软件的方法已成为研究热点。但是Android应用程序的功能复杂性在不断增加,同时伴随着代码复杂性的增加,使得良性应用和恶意应用在敏感信息流行为模式上的相似度越来越高,粗粒度的信息流特征描述很难对良性应用和恶意应用做出区分,这会在很大程度上影响检测的准确率。为此本文提出了一种新的基于信息流关系特征的恶意软件检测方法,该方法在提取应用敏感信息流的基础之上进一步挖掘了信息流之间的关系特征,我们对敏感API调用序列之间的关系进行了详细的形式化描述,并通过动态规划方法分析得到敏感API调用序列之间的关系特征和它们的连续公共子序列,我们将关系特征表述为五元组,对连续公共子序列中的API进行分类后表述为六元组,最后将这两方面的特征融合后输入到卷积神经网络(Convolutional Neural Networks,CNN)中来实现恶意软件的检测。实验结果表明,我们在MalGenome和AndroZoo数据集下分别达到了98.5%和97.6%的准确率,可以看出更加细粒度的敏感信息流之间关系特征表述对于良性应用和恶意应用的区分起着重要的作用。 |
关键词: Android恶意软件检测 关系特征 信息流 特征融合 |
DOI:10.19363/J.cnki.cn10-1380/tn.2024.11.11 |
投稿时间:2022-11-02修订日期:2023-02-10 |
基金项目:本课题得到国家自然科学基金(No.62176265,No.61972040)资助。 |
|
Malware Detection Method Based on Information Flows Relationship Features |
YANG Baoshan,YANG Zhi,ZHANG Hongqi,HAN Bing,CHEN Xingyuan,SUN Lei |
Information Engineering University, Zhengzhou 450000, China |
Abstract: |
The development of the mobile internet has made mobile devices affect all aspects of our lives, which has led to the concentration of personal information in mobile devices. Due to the openness and the imperfect security mechanism of the Android system, the illegal theft of private information by software has become a common problem. Information flow analysis technology aims at ensuring the security of information. It detects whether private data has been leaked by analyzing the legitimacy of data transmission in applications. The method of using information flow analysis technology to detect malware has become a current research hotspot. However, the functional complexity of Android applications is increasing, along with the increase of code complexity, the similarity between benign and malware in the behavior patterns of sensitive information flows is getting higher and higher. It is difficult to distinguish between benign and malware by coarse-grained information flow feature description, which will greatly affect the accuracy of detection. In this paper, we propose a new malware detection method based on the relationship features of information flows. This method further excavates the relationship features of information flows based on the extraction of application sensitive information flows. We have made a detailed formal description of the relationship between sensitive API call sequences, and obtained the relationship features and continuous common subsequences between sensitive API call sequences through dynamic programming analysis. We have expressed the relationship features as five-tuples, and the API in the continuous common subsequence is classified as six-tuples. Finally, the features of these two aspects are fused and input into the convolutional neural networks (CNN) to realize malware detection. The experimental results show that we have achieved 98.5% and 97.6% accuracy respectively in MalGenome and AndroZoo datasets. It can be seen that the more fine-grained expression of the relationship between sensitive information flows plays an important role in distinguishing between benign and malware. |
Key words: Android malware detection relationship features information flows feature integration |