摘要: |
安卓恶意应用程序的检测目前存在着检测速度慢、检测率低等问题,本文针对这些问题提出了一种基于多特征融合的安卓恶意应用程序检测方法。从Android恶意应用的恶意行为特点出发,运用静态分析和动态分析互相结合的方法,提取出权限和组件、函数API调用序列、系统命令、网络请求等多维度特征,对维度较大的特征种类使用信息增益方法进行特征的筛选,取出最有用特征。本文还利用半敏感哈希算法的降维和保持相似度的特性,提出基于Simhash算法的特征融合方法,将原有的大维度的特征降维到相对较小的维度,并解决了特征的不平衡问题。融合后的特征使用GBDT算法和随机森林算法分类,检测恶意样本。实验对比分析得出本文使用的多种特征融合的方法在可以大大降低分类的训练时间,提高检测效率。 |
关键词: Android恶意应用检测 特征融合 Simhash算法 GBDT算法 随机森林算法 |
DOI:10.19363/J.cnki.cn10-1380/tn.2018.07.05 |
投稿时间:2018-03-30修订日期:2018-05-30 |
基金项目:本课题得到国家重点研发计划资助(NO.2016YFB0801304)。 |
|
Android Malware Detection Based on Multi-feature Fusion |
WANG Yong,CAI Jianyu,MENG Chun,LIU Zhenyan,XUE Jingfeng |
School of Computer, Beijing Institute of Technology, Beijing 100081, China |
Abstract: |
Based on the background and current situation of Android malicious code detection, this paper studies the reasons that cause low efficiency and low accuracy of Android malicious detection. Take the malicious behavior of Android malicious application as a starting point, we use both static analysis method and dynamic analysis method extract the features. Which contains permissions and components, function call sequence, API call sequence, system commands, network requests, etc. And then use the information gain method to filter out the useless features, extracted the most useful features. In this paper, a feature fusion method based on Simhash algorithm is proposed to reduce the original large feature dimension to a relatively small dimension, and the accuracy of the feature classification is ensured while improving the classification efficiency. Then the features are used to classify and detect malicious samples using the GBDT algorithm and the random forest algorithm. Finally, a series of comparative tests have been made. The results show that the proposed method can greatly improve the detection efficiency and the detection efficiency. |
Key words: Andriod malware detection feature fusion Simhash GBDT random forest word |