引用本文
  • 刘靖雯,晋武侠,屈宇,金洋旭,范铭.面向软件缺陷预测的网络嵌入特征研究[J].信息安全学报,2021,6(3):29-53    [点击复制]
  • LIU Jingwen,JIN Wuxia,QU Yu,JIN Yangxu,FAN Ming.Research on Network Embedding Features for Software Defect Prediction[J].Journal of Cyber Security,2021,6(3):29-53   [点击复制]
【打印本页】 【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

←前一篇|后一篇→

过刊浏览    高级检索

本文已被:浏览 6246次   下载 5051 本文二维码信息
码上扫一扫!
面向软件缺陷预测的网络嵌入特征研究
刘靖雯1, 晋武侠1,2, 屈宇3, 金洋旭4, 范铭1
0
(1.西安交通大学 陕西省智能网络与网络安全教育部重点实验室, 西安 中国 710049;2.西安交通大学 软件学院 西安 中国 710049;3.美国加州大学河滨分校计算机科学与工程系 河滨 美国加州 92521;4.中国银行软件中心 西安 中国 710038)
摘要:
已有研究根据软件的代码依赖、修改历史、协同开发关系等,建立网络模型来预测软件的缺陷;近年来,网络嵌入技术广泛用于软件网络分析,显著提升了缺陷预测效果。本研究发现不同软件关联网络和网络嵌入算法的组合将影响缺陷预测性能。具体地,本文针对3种软件关联网络(类依赖网络、文件耦合网络和开发者贡献网络),并应用6类网络嵌入方法,分析不同网络嵌入方法所保持的软件结构特征及其对缺陷预测性能的影响。在12个开源Java系统上的实验结果显示:在类依赖网络和文件耦合网络,传统的度量特征上结合网络嵌入特征后,缺陷预测效果得到显著提升;DeepWalk、Grarep和Node2vec网络嵌入算法更擅长学习网络的同质性,缺陷预测效果更好;网络嵌入特征以及缺陷预测性能对嵌入算法的参数配置比较敏感。本研究结论有助于指导缺陷预测中软件关联网络和网络嵌入方法的选择。
关键词:  缺陷预测  网络嵌入  软件关联网络
DOI:10.19363/J.cnki.cn10-1380/tn.2021.05.03
投稿时间:2020-06-29修订日期:2020-09-28
基金项目:本课题得到国家重点研发计划资助项目(No.2018YFB1004500),国家自然科学基金(No.61632015,No.61772408,No.U1766215,No.61721002,No.61833015,No.62002280,No.61902306,No.61602369),国网陕西省电力公司科技项目(No.5226SX1800FC),教育部创新团队(No.IRT_17R86)和中国工程科技知识中心项目,中国博士后资助项目(No.2019TQ0251,No.2020M673439)的资助。
Research on Network Embedding Features for Software Defect Prediction
LIU Jingwen1, JIN Wuxia1,2, QU Yu3, JIN Yangxu4, FAN Ming1
(1.Key Laboratory of intelligent network and network security, Ministry of Education, Xi'an Jiao Tong University, Xi'an 710049, China;2.School of Software Engineering, Xi'an Jiao Tong University, Xi'an 710049, China;3.Department of Computer Science and Engineering, University of California, Riverside CA 92521, USA;4.Bank of China, Software Center, Xi'an 710038, China)
Abstract:
Researchers predict software defects based on software networks modeled from code dependencies, revision history, and collaborative development. Recently, network embedding techniques have been widely used for software network analysis, significantly improving the defect prediction performance. Our study reveals that different combinations of software associated networks and network embeddings will produce diverse prediction performance. Concretely, this work constructs three kinds of software associated networks (i.e., Class Dependency Network, Change Coupling Network, and Developer Contribution Network), analyzes the software structure preserved by 6 kinds of network embedding methods, and compares their performance in defect prediction. The results on 12 open-source Java systems indicate that, on Class Dependency Network and Change Coupling Network, after the traditional features combined with network embedding features, the defect prediction performance is enhanced significantly; DeepWalk, Grarep and Node2vec are better at learning network homophily, thus producing better prediction performance; the structure features learned by network embedding and their prediction performance by them are sensitive to embedding parameter settings. These results provide guidance in the selection of software associated networks and network embedding techniques for defect prediction.
Key words:  defect prediction  network embedding  software associated network