Java语言软件组成分析工具效能评估

孙丹丹; 朴爱花; 肖扬; 霍玮; 石文昌; 孙晴; 周宸锋; 聂祖培

引用本文：

孙丹丹,朴爱花,肖扬,霍玮,石文昌,孙晴,周宸锋,聂祖培.Java语言软件组成分析工具效能评估[J].信息安全学报,已采用 [点击复制]
Sun Dandan,PIAO Aihua,Xiao Yang,HUO Wei,SHI Wenchang,SUN Qing,ZHOU Chenfeng,NIE Zupei.Evaluation of Software Composition Analysis Tools for Java Language[J].Journal of Cyber Security,Accept [点击复制]

本文已被：浏览 782次下载 20次
Java语言软件组成分析工具效能评估
孙丹丹¹, 朴爱花¹, 肖扬¹, 霍玮¹, 石文昌², 孙晴¹, 周宸锋¹, 聂祖培³
0 字体:加大+\|默认\|缩小-
(1.中国科学院信息工程研究所;2.中国人民大学;3.杭州电子科技大学)

摘要:

随着开源软件生态的不断扩大、漏洞数量的持续增加以及依赖关系的日趋复杂，开源软件所引发的安全问题愈发引人关注，软件组成分析工具应运而生。软件组成分析工具可细分为不同的技术类别，其中面向Java项目的软件组成分析工具普遍借助包管理器识别第三方组件，通过所使用漏洞库来匹配并报告其潜在的漏洞。由于现有技术的局限性，目前涌现的大量工具仍在开源组件识别和漏洞识别等关键效能上存在不足。本文使用140个人工标注了漏洞影响版本范围的基准数据集，并构造了包含38个项目的组件清单和漏洞可达性的基准数据集，从漏洞库准确性、开源组件检出率和漏洞存在性多个维度，对6个开源和1个商用的软件组成分析工具进行效能评估实证研究，重点揭示了Java语言软件组成分析工具在有效性和可用性等方面的量化效能，以及所面临的关键难题。最终结果表明：i）多数工具所使用漏洞库的准确性受限于数据来源，被测工具中OWASP和商用工具所使用的漏洞库更为准确，F1值最高为69.3%；ii）多数工具利用Maven解析组件关系，被测工具识别直接依赖的F1值为72.65%到81.02%，漏报的直接依赖多被错误地报告为项目的间接依赖；识别间接依赖的F1值为23.27%到30.84%，漏报的间接依赖大多是项目的可选依赖或由其引入的依赖；iii）被测工具所报告漏洞函数的实际可达性为21.46%到44.72%，工具应更多地结合漏洞代码特征来识别潜在漏洞，以提升其可用性。

关键词: 软件组成分析开源组件依赖漏洞工具评估

DOI：10.19363/J.cnki.cn10-1380/tn.2025.04.16

投稿时间：2023-10-12修订日期：2024-05-21

基金项目:国家自然科学基金项目（面上项目，重点项目，重大项目）

Evaluation of Software Composition Analysis Tools for Java Language

Sun Dandan¹, PIAO Aihua¹, Xiao Yang¹, HUO Wei¹, SHI Wenchang², SUN Qing¹, ZHOU Chenfeng¹, NIE Zupei³

(1.Institute of Information Engineering;2.Renmin University of China;3.Hangzhou Dianzi University)

Abstract:

As the Open Source Software (OSS) ecosystem continues to expand, the number of vulnerabilities increases, and dependencies become increasingly complex, security issues arising from OSS have become a growing concern. In response, Software Composition Analysis (SCA) tools have emerged. These tools can be categorized into various technical categories, with those targeting Java projects commonly utilizing package managers to identify Third Party Libraries (TPLs) and matching them against vulnerability databases to report potential vulnerabilities. Due to the limitations of existing technologies, many emerging tools are still inadequate in key performance such as TPLs identification and vulnerability identification. We employed a benchmark dataset manually annotated the vulnerable versions of 140 CVEs and constructed a benchmark dataset contained TPLs and reachable vulnerabilities of 38 projects. From multiple dimensions of vulnerability database accuracy, TPLs detection rate and vulnerability existence, empirical research is conducted on the effectiveness evaluation of six open source and one commercial SCA tools. The study focuses on quantifying the performance of Java language SCA tools in terms of effectiveness and usability while highlighting key challenges they face. The key findings are as follows: i) The accuracy of vulnerability libraries used by most tools is limited by data sources. OWASP and the commercial tool use more accurate vulnerability databases, achieving the highest F1 score of 69.3%. ii) Most tools use Maven to resolve TPLs. The F1 score for direct dependency identification ranges from 72.65% to 81.02%, with many direct dependencies being incorrectly reported as indirect dependencies. The F1 score for indirect dependency identification ranges from 23.27% to 30.84%, with many missed indirect dependencies being optional or introduced by them. iii) The actual reachability of vulnerabilities reported by the evaluated tools ranges from 21.46% to 44.72%. Tools should incorporate more vulnerability code characteristics to identify potential vulnerabilities and improve their usability.

Key words: Software Composition Analysis Open Source Software Dependency Vulnerability Tool Evaluation