引用本文
  • 李春放,文雨,孟丹.面向操作数的路径敏感二进制依赖分析方法[J].信息安全学报,已采用    [点击复制]
  • LI Chunfang,WEN Yu,MENG Dan.Operand-Oriented Path-Sensitive Binary Dependence Analysis[J].Journal of Cyber Security,Accept   [点击复制]
【打印本页】 【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

过刊浏览    高级检索

本文已被:浏览 22次   下载 0  
面向操作数的路径敏感二进制依赖分析方法
李春放1, 文雨1, 孟丹2
0
(1.中国科学院信息工程研究所信息安全国家重点实验室;2.中国科学院信息工程研究所)
摘要:
在二进制分析中,识别指令间的数据依赖是一项基础任务,其中最主要的挑战来自于由循环和递归结构引起的路径爆炸问题,这阻碍了跨所有执行路径追踪内存状态。现有方法通过路径不敏感的设计规避这一挑战,以避免穷尽式路径探索为代价牺牲分析的可靠性和精确性。为了在保持路径敏感性的同时解决路径爆炸问题,本文将二进制依赖分析重新定义为一种面向操作数的图遍历问题。本文提出的方法聚焦于追踪可行路径片段,为每个源操作数识别所有作为数据源的目的操作数,从而消除沿每条执行路径追踪内存状态的需求。 本文提出OpTrack方法,这是一种面向操作数的二进制依赖分析方法,具有三个关键创新:(1) 分层控制流图(HPG)表示,整合函数调用图与过程内控制流图,并通过基于静态单赋值(SSA)的中间语言增强基本块内的抽象解释能力;(2) 跨块依赖解析机制,采用有向无环图(DAG)表示地址表达式和分支约束,实现带路径可行性验证的逆向数据流追踪;(3) 迭代式间接分支处理框架,通过记忆化搜索和依赖驱动的目标推断动态优化控制流。 在GNU Coreutils和SPEC CINT2000基准测试上的实验评估证明了OpTrack相对于现有方法的优越性。与基线方法angr-VSA相比,OpTrack平均减少64.78%的误报依赖并提升11.41%的召回率。在分析SPEC CINT2000中176.gcc等大型样本时,相较BDA方法减少99.39%的冗余依赖。本方法以最高96.84%的召回率优势超越最新方法DueForce。通过间接分支目标解析和程序崩溃根因诊断的实际应用,进一步验证了OpTrack的有效性。 技术贡献包括:(1) 将路径敏感的二进制依赖分析形式化为具有新型操作数/分支约束表示的图遍历问题;(2) 首个基于HPG双向值追踪的剥离二进制文件多项式时间路径敏感分析框架;(3) 全面验证表明相较三种现有方法,本方法在精确率和召回率方面均取得持续改进。
关键词:  依赖分析  二进制分析  静态分析  符号执行
DOI:
投稿时间:2024-12-17修订日期:2025-03-05
基金项目:
Operand-Oriented Path-Sensitive Binary Dependence Analysis
LI Chunfang1, WEN Yu1, MENG Dan2
(1.State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences;2.Institute of Information Engineering, Chinese Academy of Sciences)
Abstract:
In binary analysis, identifying data dependencies between instructions constitutes a fundamental task, with the primary challenge stemming from path explosion caused by loops and recursive structures, which hinders tracking memory states across all execution paths. Existing approaches circumvent this challenge through path-insensitive designs that sacrifice analysis soundness and precision by avoiding exhaustive path exploration. To address path explosion while preserving path sensitivity, this paper reformulates binary dependence analysis as an operand-oriented graph traversal problem. The methodology proposed in this paper focuses on tracing feasible path fragments to identify all destination operands serv-ing as data sources for each source operand, thereby eliminating the need to track memory states along every execution path. This paper presents OpTrack, an operand-oriented binary dependence analysis method featuring three key innovations: (1) A hierarchical control flow graph (HPG) representation integrating function call graphs and intra-procedural control flow graphs, enhanced with a static single assignment (SSA) based intermediate language for abstract interpretation within basic blocks; (2) An inter-block dependency resolution mechanism employing directed acyclic graphs (DAGs) for address expressions and branch constraints, enabling backward data flow tracing with path feasibility verification; (3) An iterative indirect branch handling framework that dynamically refines control flow through memoized search and dependen-cy-driven target inference. Experimental evaluations on GNU Coreutils and SPEC CINT2000 benchmarks demonstrate OpTrack's superiority over existing methods. Compared to the baseline approach, angr-VSA, OpTrack achieves an average 64.78% reduction in false positive dependencies while improving recall by 11.41%. When analyzing large-scale samples like 176.gcc from SPEC CINT2000, OpTrack reduces redundant dependencies by 99.39% compared to BDA. The method outperforms the state-of-the-art DueForce approach with up to 96.84% higher recall rates. Practical applications in indirect branch target resolution and crash root cause diagnosis further validate the effectiveness of OpTrack. The technical contributions include: (1) Formalization of path-sensitive binary dependence analysis as a graph traversal problem with novel operand/branch constraint representations; (2) The first polynomial-time path-sensitive analysis framework for stripped binaries through bidirectional value tracking on HPG; (3) Comprehensive validation showing consistent precision and recall improvements compared to three existing methods.
Key words:  dependence analysis, binary analysis, static analysis, symbolic execution