节点文献

面向软件脆弱性分析的并行符号执行技术研究

Research on Software Vulnerability Analysis Oriented Parallel Symbolic Execution

【作者】 曹琰

【导师】 王清贤;

【作者基本信息】 解放军信息工程大学 , 计算机应用技术, 2013, 博士

【摘要】 符号执行技术在软件脆弱性分析领域的应用已经取得了较大的发展,相比于传统的模糊测试,在测试数据生成和程序执行路径分析方面具有较大优势,提高了脆弱性分析的能力。但是由于软件执行路径数巨大,在有限的资源条件下逐条路径分析效率较低,测试覆盖率难以提高,成为符号执行发展的瓶颈。随着高性能硬件平台和新型计算模型的发展,并行符号执行技术逐渐成为国内外研究的热点。在提高软件安全并行测试效率方面,算法的并行度及漏洞挖掘方法成为研究的关键问题。本文从符号执行、约束求解及其并行化技术等方面,对软件脆弱性分析进行了研究,主要工作可概括如下:1、在静态符号执行方面,为了提高符号执行的并行度,提出了以路径簇为负载单位的并行执行方法。通过程序依赖分析,将与程序目标点具有共享控制依赖条件多路径按规则归约成路径簇,以路径簇链为单位进行并行测试的负载调度,在有效减少实际执行路径数目的同时提高执行分析的并行度。2、为了处理环境交互问题,提出了基于混合输入的环境交互方法。通过定义程序执行的一致性模型,系统化地分析了符号执行与实际执行的逼近问题;提出了符号执行与具体执行的转换方法及维护一致性的启发式策略。通过区分可求解约束和复杂约束条件,提出了具体值和符号值混合代入法化简复杂约束的路径求解算法。3、在动态符号执行方面,为了提高针对特定目标的测试覆盖率,提出了基于敏感点逼近的并行测试方法。该方法结合了动静态分析两方面的优势。利用静态分析技术,提出了基于调用链回溯的敏感点到程序执行路径的距离计算方法,能够为导向型测试提供依据;设计了基于动态符号执行的分布式测试模型,实现了“距离分析——路径求解——用例生成——实际测试”反复迭代的测试过程。以敏感点为导向、以符号执行技术为基础的测试方法,加强了测试的针对性。4、在约束求解方面,提出了基于集中式存储的全局约束并行相容模型,并将该技术与回溯搜索结合,实现了并行求解方法。该模型利用动态分配约束条件的方法解决负载均衡问题,通过对变量域的集中式管理,保证了冲突检测的及时性,利用变量域剪枝单调性的特点,实现了异步相容检查,提高多节点间相容检查的并行程度。在并行搜索方面,依赖对搜索空间划分与调度,与相容技术进行融合,进一步提高了约束求解的效率。5、将静态符号执行技术应用于软件补丁比对,实现基本块级的语义差异分析。在函数级利用基于图形同构的匹配方法,找到最大同构子图;在此基础上,利用静态符号执行技术对基本块的输入输出行为进行分析,判断其功能的相似性,借以进行语义的差异分析。通过对比实验验证了匹配结果准确率的提升。本文提出的相关模型和算法已在课题组研制的大规模分布式并行漏洞挖掘系统中得到应用。实验结果表明了相关技术的可行性和有效性,对于提高软件脆弱性分析的效率有着很好的帮助。

【Abstract】 Symbolic execution in the field of software vulnerability analysis has made greatimprovement. Compared with traditional fuzzy testing, it has the advantage of test inputgeneration and execution path analysis, and improves vulnerability detection greatly. Becausethere is a large quantity of execution paths in software, it is low efficiency to analyze path one byone, and test coverage is difficult to improve. It becomes the bottleneck of symbolic execution.With the development of high performance hardware platform and new computing model,parallel symbolic execution technology becomes domestic and foreign research focus gradually.In order to improve parallel efficiency of software security test, the degree of parallelism of thealgorithms and the method of vulnerability detection has become the critical issues. The study onsoftware vulnerability analysis includes symbolic execution, constraint solution and parallelmethod of them in the thesis. The main task can be summarized as follows.1. In respect of static symbolic execution, the parallel method in which load unit consists ofpath families is proposed to improve the degree of parallelism. Through Program DependenceAnalysis, multi paths with control dependency which have the same symbolic value of targetstatement and are reducted into path family. Path family is unit of load scheduling in parallel test.Both the number of actual execution paths is reduced and the degree of parallelism of analysis ispromoted.2. In order to address the problem of interaction with environment, symbolic execution withmixed concrete symbolic input is proposed. The consistency model of program execution isdefined to analyze systematically approximation between symbolic execution and concreteexecution. The method for switching back and forth between symbolic execution and concreteexecution and the heuristic strategy of maintaining consistency are put forward. bydistinguishing solvable constraint from complex constraint and using concrete and symbolicvalues to simplify complex constraint, mixed path constraint solution algorithm is designed andrealized.3. In respect of dynamic symbolic execution, the sensitive point oriented test method forparallel approach is put forward to improve coverage of target statements. The method combinesthe advantages of both static and dynamic. Static analysis is used for realize the method forcomputing the distance from sensitive point to program execution trace based on call chainbacktrack. It provides the basis of oriented test. Distributed test model is designed based ondynamic symbolic execution, which achieves the iterative testing process, including distanceanalysis, path condition solution, cases generation and actual test. The test method combiningsensitive point oriented and symbolic execution strengthens the pertinence. 4. In respect of constant solution, Parallel Consistency Model for Global Constraints basedon Centralized Storage. As well as the model is integrated with backtrack search to parallelsolution method. Dynamic constraints distribution is used for addressing the issue ofload balancing. Collision is detected quickly due to centralized management of variable domain.Since pruning variable domain is monotonic, asynchronous consistency was realized to improvethe degree of parallelism. Parallel search is merged with parallel consistency by dividing andscheduling search space, to improve constraint solving efficiency.5. Static symbolic execution used in software security patches Comparison for the purposeof semantic differences analysis in basic block level. Through traditional structural comparison,syntax differences in function level are analyzed to find the maximum common subgraph. On thebasis, input and output behavior of basic block is analyzed by static symbolic execution. Thenthe I/O behavior is used for determining functional similarity so as to find semantic differences.Ultimately, the experimental results show the accuracy of match result is improved.Models and algorithms included in the thesis have been applied to Large scale DistributedParallel Vulnerability Detection System developed by our team. The experimental results showthat related technologies are feasible and effective and provide support for improve vulnerabilityanalysis efficiency.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络