节点文献

缓冲区溢出漏洞的静态检测方法研究

Research on Static Analysis Methods to Detect Buffer Overflows Vulnerability

【作者】 夏一民

【导师】 张民选; 罗军;

【作者基本信息】 国防科学技术大学 , 计算机科学与技术, 2007, 博士

【摘要】 程序运行时的缓冲区溢出将导致程序行为异常。恶意用户可以利用服务程序中的缓冲区溢出漏洞,攻击目标计算机系统,窃取或破坏其中的敏感信息,甚至控制该系统向其它主机发动攻击。在每年数十万的网络攻击事件中,大约50%与缓冲区溢出漏洞有关,每年给计算机用户造成的经济损失超过100亿美元。缓冲区溢出漏洞已经成为一种最危险和最常见的软件安全漏洞。静态检测方法通过分析程序源码,发现其中的缓冲区溢出漏洞,可以从根本上提高软件的安全性。和动态检测方法相比,静态检测方法具有不降低程序执行效率、无兼容性问题、可在软件开发早期应用等优点。缓冲区溢出漏洞的静态检测正在成为信息安全领域的一个研究热点。本文的目标是研究不依赖用户注释,同时具有高检测精度和高检测效率的缓冲区溢出漏洞静态检测技术。在已有的静态检测方法中,数据流分析和约束分析方法具有不依赖用户注释和高效的特点。数据流分析沿控制流传播约束变量的范围,当传播达到不动点时得到分析的结果。由于数据流分析不能跟踪控制条件,特别是传统的加宽/收窄技术不考虑循环控制条件中约束变量之间的数据依赖关系,导致采用数据流分析的缓冲区溢出漏洞检测方法的精度不高。约束分析通过约束状态的建立和求解发现程序性质。如果循环定义的约束变量之间存在相互依赖的关系,那么包含这些依赖关系的约束状态将无解。此外,基于约束分析的已有静态检测系统忽略指令执行条件的传播和求解,这也降低了它们的检测精度。本文从提高缓冲区溢出漏洞静态检测的精度和效率入手,针对上述问题展开深入研究,主要研究工作和创新包括:(1)针对已有数据流分析方法不能精确分析循环的问题,提出一种基于反向路径的循环分析方法。缓冲区溢出和循环有着密切的联系。使用加宽/收窄操作能够快速建立循环的近似解,但是已有方法在分析循环控制条件时不考虑约束变量之间的数据依赖关系,分析精度不高。本文提出一个基于反向路径的循环分析框架和一组双向数据流方程,以及一个基于此框架和数据流方程的循环分析算法。对于循环内可能改进分析结果的控制转移节点,该算法首先进行后向数据流分析,并沿反向路径传播节点的控制条件,再进行前向数据流分析,改进控制转移节点的数据流状态,提高循环分析的精度。(2)针对穷尽约束产生方法建立的约束状态中存在大量冗余约束条件的问题,提出一种需求驱动的约束产生方法。传统的穷尽约束产生方法计算每条语句的约束信息。由于程序中的许多指令是与内存访问无关的,所以这种方法产生了许多冗余的约束条件。本文提出的需求驱动约束产生方法仅为内存访问指令维护约束状态,并且仅考虑那些与缓冲区溢出漏洞检测有关的约束信息,从而减少了需要建立的约束状态和约束条件的数量,提高了约束产生的效率。(3)针对已有约束状态安全检查方法求解精度不高的问题,提出一种范围约束和控制约束相互改进的约束状态检查方法。该方法通过一个“求解一检查一改进”的迭代过程,不断缩小约束变量在范围约束和控制约束下的取值范围,逐步提高溢出表达式的计算精度。由于迭代过程充分利用了控制约束,所以本方法能够获得比以往方法更精确的检测结果。同时,迭代可以在求解结果足以确定指令安全性时停止,避免不必要的改进。如果程序中存在循环或递归,那么本文的约束状态安全检查算法还通过打破范围依赖环路的方法,消除使得约束状态无解的约束条件,其效率明显高于使用IIS的传统方法。(4)针对单一静态分析方法不能同时获得高检测精度和高检测效率的问题,本文提出一种基于反例的溢出漏洞静态检测方法。该方法是一种层次式检测方法,先通过流敏感和上下文敏感的快速检测,发现可能的缓冲区溢出漏洞,再在快速检测结果的指导下,进行路径敏感和上下文敏感的精确检测,消除快速检测引入的误报,建立可导致缓冲区溢出的具体反例。由于反例的建立是在快速检测结果的指导下完成的,因而这种方法比传统的精确检测方法更高效,又不会降低检测的精度。为了消除数据流分析或约束分析各自在检测缓冲区溢出漏洞时的不足,该方法还将数据流分析建立的循环分析结果和约束分析建立的约束状态合并,组成统一的约束状态。(5)根据本文提出的分析算法,我们实现了一个缓冲区溢出漏洞静态检测的原型系统,并对它的总体和各个部分进行了性能讦测。实验结果表明,本原型系统具有非常高的检测精度和检测效率,总体性能优于已有的典型缓冲区溢出漏洞静态检测系统。综上所述,本文针对缓冲区溢出漏洞静态检测的几个关键问题进行了探索,取得了一定的研究成果。对于提高溢出漏洞静态检测的精度和效率具有重要的理论意义和应用价值。

【Abstract】 Buffer overflows at run-time can lead to non-deterministic program behavior, which may be used by malicious attackers to compromise a computer, such as to access or corrupt sensitive information, and even worse, to control the computer to attack others. Every year, hundreds of thousands of network attacks happen, about fifty percent of which are related to buffer overflow vulnerabilities. And the economic waste caused by this kind of attacks is more than 10 billion dollars. Buffer overflow has become the most dangerous and most common software security vulnerability.Static analysis can be used to detect vulnerability by analying source codes. It helps developers in eliminating buffer overflows before software is deployed and improve software security essentially. Compared with dynamic analysis, static analysis methods have such advantages as not slowing down program execution, not leading to compatibility problems and being processed at early stage of software development. Static detection for buffer overflow (SDBO) is becoming a hot topic in the area of information security.The goal of the thesis is to present very high precise and efficient static analysis methods for buffer overflow detection without depending on user annotations. Among the existing static analysis methods, data flow analysis and constraint analysis don’t depend on user annotations, and are very efficient at the same tiime. Data flow analysis propagates ranges of constraint variables along control flow, and gains solutions when fixed point is arrived. But data flow analysis method does not trace control conditions. Especially, the traditional widening/narrowing method does not consider data dependence between constraint variables of loop control conditions. Therefore, the precision of existing data flow analysis is poor. Constraint analysis finds program characteristics by building and solving constraint states of the program. If there are co-dependent relations among constraint variables defined in a loop, the constraint states including these relations are infeasible. In addition, the existing constraint-based static detection systems ignore the propagation and solution of execution conditions of instructions, which makes the detection precision very low.To improve the precision and efficience of static analysis methods for buffer overflow vulnerabilities, detailed researches have been done. The main contributions are as follows:(1) To improve the precision of data flow loop analysis, a loop analysis method based on reverse paths is proposed. Buffer overflows are usually related to loops. With the widening/narrowing operator, traditional loop analysis methods can obtain approximate solutions of loops fast, but these methods ignore data dependence between constraint variables, and leads to poor imprecise. In this dissertation, a framework based on bidirectional data flow analysis and the corresponding bidirection dataflow equations are established. On the basis of the framework and the equations, a loop analysis algorithm is proposed. For control nodes that may improve the result of loop analysis, the algorithm first propagates control conditions along reverse paths, and then makes a forward data flow analysis to improve their data flow state. By this way, the precision of the loop analysis is improved.(2) To avoid the redundant constraint conditions that are generated by exhaustive constraint analysis, a demand-driven constraint generating method is proposed. Traditional constraint analysis methods compute constraint information for every instruction. Because many instructions are not related to memory access, the exhaustive method should generate many redundant constraints for buffer overflow detection. Our demand-driven constraint generation approach only maintains constraint states for memory access instructions, and only considers constraint information that is related to buffer overflow detection. Therefor, the number of constraint states and constraint conditions is decreased and the efficiency of constraint generation is improved.(3) To improve the solution precision of constraint state computing, a constraint state computing algorithm which mutually refines the solution of range constraints and control constraints is proposed. Through an iteration of "resolution - check - refinement", the algorithm reduces ranges of constraint variables with range constraints and control constraints, and improves resolution precision of overflow expression step by step. It is more precise than existing ones benefit from control constraints. At the same time, it stops the iteration as soon as the memory access is decided to be safe, avoiding unnecessary computation. To eliminate constraint conditions that make range constraints infeasible, this algorithm builds range dependency graph of the range constraints and breaks loops in the graph, which is more efficient than the tradition methods of using IIS.(4) To overcome the shortcoming of single static analysis mode that cannot gain both high precision and high efficiency, a hierarchy analysis algorithm based on counter-example is proposed. As a hierarchy analysis approach, it first makes a fast detection with flow sensitive and context sensitive analysis to find possible buffer overflow vulnerabilities, then makes a precision detection with path and context sensitive analysis guided by the former result to reduce false positives and identifying the concrete path of inducing buffer overflow. Since the counter examples are constructed under the guide of the fast analysis, the algorithm is more efficient than traditional path-sensitive algorithms, and don’t sacrifice precision at the same time. To eliminate the imprecision of buffer overflow static detection with data flow analysis and constraint analysis, the algorithm forms the analysis state of memory access instructions by combining the result of data flow loop analysis and the constraint state of constraint state of constraint analysis.(5) A prototype system is designed and implemented with technologies described above. The experiment results show that the prototype system can detect buffer overflow vulnerabilities automatically with high precision and high efficiency. In the mass, the capability of the prototype system is better than existing classical systems.To sum up, in this dissertation, several key problems of SDBO have been studied and some contributions have been achieved. Our contributions listed above are helpful for enhancing the performance of SDBO in both theory and practice.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络