节点文献

基于正则表达式的深度包检测研究

Research on Regular Expression Based Deep Packet Inspection

【作者】 张娜

【导师】 郑骏;

【作者基本信息】 华东师范大学 , 计算机应用, 2007, 硕士

【摘要】 随着专门针对应用层攻击现象的增多,传统的状态检测防火墙有效性越来越低。防火墙的功能重心从网络层发展到了应用层,因而诞生了深度包检测技术。深度包检测技术不仅检测网络层和传输层数据包头部,而且深入到应用层数据包的有效载荷所封装的内容中,搜寻合法或非法的内容以决定是否允许数据包通过。随着深度包检测技术的发展,传统上用于过滤数据包内容的模式集合(包含模式的匹配串)逐渐被正则表达式集合所代替。例如Linux的应用协议分类器L7-filter(Linux Application Protocol Classifier),通过基于正则表达式的模式集合识别应用层的数据包;Snort、Bro等入侵检测系统也已将正则表达式应用于它的规则集当中。然而,虽然正则表达式在模式匹配时比字符串表现得更优异,但在现有的网络应用中,一个典型的模式集合往往由上百个正则表达式和数以万计的状态数组成,将模式集合构造成一个有限自动机,所需的内存可达几百兆,甚至几G,结果导致了基于正则表达式的深度包检测的响应时间过长,极大地影响了检测效率。目前,如何提高基于正则表达式的深度包检测技术的效率,在国内外都尚处于探索阶段。本文所进行的研究正是在该背景下展开的。本文首先在分析传统防火墙工作原理的基础上,介绍了采用深度包检测技术的新一代智能防火墙。然后在详细说明数据包过滤技术和入侵防护检测技术的基础上,阐述了深度包检测技术的工作原理。通过对常见模式匹配算法的优缺点的分析,本文提出了一种新的基于正则表达式的匹配算法。在深入分析了DFA(DFA,Deterministic Finite Automaton)状态数对算法性能影响的基础上,本文进一步提出了构造最优DFA状态数的算法,该算法保证在任意有限的系统资源下算法的时间复杂度最小。作者已经在Linux环境下实现了该算法,并对基于L7-filter模式集合的网络数据包进行了大量检测实验。实验数据表明,与已有算法相比该算法的时间复杂度最小。

【Abstract】 Traditional stateful firewall can’t provide enough protection against application-level attacks. The function of firewall moved from the network layer to the application layer and DPI (Deep Packet Inspection) technology was developed. DPI technology examines not only the header but also the contents of packets from the application level.Traditional string set based DPI technology is being replaced by regular expression set based technology. For example, in Linux Application Protocol Classifier (L7-filter), all protocol identifiers are expressed as regular expressions. Similarly, Snort and Bro intrusion detection systems also use regular expressions as pattern language.Although regular expression is effective and flexible, in current network application, a typical set of regular expressions contains hundreds of regular expressions and tens of thousands of DFA (Deterministic Finite Automaton) states which result in a storage requirement of hundreds of megabytes, even more than gigabytes. Thus the response time of regular expression based DPI algorithm increases and its performance degrades dramatically. Nowadays, how to improve the efficiency of regular expression based DPI technology is still under development all over the world.Based on the analysis of traditional firewall, the author introduced DPI technology base firewall. The operating principle of DPI was described from the viewpoints of packet filtering and intrusion detection. By analyzing the merits and demerits of the classical pattern matching algorithms, a new pattern matching algorithm based regular expression which was proposed in this paper. Based on the analysis of the impact of number of DFA states to the algorithm performance, further improvement to the algorithm was made by introducing a DFA state number optimization algorithm. The propose algorithm has been implemented in Linux environment and lots of experiments have been done. Experimental results show that the performance of the proposed algorithm is much better than others.

  • 【分类号】TP393.08;TP391.4
  • 【被引频次】15
  • 【下载频次】679
节点文献中: 

本文链接的文献网络图示:

本文的引文网络