节点文献

高速网数据过滤若干关键技术研究

Key Techniques of Data Filtering on High Speed Network

【作者】 李云照

【导师】 王志英; 朱中梁;

【作者基本信息】 国防科学技术大学 , 计算机科学与技术, 2009, 博士

【摘要】 网络的高速发展给人们带来了很多便利,但同时也给人们的生活带来冲击和影响。其中,有害信息污染、网络攻击、网络犯罪等非法应用极大地妨碍了互联网的正常使用,也给人们的正常生活带来了威胁。目前,研制如何有效监控网络内容信息,发现网上有害信息,控制、处置以及打击各种网上违法犯罪活动的专用工具已成为各国政府、机构的迫切需要和重要任务。网络数据过滤是管理信息传播的一种重要手段,也是实现网络内容监控,滤除有害信息,维护网络信息安全的核心理论和关键技术,对支持国民经济发展和国防建设具有重大意义。国家骨干网的网络安全应用需要在高速环境下进行网络数据的深度检测并过滤有害信息,由于软件过滤算法的处理速度难以跟上网络发展的步伐,使用硬件对网络过滤进行加速成为一种必然的选择。本文针对字符串匹配、跨包匹配、规则匹配这三种常用的数据过滤技术展开研究,提出了多种新算法。以字符串匹配算法为基础设计并实现了一种硬件网络数据过滤器,然后扩展其功能,综合使用本文提出的多种算法设计了一款基于IP流的高速网数据过滤卡。仿真测试表明该卡是可用的、可实现的,验证了本文的相关研究工作。本文所取得的研究成果主要有:1.提出了三种适用于大规模字符串集的字符串匹配算法。(1)提出了一种存储优化的字符串匹配算法——MEPLPM算法。该算法以基于BF的PLPM算法为基础进行改进,扩大了字符串容量。(2)提出了一种查询与分析分离的BF型字符串匹配算法——DQAPLPM算法,改进了传统BF型匹配算法在数据中字符串出现率增加时匹配速度迅速降低的缺点。(3)提出了一种将hash计算与查表分离的动态可配置BF算法——DCBF算法。该算法能适应字符串集的变化,始终保持较高的匹配速度,增强了BF型匹配算法的灵活性和实用性。2.提出了一种基于分段BF和前缀保存的跨包匹配算法。该算法借鉴了使用PBF进行包尾前缀识别的思想,以适量增加保存数据长度为代价减小了存储开销,因而更适用于大规模字符串集。该算法同样保证了连接切换时保存的数据量不超过一个较小的限度,同时平均每包保存的数据量与原算法在同一量级。在该算法的研究中,还设计了一种将BF与CAM(Content Addressable Memory)结合的快速流识别方法,使跨包匹配算法能应用于实际网络中的多连接环境。3.提出了一种基于PVMatcher的快速规则匹配算法。该算法将二元规则运算转化为一对位向量操作,并使用硬件加速运算,能快速识别非匹配的字符串而且能够加快匹配字符串的处理速度,在输入字符串总数目较少时规则匹配速度明显高于传统规则匹配算法。分析表明,该算法的资源需求小,生成的中间结果少,连接切换时存取数据少,而且在网络数据中出现的字符串数目较少时匹配速度快,非常适合网络安全应用。4.在上述研究工作的基础上,设计了一种基于基于IP流的高速网数据过滤卡。该卡主要针对网络安全分析应用,无需进行协议还原,可以直接针对IP数据包进行分析并逐级过滤掉不关心的数据,从而可以大大减轻后继处理单元的负载,提高整体处理效率。详细的设计和仿真测试表明该卡是可靠的、可实现的。

【Abstract】 With the fast development of Internet, we get not only convenience, but also some bad influence. Some illegal activities on Internet including information pollution, attack and crime bring damages to Internet and threat the interest of legal users. Governments and organizations put much attention on techniques and tools for Internet content supervision, so that the abnormal data could be distinguished from normal data on Internet and criminal activities could be controlled or punished.Data filtering on Internet is an important method to control information transmitting and to maintain information security, by which harmful information could be monitored and filtered. With the development of broadband technique, the increasing network bandwidth has exceeded the increasing speed of CPU, which brings greatly challenges to the intrusion detection systems of backbone network. Since the speed of software process could not keep up with the demand of Internet speed, accelerating algorithms based on hardware are necessary for data filtering on Internet.In this thesis, we focused on several important data filtering techniques such as string matching, multi-packet matching and rule matching, and proposed some new algorithms in each field. Based on the proposed string matching algorithms, a hardware based Internet data filter was designed and implemented. Then we improved this data filter in functions and processing speed and designed an Internet data filter card. Simulation proved the usability and feasibility of our works.Primary innovative contributions of this thesis can be summarized as follows.1. We proposed three new string matching algorithms which could be fit for strings of large numbers. (1)Memory Efficient Parallel Longest Prefix Matcher——MEPLPM was proposed, which was based on PLPM and could contain more strings. (2) Decoupled Query and Analysis Parallel Longest Prefix Matcher——DQAPLPM was proposed, with which the scope of string occurrence rate would be extended compared with traditional BF based algorithms. (3)Dynamic Configurable Bloom Filters——DCBF was proposed, which could be applied to different string sets and increase the flexibility of Bloom Filter.2. We proposed a multi-packet matching algorithm based on DCBF and PBF. PBF is a multi-packet string matching algorithm which can work without defragment. Our new algorithm decreased the on-chip memory requirement of PBF with a little more off-chip memory cost. To fit for multi-flows on network, a fast flow state manage algorithm using BF and little CAM(Content Addressable Memory) was proposed to support this new algorithm.3. We proposed a fast rule matching algorithm based on PVMatcher. A 2-length rule was transformed as a pair of bit-vector operation, which can be accelerated by hardware. With PVMatcher, strings without matching any rules could be recognized quickly so the processing time could be reduced. Analysis shows that the memory requirement of PVMatcher is far less than that of traditional algorithms. The process speed of our new algorithm is obviously faster than traditional algorithms when the numbers of string is not large(eg: <20), which is consistent with the fact of Internet security applications.4. Based on the above researches, we designed an Internet data filter card. This card could filter data without defragment and work at very high speed. Tens thousand of strings and rules could be programmed into this card and most data in high speed network could be filtered. Elaborate design and simulation have been performed, proved the usability and feasibility of this card.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络