

Research and Implementation of Non-Default Port Based Network Protocol Identification System

【作者】 吴鹏冲

【导师】 马严;

【作者基本信息】 北京邮电大学 , 计算机科学与技术, 2009, 硕士

【摘要】 随着Internet的高速发展,互联网已成为国际化商业合作、信息交互和新技术发展的最为重要的组成部分。而随着越来越丰富多样的应用不断涌现,大大改变了互联网的流量结构和流量模式,使得网络应用的分析面临着严峻的挑战。这样,网络应用分析的准确性将大大影响网络流量的分析与预测结果。然而,目前业界对网络业务分类技术的研究还远远不能赶上业务发展的步伐。第一代网络协议识别技术通常是采用基于端口号的方法进行的,由于当时的业务都能严格遵守IANA分配的端口号,因此基于端口的识别技术既准确,又能满足实时业务分类的需要。但是随着新业务的不断涌现,这些业务开始呈现伪装性和动态性的特征,此外,这些业务也会采用用户自定义业务或动态端口。这样,原先基于端口的协议识别技术就无能为力了。本文作者在查阅和学习了TCP/IP协议栈、网络协议识别技术、网络流量管理技术以及Linux网络编程技术后,在原有协议识别技术的基础上,提出了一套行之有效的非默认端口网络协议识别方案。主要的研究内容如下:(1)介绍和论述了网络协议的发展背景、网络协议识别工具的发展现状以及非默认端口网络协议识别的意义。(2)针对FTP协议、HTTP协议、TELNET协议和SSH协议,提出了有效的非默认端口识别方案,其中采用了全新的初始条件表和扩展条件表的概念。(3)综合多种应用层网络协议识别方案的特点,提出了一套适用于多种网络协议的非默认端口网络协议识别框架。(4)为了实现负载均衡的目的,本系统采用了一种灵活度高的调度策略作为流量调度机制的雏形,旨在提高系统运行的效率和稳定性。(5)在上述成果基础上,设计并实现了非默认端口网络协议识别系统,此套系统具有协议识别准确率高、支持负载均衡、协议识别方案可扩展性强、应用前景广泛等特点。

【Abstract】 With the rapid development of Internet, the Internet has become the most important component for International commercial cooperation, information exchange and development of new technologies. However, with the increasingly diverse application emerging, it has changed the structure and pattern of network traffic dramatically, making the analysis of Web applications are faced with severe challenges. Therefore, the accuracy of network application analysis will significantly affect the network analysis and prediction results.However, the current research of network application identification technology cannot catch up with the pace of development. First generation network protocol identification technology is usually based on default port number. Because most of applications at that time were strictly complied with the IANA port number allocation, port-based identification technology is not only accurate, but also can meet the needs of real-time application classification. However, with the continuous emergence of new applications, these applications began to show the camouflage and dynamic characteristic of. In addition, these applications will be using self-defined and dynamic ports. Therefore, port-based protocol identification technology becomes powerless.In this thesis, after browsing and studying TCP/IP protocols stack, network protocol identification technology, network traffic management technology and Linux network programming technology, a set of effective non-default port based network protocol identification mechanism is proposed based on the original protocol identification technology. The main research contents are as follows:(1) Introduce and discuss the background of network protocols, development status of network protocol identification tools and significance of non-default port based network protocol identification.(2) Propose effective identification mechanisms for FTP, HTTP, TELNET and SSH, which uses new concepts of initial condition table and extensive condition table.(3) Propose a set of non-default port based network protocols identification framework by integrating the characteristics of several application layer network protocol identification mechanisms.(4) Adapt a highly flexible schedule strategy as the embryonic form of the traffic schedule mechanism, to achieve load balancing, with which it can improved the efficiency and stability of the system.(5) On the basis of above results, design and implement a non-default port based network protocol identification system, which has the characteristics of high identification rate, support of traffic load balancing, protocol identification scheme scalability, broad application prospects and so on.
