节点文献

流量分析与流记录分析系统的研究与实现

Research and Implementation on Traffic Classification and Mass Flow Logs Analysis System

【作者】 袁仑

【导师】 杨洁;

【作者基本信息】 北京邮电大学 , 信号与信息处理, 2013, 硕士

【摘要】 近年来,我国互联网特别是移动互联网迅猛发展,截止到2012年6月底,中国网民数量达到5.38亿,互联网普及率为39.9%。网络流量监控成为运营商进行网络管理和运营的重要手段,但随着网络应用的多样化,网络流量的识别和分类面临重大挑战。使用何种或多种识别方法能够对网络流量进行精确的识别并保证低的误判率已经成为当前研究的热点。随着网络线速越来越高,网络流量数据量大小急剧增长,普通的分析方法已经无法满足海量的流量数据分析需求。Google提出的MapReduce编程模型成为了海量数据分析的重要方法,而开源的Hadoop分布式平台克隆了这一模型,并得到了学术界和工业界的认可,Hadoop已经成为分析处理海量数据的重要手段。本文首先介绍了网络流量识别技术,包括深度报文检测和深度流检测。随后还介绍了海量数据分析平台,特别是Hadoop系统以及它在流量分析方面的应用。在研究流量识别技术的基础上,我们研发了网络流量分析分类系统(Traffic Analysis and Classification System, TACS)。本文详细介绍了该系统的主要功能、整体设计方案和关键子模块的设计说明。为了分析海量的流量数据,我们研发了基于Hadoop的海量流量数据分析系统LogAnalyser,使得处理分析海量数据变得方便快捷。本文详细介绍了LogAnalyser系统的主要功能、整体设计方案和关键子模块的设计说明。最后,本文使用TACS和LogAnalyser分别对报文数据和流记录数据进行分析,研究ADSL和CDMA网络中P2P流媒体业务的流量特征和GPRS网络的业务分布及网络质量特征。

【Abstract】 In recent years, the Internet in China, especially the mobile Internet, was rapidly developed. Until the end of June2012, the number of Internet users in China has reached538million, the Internet penetration rate is39.9%. Network traffic monitoring has become an important technical measure to ISPs for network management and operation. With the diversification of network applications, the identification and classification of the network traffic is facing grand challenges. Research on identification and classification methods which can achieve high accuracy and low error rate has become a hot point. With the increasing of the network speed, the size of network traffic data increases sharply, the common analysis method has been unable to meet the massive traffic data analysis needs. Google’s MapReduce programming model has become an important method for massive data analysis, and then Hadoop cloned this model and has been recognized by both academia and industry. Hadoop has become an important tool of massive data analysis.This thesis first introduces the network traffic identification and classification techniques, including deep packet inspection and the deep flow inspection methods. Then, the massive data analysis platform, especially Hadoop system and its application in flow analysis are introduced.We have developed a network traffic analysis and classification system (TACS) based on the research on traffic identification technology. This thesis describes the main function of TACS, the overall design of the program and the key sub-module design description. In order to analyze the vast amounts of traffic data, we developed a hadoop based system, LogAnalyser, making the processing and analysis of massive data become quick and easy. This thesis describes the main function, the design scheme of overall and key sub-modules of LogAnalyser.Finally, the ADSL and CDMA network traffic characteristics of P2P streaming application, GPRS network services distribution and network quality characteristics are analyzed using TACS and LogAnalyser.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络