节点文献

基于Hadoop的网络海量数据采集及处理平台开发

Platform Development on Massive Data Collection and Processing Based on Hadoop

【作者】 周天君

【导师】 崔鸿雁;

【作者基本信息】 北京邮电大学 , 通信与信息系统, 2013, 硕士

【摘要】 目前随着移动网与互联网的融合加剧,用户使用的数据类业务日益丰富,并已经成为信息传递的主要方式。这些业务数据以IP数据报格式在互联网上传输,目前基于网管的网络质量指标并不能有效地根据用户的行为特性对业务进行管控,准确反映用户行为。针对这种情况,需要对IP包进行连续采集,研究用户行为特征分析体系及数据业务的规律评测与分析系统,提高网络对业务及用户特征的预测和感知能力,推动未来网络可控可管化发展。网络数据包的采集是实现这一需求的基础,对后续数据处理及用户行为特征的分析具有重大的意义,将进一步推动未来网络的可控可管化发展。随着网络数据采集工作的展开,数据量不断积累增多,海量的数据影响着处理系统的研究与设计,单靠单一数据库系统来完成所有的数据分析处理工作已不能满足实际的需要,因此,需要提高对数据的存储处理能力,满足大数据环境下对数据处理的要求。对数据进行准确地分析才能展现出数据的价值,服务于用户行为特征分析体系和未来网络的研究。这将有助于准确刻画网络的行为,指导实际的网络部署和实施有效的流量控制,推动面向服务的未来互联网体系结构与机制的研究。本文针对上述领域展开相应的研究,研究包括以下方面:(1)高速链路数据包捕获技术;(2)海量数据存储技术;(3)海量数据分析技术;(4)数据特征分析与展示。

【Abstract】 With the integration of the mobile network and the Internet, different kinds of data service used by users have become the main way of information transfer. Those service data is transferred over the Internet by the way of IP datagram. At present, the network quality indicators based on NMS can not take control of service effectively according to the characteristics of user behavior or reflect the real user experience of various service. In this case, we need collect IP packet continuously, and then study the analysis system of user behavior characteristics, the law of data service, improve the predictive ability of the network about the user characteristics and promote the development of future network.Network packet capture is the core of this demand and is of great significance to follow-up analysis of data and the characteristics of user behavior.With the beginning of the network data collection, massive data rapidly emerges. It is a servere test to the resources of database servers. With the rapid increase of data resources, all data analysis and processing job to be completed by a single database system alone can not meet the actual needs. Therefore, we need to enhance capabilities of data processing to meet the data processing requirements of large data environment.Accuracy of the data analysis can reflect the value of the data and is good for the study of user behavior characteristics. Therefore, the study of the characteristics of the Internet data can help to portray the behavior of the network accurately and give guidance to the practical network deployment and traffic control, promoting the study of service-oriented future Internet architecture and mechanism.In this paper, we do our research on the areas metioned above, which include:(1) technology of high-speed link packet capture,(2) technology of massive data storage,(3) technology of massive data analysis and (4) data analysis and presentation.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络