节点文献

骨干通道上的网络论坛通信信息监测和分析的关键技术研究

Research of Key Technologies to Monitor and Analyze the BBS Information Transferred Through Backbone Network

【作者】 吴承荣

【导师】 张世永;

【作者基本信息】 复旦大学 , 计算机应用技术, 2011, 博士

【摘要】 经过近二十年的飞速发展,互联网已经从一种便捷的通信工具,逐渐演变成为一个虚拟社会。网络论坛是这个虚拟社会的重要组成部分,论坛中的言论传播面广、传播速度快,在一定程度上引导了舆论倾向。因此对网络论坛的通信信息进行监测和分析是了解当前舆情态势的有效渠道,对于网络虚拟社会的有序治理具有重要意义。本论文针对骨干通道上的网络论坛通信信息的监测和分析的关键技术进行了研究,主要包括以下内容:1.提出了一个“分层分布的高速骨干通信信息监测分析系统结构模型”。该模型横向根据功能和技术特性分层,纵向遵循骨干网通道的属地化管理原则构建分布式监测节点。对网络通信数据捕获、信息抽取、信息存储、深度分析、协同监测业务等重要环节的处理过程进行了归纳和抽象。其中的数据捕获、信息抽取、深度分析技术作为后续章节展开的重点。2.提出了一个“基于逻辑输出端口分组的过滤分流器设备”的设计方案。可以进行灵活的数据包复制和过滤,实现组内不同逻辑端口分流负载的灵活分配以及外接交换机二级分流。并在此基础上,提出了“动态反馈式前向缓存的过滤分流机制”的过滤分流优化机制,可以实现会话级别基于数据内容的过滤,以及灵活的关联数据完整截获。可以显著降低信息还原抽取层设备的规模。3.提出了一个“基于SVM和层次CRF的旁路截获数据的网络论坛信息抽取方法”。采用SVM技术通过对截获数据的宏观特征分析,自动识别论坛网站;采用层次CRF技术,对网络论坛会话的行为类型进行判断,并对相关信息元素进行类型标注,在此基础上形成信息抽取所用的Wrapper;采用wrapper技术对骨干网通道上的网络论坛信息进行自动抽取。4.提出了“基于旁路截获数据的网络言论特征参量体系”,以及“基于网络言论特征参量的深度分析方法”。充分考虑了论坛网站、版面、网民和帖子实体间的相互关系,将如网民对言论的兴趣、网民参与议论的程度、言论的扩散速度、网站关注程度等要素融合在一起;能够充分利用旁路截获数据的特性,通过对旁路截获数据中抽取的重要元素进行分析而获取网络言论规律特性,进行网络言论的态势分析和趋势预测。

【Abstract】 With rapid development in last twenty years, Internet, originally a convenient communication tool, has been evolved into a virtual society. BBS is one of the important communities of this virtual society. Widely and quickly spreading of BBS consensus, guides the public opinion to some extent. To monitor and analyze the BBS information is an effective way to know the public opinion situation. It can play an important role to govern the virtual society. This dissertation studies some key technologies to monitor and analyze BBS information transferred through backbone networks. The main contributions are as follows:1. Put forward a "Layered and Distributed System Architecture Model to Monitor and Analyze the Communication Information Transferred Through High Speed Backbone Networks". The model divides the layers by technical functionalities and features, and composes distributed monitoring nodes according to regional backbone administration. The key processes of Data Capture, Information Extraction, Information Storing,Information Deep Analyzing, coordinated Monitoring Applications are concluded and abstracted. Among those, Data Capture, Information Extraction, and Deep Analyzing technologies are described in follow-up chapters separately.2. A "Filtering and Distributing Device based on Logical Output Port Group" is designed. It can duplicate and filter data packets flexibly. It can allocate packets-flow to different logic ports in a group, and connect to external switches for second-level distribution. Based on that, a "Forward-Caching with Dynamic-Feedback Filtering and Distributing Mechanism" is suggested for optimization. It realizes session level data contents filtering, and flexible related-packets capture. The new mechanism can notably reduce the number of the devices deployed.3. A "Method to Extract BBS Information from Captured Packets by SVM and Layered CRF Technology" is proposed. It automatically recognizes the BBS sites by analyzing macro-features of captured packets with SVM. It adopts layered CRF technology to determine the behavior-type of BBS sessions, labels the elements-type, and composes wrappers for information extraction. Then it fulfills automatic extraction of BBS information transferred through backbone networks by wrapper technology.4. A "BBS consensus Characteristic Parameter Structure based on Captured Information" is defined. A "Deep Analyzing Method based on the BBS Consensus Characteristic Parameters " is proposed. They take into account the relationship among BBS sites, board, netizen and post-notes. They integrate the key elements together such as the interests and involvement netizen showed towards a specific consensus, the spread-speed, the attention paid to BBS sites, etc. By utilizing the characteristics and analyzing the key elements extracted from the captured-data, we can obtain the regular characteristics, evaluate the situation and predict the trends of BBS consensus.

  • 【网络出版投稿人】 复旦大学
  • 【网络出版年期】2011年 12期
  • 【分类号】TP393.09
  • 【被引频次】2
  • 【下载频次】295
节点文献中: 

本文链接的文献网络图示:

本文的引文网络