节点文献

基于倾向性文本过滤的IM监控系统的研究与实现

Research and Implementation of Instant Messaging Monitoring System Based on Tendency Text Filtering

【作者】 于海燕

【导师】 房鼎益;

【作者基本信息】 西北大学 , 计算机应用技术, 2007, 硕士

【摘要】 即时通信(Instant Messaging,简称IM)是一种实时的互联网交流形式,伴随着网络的开放性和日益增长的规模,它已经成为人们自由交流信息的便捷手段,极大地改变了人们的联系方式。然而在IM得到广泛应用的同时,却存在着很大的负面效应,如不良信息的广泛传播,机密信息泄露,影响正常工作效率等。这时,一个能对IM软件进行有效监控的系统有了很大的市场需求,但目前国内IM过滤软件多采用基于主题的过滤,使得在过滤精度上有所欠缺。本文针对现有IM监控软件的缺陷,以建立一个高效、准确的监控系统为目标,实现了一个原型系统。本文的研究工作主要包括以下几个方面:1、研究了IM监控系统实现平台——Netfilter框架的设计思想和工作原理,着重分析了其扩展机制及应用;然后针对IM监控系统的过滤需求,选择合适的Netfilter框架钩入点,扩展了框架对应用层IM协议的支持。2、提出了IM监控系统的实现方案,深入分析并讨论了系统实现中的一系列关键技术,包括IM软件协议解析方案、中文分词技术、倾向性文本过滤技术、TCP连接阻断技术以及可加载内核模块(LKM)技术和内核空间与用户空间的通信技术。本文针对系统过滤准确性和实时性的需求,在分析IM文本消息特点和实际应用特点的基础上,对基于语义分析的倾向性文档过滤技术进行了重点研究,给出了一个适用于实时过滤IM消息的倾向性文本过滤方法。3、设计并实现了一个基于倾向性文本过滤的IM监控系统原型——TFIMM(Instant Messaging Monitoring System based on Tendency Text Filtering)。该系统应用了本文所给出的倾向性文本过滤方法和旁路监控技术,不仅有效提高了IM文本信息过滤的准确性,而且避免了对网络速度的负面影响。4、搭建了系统的实验环境,通过召回率、正确率等指标对本文给出的倾向性文本过滤方法进行了测评,并从吞吐率、延迟率两方面对系统性能进行了分析和评价。实验结果表明,该原型系统达到了预期的效果。

【Abstract】 Instant Messaging (IM) is a kind of real-time exchange way for millions of Internet users. Along with the opening and the scale that increased day-by-day of the network, it has come to being a convenient means by which people can exchange information freely. At the same time, there are some negative effects also, such as the spread of various kinds of illegal information, leaking of secret information, low efficiency and high cost of network. Therefore, a system which can monitor the use of IM has a very big market demand. However, most IM filter software products in China are based on subject filtering at present, which are short of filtering precision.By analyzing the shortages of current IM filter software, a prototype system is designed and implemented in order to filter the information precisely and effectively. In this paper, the research work can be summarized in the following aspects:Firstly, the implementation platform of IM monitoring system, Netfilter security framework, is studied. Its design philosophy and extended mechanism are mainly analyzed. Then aiming at the filtering request of IM monitoring system, an appropriate hook point is chosen and the Netfilter framework is extended. Thus the IM communication protocols can be supported at the application layer.Secondly, the implementary scheme of IM monitoring system is proposed. The key techniques of the implementation of IM monitoring system are analyzed and studied in detail, including the analytic scheme of the IM protocol, Chinese word segmentation technique, tendency text analysis technique, TCP connect blocks technique, Loadable Kernel Modules (LKM) technique and the communications between kernel space and user space. By analyzing the characteristic of IM text information and users’ filtering demand, the tendency text filtering technique based on semantic analysis is studied, and a tendency text filtering method (IMTTF) which is fit for IM monitoring is given. The method can filter the information precisely and effectively.Thirdly, the prototype of instant messaging monitoring system based on tendency text filtering (TFIMM) is designed and implemented. The IMTTF method and the bypass monitoring technique are applied to this system, which not only improve the filtering precision effectively, but also avoid the negative influence on Internet speed.Finally, the system experimental environment is set up. The IMTTF method is evaluated on Recall and Precision, and the system performance is evaluated on Response per Second and Response Delay. The results indicate that the prototype system reaches the anticipated effect.

  • 【网络出版投稿人】 西北大学
  • 【网络出版年期】2007年 05期
  • 【分类号】TP393.09
  • 【被引频次】7
  • 【下载频次】150
节点文献中: 

本文链接的文献网络图示:

本文的引文网络