节点文献

基于Web公共舆情自动分析及预警关键技术研究

Research on Auto Analysis and Key Technologies of Early Warning Based on Web Public Opinion

【作者】 张珺

【导师】 范春晓;

【作者基本信息】 北京邮电大学 , 电路与系统, 2012, 硕士

【摘要】 随着网络的普及和网民人数的增加,网络舆情已经成为了公共舆情的重要组成部分。同时,网络舆情和传统公共舆情相比,具有数据量大、突发性强、来源分散、影响范围大等特点。因此,网络舆情的监控引导十分重要,但是目前网络舆情的监控手段多数是采用人工监控方式。为了提高舆情监控效果,迫切的需要采用自动分析及预测手段及时地了解网络舆情的动态,掌握其发展趋势,以便于相关部门进行及时干预。本文首先研究了现有舆情分析预测技术和相关舆情分析系统,总结了目前舆情分析预测的一般模型,将舆情分析预测模型分为热点舆情发现模型和热点舆情预测模型两部分。经研究发现模型有以下不足之处:热点舆情发现模型中的文本特征表示阶段,存在着只处理报道内容的局限性;在舆情数据采集和处理方面,对多来源数据同等对待;其次,在热点舆情预测模型建立过程中,对训练数据的分类不够合理。针对以上几点,分别提出改进。首先,将Web意见挖掘应用在文本结构化阶段,利用SO-PMI和K-Means算法构建Web意见词典,量化评论意见,提出了评论内容向量和评论意见向量,完善了舆情文本表示和结构化;其次,提出了舆情来源分析模型补充舆情数据采集和处理的不足之处。·最后,利用C5.0决策树算法将训练数据按照热点舆情意见倾向的极性和强度分类,对各类舆情数据分别建立BP神经网络热点舆情预测模型,改善了模型的预测精度。实验分析表明,改进模型降低了热点舆情发现的错检率和漏检率,同时降低了热点舆情发展趋势预测的平均绝对百分比误差(MAPE)。

【Abstract】 With the popularization of the internet and the increase in the number of internet users, the internet public opinion has become an important part of the public opinion. At the time, compared with the traditional public opinion, the internet public opinion has the features of being large in number, abrupt in occurrence, scattered in sources and influential in many field. The inspection and piloting of the internet public opinion is very important while the monitoring method adopted most is human monitoring at present. In order to improve the monitoring effect, it is in desperate need to introduce automatic analysis and forecasting method to keep track of the tendencies of the internet public opinion. In this case, it is easier for related departments to intervene on time.This paper first studies the present technologies adopted in the public opinion forecasting and analysis and the related system for public opinion analysis and summarizes the general model. This paper divides the model into two parts:one is the model for the hot issue detection and the other is for the hot issue forecast. Improvements targeting the two parts are suggested. Firstly, on the basis of online comment’s importance in web public opinion, this paper applies web opinion mining in the public opinion forecasting and analysis model. By using web opinion dictionary, comment is quantized; meanwhile, review details vector and opinion vector are proposed to optimize the original character representation of report. In the original model, multisource data was equally treated, this paper put forward public opinion source analysis model to resolve the problem. Secondly, C5.0 decision tree algorithm and BP neural network algorithm are combined to structure the classification and prediction model. The model forecasts the public opinion development tendency by different opinion polarity and strength and improves the shortcoming of unclassified forecast. Finally, experiment demonstrates the improved model lower the fallout ratio and omission factor, at the same time, MAPE in public opinion development tendency is reduced.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络