节点文献

Web用户评价的自动情感分析

Antomatic Sentiment Analysis for Web User Reviews

【作者】 王林梅

【导师】 何丕廉; 孙越恒;

【作者基本信息】 天津大学 , 计算机应用技术, 2009, 硕士

【摘要】 随着网上信息量的增加和网络应用的扩大,有越来越多的用户借助因特网来获得自己需要的信息。用户在购买某种产品、做某件事之前,往往希望得到相关的一些评价和建议作为参考,因特网成为一种很重要的途径。而因特网上也有很多关于产品或者服务的用户评价信息,但是靠人工来区分这些信息是一件非常艰巨的任务,所以本文提出了自动情感分析方法。本文首先研究了情感词汇的自动获取技术,在北大计算语言所提出的“基于同义词词林的词汇褒贬计算”的算法基础上,通过提取部分标注错误的词汇对该方法加以改进,使词汇情感标注的准确率从89.58%上升到91.52%,并提出一种基于规则的动态扩展方法,通过上下文决定歧义词的情感倾向。接着研究情感文本分类的一个应用——评价信息的情感分析,对用户评价信息进行情感倾向分析。本文使用文本向量模型,通过对中文语言中各种不同词性,以及否定词,转折词,程度副词对文本的影响,来判断文本情感。并且提出一种迭代算法扩展初始情感词典,以提高分类的准确率。该方法思想简单,容易理解,准确率达到了86.43%,但缺点是算法时间复杂性较高,比较费时。本文使用Web挖掘技术将这些用户评价信息挖掘出来并根据用户情感进行分类。输入的是要查询的主题,输出的是对于该主题三种类别(正面、负面、中性)的评价各占的百分比,以及所占比重最高的类别中,权重的绝对值最高的前十条评价信息。

【Abstract】 With the increasement of the information on the Internet and the expansion of the network applications, there are more and more people obtain the information they needed by the Internet. Before users buy a product, does something, they often expect access some of the reviews and recommendations as a reference. And so, the Internet becomes a very important way. And there are many kinds of reviews and recommendations on the Internet, but it is a daunting task to discriminate them manually. So this paper prensents an approach for antomatic sentiment analysis.First, the paper will introduce Automatic acquisition of emotional Dictionary. Based on the algorithm in paper Using Tongyici Cilin to Compute Word Semantic Polarity proposed by Institute of Computer Science & Technology, we improve it according to extract some words those are wrong tagged and this makes the experimental results improve from 89.58% to 91.52%. Besides, we present a rule-based dynamic expansion method, determining the sentiment orientation of ambiguous words according to their contexts.Next the paper research one application of sentiment classification—sentiment analysis of user reviews. We use text vector model, according to the influences of characters of Chinese language, adversatives, privatives and degree adverbs, determining the sentiment orientation of the reviews. At the same time, expand the initial sentiment words by an iterative process. The method is simple, easy to understand, and the total accuracy achieves 86.43%, but has a high time complexity.In this paper, we mining the relatively reviews and recommendations, and classified according to user’s sentiment. Input the subject, and output the percentage of the three categories (Positive, Negative and Neutral). And for the highest category, give the top ten pieces of information which absolute values are highest.

  • 【网络出版投稿人】 天津大学
  • 【网络出版年期】2011年 S2期
  • 【分类号】TP391.1
  • 【被引频次】1
  • 【下载频次】334
节点文献中: