

Research on Sentiment Text Classification of Chinese Customers Reviews Based on SVM

【作者】 陶敏

【导师】 夏火松;

【作者基本信息】 武汉纺织大学 , 管理科学与工程, 2011, 硕士

【摘要】 随着网络媒体形式内容的丰富,更多的人开始在论坛和评论中发表自己的观点。这些网络文本中带有个人情感色彩的文章、言论也大量出现,其中互联网上的客户评论对于网络消费者的购买决策有着重要的影响,如何从海量客户评论文本数据中自动的抽取出有价值的信息,已成为目前亟待解决的问题。文本主要研究将传统的基于主题的文本分类方法应用于情感文本分类,考虑应用统计学方法实现对情感文本分类的研究,结合传统的基于主题的中文文本分类技术,分析中文情感文本分类的关键技术问题,着重对提高情感文本分类精度过程和方法上进行研究;分析不同的特征选择方法、特征表示方法以及不同的分类器模型的构建对中文情感文本分类精度的影响。论文对情感文本分类问题的关键技术进行了研究,最终确定了有效的分类模型,提出提高中文情感文本分类的较为有效的特征选择方法、特征表示方法以及有效的情感文本分类器;提出4种不同的基于中文情感文本分类特征的停用词表,通过实验分析使用不同停用词表对中文情感文本分类的贡献,并给出有效的停用词表。最后,将实验总结的分类模型应用于实际,验证了研究结果的有效性。将基于支持向量机的情感文本分类模型应用商品推荐领域,实现对国内知名购物网站的商品评论文本信息进行分类实验,提取消费者对产品评论的有效特征,情感分类所得客户评论的情感倾向,并就得出的结果给出了合理的分析,为情感文本分类的应用提出了建设性的意见。

【Abstract】 In recent years, with the quick development of media, more and more people began to comment in the forum and express their opinions. The network version of the article and sentence with a personal emotional polarity have appeared with large numbers, which the customer comments on the Internet for purchase decisions of online consumers have an important impact, and how the comment text from the mass customer data automatically extracted valuable Information, has become an urgent problem. This paper uses the methods of traditional text classification to sentiment text classification. Considering use the statistical methods as a solution to solve the problem of sentiment text classification. Combination with the technology of traditional Chinese text classification based on the theme, have a research on the key techniques of Chinese sentiment text classification, focusing on improving the precision of the result of the sentiment text classification. Analysis the influence of different feature selection methods, feature representation methods and different classification Model have on the accuracy of sentiment classification.This paper have an research on the key technology of Chinese sentiment text classification and ultimately confirm the effective classification model which proposed an effective feature selection methods, feature representation model and effective sentiment text classifier; constitute four different stop list which based on the feature of sentiment text classification. Analysis the different contribution of the four different stop list to the result of sentiment text classification through some experiment. Finally, this paper confirmed the effective stop word list. Finally, this paper applied the classification model to practical problem and verified the validity of research results. Using text classification model based on SVM for goods recommended, have a classification experiment to classify the product reviews which collected from a well-known shopping site. Extract the effective consumer product reviews characteristics, polarity of the sentiment text. Give the final results a reasonable analysis and put forward some constructive opinions on the application of sentiment text classification.
