节点文献

基于Hadoop的微博用户及微博影响力排名研究

The Research of Weibo User And Weibo Influence Ranking Based on Hadoop

【作者】 关文斌

【导师】 刘伟平;

【作者基本信息】 暨南大学 , 电子与通信工程(专业学位), 2015, 硕士

【摘要】 随着Web2.0时代的到来,用户主导因素在互联网中占了越来越重要的地位,以微博为首的社交网络平台更是得到了空前的发展。当前,微博已经成为了资讯传递与社会舆论传播的一个重要平台,是社会热点信息传递的新风向标。为此,分析微博用户影响力及微博影响力具有重要的研究意义及商业价值。本文主要分微博用户影响力及微博影响力两部分展开研究。对于微博用户影响力的评定,当前应用较为广泛的是基于Page Rank算法模型的People Rank算法,但People Rank算法只是简单的将网页排名模型应用于微博用户影响力排名中,其对用户影响力评定的精确度还有待完善。本文则是以新浪微博个人认证这一影响因素作为切入点,提出了一种基于People Rank算法的优化方案——NPRank(New People Rank)算法。对于微博影响力的评定,目前较常用的方法是以微博的转发数和评论数作为评定标准,此方法考虑的指标不够全面,对微博影响力评定的可靠性有待提高。而本文则是在考虑微博的转发数、评论数和点赞数的情况下,加入了微博发布者的用户影响力这一因素作为微博影响力的评定标准,并提出了一种微博影响力评定方法——TRank(Topic Rank)算法。针对微博用户影响力排名及微博影响力排名需进行海量数据处理的特点,本文还通过实验证明了NPRank算法和TRank算法在Hadoop平台上的可行性,并对实验结果展开了对比分析,验证了NPRank算法相较于People Rank算法在用户影响力评定精确度得到的优化,及TRank算法在微博影响力评定上有着较强可靠性。最后,基于NPRank及TRank得到的排名结果,本文还提出新浪微博影响力排名展示系统的设计方案。

【Abstract】 With the advent of Web2.0 era, the user dominant factor in the Internet accounts for an increasingly important role. The social network platform which under the leadership of Weibo has obtained the unprecedented development. Weibo has become an important platform for information transfer and social public opinion. Therefore, studying the influence of Weibo users and Weibo has important research significance and commercial values.The study of this paper is divided into Weibo user influence and Weibo influence. For the assessment of the Weibo user influence, the People Rank algorithm which is based on the Page Rank algorithm model is widely used in the current. But the People Rank algorithm is just applied the page rank model to the ranking of Weibo user influence simply and the accuracy of the assessment of the user influence should be improved. By integrating the microblogging personal authentication feature of Sina Weibo, the paper proposes the NPRank(New people Rank) algorithm which is improved from the People Rank. For the assessment of the Weibo influence, at present, more commonly used method is based the number of microblogging forwarding and comments as the evaluation criteria. This method takes into account the index is not comprehensive enough, and the reliability of the assessment of the Weibo influence is need to be improved. This paper is considering the number of microblogging forwarding, comments and point like of cases, adding a factor of microblogging publisher‘s user influence as the Weibo influence assessment standard. And proposes a microblogging influence assessment method—TRank(Topic Rank) algorithm.For the feature of Weibo user influence ranking and Weibo influence ranking in need of massive data processing, this paper proved that the NPRank and the TRank algorithm running on Hadoop platform is feasible by the experiment, and launched a comparative analysis of the experimental results. It also verifies that the NPRank algorithm is better than the People algorithm in terms of accuracy, and the TRank algorithm has a strong reliability on the assessment of Weibo influence. Finally, based on the ranking results of NPRank ank TRank, the paper also proposes Weibo influence ranking display system design.

【关键词】 PageRank新浪微博NPRank算法TRank算法Hadoop
【Key words】 PageRankSina WeiboNPRank algorithmTRank algorithmHadoop
  • 【网络出版投稿人】 暨南大学
  • 【网络出版年期】2015年 12期
  • 【分类号】TP393.092;TP311.13
  • 【下载频次】35
节点文献中: