节点文献

社会网络的社团结构发现与动态特性研究

A Study of Community Detection and Dynamic Properties in Social Networks

【作者】 沈珂轶

【导师】 杨小康; 宋利;

【作者基本信息】 上海交通大学 , 信号与信息处理, 2011, 硕士

【摘要】 随着互联网与信息化技术的迅速发展,社会网络已逐渐成为人们生活中不可或缺的一部分。通过对社会网络上残留的用户数字轨迹进行分析挖掘,可以增进我们对社会网络上消息传播的认识,从而促进有益信息在社会网络上更好地传播,并能对虚假、垃圾、谣言信息的爆发式传播进行及时预警。目前社会网络研究的难点在于最有价值的信息往往隐藏在海量的数据之中。本文从两个不同的角度出发来讨论如何从海量的数据中找出最有价值的信息这一重要问题。在社会网络的社团发现方面,本文提出了一种新的社团结构发现算法,旨在将海量的用户连接关系简化为社团与社团之间的连接关系,以及各个社团内部的关系。这样就能很大程度上减少每次需要处理的数据量,对海量数据的分析能起到化繁为简的作用。由于本算法利用到了社会网络的局部特征,所以有着较低的算法复杂度,可以用于处理海量的社会网络连接数据。和目前性能最优的社团发现算法相比,本算法在质量上和最优算法相当,而在速度上要优于目前最优的算法。在社会网络动态特性研究方面,本文从用户在微博上转发信息这一行为中,挖掘出了每个用户的个人兴趣。在此基础上,提出了一个对每个用户收到的微博进行个性化重新排序的算法。通过对微博的重新排序,每个用户最可能感兴趣的微博会被排在最前面,而那些没有信息量、含垃圾广告等的微博会被排在最后。本微博重排算法起到了对用户微博数据过滤的作用,这样用户就能从海量的信息中,有效地挑选出自己真正需要的信息。在真实用户数据集上的测试显示,本微博重排算法弥补了目前微博系统在内容呈现上的不足。和微博默认的排序相比,本算法在排序性能上至少提升了30%。

【Abstract】 All kinds of social networks play a more and more important role in ourdaily lives. By analyzing users’ behaviors in the social networks, we can gainthe insight of the way a message propagating in the social network and get toknow how to promote new ideas and prevent spams and rumors fromepidemic spreading.The study of the social network currently is troubled by the problem thatthe amount of the social network data is tremendous and the most valuableinformation is concealed in the whole dataset. In this work we tackle thisproblem from two aspects.For the community detection section: we have proposed a hierarchicaldiffusion method to detect the community structure from very large socialnetworks. By using the network of communities instead of the network ofpeople, we can reduce the dimension of the social network greatly. Ourcommunity detection algorithm is based on the local structure, so it’s veryefficient. Tests on both classical and synthetic benchmarks show that ouralgorithm is comparable to state of the art community detection algorithms inboth computational complexity and accuracy.For the dynamic properties section: we infer users’ interests from theirbehaviors in social networks. Then we present a supervised learning methodfor personalized tweets reordering based on users’ interests. Twitter displaysthe tweets a user received in a reversed chronological order, which is notalways the best choice because many informative or relevant tweets might beflooded or displayed at the bottom due to some nonsense buzzes. Throughexploring a rich set of social and personalized features, we model the relevance of tweets by minimizing the pairwise loss of relevant andnon relevant tweets. The tweets are then reordered according to the predictedrelevance scores from the learned model. Experimental results with realtwitter user activities demonstrated the effectiveness of our method. The newmethod achieved above 30% accuracy gain compared with the defaultordering in twitter based on time.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络