èŠ‚ç‚¹æ–‡çŒ®

Webç¤¾åŒºå‘çŽ°ç®—æ³•çš„ç ”ç©¶

A Syudy of Web Community Detection Algorithm

åˆ†é¡µä¸‹è½½
åˆ†ç« ä¸‹è½½
æ•´æœ¬ä¸‹è½½
åœ¨çº¿é˜…è¯»
ä¸æ”¯æŒè¿…é›·ç‰ä¸‹è½½å·¥å…·ï¼Œè¯·å–æ¶ˆåŠ é€Ÿå·¥å…·åŽä¸‹è½½ã€‚

ã€ä½œè€…ã€‘ é»„ä¼Ÿå¹³ï¼›

ã€ä½œè€…åŸºæœ¬ä¿¡æ¯ã€‘ åŒ—äº¬é‚®ç”µå¤§å¦ ï¼Œ ç”µåä¸Žé€šä¿¡å·¥ç¨‹ï¼ˆä¸“ä¸šå¦ä½ï¼‰ï¼Œ 2013ï¼Œ ç¡•å£«

ã€æ‘˜è¦ã€‘ ç¤¾åŒºå‘çŽ°æŠ€æœ¯æ˜¯ç½‘ç»œç ”ç©¶ä¸ä¸€é¡¹éžå¸¸é‡è¦çš„æŠ€æœ¯,å› ä¸ºç¤¾åŒºç»“æž„åœ¨å®šç¨‹åº¦ä¸Šåæ˜ äº†çœŸå®žç³»ç»Ÿçš„æ‹“æ‰‘å…³ç³»,æ‰€ä»¥,è¯†åˆ«å‡ºç½‘ç»œå›¾ä¸çš„æ½œåœ¨ç¤¾åŒºå…·æœ‰éžå¸¸é‡è¦çš„ç ”ç©¶æ„ä¹‰ã€‚ç‰¹åˆ«éœ€è¦æŒ‡å‡ºçš„,éšç€äº’è”ç½‘çš„é£žé€Ÿå‘å±•,å…¶å·²ç»æˆä¸ºäº†äººç±»ç¤¾ä¼šç”Ÿæ´»ä¸ä¸å¯ç¼ºå°‘çš„ä¸€éƒ¨åˆ†,å‘çŽ°å¹¶åˆ†æžå˜åœ¨äºŽäº’è¿žç½‘ä¸çš„ç¤¾åŒºåˆ™å…·æœ‰å²æ·±åˆ»çš„æ„ä¹‰ã€‚æˆ‘ä»¬çŸ¥é“,çŽ°å®žç”Ÿæ´»ä¸çš„å„ç§ç¤¾äº¤åœˆåå¯ä»¥åœ¨æŸäº›å±‚é¢ä¸Š,åæ˜ å‡ºäººä¸Žäººä¹‹é—´çš„å…³ç³»ã€‚è€ŒåŒä¸€ç¤¾äº¤åœˆåçš„äººé€šå¸¸åœ¨ç½‘ç»œä¸Šä¹Ÿä¼šå˜åœ¨ä¸€äº›è”ç³»,å› æ¤å¯¹è¯¸å¦‚FaceBookã€ Twitterã€æ–°æµªå¾®åšã€å¤©æ¶¯ã€çŒ«æ‰‘ã€è±†ç“£ç½‘ã€äººäººç½‘ç‰åœ¨çº¿ç¤¾äº¤ç½‘ç»œè¿›è¡Œç¤¾åŒºç»“æž„åˆ†æž,è¿›è€Œå‘çŽ°äººä»¬ä¹‹é—®å˜åœ¨çš„å„ç§æ½œåœ¨å…³ç³»,è¿™æ ·ä¸ä»…å¯ä»¥å¿«é€Ÿçš„äº†è§£å’Œé¢„æµ‹äººä»¬çš„æ´»åŠ¨,è€Œä¸”è¿˜å¯ä»¥æ›´åŠ æœ‰æ•ˆçš„ç›‘æµ‹ç¤¾ä¼šèˆ†è®ºçš„å‘å±•æƒ…å†µ,åŒæ—¶ä¹Ÿå¯ä»¥åœ¨å¯¹æ¯”çŽ°å®žç¤¾åŒºå’Œè™šæ‹Ÿç½‘ç»œç¤¾åŒºä¸äººä»¬æ´»åŠ¨çš„å¼‚åŒæ—¶ç»™æˆ‘ä»¬æä¾›ä¾æ®ã€‚å¦å¤–,çŽ°å¦‚ä»ŠåŸºæœ¬æ¯ä¸€ä¸ªåœ¨çº¿å•†åŸŽéƒ½æœ‰å•†å“æŽ¨æŽ¨èåŠŸèƒ½,è€Œè¿™ä¸ªåŠŸèƒ½çš„å®žçŽ°å…¶å®žä¹Ÿæ˜¯åŸºäºŽç¤¾åŒºå‘çŽ°æŠ€æœ¯çš„ã€‚å•†å“æŽ¨èç³»ç»Ÿçš„å…³é”®å…¶å®žå°±æ˜¯è¦æŠŠé‚£äº›å…·æœ‰ç›¸ä¼¼è´ä¹°å…´è¶£çš„é¡¾å®¢ä»Žåºžå¤§çš„é¡¾å®¢å’Œå•†å“è´ä¹°å…³ç³»ç½‘ç»œä¸è¯†åˆ«å‡ºæ¥,è¿™ä¸ªè¿‡ç¨‹å…¶å®žå°±æ˜¯ä¸€ä¸ªç¤¾åŒºå‘çŽ°çš„è¿‡ç¨‹ã€‚å†è€…,äº’è”ç½‘ä¸å……æ–¥ç€å„ç§å„æ ·çš„ä¿¡æ¯,æƒ³è¦å¿«é€Ÿçš„ä»Žè¿™äº›æµ·é‡æ•°æ®ä¸,æå–å‡ºç”¨æˆ·éœ€è¦çš„ä¿¡æ¯æ˜¯å¾ˆä¸å®¹æ˜“çš„ã€‚å¦‚æžœæˆ‘ä»¬å¯¹äº’è”ç½‘ä¸å˜åœ¨çš„è¿™äº›ä¿¡æ¯è¿›è¡Œç¤¾åŒºå‘çŽ°çš„è¯,ä¸ä»…å¯ä»¥æŠŠè¿™ä¸ªé—®é¢˜è§£å†³,è¿˜å¯ä»¥å®žçŽ°é’ˆå¯¹ä¸ªäººçš„ä¿¡æ¯æŽ¨è,ä»¥åŠç½‘ç»œä¸çš„æ™ºèƒ½æœç´¢åŠŸèƒ½,ä»Žè€Œå¸¦é¢†ç”¨æˆ·,è®©ä»–ä»¬æ›´åŠ å‡†ç¡®å’Œå¿«é€Ÿçš„æ‰¾åˆ°è‡ªå·±æ„Ÿå…´è¶£çš„ä¿¡æ¯ã€‚æœ¬æ–‡é¦–å…ˆå¯¹ä¸€äº›å’Œç¤¾åŒºå‘çŽ°æŠ€æœ¯æœ‰å…³çš„ç†è®ºçŸ¥è¯†è¿›è¡Œäº†ä»‹ç»,ä¹‹åŽåˆå¯¹ä¸€äº›æ—©æœŸçš„æ¯”è¾ƒç»å…¸çš„ç¤¾åŒºå‘çŽ°ç®—æ³•,å¦‚ä¼ ç»Ÿå›¾åˆ†å‰²ç®—æ³•Kernighan-Linç®—æ³•ã€å±‚æ¬¡èšç±»æ–¹æ³•ä¸çš„GNç®—æ³•ã€åŸºäºŽæ¨¡å—åº¦ä¼˜åŒ–çš„Newmanå¿«é€Ÿç®—æ³•(å…¶ä¹Ÿå±žäºŽå‡èšç®—æ³•)ã€è°±åˆ†æžç®—æ³•æ™®åˆ†æ³•ã€ä»¥åŠPallaç‰äººæå‡ºçš„ç”¨äºŽé‡å ç¤¾åŒºå‘çŽ°çš„CPMç®—æ³•ç‰è¿›è¡Œäº†ä»‹ç»,åŒæ—¶è¿˜å¯¹å®ƒä»¬çš„ä¼˜åŠ¿åŠä¸è¶³,ç®—æ³•çš„å¤æ‚åº¦åº¦åŠé€‚ç”¨èŒƒå›´è¿›è¡Œäº†åˆ†æžã€‚æ¤å¤–,è¿˜å¯¹ç¤¾åŒºè´¨é‡å®¢è§‚è¯„ä»·æ ‡å‡†è¿›è¡Œäº†ä»‹ç»ã€‚æœ¬æ–‡åœ¨æ·±å…¥åˆ†æžå¹¶ç†è§£çŽ°æœ‰ç¤¾åŒºå‘çŽ°ç®—æ³•çš„åŸºç¡€ä¸Š,ç»“åˆå¾®åšè¿™ç§åŒå‘æ€§ç¤¾äº¤ç½‘ç»œçš„ç‰¹æ€§,æå‡ºäº†ä¸¤ä¸ªé’ˆå¯¹å¾®åšçš„ç¤¾åŒºå‘çŽ°æ–°ç®—æ³•,åˆ†åˆ«æ˜¯ï¼š1)åŸºäºŽç¤¾åŒºå¸å¼•åŠ›çš„ç¤¾åŒºå‘çŽ°ç®—æ³•å’Œ2)åŸºäºŽç¤¾åŒºå¸å¼•åŠ›çš„é‡å ç¤¾åŒºå‘çŽ°æ–°ç®—æ³•ã€‚å…¶ä¸,ç®—æ³•1)çš„æå‡º,æ˜¯ä¸ºäº†è§£å†³çŽ°æœ‰ç¤¾åŒºå‘çŽ°ç®—æ³•åœ¨é¢å¯¹å¾®åšè¿™ç§å¤§è§„æ¨¡ç½‘ç»œæ—¶å¤æ‚åº¦è¿‡é«˜è€Œéš¾ä»¥åº”ç”¨çš„é—®é¢˜ã€‚ç®—æ³•1)ä¸ä»…åœ¨æ—¶é—´å¤æ‚åº¦æ–¹é¢ä¼˜äºŽçŽ°æœ‰çš„ç¤¾åŒºå‘çŽ°ç®—æ³•,åŒæ—¶åœ¨ç¤¾åŒºå‘çŽ°ç²¾å‡†åº¦æ–¹é¢ä¹Ÿæœ‰ä¸ä¿—çš„è¡¨çŽ°ã€‚è€Œç®—æ³•2)çš„æå‡º,æ˜¯å› ä¸ºæˆ‘ä»¬è€ƒè™‘åˆ°åœ¨çŽ°å®žä¸–ç•Œä¸ä¸€ä¸ªç”¨æˆ·å¯èƒ½ä¼šåŒæ—¶å±žäºŽå¤šä¸ªç¤¾åŒº,è€ŒçŽ°æœ‰çš„å¤§å¤šæ•°ç®—æ³•ä»¥åŠæˆ‘ä»¬æ‰€æå‡ºæ¥çš„ç®—æ³•1)éƒ½æ˜¯ç®€å•æŠŠæ¯ä¸ªç”¨æˆ·åˆ’åˆ†åˆ°ä¸€ä¸ªå•ç‹¬çš„ç¤¾åŒºä¸,è€Œè¿™ä¸Žäº‹å®žæœ‰ç‚¹ä¸å¤ªç›¸ç¬¦ï¼›åŒæ—¶,çŽ°æœ‰çš„ä¸€äº›é‡å ç¤¾åŒºå‘çŽ°ç®—æ³•åœ¨é¢å¯¹å¤§è§„æ¨¡ç½‘ç»œæ—¶æ€§èƒ½ä¸ä½³,ç®—æ³•2)è§£å†³è¿™äº›é—®é¢˜ã€‚æœ€åŽ,æˆ‘ä»¬è¿˜ç»™å‡ºäº†è¿™ä¸¤ä¸ªç®—æ³•çš„å®žéªŒç»“æžœ,è¿™äº›å®žéªŒç»“æžœå¯¹æˆ‘ä»¬æ‰€æå‡ºçš„ä¸¤ä¸ªç®—æ³•åœ¨ç¤¾åŒºå‘çŽ°çš„æœ‰æ•ˆæ€§å’Œé«˜æ•ˆæ€§ä¸Šå‡ç»™å‡ºäº†æœ‰åŠ›çš„è¯æ˜Žã€‚æ›´å¤š è¿˜åŽŸ

ã€Abstractã€‘ Community Detection is a very important technology in network research. Because, in some extent, Community structure could reflect the topology relationship of real system, and there is great significance to find the potential communities in network. Particularly, with the rapid development of the Internet, it has become an indispensable part of humanâ€™s social life, so discovery and analyze the potential communities in the Internet has a further significance. We know that, in reality, the variety of social circles, in some levels, could reflect the relationship among people. The peoples in the same social circle usually also have some contacts in the Internet, Doing community structure analysis of the online social networks such as FaceBook, Twitter, Sina, Tianya, Mop could find the potential relationship exists among people, then we not only can quickly understand and predict peopleâ€™s activities, but also can carry out a more effective monitoring of the development of public opinion. At the same time, it can provide some basis when we compare the activities of people between reality communities and virtual network community. On the other hand, nowadays, commodity recommended function is a function which most online malls provided, the realization of this function is actually based on community detection technology. The key of commodity recommended system is identifying the customers with similar buying interest from the relationship network of customers and commodities, this process is actually the process to do community detection among customers. Moreover, there are variety of information filled with the Internet, so getting the information we wanted is not an easy thing. If we do community detection among these information of the Internet, then, not only can this problem be solved and also can achieve the personal information recommended function, as well as the intelligent search function, which lead the user to find the information they interested more accurately and quickly.In this paper, we first introduced the theoretical knowledge of community detection, and then give a brief review of some traditional algorithms, such as Kernighan-Lin algorithm(a segmentation algorithm), GN algorithm(a hierarchical clustering algorithm), Newman-fast algorithm(a algorithm based on module optimization), CPM algorithm(a algorithm to find overlapping communities). To each algorithm, we also analyzed its advantages and disadvantages, as well as its complexity. In addition, we also introduced the objective quality evaluation criteria of community.After making deep analysis and understanding of the existing community detection algorithms, we proposed two new community detection algorithms. The two new algorithms are both designed for micro-blog, which is a bidirectional social network. The first algorithm is community detection algorithm based on the community attractive; and the second one is overlapping community detection algorithm based on community attractive. The reason why algorithm one is proposed, is to solve the problem that the existing community detection algorithms are of high complexity when used to analyze the large-scale network like micro-blog. Algorithm one not only performs better than the existing community detection algorithms in time complexity, also performs well in accuracy. In real world, a people may belong to multiple communities at the same time, but most existing algorithms and algorithm one just simply divide each individual into a single community, so the results of these algorithms are not quite consistent with the fact, that is why algorithm two is proposed; the existing overlapping community detection algorithms are of poor performance for large-scale network is also the reason. Finally, the simulation results give a strong proof to the effectiveness and efficiency of the two algorithms.æ›´å¤š è¿˜åŽŸ

ã€å…³é”®è¯ã€‘ å¤æ‚ç½‘ç»œï¼› ç¤¾åŒºç»“æž„ï¼› ç¤¾åŒºå¸å¼•åŠ›ï¼› ç¤¾åŒºå‘çŽ°ï¼› é‡å ç¤¾åŒºï¼›
ã€Key wordsã€‘ Complex Networksï¼› Community Structureï¼› Community Attractiveï¼› Community Detectionï¼› Overlapping Communityï¼›

ã€ç½‘ç»œå‡ºç‰ˆæŠ•ç¨¿äººã€‘ åŒ—äº¬é‚®ç”µå¤§å¦

ã€åˆ†ç±»å·ã€‘TP301.6
ã€ä¸‹è½½é¢‘æ¬¡ã€‘334

çŸ¥ç½‘èŠ‚ä¸‹è½½

èŠ‚ç‚¹æ–‡çŒ®ä¸ï¼š

æœ¬æ–‡é“¾æŽ¥çš„æ–‡çŒ®ç½‘ç»œå›¾ç¤º:

æœ¬æ–‡çš„å¼•æ–‡ç½‘ç»œ

èŠ‚ç‚¹æ–‡çŒ®

èŠ‚ç‚¹æ–‡çŒ®

Webç¤¾åŒºå‘çŽ°ç®—æ³•çš„ç ”ç©¶

A Syudy of Web Community Detection Algorithm

æœ¬æ–‡é“¾æŽ¥çš„æ–‡çŒ®ç½‘ç»œå›¾ç¤º:

Webç¤¾åŒºå‘çŽ°ç®—æ³•çš„ç ”ç©¶