èŠ‚ç‚¹æ–‡çŒ®

åŸºäºŽå‚è€ƒæ–‡æ¡£æ¨¡åž‹çš„ä¸ªæ€§åŒ–Webæ£€ç´¢ç ”ç©¶

Research on Personalized WEB Search Based on Reference Document Model

åˆ†é¡µä¸‹è½½
åˆ†ç« ä¸‹è½½
æ•´æœ¬ä¸‹è½½
åœ¨çº¿é˜…è¯»
ä¸æ”¯æŒè¿…é›·ç‰ä¸‹è½½å·¥å…·ï¼Œè¯·å–æ¶ˆåŠ é€Ÿå·¥å…·åŽä¸‹è½½ã€‚

ã€ä½œè€…ã€‘ æŽå¤§ä»»ï¼›

ã€ä½œè€…åŸºæœ¬ä¿¡æ¯ã€‘ å“ˆå°”æ»¨å·¥ä¸šå¤§å¦ ï¼Œ è®¡ç®—æœºç§‘å¦ä¸ŽæŠ€æœ¯ï¼Œ 2011ï¼Œ ç¡•å£«

ã€æ‘˜è¦ã€‘ éšç€è®¡ç®—æœºå’Œäº’è”ç½‘çš„è¿…é€Ÿæ™®åŠ,äººç±»è¿›å…¥äº†ä¿¡æ¯æ—¶ä»£,å„ç§ä¿¡æ¯èµ„æºå‘ˆçŽ°å‡ºäº†çˆ†ç‚¸å¼åœ°å¢žé•¿ã€‚åœ¨å¤§é‡çš„ä¿¡æ¯ä¸å¸®åŠ©ç”¨æˆ·æ›´åŠ å‡†ç¡®åœ°æ‰¾åˆ°ä»–ä»¬æƒ³è¦çš„ä¿¡æ¯å°±æˆä¸ºäº†ä¿¡æ¯æ£€ç´¢çš„é‡è¦ä»»åŠ¡ã€‚ç„¶è€Œä¼ ç»Ÿçš„ä¿¡æ¯æ£€ç´¢æŠ€æœ¯å¤§éƒ¨åˆ†éƒ½æ˜¯åŸºäºŽå—ç¬¦ä¸²åŒ¹é…çš„,ä»–ä»¬å·²ç»å¾ˆéš¾æ»¡è¶³ç”¨æˆ·è¶Šæ¥è¶Šä¸ªæ€§åŒ–çš„éœ€æ±‚ã€‚ä¸ºäº†è§£å†³è¿™ä¸€é—®é¢˜,æœ¬æ–‡ä»Žä¸ªæ€§åŒ–çš„åŠ¨æœºå‡ºå‘,å°è¯•äº†å®žçŽ°ä¸ªæ€§åŒ–æœç´¢å¼•æ“Žçš„ä¸åŒçš„æŠ€æœ¯,ä¸»è¦åˆ†æˆä»¥ä¸‹ä¸‰ä¸ªæ–¹é¢çš„ç ”ç©¶:(1).ä¸ªæ€§åŒ–æ½œåŠ›åˆ†æžã€‚åœ¨æœ¬ç« ä¸,æˆ‘ä»¬é¦–å…ˆä»Žæ•°é‡çš„è§’åº¦è¯å®žäº†åœ¨ç½‘é¡µæœç´¢å¼•æ“Žçš„æŸ¥è¯¢æ—¥å¿—ä¸ä¸åŒä¸Žå…¶ä»–ç”¨æˆ·çš„ç‚¹å‡»æ•°é‡è¦å¤šè¿œäºŽè¢«é‡å¤çš„ç‚¹å‡»æ•°é‡ã€‚ç„¶åŽæˆ‘ä»¬å¼•å…¥Kappaç»Ÿè®¡é‡å¯¹åœ¨åŒä¸€ä¸ªæŸ¥è¯¢ä¸‹çš„ä¸åŒç”¨æˆ·çš„ç‚¹å‡»çš„ä¸€è‡´ç¨‹åº¦è¿›è¡Œäº†åº¦é‡ã€‚Kappaå€¼çš„åˆ†å¸ƒæ˜¾ç¤ºç”¨æˆ·çš„ç‚¹å‡»çš„ä¸€è‡´ç¨‹åº¦æ˜¯å¾ˆéš¾ç”¨â€œä¸€åˆ€åˆ‡â€çš„ç½‘é¡µæœç´¢å¼•æ“Žæ»¡è¶³çš„ã€‚æœ€åŽæˆ‘ä»¬å¼•å…¥äº†â€œä¸ªæ€§åŒ–æ½œåŠ›â€æŒ‡æ ‡ç»™å‡ºäº†å¤§æ¦‚ä»€ä¹ˆæ ·çš„æŸ¥è¯¢èƒ½å¤Ÿä»Žä¸ªæ€§åŒ–ä¸èŽ·ç›Šæ›´å¤šã€‚(2).åŸºäºŽå‚è€ƒæ–‡æ¡£æ¨¡åž‹çš„ä¸ªæ€§åŒ–Webæ£€ç´¢ã€‚æœ¬ç« ä¸æˆ‘ä»¬å¼•å…¥äº†å‚è€ƒæ–‡æ¡£æ¨¡åž‹å¯¹ç”¨æˆ·çš„åŽ†å²ç‚¹å‡»æ–‡æ¡£è¿›è¡Œå»ºæ¨¡å¹¶ä»¥åé¦ˆçš„æ–¹å¼ä¸ªæ€§åŒ–ä¸åŒç”¨æˆ·ç›¸åŒæŸ¥è¯¢çš„æœç´¢ç»“æžœã€‚æˆ‘ä»¬åˆ†åˆ«åœ¨å‘é‡ç©ºé—´å’Œæ¦‚çŽ‡ç©ºé—´ä¸‹å¯¹å‚è€ƒæ–‡æ¡£æ¨¡åž‹çš„æ€§èƒ½è¿›è¡Œäº†å®žéªŒã€‚å®žéªŒç»“æžœè¡¨æ˜Ž,ä¸è®ºæ˜¯åœ¨å‘é‡ç©ºé—´è¿˜æ˜¯åœ¨æ¦‚çŽ‡ç©ºé—´ä¸‹,å‚è€ƒæ–‡æ¡£æ¨¡åž‹éƒ½èƒ½å¤Ÿä»Žç”¨æˆ·çš„åŽ†å²ç‚¹å‡»çš„æ–‡æ¡£ä¸å¯¹ç”¨æˆ·çš„ä¸ªæ€§è¿›è¡Œå¾ˆå¥½åœ°å»ºæ¨¡,å¹¶å°†è¿™ç§ä¸ªæ€§å¾ˆå¥½åœ°èžå…¥æ£€ç´¢è¿‡ç¨‹å½“ä¸ã€‚(3).åŸºäºŽå¤šä¿¡æ¯èžåˆçš„æŸ¥è¯¢æŽ¨èã€‚æœ¬ç« ä¸æˆ‘ä»¬å°±å¦‚ä½•ä½¿ç”¨æŸ¥è¯¢æ—¥å¿—ä¸è®°å½•çš„ç”¨æˆ·ç¾¤ç»„çš„åŽ†å²æ¥å®žçŽ°ä¸ªæ€§åŒ–çš„æŸ¥è¯¢è¿›è¡Œäº†ç ”ç©¶ã€‚å…·ä½“åœ°è¯´,æˆ‘ä»¬é¦–å…ˆé€šè¿‡å¯¹ç¾Žå›½åœ¨çº¿çš„æŸ¥è¯¢æ—¥å¿—çš„åˆ†æžéªŒè¯äº†å°†å…¶ä»–æŸ¥è¯¢åŽ†å²ç›¸ä¼¼çš„ç”¨æˆ·çš„æŸ¥è¯¢è¿›è¡Œç›¸äº’æŽ¨èçš„å¯è¡Œæ€§,ç„¶åŽä½¿ç”¨äº†æœºå™¨å¦ä¹ ç®—æ³•å¯¹å¤šç§ç”¨æˆ·æŸ¥è¯¢åŽ†å²åºåˆ—çš„ç›¸ä¼¼åº¦æŒ‡æ ‡è¿›è¡Œäº†èžåˆ,å¹¶æ ¹æ®èžåˆåŽçš„ç›¸ä¼¼åº¦æ‰¾å‡ºæŸ¥è¯¢åŽ†å²æœ€ç›¸è¿‘çš„ç”¨æˆ·å°†ä»–ä»¬çš„æŸ¥è¯¢æŽ¨èå‡ºæ¥ã€‚åœ¨æœç‹—çš„æŸ¥è¯¢æ—¥å¿—ä¸çš„å®žéªŒç»“æžœè¯å®žäº†è¿™ç§æ–¹æ³•ç¡®å®žèƒ½å¤Ÿæœ‰æ•ˆåœ°å°†ç›¸ä¼¼çš„ç”¨æˆ·çš„æŸ¥è¯¢æŽ’åœ¨äº†å‰é¢ã€‚æ¤å¤–,æˆ‘ä»¬è¿˜å¯¹åŸºäºŽç”¨æˆ·ç¾¤ç»„çš„ç‚¹å‡»æŽ¨èè¿›è¡Œäº†ä¸€å®šçš„æŽ¢ç´¢ã€‚æ›´å¤š è¿˜åŽŸ

ã€Abstractã€‘ With the development and wide spread of computer and Internet, men have entered the information epoch. The information resources have grown explosively. Thus, how to help internet users exactly find the information that they want becomes an important mission of the information retrieval. Considering that most of traditional information retrieval techniques are based on string matching, they are hardly able to fulfill the more and more individualized information needs. In order to resolve this issue, this paper confirms the motivation of personalization through query log analysis and tries some methods to provide personal service for web users. In details, this paper makes the following contributions:1. Potential for personalization in web search. In this section, we first demonstrate that there are more clicks which are different from other than those repetitive clicks. Then we employ the statistic Kappa to characterize the overall consistency of usersâ€™clicks on the same query. The distribution of Kappa values, together with query submission, further reveal that the consistency level of clicks is hard to be satisfied by one-size-fits-all web search engine. Finally, we calculate potential for personalization to present an overview of what queries can benefit more from individual user information.2. Personalized web search based on reference document model (RDM). In this section, we introduce RDM to build user preference model from the usersâ€™clicked web pages and then personalize the different usersâ€™search results on the same query through the feedback from the model. We respectively examine the performance of the RDM in the vector space and probabilistic space. The results of our experiments represent that, whether in the vector space or probabilistic space, RDM is able to properly model usersâ€™preference and incorporate it into the process of retrieval.3. Query recommendation based on multiple information fusion method. In this section, we conduct research on how to exploit the history of user group recorded in query log to implement the personalized query recommendation. Specifically, we first verify the conjecture that it is proper to recommend the queries issued by a user group who share some common search history with the one to be recommended. Then we propose a query recommendation method which finds the preference related queries through ranking users by the sequence similarity of usersâ€™query histories. We investigate various measures for user history similarity and employ RankingSVM to fuse these measures to predict the similarity of users. Empirical experimental results indicate that recommending queries issued by the users who have similar search history can effectively predict the subsequent query.æ›´å¤š è¿˜åŽŸ

ã€å…³é”®è¯ã€‘ æŸ¥è¯¢æ—¥å¿—åˆ†æžï¼› ä¸ªæ€§åŒ–ï¼› æŸ¥è¯¢æŽ¨èï¼› å‚è€ƒæ–‡æ¡£æ¨¡åž‹ï¼›
ã€Key wordsã€‘ query log analysisï¼› personalizationï¼› query recommendationï¼› reference document modelï¼›

ã€ç½‘ç»œå‡ºç‰ˆæŠ•ç¨¿äººã€‘ å“ˆå°”æ»¨å·¥ä¸šå¤§å¦

ã€åˆ†ç±»å·ã€‘TP391.3
ã€ä¸‹è½½é¢‘æ¬¡ã€‘39
æ”»è¯»æœŸæˆæžœ

çŸ¥ç½‘èŠ‚ä¸‹è½½

èŠ‚ç‚¹æ–‡çŒ®ä¸ï¼š

æœ¬æ–‡é“¾æŽ¥çš„æ–‡çŒ®ç½‘ç»œå›¾ç¤º:

æœ¬æ–‡çš„å¼•æ–‡ç½‘ç»œ

èŠ‚ç‚¹æ–‡çŒ®

èŠ‚ç‚¹æ–‡çŒ®

åŸºäºŽå‚è€ƒæ–‡æ¡£æ¨¡åž‹çš„ä¸ªæ€§åŒ–Webæ£€ç´¢ç ”ç©¶

Research on Personalized WEB Search Based on Reference Document Model

æœ¬æ–‡é“¾æŽ¥çš„æ–‡çŒ®ç½‘ç»œå›¾ç¤º:

åŸºäºŽå‚è€ƒæ–‡æ¡£æ¨¡åž‹çš„ä¸ªæ€§åŒ–Webæ£€ç´¢ç ”ç©¶