节点文献

面向用户的查询扩展研究与实现

Research and Implementation on User-Oriented Query Expansion

【作者】 胡珍新

【导师】 丁晖; 王明文;

【作者基本信息】 江西师范大学 , 计算机软件与理论, 2004, 硕士

【摘要】 近年来随着Internet的飞速发展,Web资源以指数级的速度增长,到2004年初,网页数量大约达到80亿。目前搜索Web资源的形式多种多样,使用最广泛的是搜索引擎,但当前的搜索引擎检索信息主要考虑的是通用性,没有体现个别用户的信息需求,个性化信息服务能有效地满足个别用户的信息检索需求。此外,有研究表明人们在访问Web时58-81%的网页访问是访问已访问过的网页。因此在实现个性化信息服务的同时,对用户已访问的网页进行有效管理也是有现实应用意义的。 由于目前大多数检索系统中,用户的需求是通过查询关键词来表示的。用户实际需求与查询关键词之间是存在较大语义差距的。如何缩小这种语义差距是实现面向用户个性化信息服务的关键问题。本文应用查询扩展方法,给出了对查询关键词的增加、删除和权重修改的自适应模型,使之能够更好的满足用户的实际需求,提高了检索的精度。在模型中给出了确定扩展关键词的数量及优化了查询反馈中权重调节因子α,β,γ,λ。 我们合作设计了一个基于个人的电子信息助手原型系统,其主要思想是:首先,在每一个用户注册时,我们要求用户给出他的基本信息、兴趣类、查询关键词等信息。对每一个新注册的用户,我们将根据该用户的兴趣类为该用户建立初始的用户兴趣模型。然后,我们将借用现有的搜索引擎(如Google,Baidu等)进行信息查找,对返回的结果文档,利用用户兴趣模型过滤掉与用户兴趣不相关的文档,再将剩余的文档重新排序显示给用户。用户可对感兴趣的文档下载、浏览,系统将根据用户的行为反馈自动地更新用户的兴趣模型并扩展查询,以使系统中的用户兴趣模型能真正地代表用户当前的兴趣。同时实现了网络信息管理功能,能将搜索到的信息自动归档。 进一步研究工作:1.尝试使用其它方法进一步改善查询扩展自适应模型。2.权重调节因子α,β,γ,λ有待进一步优化。3.完善系统的功能。

【Abstract】 With the fast development of Internet in recent years, Web resources increase at an explosive speed. By the beginning of 2004,the quantity of the webpage was up to 8 billion. At present the ways of using the resource on Web is various, the most popular is to use search engine, but the existing engines is for all users, which can’t satisfy the user’s individual demands, Personalized information service system can satisfy the users’ individual demands effectively. Besides, some studies show that 58-81% webpage are those visited before[21], when people access web. So, Managing visited webpages effectively is meaningful for a personalized information service system.In most retrieval systems, the demand of users is represented by query keywords. In fact, there exists difference between the real demand of users and the query words. How to decreasing the difference is the key problem in implementing the user-oriented EIS. We put forward a way of query expansion oriented to user and adapted modification model oriented to users’ interests. The model can increase the retrieval precision and make the returning webpages satisfy users better. Furthermore, using this model, we can ensure the numbers of query-words expanded and can optimize the regulating factor α,β,γ,λ .We cooperate to design an Electronic Information Assistant’s prototype System based on individual, the idea of design is: first of all, when every registration of users, we request user provide his basic information, interest domain, query keyword and so on. We will create an initial user’s interest model to every new user according to those basic information. Then we can search information by using existing search engine(such as Google, Baidu etc.). To the returning result, we filter the documents irrelevant to the user’s interests and rerank the remaining documents according to user’s profile. The remaining documents will be displayed in user interface. User can download and browse the documents which he is interest in, Personalized Electronic Information Assistant’s System will modify the user’s interest model and expand original query by the result of user feedback automatically, so that the interest model can really represent the users’ interest. And we implement network information manage function, which can classify documents be searched EIS automatically.Further work: I. bettering the query expansion adapted model; 2. optimizing the regulating factor α,β,γ,λ ; S.perfecting the system’s function.

  • 【分类号】TP311.52
  • 【被引频次】4
  • 【下载频次】278
节点文献中: 

本文链接的文献网络图示:

本文的引文网络