

Study on the Evaluation of Performance of Search Engines’ Features

【作者】 费巍

【导师】 彭斐章; 张进;

【作者基本信息】 武汉大学 , 图书馆学, 2010, 博士

【摘要】 搜索引擎评价研究是信息检索领域研究的热点之一,网络信息和信息检索技术的发展推动了搜索引擎实践的发展。为了满足用户日益增长的信息需求,搜索引擎除了完善其简单检索功能外,也不断开发高级检索功能。这些检索功能旨在帮助用户获取高质量的网络信息,但它们的检索性能并不为人所知。本文以搜索引擎检索结果的相关性和排序质量为两个核心评价指标,对当前主流搜索引擎的主要检索功能进行了评价。本研究的成果一方面可以帮助用户在利用搜索引擎进行信息检索时选取恰当的检索策略,另一方面则可以知道不同的检索功能对搜索引擎检索性能的影响。在第一章中,笔者论述了近年来搜索引擎及其评价研究的现状。在大量文献的基础上,对研究内容、方法、特点、不足以及发展趋势进行了总结。目前搜索引擎评价研究主要以相关性研究为核心内容,以实验方法、调查方法、数据分析法、观察法、综述和评论等为主要的研究方法,具有依附性、动态性、多样化、重视用户参与等特点。然而搜索引擎评价研究还有所不足,主要在于缺乏不同检索功能之间检索效率的比较以及检索结果排序质量的评价等方面。随着多媒体信息的发展,对搜索引擎多媒体检索功能的评价必将成为今后研究的热点。在第二章中,笔者指出,相关性是搜索引擎评价的基础指标,并由此衍生出检索结果的排序质量这一指标,相关性根据网页的形式和内容进行评判,结果排序的质量由检索结果的排列次序和排序的稳定性决定。围绕这两个核心指标,笔者构建了一套评价体系,并根据一定的标准选取了5个中英文搜索引擎以及5种检索功能作为研究对象。英文搜索引擎为Google、Yahoo和MSN/Live/Bing,中文搜索引擎为百度和谷歌,5种检索功能分别为题名检索、短语检索、PDF检索、URL检索和普通检索,其中以普通检索作为比较分析的基准。在第三章中,笔者对所研究的内容提出了假设,并设计了实验步骤。应用层次分析法对相关性的评价指标进行了分析,从一系列的指标中选取了检索结果的全文、摘要、题名、网页有效性、用户负担和网页长度等核心指标来衡量网页的相关性,并对检索结果相关性的计算方法进行了修正,用修正的相关性计算公式来衡量每一检索功能检索结果的整体相关性。方差分析法用于比较分析搜索引擎各检索功能之间的检索效率是否具有显著性差异。如果存在显著性差异,Tukey多重比较检验法则会用于探究造成这一差异的原因。同时,通过回归分析法评价检索结果排序的次序和稳定性。在第四章中,基于50,000份数据,笔者应用了方差分析法对5个搜索引擎的5种检索功能进行了评价,结果显示各检索功能之间的检索效率存在显著性差异,Tukey多重比较检验法探明了造成这一差异的原因。在各检索功能中,PDF检索的效率最高,其余依次为题名检索、普通检索、短语检索和URL检索。在各检索功能的稳定性测评中,普通检索的稳定性要优于其他检索功能。英文搜索引擎中,Yahoo!在5种检索功能中的检索效率均高于Google和MSN/Live/Bing,其次为Google, MSN/Live/Bing的检索效率最差。中文搜索引擎中,谷歌题名检索、普通检索、PDF检索和URL检索的检索效率要明显优于百度,短语检索两者并无显著区别。在第五章中,笔者利用回归分析法的曲线估计方法比较分析了5个搜索引擎的5种检索功能的结果排序质量。英文搜索引擎中,普通检索的结果排序质量最好,URL检索最差,中文搜索引擎中,URL检索的结果排序质量最差,百度的PDF检索结果排序质量最好,谷歌的题名检索结果排序质量最好。数据显示,中文搜索引擎检索结果的排序质量与英文搜索引擎相比具有较大的差距。在第六章中,笔者指出,在数据收集和分析的过程中,发现中英文搜索引擎无论在检索效率还是在检索结果排序上,都存在较大的差距。针对中文搜索引擎目前存在的问题,笔者提出了相应的优化策略,不仅要加强中文网页的质量建设,还要推动开放存取的发展,这样可以从源头上提升中文网络资源质量。搜索引擎应该具备强有力的信息过滤能力,同时谨慎采用一些商业行为人为干扰检索结果的排序。

【Abstract】 The study of evaluation of search engine is one of the popular issues in the field of information retrieval. The development of Internet information and technologies of information retrieval accelerates the development of search engine. Besides simple search, search engines have developed many other advanced search features. These features are at the aim of helping users to find the information they need, but as the matter of fact, the performance of these features is still a puzzle. This study takes relevance and ranking quality of retrieval results as two key indexes to evaluate the main search features of popular search engines. The findings of this study can be used to assist users in formulating an appropriate search strategy to improve search effectiveness, and to shed light on how search engines react to different types of search features in terms of retrieval effectiveness.In the first chapter, the author discusses the research status of the study of search engine and its evaluation and summarizes the content, methods, characteristics, deficiency, and the development trend. At present, the relevance is the core content in the evaluation of search engine. Experimentation, observation, investigation, data analysis, and review are main research methods. The study of evaluation of search engine has the characteristics of dependence, dynamic, diversity, emphasis on users’ participation and so on. There is less finding which compares the search effectiveness between different search features, as well as the ranking quality of search results. As the development of multimedia information, the study on the evaluation of performance of multimedia retrieval features will become part of the hot research issues.In the second chapter, the author points out that relevance, which is the basic index, derives the index of ranking quality of search results. Relevance can be judged based on the form and content of retrieval web pages. And the ranking quality is decided by the sequence and stability of search results. Based on the two key indexes, the author sets up an evaluation system. Following the standards, five search engines and five search features are selected respectively. There are three English search engines, which are Google, Yahoo! and Bing, and two Chinese search engines, which are Baidu and Google China. Five search features are title search, phrase search (exact search), PDF file format restriction search, URL search and regular search, and the results from a regular search can serve as a baseline for comparison and analysis in a search engine.In the third chapter, the overarching research question for this study is whether the use of advanced search features would enhance retrieval effectiveness in a search engine. Based on the research question, some null hypotheses of the study are developed. The author selects some indicators, which are full text, abstract, title, the validity of web page, user’s burden and the length of web pages, to evaluate the relevance of search results based on the methods of Analytic Hierarchy Process (AHP). A revised relevance is used to evaluate the effectiveness of a search feature. A one-way ANOVA analysis method is applied to whether there are significant differences among the effectiveness of search features. If there are significant differences, the Tukey analysis method is used to detect what causes the significant differences. The regression analysis method is applied to detect the sequence and stability of ranking of search results.In the fourth chapter, a one-way ANOVA analysis method is used to evaluate theeffectiveness of five search features of five search engines based on the 50,000 date. The findings show that there are significant differences between search features, so the Tukey analysis method is used to detect the cause of the significant differences. Among these search features, PDF file format restriction search achieves the best retrieval effectiveness. Yahoo! achieves the best retrieval effectiveness among three English search engines in all search features. And Google China gets better retrieval effectiveness than Baidu at title search, regular search, PDF file format restriction search and URL search, but there is no significant difference in phrase search.In the fifth chapter, the regression analysis method is applied to analyze the ranking quality of five search features of five search engines. The regular search achieves the best ranking quality among the search features of English search engines, and URL gets the worst ranking quality among all search engines. PDF file format restriction search achieves the best ranking quality within Baidu’s five search features. Correspondingly, title search achieves the best ranking quality in Google China. Obviously, the ranking quality of search features of Chinese search engines is lower than English search engines.In the sixth chapter, the author finds that Chinese search engines achieve worse results both in retrieval effectiveness and ranking quality during the process of data collecting and analyzing, The author puts forward to some optimization strategies to improve the development of Chinese search engines in retrieval effectiveness and ranking quality. We should pay much attention to the quality of web pages at very beginning of creating them. And open access should be propelled in China to improve the quality of Chinese web resources. Search engines can develop some powerful features to filter particular search results that users have no interests, and should be cautious when operate the policy of bid ranking.

  • 【网络出版投稿人】 武汉大学
  • 【网络出版年期】2010年 10期