节点文献

Web信息检索及应用设计优化技术研究

Research on the Optimization Techniques of Web Information Searching and Application Design

【作者】 张宏森

【导师】 朱征宇;

【作者基本信息】 重庆大学 , 计算机软件与理论, 2002, 硕士

【摘要】 随着信息技术的不断发展,Web上的信息资源正在以前所未有的速度增长。面对Web这个巨大的知识海洋,用户在寻找自己所需要的信息时往往显得束手无策。搜索引擎由于其所具有的方便、快捷的特点,逐渐成为用户在Web上进行信息检索的主要工具。首先,针对传统搜索引擎在信息检索的精度(precision)、召回率(recall)、以及使用的方便性等方面存在的不足,作者对Web信息检索系统的检索方法和基本结构进行了仔细的分析研究,并完成了下述研究工作:为了改进搜索引擎的性能,作者将Web上的资源分为了三类:网页资源、多媒体资源和网站资源。根据W3C所提供的RDF资源元数据规范,采用XML的形式给出了三类资源的元数据描述文件及其自动生成方法。用资源的元数据来代替资源进行信息存储,大大减少了搜索引擎中的数据存储量,方便了信息的检索,并且支持了对多种资源的检索。普通的搜索引擎由于其结构和所存储数据等方面的限制,使其不能很好的解决在数据采集、数据存储、信息查询以及查询结果排序等方面所存在的问题。为了从结构方面对普通搜索引擎进行改进,作者设计了基于RDF元数据搜索引擎的基本结构。普通搜索引擎在进行信息收集时一般采用集中式的信息收集方法。集中式信息收集在信息收集的速度和性能等方面都不如分布式信息收集。作者介绍了在基于RDF元数据搜索引擎中所采用的分布式信息收集方法。分布式信息收集方法和资源元数据技术相结合可以大大减少网络上的信息流量。作者在对大量用户使用搜索引擎进行信息检索的模式进行观察和分析后,提出了一种基于关键词扩展的检索模式,给出了基于资源元数据库对关键词进行扩展的方法,并且设计了采用此检索模式搜索引擎的界面。这种检索模式更加符合用户检索信息的习惯,能够引导用户准确完整的提出自己的信息需求。此外,在当前的Web应用设计开发中,主要是以网页为基本单位对信息进行组织。采用这种方法进行Web应用开发的效率低下,并且后期的修改维护工作量巨大。针对Web应用设计开发所存在的问题,作者提出了一种模块化的网页设计及浏览技术。采用这种设计技术可以对信息进行高效的组织和维护,提高了Web应用设计开发的效率。在浏览时通过让网页上比较重要的部分首先出现在用户面前,提高了网页浏览的性能对于复杂网页浏览中所存在的问题进行了分析,提出了将一个复杂网页按其内<WP=5>容组成多种模式的新思想,引入了模式化的网页浏览技术,使网页浏览速度有明显改善,有效减少了网络传输时间。论文所做的研究工作,对进一步改善Web性能和进行检索技术的优化研究,具有一定的学术意义和较好的实用参考价值。

【Abstract】 As the developing of information technique, the information resources in the Web are increasing with the never-heard speed. Faced with this huge information ocean, users are always overwhelming when searching information on the Web. Because of the convenience and shortcut, search engine has become a main tool for information searching.Firstly, as to the shortages of traditional search engine in the precision, recall and convenience, the author has carefully analyzed the search method and the basal structure of Web information searching system, and then completed the following tasks:To enhance the performance of search engine, the author has categorized the Web resources in three types: Web page resource, multimedia resource and Web site resource. And then present the XML documents of the three types of resource’s metadata based on the RDF metadata standard that has supplied by W3C, and introduced its auto-generating method. Storing resource’s information using metadata instead of resource itself has decreased the quantity of data in the database, has provided more convenience in information searching, and has supported the searching for multiform resources.Because of the limit in structure and data storage, common search engine cannot solve the problems in data collection, data storage, information searching and sorting of searching result. To improve common search engine in structure, the author has designed the structure of search engine based on the RDF metadata.Common search engine gathered information using centralized method. Centralized information gathering isn’t good as distributed information gathering in speed and performance. In the paper, the author has introduced the decentralized information gathering method used in the search engine based on the RDF metadata. The distributed information gathering method combines with resource metadata technique could lighten the burden on the network.The author observed and analyzed the using pattern that a great deal of users search information by search engine, and then presented a new search pattern based on keyword extension and the extension method of keyword based on the metadata database, designed the interface of this search engine. This search pattern is more suitable for users’ habits, can lead users to bring forward those requirements for information searching precisely and completely.<WP=7>Moreover, in the current Web application development, information is organized mainly with Web pages. This method not only is low efficiency when developing Web application but also needs large task when maintain the application. As to the problems in the Web application development, the author has proposed a block-based design for Web page. This technique can organize and maintain information more efficiently, and then improve the efficiency of Web application development. When browsing Web page, it can improve the performance of browsing that making important part of Web page to appear as early as possibleThis paper analyzed the problem in browsing of complicated Web pages, and then presented a new idea that organize Web page in multi-pattern, introduced the technique for browsing a web page based on multi-pattern. This technique can higher the browsing speed greatly, and can reduce the transfer time of Web page.As to enhancing the Web’s performance and optimizing the information searching, the tasks of study in the paper are meaningful and valuable in some degree.

  • 【网络出版投稿人】 重庆大学
  • 【网络出版年期】2003年 01期
  • 【分类号】TP393.03
  • 【下载频次】205
节点文献中: