

Based on Network Information Retrieval Technology Information Filtration Method Applied Research

【作者】 张文左

【导师】 朱青; 庞晓东;

【作者基本信息】 北京工业大学 , 软件工程, 2012, 硕士

【摘要】 当前,运用网络搜索引擎的查找,从而获得一些自己所需的相关信息,已经成为众多网民的惯常举动。其中,使用最为频繁的当属一些著名的综合性搜索网站,如GOOGLE、百度、雅虎等搜索网站。这几个网站正是凭借自身强大的搜索能力而始终处于全球网站点击量排名的前茅。有时为了解决一些似是而非的问题,许多网民也通过此类网站的搜索功能而获得相关的参考信息。通过互联网获得足够自身使用的信息,已经成为网络活动中不可或缺的重要内容。可以说,网络的信息检索和过滤技术已经成为众多网民离不开的基本技术。面对全球数以亿计的庞大网民群体,就其个体而言,其检索信息、获得所需内容的需求必然呈现出千差万别、多样化的特性。GOOGLE、百度、雅虎等综合性搜索网站针对用户的需求变化,一直不断改进他们的搜索技术,提高自身的搜索能力,从而提高网站自身的知名度和实用性,以获取网站稳定提升的点击量。然而,无论他们如何改进技术,其技术指导原则必然是不断寻找众多网民需求的共性,也即总是从宏观考虑入手,这就必然对网民的个性需求无法顾及。因此,一些搜索面并不宽,但搜索深度足够高的专门性搜索软件便应运而生。如:随心信息搜索软件、网络信息采集专家、企业信息搜索王、小蜜蜂采集器、火车采集器,等等。本文所要探讨的是针对网民个性需求,从某些非常特殊的专业岗位面对的网络困境出发,运用独立而非综合的信息采集技术(软件),实现对网络专门信息的检索、过滤、抓取的快速性、准确性、全面性和自动化。

【Abstract】 Currently, the use of Web search engines to find, they need to get some information, Many Internet users have become the usual moves. Of these, undoubtedly the most frequently used comprehensive search of some famous sites such as GOOGLE, Baidu, Yahoo and other search sites. These sites is by virtue of its powerful global search ability is always in the forefront of website traffic ranking. Sometimes in order to solve some paradoxical problems, many users also search through such sites and access to relevant reference information. Their use of the Internet to obtain sufficient information, the network activity has become an indispensable content. It can be said, the network information retrieval and filtering technology has become inseparable from the basic technology of many Internet users.The face of hundreds of millions of large groups of users on its individual, its retrieve information, obtain the required content is inevitably show different and diverse features. GOOGLE, Baidu, Yahoo and other search sites for comprehensive changes in the needs of users, constantly improving their search technology, improve their search capabilities, thereby enhancing the visibility and usefulness of the site itself, in order to enhance the stability of access to site traffic. However, no matter how they improve the technology, its technical guidelines must be constantly looking for the common needs of many users, which are always considered to start from the macro, which is bound to the individual needs of users, can not be taken into account. Therefore, some search area is not wide, but the search depth is high enough specialized search software have come into being. Such as:heart information search software, network information gathering experts, business information search king, bee collector, train collector, and so on.This paper is to explore the demand for individual users, from some very special network of professional positions in the face of difficulties starting the use of separate rather than consolidated information collection technology (software) to realize the network of specialized information retrieval, filtering, crawl rapidity, accuracy, comprehensiveness and automation.

【关键词】 网络信息检索过滤
【Key words】 networkinformationretrievalfiltering
  • 【分类号】TP391.3
  • 【下载频次】105

