

Research and Implementation of P2P Search Engine Based on JXTA

【作者】 孙赛赛

【导师】 孟晓景;

【作者基本信息】 山东科技大学 , 计算机应用技术, 2007, 硕士

【摘要】 搜索引擎解决了用户搜索信息的难题。但是,由于传统搜索引擎采用集中式架构,还存在许多问题,如服务器故障、存储容量有限、以及存储链接不能及时更新等,严重影响了搜索引擎的性能。P2P技术具有分布式、动态性、可扩展性的特点。P2P技术应用于搜索引擎,给搜索引擎的发展带来了新活力。论文主要探讨一种将P2P的新理念和技术优势引入搜索引擎系统的方式,主要研究内容和解决的问题包括:(1)由于现有的P2P应用程序均采用从底层开发的方式,没有共同的标准,彼此间无法相互兼容。因此,设计系统时选择了SUN公司的通用开发平台——JXTA作为P2P网络的开发标准,在JXTA协议基础上构建了基本的P2P通信网络。(2) P2P网络中的资源发现是一个难点。实现时采用IP多播进行防火墙内的多播搜索,采用HTTP实现穿越防火墙的搜索。同时定义了“搜索”对等组,提供组成员资格服务,并将通信流量限制在对等组范围内,避免网络通信流量不必要的扩散。(3)加入二次排序模块,将来自多个对等体的检索结果汇总排序后显示给查询用户。考虑到P2P系统动态特性以及用户需求特性,以Lucene评分机制为前提,提出了二次排序评分机制以适应P2P网络搜索的特点。(4)定义了位于P2P网络之上的对等组管理服务、管道通信服务、消息管理服务、内容下载服务以及本地资源管理服务,设计了便于用户操作的应用界面,从而构建了一个完整的基于JXTA的P2P搜索引擎系统。最后,论文给出了系统实现方案,在局域网环境内对系统进行了测试与分析。实践证明,系统能够有效的挖掘网络边缘计算机中的信息,充分利用边缘计算机的计算与存储能力,具有较高的实用价值和推广前景。

【Abstract】 Search engine solves the difficult problem of searching information. However, because traditional search engine adopts centralized mode, some problems still exist, such as server failure, limited storing capacity and outdated links that are not upgraded in time, which seriously affect performances of search engine.P2P has characteristics of distribution, dynamics and scalability. That P2P technology is to be applied into search engine brings new energy for search engine.How to bring P2P new idea and technology advantages to search engine is discussed in this thesis, and main content to research and problems to solve are the followings:(1) Because existing P2P applications are developed from the bottom with no standards, all of them can not be compatible to each other. Then, while designing, P2P platform—JXTA is chose as the development standards for P2P network, and a basic P2P communication network is built based on JXTA protocols.(2) Resource discovery is a difficult point in P2P network. IP multicast is used for broadcast searching within firewall, and HTTP is realized to search through firewall. Meanwhile, search peergroup is defined with membership service, and communication flow is limited within peergroup, avoiding needless communication.(3) Because searching results come from several peers, second sorting module is used to gather and sort results before displaying them to users. Considering dynamic character of P2P and user demands, based on Lucene sorting mechanism, second sorting mechanism is advanced to adapt to search in P2P.(4) Peergroup management service, pipe communication service, message management service, content download service and local resource management service above P2P network are defined, user-friendly interface is designed, and at last, one integrated P2P search engine based on JXTA is built.At last, implementing schemes of P2P search engine based on JXTA is present, then it is tested and analyzed in LAN. The experimental results approve that this system can deeply mine information stored in the computers lying on the edge of the network, and can sufficiently take use of the computing and storing capability, so it has great value of utility and foreground to widely spread.

【关键词】 搜索引擎P2PJXTALucene
【Key words】 search engineP2PJXTALucene
  • 【分类号】TP391.3
  • 【下载频次】473

