节点文献

基于对等网络的语义发布/订阅系统的关键技术研究

Research on Key Technologies of P2P-based Semantic Publish/Subscribe Systems

【作者】 施冬材

【导师】 董金祥; 尹建伟;

【作者基本信息】 浙江大学 , 计算机应用技术, 2007, 博士

【摘要】 随着Internet的飞速发展、移动设备和宽带的普及,网络计算的复杂度越来越高。新一代网络计算是以大规模、分散控制、动态性、自治性和松耦合为主要特征的分布式计算,而发布/订阅系统具有松耦合、匿名、多对多通信和可扩展的特点,已成为支持新一代网络计算的重要基础中间件平台。发布/订阅系统在基于内容的数据模型、匹配算法以及路由算法方面已取得相当成熟的研究成果,但还不能迎合新一代网络计算提出的挑战,主要表现在缺乏对语义事件和语义路由的支持上。论文结合国家863课题,主要研究发布/订阅系统的语义数据模型、语义匹配算法和基于对等网络的语义路由算法等关键技术,研制面向新一代网络计算的发布/订阅原型系统——钱塘语义发布/订阅系统(JTang Semantic Publish/Subscribe System,简称JTangPS),为新一代网络计算提供有力的中间件支持。论文第一部分论述发布/订阅系统的研究背景和研究意义。在阐述发布/订阅系统基本模型和比较相关通讯模型的基础上,介绍数据模型、匹配算法和路由算法等关键技术的国内外研究现状,总结各种方法的优缺点。分析新一代网络计算对发布/订阅系统提出的新需求,介绍发布/订阅系统的研究热点。第二部分提出一种基于组件的分层的语义发布/订阅系统体系结构。在分析几种典型发布/订阅系统体系结构优缺点的基础上,介绍发布/订阅系统的设计框架。在设计框架下讨论JTangPS的基本实现技术,并从分层体系结构和具体实现架构两个层次上介绍JTangPS的体系结构。JTangPS的体系结构具有松耦合的特点,系统中的组件可以被相同功能不同实现的其他组件替换,能快速满足不同应用场合的需求。第三部分提出一种基于Web本体描述语言OWL和资源描述框架RDF的语义数据模型。语义数据模型是系统理解语义信息的基础,由概念模型、事件模型和订阅模型组成。概念模型采用OWL语言,事件模型采用RDF图,订阅模型采用RDF图模式。概念模型可以直接使用现有网络上的OWL本体,而普通的RDF图都可以表述为事件。JTangPS的语义数据模型解决了数据信息语法异构而语义同构的问题,使事件和订阅能被机器无歧义地理解和处理。在此基础上,提出一种新的订阅语言RESL,该语言类似于现有的RDF查询语言。第四部分提出一种基于RDF图的快速匹配算法。高效的匹配算法是调和系统丰富表达能力和可扩展性矛盾的关键。JTangPS语义匹配算法的基本思想是把事件图和订阅图分解为一系列弧的集合,以弧作为匹配的基本单位;充分利用订阅之间的重叠性,为订阅和事件建立索引,缩小匹配范围;把事件转化为索引结构时,考虑到属性之间的语义关系,添加等价属性和祖先属性到索引;并结合订阅变量的类型约束检查,实现订阅和事件的语义匹配;通过订阅变量绑定表的自然连接操作消除不必要的约束检查,提高事件匹配效率。实验结果表明,该匹配算法在性能上优于G-ToPSS,远优于把事件同每个订阅进行匹配运算的简单匹配算法SMA,是一种高效的语义事件匹配算法。第五部分提出一种结构化P2P网络上基于集结点的语义路由算法。基于DHT的结构化P2P网络具有自组织性、容错性和扩展性的特点,不仅能够适应于网络的动态变化,还能够保证资源发现的准确率,很适合作为发布/订阅系统的底层结构。JTangPS语义路由算法的基本思路是根据订阅和事件的域标识、属性个数以及属性名映射订阅和事件到集结点,采用P2P的内在路由机制和聚合优化措施来分发事件。通过映射属性名,解决结构化P2P网络上DHT映射精确性与数据模型复杂性之间的矛盾,支持语义路由的同时,避免映射对订阅语言的约束;通过属性个数控制集结点数目和限制事件发布的目的地,减少事件发布流量,避免不必要的事件匹配计算开销;采用P2P的内在路由机制和聚合优化措施来分发事件,充分利用P2P网络容错性的同时,降低事件的路由流量。实验结果表明,在大规模的发布/订阅下,JTangPS的语义路由算法在性能上优于基于逆向路径转发的路由算法,并在路由效率、网络资源消耗、订阅维护效率和扩展性等方面取得了良好的平衡效果。第六部分探讨JTangPS原型系统的具体实现,通过RSS文档分发的例子介绍系统的应用,验证上述几章所讨论的系统体系结构、语义数据模型、语义匹配算法和语义路由算法。

【Abstract】 With the rapid development of Internet, mobile devices and broadband networks,network computing is getting more and more complex. The main characteristics of thenew generation of network computing are large-scale, decentralized control, dynamic,and autonomous. Since publish/subscribe systems have the advantages of loosecoupling, anonymity, many-to-many communication and scalability, they have becomeimportant infrastructure middleware platforms support for the new generation ofnetwork computing.Publish/subscribe systems have gained fairly mature research results oncontent-based data models, matching algorithms and routing algorithms, but they cannot yet meet challenges posed by the new generation of network computing, mainlydue to the lack of support for semantic events and semantic routing. In order to providepowerful middleware support for the new generation of network computing, this thesismakes the research on key technologies including semantic data model, semanticmatching algorithm, and semantic routing algorithm based on structural P2P networks,and develops a prototype of semantic publish/subscribe system called JTangPS (JTangSemantic Publish / Subscribe System). The work has been supported by the NationalHigh-Tech. R&D Program, China.In the first part of thesis, we discuss the research background and significance ofpublish/subscribe system. After explaining the basic publish/subscribe system modeland comparing it with relevant communication models, we introduce key technologyresearch situation on data model, matching algorithm and routing algorithm, summingup the advantages and disadvantages of various methods. Then we analyze therequirements of publish/subscribe systems proposed by the new generation of networkcomputing and introduce the research hotspots of publish/subscribe systems.In the second part, we propose a component-based layered architecture forsemantic publish/subscribe system. After analyzing some typical publish/subscribesystem architectures, we introduce various design frameworks for publish/subscribesystems and discuss basic implementing technologies for JTangPS under the givenframework. At last, we present the architecture of JTangPS on two different levels ofstructures, namly layered architecture and concrete implementing structure. JTangPShas a loosely coupled architecture in which system components can be replaced byones with the same functions but different implementations, thereby quickly meetingthe demands of various application scenarios. In the third part, we propose a new semantic data model based on OWL (WebOntology Language) and RDF (Resource Description Framework). The semantic datamodel includes conceptual model, event model and subscription model, which is thebase of understanding semantic information for publish/subscribe systems. Theconceptual model uses OWL.language; the event model uses RDF graph; and thesubscription model uses RDF graph pattern. The concept model can directly employexisting OWL ontologies on networks, and ordinary RDF graphs can be treated asevents. The semantic data model of JTangPS solves the problem on data withheterogeneous syntactic structures but homogeneous semantics, so that events andsubscriptions can be understood and dealt by machines without ambiguously. Inaddition, we propose a new subscription language RESL (RDF Event SubscribeLanguage) which is similar to existing RDF query languages.In the fourth part, we propose a fast RDF graph based matching algorithm.Efficient algorithm is a key technology to reconcile rich expressiveness with systemscalability. The basic idea behind the semantic matching algorithm of JTangPS is:devising an event or a subscription into a series of arcs of which each is the basicmatching unit; creating indexes for subscriptions and events to make full use of theoverlap between subscriptions and narrow the matching scope; adding the equivalentsand ancestors of attribute to index structures and being combined with the type checkof variable bindings of subscriptions to achieve semantic matching; doing natural joinoperations on variable binding tables to eliminate unnecessary constraint checks,thereby improving event matching efficiency. Experimental results demonstrate thatthe algorithm is superior to G-ToPSS in performance and much better than SMA(Simple Matching Algorithm) which simply matches events with each subscription.In the fifth part, we propose a rendezvous-based semantic routing algorithm onstructured peer-to-peer networks. DHT-based P2P networks are so self-organizing,fault-tolerant and scalable that not only it adapts to the dynamic changes of networks,but also guarantees the resources discovery accuracy, hence being very suitable as thesubstrate of publish/subscribe system. The basic idea behind the semantic routingalgorithm of JTangPS is to map events and subscriptions to rendezvous according tothe combination of domain identifier, the number of attributes and attribute name.Mapping attribute name not only solves the conflict between precision in data mappingand complexity of data model on DHT-based P2P networks, but also avoids theconstraints on subscription language while mapping subscriptions to rendezvous;attribute number controls the number of rendezvous and limits the destinations towhich events are published, thereby reducing the traffic of event publication and avoiding unnecessary event matching computation; it makes full advantage offault-tolerance of P2P and reduces the routing flow for event delivery to employ theembedded routing mechanism of P2P and the aggregation optimization approach.Experimental results show that for large-scale publish/subscribe, the semantic routingalgorithm of JTangPS is superior to the reverse-path based forwarding routingalgorithm and achieves a good balance among routing efficiency, network resourceconsumption, subscription maintenance efficiency and system scalability.In the sixth part, we explore the implementation of JTangPS prototype system andintroduce the practical application of JTangPS by the example of selectivelydisseminating RSS feeds, which demonstrates the system architecture, the semanticdata model, the semantic matching algorithm and the semantic routing algorithmdiscussed in previous parts.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2007年 06期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络