节点文献

基于结构连接的XML查询处理与研究

Query Processing and Research for Structural Join Against XML Data

【作者】 贾蓓

【导师】 包小源;

【作者基本信息】 天津师范大学 , 计算机应用技术, 2008, 硕士

【摘要】 XML自从1998年由W3C提出以来,就迅速的成为Internet上用于数据表示和数据交换的标准。XML文档大量涌现,XML的有效管理受到广泛关注。由于XML数据具有不同于传统数据形式的树状结构,使得传统的数据库技术不能有效地发挥作用,因此需要针对其特点研究新的处理方法。为了解决XML路径查询处理中的关键技术问题,为较大规模的XML查询应用提出切实可行的解决方案,本文给出了XPath查询的系统框架,定义了系统可以处理的XPath的语法,实现了一个XML文档的查询处理系统。作为XML查询处理的核心操作,结构连接操作的高效实现是提高查询处理性能的关键所在。本文针对结构连接操作的高效问题,在XML数据区间编码的基础上,把基于过滤的小枝结构连接技术应用到查询系统中。把源路径以及路径包含的概念引入过滤算法,减少了PSet集合中的路径数目。对使用过滤算法与不使用过滤算法的整体小枝连接技术进行了实验比对,试验结果显示使用过滤算法的整体小枝连接具有更好的性能。现有的XML结构连接算法都是在节点编码的基础上提出的。目前,各种节点编码方式及其对应的结构连接算法很多。本文针对多种结构连接算法进行了系统的总结和比较,并分析了各种算法的不同性能。

【Abstract】 XML has become new criteria of data represention and exchange in Internet and it has been accepted in many fields since it was put forward by W3C in 1998. This is creating a new set of data management requirements involving XML. Traditional database technologies can’t work efficiently owing to the tree-like nature of XML data and new application environment .New technologies specially designed for XML data are needed to process XML data efficiently. In this paper, we focus on the path expression processing such that the key issues in the large-scale XML query application can be settled by feasible approaches. We propose a system framework of XPath query, defining the XPath grammar that the system can deal with, giving the query processing system.As the core operation of XML query processing, the efficient implimenation of structural join is the key to improve XML query processing. Based on the region numbering scheme of XML data, we led into filter-based twig structural join technology. Different form previous algorithms, filteration algorithm filters the query pattern and the data set with the path encoded information, leaving the elements to join the structural join. Then we use twig join algorithm for these elements. We introduce the concept of source path and path containment, decreasing the amount of PSet. We hava carried out an experiment to compare the technologys about whether using filtering algorithm or not. The results of our comprehensive experiment show that the twig join algorithm with filtering process performs well both synthetic and real-word datasets,and has good scalability.The XML containment join algorithm is proposed based on XML encoding. Many researchers hava proposed all kinds of encoding and relevant containment join algorithms. We sum various structural join algorithms up. At last, we analyse the performance of these algorithms.

【关键词】 XMLXPath编码方法过滤结构连接
【Key words】 XMLXPathNumbering SchemeFilterStructural Join
  • 【分类号】TP312.2
  • 【被引频次】1
  • 【下载频次】104
节点文献中: