节点文献

XML数据查询的关键技术研究

Key Technologies in XML Data Query Process

【作者】 赵九震

【导师】 张世栋;

【作者基本信息】 山东大学 , 计算机软件与理论, 2010, 硕士

【摘要】 XML的全称是Extensible Markup Language(可扩展标识语言)由于具有简单、可扩展、互操作性强,开放性强等特点,正迅速成为一种与技术无关的数据交换的标准和传输格式,并逐渐成为当前网络应用中事实的数据表达、交换的标准。鉴于XML在诸多领域有广泛的应用前景,许多关于XML的研究都是前沿和热点课题。例如在数据库领域,从某种意义上说XML作为数据库使用可以自然地表示嵌套型数据,比关系型数据库具有更强的表达能力,但是对XML数据的查询还有很多不完善的地方,其查询准确性与查询速度都需进一步的提高。XML数据管理系统主要解决XML数据的存储管理、查询处理、访问控制、数据更新等。XML查询处理与优化包括XML查询代数、查询处理、查询优化等。XML数据查询是XML数据管理一个非常重要的组成部分,是当前学术界研究的一个热点方向。XML查询根据其查询模式的不同可以分为两类:XML Query查询方式和XML IR查询方式。而XML IR方式又可以细分为三类:XML IR/keyword方式、XML IR/query方式和XML IR/fragment方式。本文主要研究XML数据集成查询过程中碰到的一些问题,以及所采取的相应解决方案。其中主要包括三部分的内容:第一,由于XPath是当前流行的XML数据查询语言XQuery和XSLT的基础,我们针对XPath语言中的复杂路径表达式,设计了一种路径表达式的最优化方法,用以提高在对XML进行查询时的执行效率;第二,基于当前比较流行的一种查询代数OrientXA,基于代数表达等价原则,设计了一系列的等价转化方法,简化了XML查询路径表达式的代数表示,优化了XML数据的查询效率;第三,针对多XML数据源的集成查询,由于查询过程往往涉及到对多个XML片段中相似重复信息的处理,而我们有时候需要对多XML片段中的共同信息进行提取,由此,本文提出一种XML有向标记树模型,并在此模型上设计了一种相似匹配算法来对共同信息进行挖掘。实验显示,该算法具有很高的可行性及使用价值。

【Abstract】 XML, which stands for Extensible Markup Language, with the advantages of simplicity,scalability,interoperability, and strong opening characteristics, is fast becoming a kind of standards for data exchange which is unrelated to the technology and transmission format. In view of XML have broad application prospects in many fields, and many studies about XML query there are forefront and hot topics. For example, in the field of database, in a sense XML as a database can be naturally usto represent nested data, and it have a more stronger ability. but there are many deficiency in XML data query field, the efficiency and speed need to be improved largely.XML data management system mainly solve the XML data storage management, query processing, access control, data updates. XML data storage management technology include data storage,data encoding, indexing and other methods. XML query processing and optimization, including XML query algebra, query processing, and query optimization. XML data query is a very important component of XML data management and it is a hot topic of the current academe. According to the different query patterns XML Query can be divided into two main parts:XML Query query mode and XML IR queries. XML IR can also be devided into three parts:XML IR/keyword method, XML IR/query methods and XML IR/fragment method.This paper studies some of the key technologies of XML queries, consisting mainly three parts:first, introducing briefly the basic concepts of XML data query, and designing a optimization mehod for Xpath expressions. Secondly, we analyzed the concept and representation of the current popular XML query algebra-OrientXA, then we proposed a new query optimization approach based on this query algebra; Finally, because XML query process often involves several similar XML fragments, and we sometimes want to find out same information in different XML Data, thus, this paper presents a tree model of XML Data with labels, based on this model, we design a matching algorithm, some experiments show that the algorithm has high feasibility.

【关键词】 XML查询查询优化模式匹配
【Key words】 XML QueryQuery OptimizationSchema Matching
  • 【网络出版投稿人】 山东大学
  • 【网络出版年期】2010年 09期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络