

Study and Implementation of the Heterogeneous Database Access and Integration in CGSP

【作者】 石万兵

【导师】 李廉;

【作者基本信息】 兰州大学 , 计算机软件与理论, 2008, 硕士

【摘要】 近年来,网格作为一种新兴的技术备受世界科学界的关注,被称为下一代互联网。网格可以将地理上分布不同、系统异构、性能各异的各种资源,包括硬件资源和软件资源,形成虚拟组织,通过高速互连网络连接起来形成广域范围的资源共享和协同计算环境。随着网格技术的发展,它所处理的数据也逐渐变得越来越复杂和巨大,许多网格应用服务越来越需要对大型的异构的数据库进行访问,这就迫切需要一种能够访问和集成异构数据库的中间件,本文讨论的问题也正是在这种背景下应运而生。论文以中国教育科研网格支撑平台ChinaGrid Support Platform项目为背景,提出了一种构建在网格环境下异构数据库访问与集成的基础架构。通过对开放网格服务架构—数据访问与集成(OGSA-DAI)的研究,在其原有的核心基础上进行扩展,提出了虚拟表、物理表和临时中间数据库的概念,设计和实现了SQL查询语句的解释器和分发器,通过文件流的方式来处理海量数据的查询,并扩展了数据传输模块,使得它与整个CGSP系统紧密结合起来。对于用户来说,我们通过WebService的方式提供统一的接口访问,可以方便地共享、查询和使用资源,最重要的是提供了对多个异构数据库之间的分布式联合查询功能。论文首先对网格的概念与相关技术、网格的现行标准和发展状态及网格与数据库之间的关系进行了介绍。通过比较,本论文指出传统集群计算及P2P等分布式计算不是网格。其次,论文介绍了ChinaGrid Support Platform项目,并详细说明了中国教育科研网格支撑平台(CGSP)中与异构数据库相关联的关键模块。最后,论文提出了一种新的可行的异构数据库访问与集成的方法,给出了相关概念的定义,详细说明了该模块的整体架构和内部结构与实现,给出相关的实验性能结果并讨论。

【Abstract】 In recent years, grid computing has been focused by scientific communities as a new emerging technology, and we called it as the Next Generation Internet. Grid can integrate lots of geographically distributed heterogeneous resources, including hardware and software resources, to compose virtual organizations, which are connected by high performance network in a broad range. Grid emphasizes the resources sharing and the collaborative work environment. As grid technology developing, more and more grid-based applications need a grid middleware which can access large, heterogeneous data resources. It’s just the background of this paper.This paper proposes a new fundamental architecture of the heterogeneous database access and integration in grid, which is based on the ChinaGrid Support Platform project. It presents the concepts of virtual table, physical table and temporary middle database, designs and implements the SQL interpreter and dispatcher of query statements, handles the large data query by file stream, and extends the data transport component based on the functionalities of OGSA-DAI core framework. Meanwhile, it provides uniform webservice-based interfaces, which can help users to share, query and use resources conveniently. Moreover, it is the most important that it provides distributed joint query spanning many the heterogeneous databases.This article is organized as follows. Firstly, it introduces concepts and technologies about grid as well as the information of grid specifications and statuses. It discriminates the grid computing from the traditional cluster computing and distributed computing such as P2P systems. Secondly, the article introduces ChinaGrid Support Platform project, and illustrates the related modules of CGSP in detail. Lastly, the article proposes a novel and feasible infrastructure for heterogeneous database access and integration, defines relevant concepts, depicts the fundamental architecture and how to implement it. The performance evaluation and the experiment results will be discussed at the end of this paper.

  • 【网络出版投稿人】 兰州大学
  • 【网络出版年期】2009年 01期
  • 【分类号】TP311.13
  • 【被引频次】1
  • 【下载频次】109

