节点文献

基于生物医学本体的生物信息数据库集成方法研究

Research of Biomedical Ontology-Based Bioinformatics Database Integration

【作者】 隽立然

【导师】 王亚东;

【作者基本信息】 哈尔滨工业大学 , 计算机科学与技术, 2009, 硕士

【摘要】 分析与处理分子生物学实验、特别是近年来涌现的高通量方法产生的海量数据是生物信息学的重要任务。大量计算机学科方法广泛地应用在这个领域中。分子生物学数据库是这两个学科的交汇点。截至2009年,国际上已经有1000个以上的生物信息数据库,这些数据库涉及分子生物学和生物信息学的各个领域,包含的数据类型复杂多样。通过分析现有数据库的内容和结构可以描绘生物信息学的发展现状以及探索新的研究方向。在这个过程中,能够挖掘数据库以及相关研究之间关系的数据库网络将是十分重要的。当前只有以研究领域对数据库的简单分类,整合并分析生物信息学数据库内容关系的研究还未在文献中见到。生物学知识固有的复杂性导致难以简单集成在已有的数据库或分子数据中。本体是一种形式化表示概念意义的描述以及概念之间关系等方面知识的方式。用唯一的标识符来标记生物学本体中的每个概念,可以用于检索分子数据库。本文整合部分现有的生物信息数据库资源,分析了生物信息数据库的一般特征和生物信息学研究的一般过程,设计了一个基于内容的生物信息数据库集成模型。本文使用概念/术语来描述每个数据库的内容,抽取生物医学本体的知识以建立概念之间的联系组成生物学概念网络,在概念网络的基础上建立生物信息数据库网络。通过进一步区分概念之间的关系类型,包括生物学关系,可以使生物信息数据库网络具有生物学意义。不同的关系赋予不同的权值,以此量化数据库间的关系,能够衡量网络中数据库之间关系的紧密程度,并基于此进行生物信息数据库检索。本文实现了一个生物信息数据库集成平台Bio-DB^2,通过整合部分现有的生物信息数据库资源,建立了基于内容的数据库网络。在实际开发中,Bio-DB^2还提供直观的关系视图来表示概念与数据库以及数据库与数据库之间的关系。

【Abstract】 Bioinformatics need to process, analyse mass data from molecular biology, especially data generated by the high-throughput methods these years. Computer science and technology has been used in this area widely. Molecular biology databases are the meeting point to the two subjects. Until 2009, More than 1000 Bioinformatics database has been built up all around the world. These databases were involved in every subjects of molecular biology and bioinformatics, and were restored multi datatype of data.Thus we could depict current situation of Bioinformatics and explore new study by analysing content and structure of these databases. And a database network can mine relationship among databases(and relative research) will be helpful. Currently there is database list simply classified by research fields, and no reference mentioned content-based databases integration.Biological knowledge is inherently complex and so cannot readily be integrated into existing databases of molecular data. An ontology is a formal way of representing knowledge in which concepts are described both by their meaning and their relationship to each other. Unique identifiers that are associated with each concept in Bio-Medical Ontologies can be used for linking to and querying molecular databases. In this paper, we integrate some current biological database resource, design a database integration model through analysing common feature of biological database and common research process of bioinformatics. We use concept/term to describe the content of every database. Then we extract knowledge information from biological ontologies to associate concept with others. Database network will be built on this concept network.We further distinguish relation type between concepts--inherit, part and whole, and even biological relations, etc. Thereby Bio-DB network could be put biological meanings. Different weight have been add on relation types in order to quantify relationship of databases. And we develop retrieval base on the relationship score.In Bio-DB^2, we developed a database network named Bio-DB^2 that integrated current Bioinformatics database. Bio-DB^2 is a database network based on DB content. We also provide visual relation view to characterize relations between concept and database, or database and database.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络