节点文献

RNA序列比对算法研究

Research on the Algorithms of RNA Sequence Alignment

【作者】 方曾营

【导师】 张祖平;

【作者基本信息】 中南大学 , 通信与信息系统, 2008, 硕士

【摘要】 生物信息学是一门综合利用生物学、计算机科学、数学等学科知识的新兴交叉学科。RNA序列比对是生物信息学研究的重要课题,特别是包含二级和三级结构的比对。由于RNA序列数据量大,折叠的结构非常复杂,造成序列的比对是一个复杂度高而又很难有实际检验的过程,其中RNA三级结构比对是NP-hard问题。如何提高序列比对的速度,以及解决三级结构比对问题是本课题研究的重点。本文在深入分析现有比对算法及其实现软件的基础上,利用RNA二级树形结构模型,深入分析了RNA二级结构对比对算法,对算法进行详细的阐述与分析。论文针对RNA三级结构比对难点,提出了基于二级结构映射的序列三级结构比对算法,以及基于二级结构转化的三级结构比对算法,对比对结果进行了分析。论文对算法的实现及其比对软件的测试结果也作了深入的分析,实验结果表明二级结构比对算法具有较好的时间特性;三级结构比对算法能够正确反映序列的相似度,并且和二级结构密切相关。在实现的过程中,还针对RNA海量数据的特点,提出内存优化和动态规划回溯优化策略,提高了处理的效率。

【Abstract】 Bioinformatics is a new science field. Research in this field involves multi-disciplines such as biology, computer science, mathematics, etc. RNA sequence alignment is the important topics of biological information, especially the RNA that contains the secondary and tertiary structure. Because the data of RNA sequence is massive, and the structure is very complex, so that sequence alignment is a time-consuming process, which is difficult to be proofed, the tertiary structure of RNA alignment is NP-hard problem. Improve the speed of sequence alignment, as well as the tertiary structure alignment, is the focus of this research.After the deep analysis of existing assembly methods and implementation softwares, we analyze the optimized algorithm of RNA secondary structure alignment by the tree-model. In addition, the paper discusses the difficulties of RNA tertiary structure alignment, and gives two tertiary structure alignment algorithms which base on the mapping of the secondary structure sequence, and transforming to the secondary structure, and analyze the result.Finally, the paper makes a deep analysis into the realization of improved algorithms and also presents some experiments of them. The testing results indicate that the algorithm of secondary structure alignment has a good time complexity, and that of the tertiary structure can tell the sequence similarity correctly, and it is related closely with the number of secondary structure. In the process of realization, consider about the characteristics of massive data, we put forward some optimization strategy, it includes memory optimization and dynamic programming retrospect optimization and it has a high processing efficiency.

  • 【网络出版投稿人】 中南大学
  • 【网络出版年期】2009年 01期
  • 【分类号】TP399-C8
  • 【被引频次】1
  • 【下载频次】192
节点文献中: 

本文链接的文献网络图示:

本文的引文网络