节点文献

基于动态时间规整的基因表达数据分析

Analysis for Gene Expression Date Based on Dynamic Time Warpping

【作者】 潘谈

【导师】 李瑛;

【作者基本信息】 吉林大学 , 软件工程, 2010, 硕士

【摘要】 推断基因表达数据间的相似性是推断基因功能,回答复杂的生物学过程的一个重要途径。动态时间规整算法是最早应用于生物信息学中进行序列比对的,考虑到时间序列基因表达谱存在时间上的延迟以及局部相似性等特性,本文将动态时间规整算法用于时间序列基因表达谱的相似性推断,并且实现了动态规整算法的优化,即多分段的动态时间规整算法。实验表明,该算法的时间复杂度低,比对精确度很高。

【Abstract】 Science and Technology of the 20th century the rapid development in various fields, especially in information technology application and impact of more extensive development of almost all fields of information technology are inseparable. As the rapid development of biological science and technology produced a large amount of biological data, simply use the traditional biological experiments will be difficult to quickly and comprehensively addressed so many biological data, which is bound to restrict the life sciences and related areas of rapid development. In this case, bioinformatics emerged, Bioinformatics using computer technology, information technology, statistical science, medicine and mathematics and other disciplines of knowledge and technology, mainly to study the basis of available data found in the corresponding knowledge of the law and thus to further guidance and interpretation of biological experiments and life and accelerate the understanding of essential characteristics of life.Inferred gene expression data is the similarity between the inference of gene function, to answer complex biological processes is an important way. Solution similar to gene expression time series query There are several ways in which the most commonly used is the basic dynamic time warping algorithm, dynamic warping algorithm to solve many important applications in key technologies, for example, using dynamic programming made in the field of speech recognition great success in biology with genomics to solve matrix multiplication, there are applications to graph the shortest path problem.Dynamic time warping algorithm is first used in bioinformatics for sequence alignment, dynamic time warping algorithm is used to process the ratio of the time series gene expression data generated by many problems, including the sparsity of data, height, dimensions, noise measurements, and occurred at similar time series of local deformation. Taking into account the existence of time series gene expression time delay and the local similarity and other features, this dynamic time warping algorithm is used to time-series gene expression date similar inference, and implements the dynamic warping algorithm optimization, a multi-segmented dynamic warping algorithm.Mentioned in this article the dynamic warping algorithm for multi-segment method to deal with several key challenges:Toxicology Research is a typical time series matrix, contains less than 10 time points measured.Since the time series is non-uniform time intervals in the treatment of changes in the sample, at a given point in time the query in the sequence in the database and the measurement points may not be similar.You can query multiple measurements or length differences. Some queries may be only constituted by a single observation report, however, may contain many other points in time. Some queries may span only a few hours while others may be included in measurements for several days.A given query in the database sequence with its best match in the amplitude, frequency or duration of time, and it can be different. For example, a query expression profile of the treatment may be treated with a gene expression database sequence similarity to the response in addition to reduced or delayed, or the delay occurred more slowly. This query can be seen as a shortened version of the database sequences, and vice versa.Experiments show that the number mentioned in sub-dynamic time warping algorithm is better than some other alternative to produce a more accurate comparison and classification, and chemotherapy in a similar relative distortion between the strong.

  • 【网络出版投稿人】 吉林大学
  • 【网络出版年期】2011年 05期
  • 【分类号】Q78
  • 【下载频次】128
节点文献中: 

本文链接的文献网络图示:

本文的引文网络