节点文献

F/10及G/11木聚糖酶家族的数学建模与分析

Mathematical Modeling and Analyst of F/10 and G/11 Xylanase Family

【作者】 赵静静

【导师】 唐旭清;

【作者基本信息】 江南大学 , 应用数学, 2010, 硕士

【摘要】 基因序列的相似性研究是生物信息学研究的热门问题之一.随着人类基因组计划的相续完成,大量的基因序列被相续测序,蛋白质序列的相似性研究变得越来越复杂,工作量越来越大.因此,研究新的序列比对方法便成了迫切的问题.而基因序列的图形表示方法则是研究基因序列相似性的一种行之有效的方法.本文的主要工作包括以下几个方面:1、在DNA序列的混沌游走方法(CGR)及DNA序列的4线图谱表达方法(4-LGR)的基础上,提出了一种新型DNA序列的表达方法—矩阵图谱表达法(MGR).进一步,在DNA序列的上述三种表达式基础上,分别建立了基于经典HP模型的蛋白质序列的图谱表达法,而且对蛋白质序列的相似性进行了比较验证.2、基于经典HP模型下,利用蛋白质序列的矩阵图谱表达法(MGR)及数值刻划的思想提出了一种新的蛋白质序列的比对方法.通过观察蛋白质序列的数值刻划图及计算两蛋白质序列之间的欧氏距离d ,对木聚糖酶两家族的蛋白质序列进行了相似性分析.3、在石秀凡及朱平等人提出的拟氨基酸编码方法的基础上,计算了F/10和G/11木聚糖酶家族的同义密码子的二个相对使用度,即RSCU和QRSCU.通过分析和比较得到,基于拟氨基酸的编码方法能更明显的展示出密码子家族中对同义密码子的一致偏好性.也就是说,基于拟氨基酸编码方法下的F/10与G/11木聚糖酶家族更偏好使用密码子-反密码子结合作用强的密码子,恰好是以g/c结尾的密码子.这些结果与前人的偏好性研究结果一致,并且我们进一步验证了拟氨基酸的编码方法与密码子偏好性的研究结果密切相关.4、本文采用Jeffrey于1990年提出的描绘DNA序列的混沌游走方法(CGR)给出了F/10及G/11木聚糖酶家族的核酸序列的CGR图,计算了相应的马尔可夫两步转移概率,进而计算了F/10和G/11家族同义密码子的偏好使用度.通过以上分析得出的结论是,碱基的偏好使用情况与序列的G/C含量和分子进化成正相关性.文中的研究结果表明,上述的研究是有意义的,其具有实用价值,对今后的这一方面的研究具有极大的帮助.

【Abstract】 The similarity research of gene sequence is one of the most hot question in the area of bioinformatics. With the completion of human genome project, a large number of gene sequences are measured, similarity research of protein sequences become more complex and workload more heavy. Therefore, the study of new methods of sequence alignment has become a urgent issue. In fact, graphical representation method of gene sequences is an effective method of research sequence similarity. The mains contents are listed as follows:1、Based on the chaos game representation method (CGR) of DNA sequences and 4 Line Graphical Representation method (4-LGR) of DNA sequences, we proposed a novel graphical representation method of DNA sequences—Matrix Graphical Representation (MGR).Further, on the basis of the above three kinds of DNA sequences model, we extend graphical representation of protein sequences based on the detailed HP model respectively. Then, the similarity of protein sequences is compared.2、Based on the detailed HP model, using the idea of matrix graphical represention of protein (MGR) and numerical description, we proposed a new method to align two protein sequences. Through review numerical description graph of protein sequences and compute Euclidean distance d between that two sequences, we analyse the similarity of protein sequences about two xylnase family.3、According to the work of Shi Xiufan and Zhu Ping et al, the paper computes the relative usage degree of the synonymous codon of F/10 and G/11 Xylanase: (RSCU and QRSCU). Through the analysis and comparison we can see that based on the classification of the quasi-amino acid can more abvious show the consistent preference to the synonymous codon. That is to say, based on the classification of quasi-amino acids, F/10 and G/11 Xylanase prefer to use the codons with strong combination of the codon-anticodon, just the codons ending with g/c. The conclusion accords with the preference studying about the 78 human genes. And further verified codon preference closely related to quasi-amino acids coding method.4、According to the CGR of the DNA sequences proposed by Jeffrey in 1990, the paper researched the gene sequences of F/10 and G/11 family and gived the CGR of gene sequences. At the same time, we further gived the corresponding probability matrix for the second-order Markov Chain model and computed the relative usage degree of the synonymous codon. Through the analyst we can see that the use of preferences of the synonymous codon closely related to the G/C content and molecular evolution. The paper’s results indicate that the research is meaningful, and it has great practical value for future research in this area.

  • 【网络出版投稿人】 江南大学
  • 【网络出版年期】2012年 02期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络