节点文献

识别和分析植物基因组里的功能元件

Identification and Analyses of Functional Elements in Plant Genomics

【作者】 周婵

【导师】 郝柏林; 徐鹰;

【作者基本信息】 浙江大学 , 生物信息学, 2009, 博士

【副题名】Z-DNA、看家基因、细胞壁合成相关蛋白的识别和分析

【摘要】 识别和刻画植物基因组中的各种功能元素是解读它们在植物中的功能的第一步,帮助我们从分子和细胞水平上进一步深入了解植物基因组,从而有可能有助于我们更好地利用植物资源为农业生产、生物能源等方面的研究做出贡献。在本论文中,我们将介绍在双子叶模式植物拟南芥和单子叶模式植物水稻中三种不同功能元件(Z-DNA构型序列、看家基因、细胞壁合成相关蛋白/基因)的研究工作。在第一章绪论之后,分三章展述相关研究。第二章讲述Z-DNA在拟南芥和水稻基因组上的分布与功能的比较分析。左手螺旋Z-DNA是一种在能量上处于劣势的DNA结构,只在细胞的特定生理条件下才形成。目前已知Z-DNA可参与一系列的细胞活动,比如转录调控。我们比较了拟南芥和水稻基因组里Z-DNA的分布与功能,且观察到Z-DNA在水稻基因组里出现的频率是其在拟南芥基因组里出现频率的9倍。类似的情况也存在于其它单双子叶植物中。此外,Z-DNA显著富集在拟南芥基因的编码区域,但是在水稻的高GC区域。基于这些分析,我们推测:在拟南芥里,Z-DNA可能参与调控转录因子、抑制子、翻译抑制子、琥珀酸脱氢酶、谷胱甘肽二硫化物还原酶的表达;在水稻里Z-DNA可能影响到囊泡基因、核小体基因、参与乙醇转运活动的基因、维持干细胞的基因、分生组织发展的基因、生殖结构发育的基因的表达水平。第三章讲述从全基因组水平去寻找和刻画水稻和拟南芥里的看家基因。看家基因是那些在不同的组织细胞里持续表达以维持细胞基本功能的基因。用芯片数据,我们分别在水稻和拟南芥里找出1928和1411个看家基因。所找到的水稻看家基因主要属于五类功能类别:结合因子、酶类、转运子、转录因子和结构分子;而拟南芥里的看家基因则主要属于以下五类:酶类、结合因子、结构分子、转运子和转录调控子。此外,我们还观察到一些有关看家基因的有趣现象,包括(a)在这两个基因组上看家基因的编码序列长度显著地比其它基因的平均编码序列长度短;(b)在水稻和拟南芥里,它们的平均基因长度显著比其它基因短;(c)水稻看家基因的平均外显子长度显著比其它基因短,而拟南芥看家基因的比其它基因的外显子个数要少;(d)在两个植物中,看家基因的内含子显著比其它基因长。看家基因有较短的编码序列可能是为提高它们的转录效率自然选择造成的。此外,我们还发现所识别的看家基因比其它基因有更多的表达序列标签以及其有更广泛的组织表达谱。第四章讲述通过蛋白质相互作用数据来识别与植物细胞合成相关的新蛋白。植物细胞壁主要由木质素和纤维素组成,是未来生物燃料生产最富有的生物物质来源。在此,我们报道如何利用蛋白相互作用数据和已知的细胞壁合成相关(CWSR)蛋白通过计算方法来预测新的细胞壁合成相关蛋白。我们在拟南芥里预测了100个新的候选CWSR蛋白。其中有7个由现有的CWSR数据库所证实,有46个有其它的独立证据,从而认为是相对可靠的预测结果。在这46个预测的新CWSR蛋白里,基于蛋白结构域构造、系统发育分析和现在的注释,我们预测出其中33个CWSR蛋白在细胞壁合成中的特定功能角色。有趣的是,尽管预测用的571个已知CWSR种子蛋白仅涉及细胞壁合成过程6个主要部分中3个,但预测的CWSR蛋白还涉及了两外其它两个部分。这也显示了用蛋白相互作用方法来预测新的CWSR蛋白的有力之处。

【Abstract】 Identification and characterization of functional elements is the most fundamental step to understand the roles various functional elements play in plant genomics. It help us to know much more about the plant genomics in the molecular and cellular levels, then hence may improve us to make better use of the plant biomass in other applications, such as agriculture, biofuel, etc.In this dissertation, we presented the studies for three different functional elements (i.e., Z-DNA, housekeeping genes and cell-wall synthesis related (CWSR) proteins) in plants, especially model dicot plant A. thaliana, and model monocot plant O. sativa (rice), in three chapters following the first chapter for the introduction.Chapter two discussed the comparative analyses of distributions and functions of Z-DNA in Arabidopsis and rice. Left-handed Z-DNA is an energetically unfavorable DNA structure that could form mostly under certain physiological conditions and was known to be involved in a number of cellular activities such as transcription regulation. We have compared the distributions and functions of Z-DNA in the genomes of Arabidopsis and rice, and observed that Z-DNA occurs in rice at least 9 times more often than in Arabidopsis; similar observations hold for other monocots and dicots. In addition, Z-DNA is significantly enriched in the coding regions of Arabidopsis, and in the high-GC-content regions of rice. Based on our analyses, we speculate that Z-DNA may play a role in regulating the expression of transcription factors, inhibitors, translation repressors, succinate dehydrogenases and glutathione-disulfide reductases in Arabidopsis, and it may affect the expression of vesicle and nucleosome genes and genes involved in alcohol transporter activity, stem cell maintenance, meristem development and reproductive structure development in rice.Chapter three talked about the identification and characterization of housekeeping genes in rice and Arabidopsis in the genomic level. Housekeeping genes are constitutively expressed genes across different tissue types to maintain the essential cellular functions. We have identified 1,928 and 1,411 housekeeping genes in rice and in Arabidopsis, respectively, based on microarray data. The five most dominating functional classes of our predicted housekeeping genes are binding factor, enzymes, transporters, transcription regulators and structural molecules in rice and enzymes, binding factors, structural molecules, transporters and transcription regulators in Arabidopsis, respectively. We have made several interesting observations of these identified housekeeping genes, including that:(a)their average coding sequence lengths and average gene lengths are significantly shorter than these of all the other genes;(b) housekeeping genes have shorter average exon lengths in rice, whereas they significantly have less exons than the other genes in Arabidopsis; (c)but their introns are significantly longer than introns of all the other genes in both plants. The shorter coding sequences may be a result of natural selection for improved translational efficiency. We have also found housekeeping genes have more Expressed Sequence Tag evidences and broad expression profiles across different tissues.Chapter four focused on identification of novel proteins involved in plant cell-wall synthesis based on protein-protein interaction data. Plant cell wall is mainly composed of lignins and polysaccharides, representing the richest source of biomass for future biofuel production. We report a computational framework for prediction of CWSR proteins, based on known protein-protein interaction data and known CWSR proteins. We predicted 100 new candidate CWSR proteins in Arabidopsis seven of which were public database confirmed to be involved in cell-wall synthesis, 46 have independent supporting evidences and hence are considered as reliable predictions. For 33 of the predicted CWSR proteins, we have predicted their specific functional roles in cell-wall synthesis, based on analyses of their domain architectures, phylogenetic analyses and known functional annotation in conjunction with literature search. Interestingly, although the 571 seed CWSR proteins cover only three components among the six main components of the cell-wall biosynthesis process, our predicted CWSR proteins cover two additional components, highlighting the power of our protein-protein interaction based CWSR protein prediction method.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2011年 10期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络