节点文献

酵母倍增基因功能分化机制研究

Research on the Mechanism of Functional Divergence between Yeast Duplicate Genes

【作者】 邹央云

【导师】 谷迅;

【作者基本信息】 复旦大学 , 生物化学与分子生物学, 2011, 博士

【摘要】 基因倍增及之后的功能分化是基因组进化最主要的驱动力之一,是导致生物体复杂性、产生具有新功能的基因和进化出新物种的原因之一。研究倍增基因的功能分化是功能基因组学的一个主要目标,对我们了解新基因的起源、生命体的进化至关重要。本项研究就从现存的倍增基因入手,利用全基因组功能组学数据(主要是酵母)、系统进化分析及相关统计处理,从顺式调控因子(TATA box)、反式调控因子(反式调控eQTL)、表观遗传修饰(组蛋白修饰)以及功能补偿效应角度来阐述这些倍增基因是如何一步步实现功能分化以及功能多样化,并最终被固定和保留下来。得到的主要结果如下:1)我们观察到在人类、线虫、拟南芥和酵母基因组里,TATA box(跟环境应答相关的顺式调控元件)要显著富集于倍增基因。为进一步研究基因倍增后TATAbox的进化情况,我们对酵母700多个基因家族重构祖先TATA box的状态,发现大多数基因家族祖先基因的启动子区都是没有TATA box的,并且基因倍增后,倍增基因启动子区TATA box的获得事件要远远多于失去事件---总的获得-失去比率大约是3-4倍。同时,后来获得TATA box的倍增基因要明显富集于与环境应答相关的功能类里(其它含有TATA box的倍增基因一般跟代谢活动相关),并在环境胁迫条件下经历更为快速的表达分化(不对称进化)。因此,我们认为在酵母中,基因倍增后,倍增基因启动子TATA box的获得事件加速了倍增基因在环境变化条件下的表达分化速度并促进该生物体的环境适应性进化,从而协助倍增基因在基因组中被顺利保留。2)通过对酵母全基因组eQTL数据的分析发现,倍增基因具有更高的表达遗传率,但在上位性与方向效应上,与单拷贝基因没有显著差异;基因倍增后,倍增基因对之间的反式调控eQTL会随着进化时间的行进不断分化;倍增基因反式调控eQTL的分化能解释其大约21%的表达变异,如果再考虑转录因子结合互作分化,解释率可升至27%;倍增基因对之间的反式调控eQTL分化与基因本体论(Gene Ontology)中“生物过程”和“细胞组分”的进化紧密相关,但与“分子功能”无任何关联。因此,我们觉得eQTL分析为研究基因倍增对遗传调控网络进化的影响提供了一种全新的思路和方法。3)通过对酵母全基因组水平的组蛋白修饰数据分析,我们发现倍增基因对比随机单拷贝基因对无论是在启动子区还是编码区都拥有更为相似的组蛋白修饰谱,并且倍增基因组蛋白修饰谱的分化与其编码序列、反式调控因子、顺式调控因子的进化存在很强的相关性。同时,进一步研究发现有可能受到反式调控因子作用的倍增基因拥有更为快速分化的组蛋白修饰谱。我们猜测,基因倍增后,基因的组蛋白修饰谱也发生了复制,之后随着倍增基因的遗传因子如序列、调控因子的进化而发生分化。4)关于倍增基因的功能补偿效应,我们提出两种假说。一种是环境依赖的倍增基因功能补偿效应丢失假说:因为自然环境选择效应的存在,保留下来的倍增基因在进化历史的某一时期或某些环境中对生物体的生存、繁殖是唯一必需、不可替代的。在这些环境中,倍增基因的功能补偿效应就丧失了。另一种假说是基因网络-蛋白质功能假说:倍增基因为不被假基因化,必须经历功能分化(受正向选择的基因除外)。当倍增基因的调控网络最先发生分化时,蛋白质功能一定程度上可以摆脱被分化的命运,从而使得倍增基因保留比较高的功能补偿效应;反之,若倍增基因的蛋白质功能最先发生分化,调控网络的进化相对就比较缓慢,但其功能补偿效应大大减弱。我们用数据一定程度上证实了这两个假说的合理性。为基因的遗传缓冲现象提供了新的思路。

【Abstract】 Gene duplication with subsequent functional divergence is generally thought to be a major driving force for genome evolution, and is one of reasons for the origin of organismal complexity, the introduction of genes with new functions and generating a new species. The study of functional divergence between duplicate genes is the primary aim in functional genomics, and is very important for us to understand the origin of new genes and organism evolution. In this study, using genome-wide functional genomic data (mainly yeast), we conducted an extensive phylogenetic analysis and related statistical processing to investigate how the function of these duplicate genes diverged, and finally were fixed and retained in the genome. Our research focusing is on the evolution of three transcriptional regulatory factors cis-regulatory element (TATA box), trans-regulatory element (trans-acting eQTL) and epigenetic modification (histone modification), after gene duplication, as well as the mechanism uncovering of functional compensation between duplicate genes. The main findings are as follows:1) We observed that TATA box (stress response related cis-regulatory element) is significantly overrepresented in duplicate genes compared with singletons in human, worm, Arabidopsis and yeast genomes. To further study the evolution of the TATA box after gene duplications, we reconstructed ancestral TATA box status of over 700 yeast gene family phylogenies, and found ancestors of most yeast gene families were TATA box absent, and significantly higher number of TATA box gain events than loss events had occurred since the gene duplication---the overall gain-loss ratio is about 3-4 to 1. Interestingly, these TATA-gain duplicate genes are evidently enriched in stress-associated functional categories (other TATA-containing duplicate genes usually involved in metabolic related processes), and on average have experienced greater expression divergence under environmental stress conditions (the asymmetric evolution). Together, we thus conclude that after the gene duplication, gain of the TATA box in duplicate promoters may have played an important role in yeast duplicate preservation by accelerating expression divergence that may facilitate the adaptive evolution of organism in response to environmental changes.2) After yeast genomic eQTL data analysis, we found duplicate genes have higher heritability for gene expression than single copy genes, but little difference in their epistasis and directional effect; The divergence of trans-acting eQTLs between duplicate pairs increases with the evolutionary time since the gene duplication; Trans-acting eQTL divergence can explain about 21% of the variation in expression divergence between duplicate genes, which increases to 27% when the TF-target interaction divergence was combined; Trans-acting eQTL divergence between duplicate pairs is correlated with gene ontology (GO) categories’Biological processes’and Cellular components’, but not with’Molecular functions’. We consider that eQTL analysis provides a novel sight or approach to explore the effect of gene duplications on the genetic regulatory network.3) Analyzing yeast genome-wide histone modification profile data, we noticed that duplicate genes share more common hisotone modification pattern both associated with promoter and coding regions (ORF) than singletons. Moreover, both promoter and ORF histone modification divergence between duplicate genes are coupled with the evolution of coding sequence,trans-regulators and cis-regulators of duplicates. Further analysis revealed that trans-regulator-targeted duplicate genes experienced more rapid histone modification divergence. We speculate that during gene duplication, histone modification profile of genes was also duplicated; after that, histone modification profile between duplicate genes diverges with the evolution of other genetic characters like sequence or regulatory factor of duplicate genes.4) We proposed two hypotheses on functional compensation of duplicate genes. One is environment-specific loss of functional compensation models:because of the selection from natural environment, the extant duplicate genes were required and uncompensable for the survival and reproduction of organism in certain period or specific environments of evolutionary history。Functional compensation of duplicates was lost in these environments. The other one is gene network-protein function hypothesis:to avoid being pseudogenizated, duplicate genes should undergo functional divergence (except for genes under positive selection). If regulatory network of duplicates firstly evolves, protein function has the chance not to be diverged, leading to high functional compensation effect for these duplicate genes; otherwise, if protein function of duplicates diverges first, the regulatory network can evolve slowly, but effect of functional compensation for duplicates are also largely weakened. We gave the evidence to support these two hypotheses. Our research provides the new insight into genetic robustness against null mutation.

  • 【网络出版投稿人】 复旦大学
  • 【网络出版年期】2011年 12期
节点文献中: