节点文献
酿酒酵母组蛋白修饰的组合模式分析
The Analysis on the Combinatorial Patterns of Histone Modifications in Saccharomyces Cerevisiae
【作者】 崔向军;
【导师】 李宏;
【作者基本信息】 内蒙古大学 , 理论物理, 2012, 博士
【摘要】 人类基因组计划完成后,获得了海量的DNA序列数据,基于这些数据人们在基因表达方面开展大量的研究工作。漫长的研究过程使我们逐步认识到,基因组DNA的表达调控非常复杂。相关研究结果显示,除了DNA序列自身以外,还有其它的可遗传的生物学因素影响基因表达,这类作用被称为表观遗传调控,至此表观遗传学应运而生。这些遗传调控信息主要集中在几大方面:组蛋白修饰、核小体定位、染色质重塑、DNA甲基化修饰以及非编码RNA调控等。而且这些因素能够相互作用,共同决定复杂的生命过程。组蛋白修饰是表观遗传调控信息的一个重要研究内容。组蛋白修饰是在相关修饰酶的作用下发生在核小体组蛋白N末端的共价修饰,这些共价修饰包括甲基化、乙酰化和磷酸化等。不同的组蛋白修饰在基因的表达中起着不同的作用,包括组蛋白密码、信号网络和电荷中和等不同的模型被提出来说明组蛋白修饰的功能。2000年,Strah1和Allis提出“组蛋白密码”假设。该假设认为单一组蛋白的修饰往往不能独立地发挥作用,一个或多个组蛋白尾部的不同共价修饰依次发挥作用或组合在一起,形成一个修饰的级联,它们通过协同或拮抗来共同发挥作用,影响基因的表达。组蛋白密码被提出后,组蛋白修饰组合模式成为基因表达调节研究领域的热点。基于组蛋白密码思想,本文研究了酿酒酵母组蛋白修饰的组合模式,同时提出了组蛋白修饰作用关系模型,并从相关修饰酶角度对修饰因果组合关系机理进行了分子生物学水平的探讨。主要研究内容如下:1.在得到酿酒酵母组蛋白修饰数据和确定转录起始位点(TSS)的基础上,一方面为了了解12种组蛋白修饰围绕转录起始位点修饰水平的变化趋势;另一方面为了校验筛选后的组蛋白修饰数据的准确性,因此研究了组蛋白修饰在启动区和编码区的修饰分布模式。一些经典修饰(比如H3K4me3、H3K4me2和H3K4me1)的分布模式结果显示,筛选的数据是准确的。所有的组蛋白修饰水平都围绕TSS有非常明显的变化,几乎在TSS位点修饰水平都最低,而在TSS两侧区域修饰水平逐渐增强。在接近编码区末端时,修饰的水平又逐渐降低。2.为了分析构建的组蛋白修饰网络,我们首先研究了TSS+1kb区域12种组蛋白修饰的相关性。另外,针对研究较多的TSS-lkb区域,我们也进行了修饰的Pearson相关分析。结果显示,12种组蛋白修饰在两个区域根据相关系数都被分成两个强相关群体。一个群体是转录增强修饰群体(群体A),另一个群体是转录抑制或转录无关联修饰群体(群体B)。群体A包括7种修饰(H2AK7Ac、H3K14Ac、H3K9Ac、 H4K5Ac、H3K18Ac、H4K12Ac和H3K4Me3);群体B包括5种修饰(H3K4Mel、 H2BK16Ac、H3K4Me2、H4K16Ac和H4K8Ac)。基于转录水平的修饰聚类结果与Pearson相关分析一致。3.一些组蛋白修饰酶能够通过自身蛋白的结构域与另一些修饰结合。因此,这些酶催化的组蛋白修饰与结合位点修饰之间建立了因果关联关系,在此因果关系的基础上两种修饰形成修饰组合。依据以上思路,本课题以贝叶斯网络为手段研究了组蛋白修饰组合模式。贝叶斯网络能够显示一种修饰组合模式的多个修饰之间的因果关系。针对本研究的12种组蛋白修饰确定了23种组合模式。通常情况下,在一个组合模式内的几种修饰关联较强,处于一个关联群体。另外,从组蛋白修饰酶的分子结构水平分析了部分组蛋白修饰组合。4.为了了解组蛋白修饰因果关系在转录水平不同情况下的差异,在逐步删除低转录基因和高转录基因的基础上,构建了高转录贝叶斯网络和低转录贝叶斯网络。在比较和分析了两个网络后,发现高转录基因的贝叶斯网络显示的修饰关系更加稳定、有较少的变化,表明更多的修饰组合在基因转录水平高的时候发挥生物学作用。四组修饰组合(H2BK16Ac→3K4Me3、H3K14Ac→H3K4Me3、H4K12Ac→H3K18Ac和H2AK7Ac→H3K14Ac)在基因转录的过程中始终发挥作用,它们的功能不依赖于基因的转录水平,这些修饰组合对于基因转录是必需的。此外,基于两个网络的稳定路径(修饰组合关系),提出了组蛋白修饰作用关系模型。该模型认为,两个网络的终端修饰的“打开”和“关闭”是彼此互斥的;因为修饰之间存在因果关系,所以位于网络上层的修饰之间的因果关系对于下层网络修饰之间的关系是有影响的,同时这种影响会对终端修饰与PoⅢ之间的关系发挥作用,文中进一步分析了网络上层修饰间因果作用关系如何对下层修饰产生影响这一过程。
【Abstract】 After Human Genome Project was completed, a large amount of DNA sequence data was obtained. Based on the data, the researchers develop a great deal of work in the terms of gene expression. We gradually realize that the regulation on genomic DNA is very complex in the course of endless research. The relevant research results show that other hereditable biological factors affect gene expression besides DNA sequence. This kind of effects is called epigenetic regulation. Till then, epigenetics emerges as the times require. These informations on epigenetic regulation mainly focus on several aspects:histone modification, nucleosome positioning, chromatin remodeling, DNA methylation and non-coding RNA etc. These factors can interact on each other and codetermine complex life process.Histone modifications are an important research topics of the information on epigenetic regulatory. Histone modifications occur in the N-terminus of histones of nucleosomes under the action of relevant modifying enzymes. The common covalent modifications include methylation, acetylation and phosphorylation etc. Differnet histone modifications play different roles in gene expression. Various models, including the histone code, the signalling network and the charge neutralization model, have been proposed to account for the function of histone modifications. Strahl and Allis put forward "histone code" hypothesis in2000. The hypothesis suggests that single one histone modification cannot usually play its role. Multiple different covalent modifications at the terminus of one or multiple histones play their roles sequentially or combine together, and form a cascade of modifications. They together play a role cooperatelly or antagonistically and affect gene expression. After histone code was put forward, combinatorial patterns of histone modifications become a focal point of research in the field of gene expression.Based on the idea of histone code, the combinatorial patterns of histone modifications in Saccharomyces cerevisiae were studied in this paper. In addition, the action relationships of histone modifications were put forward. The underlying mechanisms on the causal and combinatorial relationships were discussed from modifying enzyme view at molecular biology level. The main contributions are summarized as follows:1. Base on the histone modification data of Saccharomyces cerevisiae obtained and transcriptional start sites (TSS) confirmed, on one hand, in order to get a general idea of the change trend of histone modifications around TSS; on the other hand, to verify the histone modification data filtered, the distribution patterns of modifications were studied at promoter regions and coding regions. The results of distribution patterns of some classical mosifications (for example H3K4me3, H3K4me2and H3K4mel) show that the data filtered is accurate. The modification levels of all histone modifications change obviously around TSS. The modification levels are almost the lowest at TSS. The modifications gradually boost up flank TSS. The modification levels gradually decrease near the end of the coding regions.2. In order to analyze the network of histone modifications constructed, we firstly studied the correlation of12histone modifications. Moreover, for the regions of TSS-1kb studied mainly, the Pearson correlation analysis on the modifications was performed too. The results show that12histone modifications are divided into two groups strongly correlated to each other on the base of correlation coefficients. One group corresponds to transcription enhancing modifications (group A) and the other one corresponds to transcription repressing or non-significant correlation modifications (group B). Group A includes seven modifications (H2AK7Ac, H3K14Ac, H3K9Ac, H4K5Ac, H3K18Ac, H4K12Ac and H3K4Me3) and group B includes five modifications (H3K4Mel, H2BK16Ac, H3K4Me2, H4K16Ac and H4K8Ac). The result of cluster for modifications based on transcript levels is consistent with the Pearson correlation analysis.3. Some histone modifying enzymes can be binded with other modifications by protein domains of its own. So the causal correlations are established between histone modifications catalyzed by the enzymes and the modifications used as binding sites. Based on this, the modification combinations can be formed between the two modifications. According to the thoughts mentioned above, the combinatorial patterns of histone modifications were studied by Bayesian network for method. Bayesian network can indicate the causal relationship among multiple mosifications within a combinatorial pattern.23combinatorial patterns were determined for12histone modifications in this study. Normally, there are stronger correlations among several modifications within a combinatorial pattern and these modifications are within a correlation group. In addition, a part of histone modification combinations were analyzed at the level of molecular structure of modifying enzymes.4. In order to understand the difference in the causal relationships of histone modifications under the circumstances of different transcript levels, the Bayesian network for the genes with high transcript levels (H-network) and the Bayesian network for the genes with low transcript levels (L-network) were constructed based on deleting gradually lowly-transcribed and highly-transcribed genes. The modification relationships within H-network are found to be more stable and less changeable. It suggests that more modification combinations play biological roles when gene transcript levels are high. Four kinds of modification combinations (H2BK16Ac→H3K4Me3, H3K14Ac→H3K4Me3, H4K12Ac→H3K18Ac and H2AK7Ac→H3K14Ac) always play their roles. Their functions do not depend on transcript levels. These modification combinations are necessary to gene transcription. Moreover, the action relationship model of histone modifications was put forward on the base of the stable pathways within two networks. The "open" and "close" of the terminal modifications in two networks were mutually exclusive. Because there are causal relationships, the relationships among the modifications which reside on the top layer of network may have impacts on the relationships of downstream modifications. Meanwhile, the effects play the roles in the relationship between terminal modifications and Pol Ⅱ. The processes of how the causal relationships among modifications on the top layer of network affect downstream modifications were analyzed further in this paper.
【Key words】 Histone modification; Combinatorial pattern; Gene transcriptlevel; Bayesian network; Causal relationship;