节点文献

豆科和禾本科植物热激转录因子基因家族的分子进化研究

Molecular Evolution of Heat Shock Transcripiton Factor Gene Families in Legumes and Grasses

【作者】 林勇翔

【导师】 程备久;

【作者基本信息】 安徽农业大学 , 作物遗传育种, 2013, 博士

【摘要】 随着全球温室效应的加剧,高温已成为造成农业减产和品质下降的一个重要因素。热激转录因子(Heat shock transcription factor,Hsf)作为信号转导途径终端组件,是热激蛋白和其它热激诱导基因表达的核心调控因子,对提高植物耐热性具有重要作用。目前,尽管在拟南芥和番茄等少数模式植物中对部分Hsf基因的功能进行了研究并取得了一些重要结果,但是对于整个植物Hsf基因家族的基因组结构及进化模式并不很清楚。随着越来越多的物种被测序,为分析和理解该问题提供了基础数据。基于此,本研究在全基因组范围内,鉴定了已测序的豆科植物百脉根、苜蓿、鹰嘴豆、大豆、木豆和菜豆中的Hsf基因,以及禾本科植物短柄草、水稻、高粱和玉米中的Hsf基因。从内含子/外显子分布、结构域和基序、系统发育分析、物种内和物种间的微共线性(microsynteny)、基因拷贝数的变化、环境选择压力以及表达模式等方面探究了豆科和禾本科植物Hsf基因家族的起源与进化、基因的复制与丢失。主要结果如下:1.利用公布基因组和转录组数据库,在冷季豆科植物百脉根、苜蓿和鹰嘴豆中分别鉴定出11、19和13个Hsfs;在热季豆科植物大豆、木豆和菜豆中分别别鉴定出46、22和29个Hsfs。结构域和基序分析表明豆科Hsf蛋白具有5个保守的结构域或基序,分别为DBD、HR-A/B、NLS、NES结构域,以及AHA基序。其中位于N端的DBD结构域最为保守且高度结构化,由三个α螺旋束和四个反向平行β片层折叠所组成。AHA基序在A类Hsfs的C端高度保守。2.对豆科Hsf基因结构的外显子和内含子分布分析显示,在159个内含子中,有140个是相位0,形成了大量的对称外显子,在DBD结构域中发现了一个高度保守的内含子插入位点且都为相位0,说明了豆科Hsf基因结构的进化上外显子改组和内含子的删除可能起着一定的作用。3.系统发育分析发现,来自6个豆科植物基因组的140个豆科Hsf基因的可以被划分成18个共享的进化枝,各自代表着一个原始的基因谱系,推算出在这些豆科植物最近的共同祖先中至少含有18个Hsf基因。4.通过对物种内基因组区段上基因微共线性分析,以及对基因复制年代分布的计算说明了豆科植物Hsf基因家族的扩张主要通过全基因组复制事件,而不是串联复制。大豆Hsf基因是经历了早期豆科祖先全基因组复制事件和近代的大豆谱系特异性的多倍体事件进化而来,46个包含Hsf基因的大豆染色体区段中有42个区段形成了两个,三个或四个为一组的旁系同源区段组。在百脉根、苜蓿以及木豆基因组中只发现了少数具有旁系同源关系的Hsf基因区段。5.豆科物种间基因组区段上基因微共线性分析发现,百脉根、苜蓿以及木豆基因组中的包含Hsf基因的区段与大豆基因中的包含Hsf基因的复制区段具有广泛的共线性,形成了17组直系同源区段。这些结果说明了百脉根、苜蓿以及木豆基因组中包含Hsf基因的区段与大豆中相应的区段都是由其共同祖先中远古全基因组复制事件产生并进化而来的,但是在百脉根、苜蓿以及木豆基因组中超过一半的Hsf基因拷贝丢失了。而大豆的Hsf基因家族在两轮全基因组复制之后,75%的远古复制基因被保留了下来,以及85%近代复制基因被保留了下来。选择压力分析揭示了持续的纯化选择在保持大豆Hsf基因的数量上起着关键性的作用,且说明了复制产生的子基因受到强烈的进化约束力,维持其功能的稳定。6.进一步对禾本科植物基因组Hsf基因家族分析显示在短柄草、水稻、高粱和玉米基因组中分别含有24、25、23和25个Hsf基因。物种内的微共线性分析和对基因复制时间的计算表明在短柄草、水稻、高粱和玉米中超过60%的Hsf基因组区段是由全基因组复制产生的,且玉米Hsf基因家族经历了远古禾本科祖先全基因组复制事件和近代玉米谱系特异性的多倍体事件。7.禾本科植物物种间的微共线性也表明,短柄草、水稻、高粱和玉米的Hsf基因组区段间具有广泛的微共线性,所有94个区段形成了17组直系同源区段。推算出远古禾本科祖先全基因组复制产生的Hsf基因拷贝在短柄草、水稻、高粱和玉米中的丢失率分别约为32%、29%、35%和44%,此外,在玉米Hsf基因家族中由近代全基因组复制产生的拷贝的丢失率约为34%。这说明了玉米基因组中相对较多的远古复制的Hsf基因被丢失了,且近代复制的基因有较快的丢失速率。选择压力分析表明在禾本科Hsf基因的进化历程中,纯化选择依然起着主导作用,但在个别基因的部分编码区域存在着较强的正向选择,可能促进其功能的分化。8.对百脉根和玉米Hsf基因的表达分析表明其在不同组织中和不同胁迫处理后具有差异表达。百脉根LjHsf-01、LjHsf-02、LjHsf-04、LjHsf-09和LjHsf-10基因,以及玉米ZmHsf-01、ZmHsf-03、ZmHsf-04、ZmHsf-23、ZmHsf-24和ZmHsf-25基因强烈受到热激诱导表达。其中玉米A2亚类的ZmHsf-01和ZmHsf-04微共线性和选择压力分析均证明这两个基因在进化中具有高度的保守性和功能的稳定性,可能在禾本科植物的热激胁迫抗性中具有重要作用,为此我们从玉米B73自交系中克隆了ZmHsf-01和ZmHsf-04的全长基因,为进一步开展功能研究奠定了基础。综合所述,本研究通过比较基因组学的方法阐明了豆科和禾本科Hsf基因组家族的进化是与全基因组复制事件相偶联,Hsf基因组区段在物种间存在广泛的微共线性,不同植物谱系中具有差异的基因丢失是Hsf基因家族在不同物种中分化重要成因。这些结果为全基因组水平上掲示Hsf基因家族的分子进化提供了重要依据。

【Abstract】 With the increase of global greenhouse effect, high temperature has become a mainfactor resulted in the reduction of both the agricultural yield and quality. Heat shocktranscription factors (Hsfs) serve as the terminal components of signal transduction and arethe central regulators of the expression of heat shock proteins and other heat shock-inducedgenes, and have important roles in improving the thermotolerance of plants. Currentprogresses mostly concentrated in the function of Hsfs in Arabidopsis and tomato, however,the genome structures and evolutionary patterns of the entire Hsf gene families are notclearly understood in plants. As more and more genomes of species were sequenced, thereis a chance to shed some light on this question. Therefore, in this study we analyzed theHsf gene families from six legume species for which substantial information aboutgenomes or transcriptomes was available, namely Lotus japonicus, Medicago truncatula,Cicer arietinum, Glycine max, Cajanus cajan and Phaseolus vulgaris. Moreover, the Hsfgene families in four grass genomes of Brachypodium distachyon, Oryza sativa, Sorghumbicolor and Zea mays were analyzed comprehensively. The origin and evolution, and geneduplication and loss of Hsf gene familes in legumes and grasses were studied based oninvestigation of intron/exon distribution patterns, protein domains and motifs, phylogeneticrelationships, intraspecies and interspecies gene colinearity (microsynteny), gene copynumber changes, environmental selection pressure as well as expression patterns of Hsfgenes. The results were as follows:1. By searching published genome and transcriptome databases, a total of11,19and13Hsfs were identified in the cool season legumes Lotus japonicus, Medicago truncatulaand Cicer arietinum, respectively, while46,22and29Hsfs were identified in the tropicalseason legumes Glycine max, Cajanus cajan and Phaseolus vulgaris, respectively. Fiveconserved domains or motifs were observed in most of the legume Hsf proteins, namelyDBD, HR-A/B, NLS, NES domains and AHA motifs. The highly structured N-terminalDBD domain of each Hsf was most conserved; it consisted of a three-helical bundle and afour-stranded antiparallel β-sheet. The AHA motifs in the C-terminus of the Class A Hsfswere highly conserved.2. The analysis of the legume Hsf gene structure in terms of intron/exon distributionpatterns revealed that among the159introns,140were phase0, and accordingly there werethe presence of an excess of symmetrical exons. Besides, in the DBD domain a highlyconservative intron insertion site was found and all with phase0. These results suggested that exon shuffling and elimination of intron may contribute to the evolution of legume Hsfgenes.3. The phylogenetic analysis showed that the140Hsf genes from the six legumespecies could be delineated into18well-supported clades, and each clade represented anancient gene lineage. Therefore, there were at least18Hsf genes in the most recentcommon ancestor of these legumes.4. By searching for intraspecies microsynteny between the genome segments oflegumes and dating the age distributions of duplicated genes, we found that the expansionof legume Hsf gene families was mainly through whole genome duplication rather thantandem duplication. The Hsf genes of Glycine max derived from the early-legume genomeduplication and the recent Glycine-lineage-specific polyploidy event. Moreover,42of46the chromosome regions hosting Hsf genes in Glycine max fell into pairs, triples orquadruples and formed paralogous groups of segments, while only a few paralogoussegments were identified in the genomes of Lotus japonicas, Medicago truncatula andCajanus cajan.5. By comparing interspecies microsynteny between the genome segments of legumes,we determined that the great majority of Hsf-containing segments in Lotus japonicas,Medicago truncatula and Cajanus cajan show extensive conservation with the duplicatedregions of Glycine max. These segments formed17groups of orthologous segments. Theseresults suggested that these regions shared ancient genome duplication with Hsf genes inGlycine max, but more than half of the copies of these genes were lost. On the other hand,the Glycine max Hsf gene family retained approximately75%and85%of duplicated genesproduced from the ancient genome duplication and recent Glycine-specific genomeduplication, respectively. Selection pressure analysis indicated that continuous purifyingselection has played a key role in the maintenance of Hsf genes in Glycine max, and theduplicated genes were subject to strong evolutionary constraints to retain the stability oftheir functions.6. The further analysis of grass genomes showed that24,25,23and25Hsf geneswere identified in Brachypodium distachyon, Oryza sativa, Sorghum bicolor and Zea mays,respectively. By searching for intraspecies gene colinearity and dating the age distributionsof duplicated genes, we found that in Brachypodium distachyon, Oryza sativa, Sorghumbicolor and Zea mays genomes more than60%Hsf-containing segments havemicrosynteny, and resulted from whole genome duplication. The Hsf gene family of Zeamays originated through the ancient whole genome duplication event occurred in the ancestor of grasses and the recent polyploidy event in the ancestor of Zea mays.7. By comparing interspecies gene colinearity between grasses, extensivemicrosynteny was also detected between Hsf-containing segments across Brachypodiumdistachyon, Oryza sativa, Sorghum bicolor and Zea mays genomes, and all94segmentsformed17groups of orthologous segments. Thus, approximately32%,29%,35%and44%of duplicated Hsf genes produced from the ancient genome duplication occurred in grassancestor were lost in Brachypodium distachyon, Oryza sativa, Sorghum bicolor and Zeamays, respectively. In addition, approximately34%of Zea mays Hsf genes, which havebeen obtained from recent genome duplication in the ancestor of Zea mays, were lost justover the past13millions years. These results suggested that in the Zea mays genome arelatively large number of ancient copies of Hsf genes have been removed, and recentcopies have a faster loss rate. Selection pressure analysis indicated that purifying selectionstill played a leading role throughout the evolution of Hsf gene families in grasses, whilestrong signatures of positive selection was detected in some parts of coding regions in theindividual genes, suggesting functional differentiation.8. The results of expression analyses of Hsf genes from Lotus japonicus and Zea maysdemonstrated that they were differentially expressed in different tissue types and abioticstresses. LjHsf-01, LjHsf-02, LjHsf-04, LjHsf-09and LjHsf-10genes of Lotus japonicus, aswell as ZmHsf-01, ZmHsf-03, ZmHsf-04, ZmHsf-23, ZmHsf-24and ZmHsf-25genes of Zeamays were significantly up-regulated by heat stress. Among these genes, microsyntenyanalysis and selection pressure analysis have proved that ZmHsf-01and ZmHsf-04ofsubclass A2were highly conserved in the evolution of grasses and the stability of thefunction. They may play an important role in the heat shock stress resistance in grasses.Therefore, we cloned the full-length genes of ZmHsf-01and ZmHsf-04from inbred lineB73and laid a foundation of further study on the function.In summary, by using the methods of comparative genomics this study demonstratedthat the evolution of legume and grass Hsf gene families were coupling with proposedwhole genome duplication events, Hsf genome segments have extensive microsyntenybetween species, and the difference of gene loss events have contributed to the divergenceof these gene families in different plant lineages. These results can serve as an importantbasis for resolving molecular evolution of the Hsf gene family on the genome-wide level.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络