节点文献
杜仲MEP途径系列基因全长cDNA分离鉴定及序列特征研究
Isolation,Identification and Sequence Characterization of Full Length cDNA of MEP Pathway Related Genes in Eucommia Ulmoides
【作者】 刘攀峰;
【导师】 杜红岩;
【作者基本信息】 中国林业科学研究院 , 森林培育, 2012, 博士
【摘要】 杜仲(Eucommia ulmoides Oliv.)是我国特有的第三纪孑遗植物,也是优良的温带胶源树种和名贵药用树种。杜仲萜类化合物广泛应用于人们生产生活,其中最为重要的是杜仲胶和环烯醚萜类。2-甲基-D-赤藓醇-4-磷酸(2-C-methyl-D-erythritol-4-phosphate,MEP)途径是植物萜类生物合成上游重要的调控路径之一。本研究以杜仲叶片转录组测序数据为基础,分离鉴定MEP途径所有相关酶高表达水平基因的cDNA全长,并利用生物信息学方法对序列特征、结构特点及基因功能进行分析和预测,旨为潜在功能基因的筛选和挖掘,合成途径中限速步骤的深入探究以及为萜类代谢工程靶点的确定和基因改良育种提供基础信息。1-脱氧-D-木酮糖-5-磷酸合成酶DXS是MEP途径中的第一个关键酶,在杜仲中发现两个DXS酶基因家族成员,分别命名为EuDXS1和EuDXS2。EuDXS1基因cDNA全长2805bp,5’端非编码区长218bp,3’端非编码区长448bp,编码712个氨基酸,与美丽帽柱木DXS基因序列相似性最高,达81%;EuDXS2基因cDNA全长2552bp,5’端非编码区长73bp,3’端非编码区长337bp,编码713个氨基酸,与橡胶DXS基因序列相似性最高,达82%。推导EuDXS1与EuDXS2氨基酸序列中均包含转运肽序列(A1-A18;A1-A34)、TPP结合基序(A106-A360;A108-A362)、嘧啶结合基序(A397-A553;A399-A555)和转酮醇酶C末端基序(A571-A676;A573-A678)等植物DXS蛋白典型的保守基序与功能位点。推导EuDXS1与EuDXS2蛋白二级结构均以螺环结构为主,分别占48.74%和49.37%。推导EuDXS1和EuDXS2蛋白三级结构分别由两个亚单位组成。EuDXS1蛋白与蓖麻DXS1蛋白亲缘关系最为接近,EuDXS2蛋白与橡胶DXS2蛋白亲缘关系最为接近。1-脱氧-D-木酮糖-5-磷酸还原异构酶DXR是MEP途径中的第二个限速酶,在杜仲中分离鉴定出两个DXR酶基因家族成员,分别命名为EuDXR1和EuDXR2。EuDXR1基因cDNA全长1814bp,5’端非编码区长126bp,3’端非编码区长251bp,编码478个氨基酸,与毛果杨DXR基因序列的相似性最高,达81%;EuDXR2基因cDNA全长1779bp,5’端非编码区长163bp,3’端非编码区长251bp,编码472个氨基酸,与葡萄DXR基因序列的相似性最高,达81%。推导EuDXR1与EuDXR2氨基酸序列中均包含转运肽序列(A1-A51;A1-A43),具有2个DXR蛋白结合基序(A227-A236,A297-A307;A221-A230,A291-A301),2个NADPH结合基序(A88-A94,A113-A119;A82-A88,A107-A113)以及N端富脯氨酸基序(A61-A69;A55-A63)等植物DXR蛋白典型的保守基序与功能位点。推导EuDXR1与EuDXR2蛋白二级结构均以螺环结构为主,分别占47.91%和45.34%。推导EuDXR1和EuDXR2蛋白三级结构分别由两个亚单位组成,在空间上呈“V”字结构。EuDXR1蛋白与水稻DXR1蛋白亲缘关系最为接近,EuDXR2蛋白与玉米DXR2蛋白亲缘关系最为接近。2-甲基-D-赤藓醇-4-磷酸胞苷酰转移酶MCT催化MEP途径的第三步酶促反应,在杜仲中分离出一个MCT酶基因,命名为EuMCT。EuMCT基因cDNA全长1435bp,5’端非编码区长223bp,3’端非编码区长252bp,编码319个氨基酸,与葡萄MCT基因序列相似性最高,达82%。推导EuMCT氨基酸序列中包含转运肽序列(A1-A75)以及植物MCT蛋白多个保守的功能位点(A100,A102,A103,A104,A105,A106,A114,A170,A171,A173,A176,A195,A196,A198,A228,A244,A250,A252,A300)。EuMCT蛋白二级结构中α-螺旋占23.82%,β-折叠占18.18%,螺环结构占57.99%。EuMCT蛋白三级结构由两个亚单位组成,并且存在两个特殊的P-loop结构。EuMCT蛋白与葡萄MCT蛋白亲缘关系最为接近。4-(5’-焦磷酸胞苷)-2-C-甲基-D-赤藓醇激酶CMK催化MEP途径的羟基磷酸化反应,在杜仲中分离出一个CMK酶基因,命名为EuCMK。EuCMK基因cDNA全长1644bp,5’端非编码区长256bp,3’端非编码区长203bp,编码394个氨基酸,与番茄CMK基因序列相似性最高,达82%。EuCMK氨基酸序列中包含转运肽序列(A1-A57)以及植物CMK蛋白催化过程中所必需的ATP结合位点(A186-A202)。推导EuCMK结构α-螺旋占32.74%,β-折叠占19.29%,螺环结构占47.97%。推导EuCMK蛋白三级结构由两个不对称的亚基组成。EuCMK蛋白与葡萄CMK蛋白亲缘关系最为接近。2-甲基-D-赤藓醇-2,4-环焦磷酸合酶MDS催化MEP途径的第五步酶促反应,在杜仲中分离出一个MDS酶基因,命名为EuMDS。EuMDS基因cDNA全长976bp,5’端非编码区长119bp,3’端非编码区长146bp,编码236个氨基酸,与紫茎泽兰MDS基因序列相似性最高,达79%。EuMDS氨基酸序列中包含转运肽序列(A1-A56)以及多个植物MDS蛋白保守的功能位点(A84,A87,A89,A121, A213,A217,A221,A223,A228)。推导EuMDS蛋白二级结构中α-螺旋占40.25%,β-折叠占13.56%,螺环结构占46.19%。推导EuMDS蛋白三级结构由三个亚单位组成,并相互围绕形成一个分子内腔。EuMDS蛋白与啤酒花MDS蛋白亲缘关系最为接近。1-羟基-2-甲基-2-E-丁烯基-4-焦磷酸合酶HDS为MEP途径的第六个作用酶,在杜仲中分离出一个HDS酶基因,命名为EuHDS。EuHDS基因cDNA全长2786bp,5’端非编码区长171bp,3’端非编码区长383bp,编码743个氨基酸,与葡萄HDS基因序列相似性最高,达84%。推导EuHDS氨基酸序列中包含转运肽序列(A1-A30)、PSN基序(A58-A78)、PSI基序(A354-A620)以及植物HDS蛋白3个绝对保守的半胱氨酸位点(A644,A647,A678)。推导EuHDS蛋白二级结构α-螺旋占37.55%,β-折叠占19.25%,螺环结构占43.20%。推导EuHDS蛋白三级结构N端为一个属TIM-barrel超家族的八链β桶形结构,C端为一个两侧翼为螺旋结构的β折叠构造。EuHDS蛋白与葡萄HDS蛋白的亲缘关系最为接近。1-羟基-2-甲基-2-E-丁烯基-4-焦磷酸还原酶HDR是MEP途径中的第三个关键酶,在杜仲中分离出一个HDR酶基因,命名为EuHDR。EuHDR基因cDNA全长1653bp,5’端非编码区长82bp,3’端非编码区长188bp,编码460个氨基酸,与喜树HDR基因序列相似性最高,达82%。推导EuHDR氨基酸序列中包含转运肽序列(A1-A33)以及植物HDR蛋白多个保守的功能位点(A117,A208,A262,A345)。推导EuHDR蛋白二级结构α-螺旋占35.65%,β-折叠占19.78%,螺环结构占44.57%。推导EuHDR蛋白三级结构为单体形式,呈不规则的三叶草形状。EuHDR蛋白与葡萄HDR蛋白的亲缘关系最为接近。异戊烯基焦磷酸异构酶IPI催化IPP与DMAPP之间的可逆转化,是萜类代谢网络的一个枢纽,在杜仲中分离出一个IPI酶基因,命名为EuIPI。EuIPI基因cDNA全长1231bp,5’端非编码区长79bp,3’端非编码区长231bp,编码306个氨基酸,与喜树IPI基因序列相似性最高,达84%。推导EuIPI氨基酸序列包含转运肽序列(A1-A70)以及植物IPI蛋白典型的功能位点(A154、A175,A107、A119、A156、A196、A206、A208)。推导EuIPI蛋白二级结构α-螺旋占22.55%,β-折叠占13.40%,螺环结构占64.05%。EuIPI蛋白三级结构以单体形式存在。EuIPI蛋白与欧洲榛IPI蛋白的亲缘关系最为接近。
【Abstract】 Eucommia ulmoides Oliv. is a tertiary relic species,which is endemic to China andtraditionally utilized as a rare medicinal and high quality temperate rubber-producing tree.Terpeniods in E. ulmoides represented by the iridoids and gutta-percha have been plenty ofeconomic value and widely used in industry as well as to people’s daily life. MEP pathway isone of the two elucidated upstream biosynthetic routes modulating the terpeniods biosynthesisin plant. Full length cDNA of high expression genes of completed set enzymes in MEPpathway were systematically isolated based on the transcriptome sequencing data of E.ulmoides leaves,afterwards the sequence structure and gene function were characterized andpredicted through bioinformatics techniques in the study,which will be conducive to dig uppotential key genes,excavate rate-limiting step in the pathway and to provide fundamentalinformation referring to metabolic engineering and gene improvement breeding of E. ulmoides.Two gene members coding the DXS enzyme which was identified as the first keyregulatory point in MEP pathway were separated from E. ulmoides leaves and designated asEuDXS1and EuDXS2. Respectively with highest gene sequence similarity to Mitragynaspeciosa (81%)and to Hevea brasiliensis(82%),EuDXS1full-length cDNA was2805bpincluding5’non-coding region of218bp and3’non-coding region of448bp and encoded712amino acids,EuDXS2was2552bp including5’non-coding region of73bp and3’non-codingregion of337bp and encoded713amino acids. Representative conserved motifs and functionalsites of plant DXS protein containing transit peptide sequence(A1-A18;A1-A34),TPP bindingmotif(A106-A360;A108-A362),pyrimidine binding motif (A397-A553;A399-A555)and TKC-terminal motif (A571-A676;A573-A678)were found in the deduced coding sequence ofEuDXS1and EuDXS2. Loop/coil mainly constituted the secondary structure of the predictedprotein with proportion of EuDXS1to48.74%and EuDXS2to49.37%. The calculated proteintertiary structure of EuDXS1and EuDXS2were both composed of two subunits. Evolutionary relationship of EuDXS1was closest to Ricinus communis DXS1protein,while EuDXS2toHevea brasiliensis DXS2protein.Two gene members coding the DXR enzyme which was identified as the secondrate-limiting step in MEP pathway were separated from E. ulmoides leaves and designated asEuDXR1and EuDXR2. Respectively with highest gene sequence similarity to Populustrichocarpa(81%)and to Vitis vinifera(81%),EuDXR1full-length cDNA was1814bpincluding5’non-coding region of126bp and3’non-coding region of478bp and encoded251amino acids, EuDXR2was1779bp including5’non-coding region of163bp and3’non-coding region of251bp and encoded472amino acids. Representative conserved motifsand functional sites of plant DXR proteins containing transit peptide sequence(A1-A51;A1-A43),two DXR binding moti(fA227-A236,A297-A307;A221-A230,A291-A301),two NADPH bindingmotif (A88-A94,A113-A119;A82-A88,A107-A113)and N terminal proline-rich motif (A61-A69;A55-A63)were discovered in the deduced coding sequence of EuDXR1and EuDXR2.Loop/coil mainly constituted the predicted protein secondary structure with proportion ofEuDXR1to47.91%and EuDXR2to45.34%. The calculated protein tertiary structure ofEuDXR1and EuDXR2were both composed of two subunits,which in space displaying“V”shape. Evolutionary relationship of EuDXR1was closest to Oryza latifolia DXR1protein,while EuDXR2protein to Zea mays DXR2protein.A gene coding the MCT enzyme which catalyzed the third enzymatic reaction in MEPpathway was separated from E. ulmoides leaves and designated as EuMCT. With highest genesequence similarity to Vitis vinifera(82%),the full-length cDNA of EuMCT was1435bpincluding5’non-coding region of223bp and3’non-coding region of252bp and encoded319amino acids. The transit peptide sequence(A1-A75)and multiple conserved functional sitesA100,A102,A103,A104,A105,A106,A114,A170,A171,A173,A176,A195,A196,A198,A228,A244,A250,A252,A300)of plant MCT protein were found in the deduced coding sequenceof EuMCT. Secondary structure of EuMCT protein was predicted with proportion of α-helix to23.82%,β-sheet to18.18%and loop/coil to57.99%.The calculated protein tertiary structure of EuMCT was composed of two subunits,which contained two special P-loop constitution.The evolutionary relationship of EuMCT protein was cloest to Vitis vinifera MCT protein.A gene coding the CMK enzyme which catalyzed the hydroxyl phosphorylate reaction inMEP pathway was separated from E. ulmoides leaves and designated as EuCMK. With highestgene sequence similarity to Lycopersicon esculentum(82%),the full-length cDNA of EuCMKwas1644bp including5’non-coding region of256bp and3’non-coding region of203bp andencoded394amino acids. The transit peptide sequence(A1-A57)and the essentialATP bindingsite (A186-A202)demanded in the catalytic process of plant CMK enzyme were found in thededuced coding sequence of EuCMK. Secondary structure of EuCMK protein was predictedwith proportion of α-helix to32.74%,β-sheet to19.29%and loop/coil to47.97%.Thecalculated protein tertiary structure of EuCMK was composed of two asymmetric subunits.Theevolutionary relationship of EuCMK protein was closest to Vitis vinifera CMK protein.A gene coding the MDS enzyme which catalyzed the fifth enzymatic reaction in MEPpathway was separated from E. ulmoides leaves and designated as EuMDS. With highest genesequence similarity(79%)to Ageratina adenophora,the full-length cDNA of EuMDS was976bp and encoded236amino acids with5’non-coding region of119bp and3’non-coding regionof146bp. The transit peptide sequence (A1-A56)and multiple conserved functional sites(A84,A87,A89,A121,A213,A217,A221,A223,A228)of plant MDS protein were found inthe deduced coding sequence of EuMDS. The secondary structure of EuMDS protein waspredicted with proportion of α-helix to40.25%,β-sheet to13.56%and loop/coil to46.19%.Three subunits which formed a molecular cavity composed of the calculated proteintertiary structure of EuMDS. The evolutionary relationship was more similar than other speciesbetween EuMDS protein and Humulus lupulus MDS protein.A gene coding the HDS enzyme which catalyzed the sixth enzymatic reaction in the MEPpathway was separated from E. ulmoides leaves and designated as EuHDS. With highest genesequence similarity to Vitis vinifera(84%),the full-length cDNA of EuHDS was2786bpincluding5’non-coding region of171bp and3’non-coding region of383bp and encoded743amino acids.The transit peptide sequenc(eA1-A30),PSN moti(fA58-A78),PSI moti(fA354-A620) and three absolutely conserved cysteine site(sA644,A647,A678)of plant HDS protein were foundin the deduced coding sequence of EuHDS. The secondary structure of EuHDS protein waspredicted with proportion of α-helix to37.55%,β-sheet to19.25%and loop/coil to43.20%.The N-terminal domain with an eight-stranded β barrel belonged to the large TIM-barrelsuperfamily and the C-terminal domain consisted of a β sheet flanked on both sides by heliceswere indicated in the calculated protein tertiary structure of EuHDS. The evolutionaryrelationship of EuHDS protein was closest to Vitis vinifera HDS protein.A gene coding the HDR enzyme which was identified as the third rate-limiting step inMEP pathway was separated from E. ulmoides leaves and designated as EuHDR. With highestgene sequence similarity to Camptotheca acuminata(82%), the full-length cDNA of EuHDRwas1653bp including5’non-coding region of82bp and3’non-coding region of188bp andencoded460amino acids. The transit peptide sequence(A1-A33) and multiple conservedfunctional sites(A117,A208,A262,A345)of plant HDR protein were found in the deduced codingsequence of EuHDR. The secondary structure of EuHDR protein was predicted with proportionof α-helix to35.65%,β-sheet to19.78%and loop/coil to44.57%.The calculated proteintertiary structure of EuHDR was composed of a monomer, which in space displayedasymmetrical shamrock-like shape.The evolutionary relationship of EuHDR protein wasclosest to Vitis vinifera HDR protein.A gene coding the IPI enzyme which catalyzed the reversible conversion between IPP andDMAPP and regarded as a hinge point in terpenoid metabolic networks was separated from E.ulmoides leaves and designated as EuIPI. With highest gene sequence similarity toCamptotheca acuminata(84%),the full-length cDNA of EuIPI was1231bp including5’non-coding region of79bp and3’non-coding region of231bp and encoded306amino acids.The transit peptide sequence(A1-A70)and other representative conserved motifs and functionalsites (A154、A175,A107、A119、A156、A196、A206、A208)of plant IPI proteins were found inthe deduced coding sequence of EuIPI. The secondary structure of EuIPI protein was predictedwith proportion of α-helix to22.55%,β-sheet to13.40%and loop/coil to64.05%. The calculated EuIPI protein tertiary structure was constituted by a monomer in space. Theevolutionary relationship of EuIPI protein was closest to Corylus avellana IPI protein.
【Key words】 Eucommia ulmoides; terpenoids; MEP pathway; Gene isolation; SequenceCharacterization;