节点文献

鳞翅目昆虫线粒体全基因组结构特点及其比较基因组学分析

Molecular Charactercs and Comparative Analysis of the Complete Mitochondrial Genome of Lepidopteran

【作者】 洪桂云

【导师】 姜绍通; 魏兆军;

【作者基本信息】 合肥工业大学 , 农产品加工及贮藏工程, 2009, 博士

【摘要】 鳞翅目属于节肢动物门、六足总纲、昆虫纲,是一类较常见的昆虫,已知有25.5万种以上,广泛分布于世界各地。目前针对鳞翅目的系统发育关系尚未形成统一的认识。线粒体基因组DNA,以高拷贝数目存在于线粒体内,分子质量小,核酸序列和组成比较保守,基因组中不含间隔区和内含子,无重复序列和不等交换,拷贝数多,易于提取、扩增和分析,在遗传过程中不发生基因重组、倒位、易位等突变,严格遵守母系遗传方式,只需少量个体材料就能反映群体的遗传结构,适合做进化生物学研究。被广泛用于研究基因组结构和功能、群体遗传结构,谱系地理学和各种分类学水平上的系统发育关系研究。为了丰富鳞翅目昆虫线粒体基因组数据,进一步深入开展鳞翅目昆虫线粒体基因组的结构与进化特征的研究,本论文选取鳞翅目黑纹粉蝶A.melete和樟蚕E.pyretorum两种昆虫,采用基于LA-PCR的二次PCR策略,扩增并测定两种昆虫线粒体基因组全序列,并对编码基因进行注释,结合已测序的其它13种鳞翅目昆虫的线粒体基因组数据,从线粒体蛋白质编码基因、tRNA基因、rRNA基因以及控制区四个方面对鳞翅目15个物种的线粒体基因组进行了详细的比较及分析。对线粒体蛋白质基因的碱基组成、密码子和氨基酸使用、tRNA的二级结构、rRNA序列同源性以及控制区的结构等方面进行了系统的比较与归纳,并联合15种鳞翅目的全线粒体基因组数据对系统发育关系进行重建。获得以下结论:1.采用基于LA-PCR技术的二次PCR测序策略,建立起一套快速、精确地进行昆虫全线粒体基因组测序的实验体系。设计了一套适合鳞翅目昆虫全线粒体基因组扩增和测序的通用引物,并成功地应用于两种鳞翅目昆虫。2.黑纹粉蝶A.melete和樟蚕E.pyretorum的全线粒体基因组长度分别为15,140bp和15,327bp,均包含了37个基因和一段非编码的AT富含区,基因的排列顺序与已经报道的鳞翅目昆虫的基因顺序和转录方向基本一致。对鳞翅目昆虫比较发现,基因间间隔和重叠区数目及长度在15个鳞翅目昆虫中有差异,但鳞翅目昆虫的蛋白质编码基因对atp8/atp6之间都是重叠了7bp。3.黑纹粉蝶A.melete和樟蚕E.pyretorum线粒体基因组的蛋白编码基因、tRNA基因、rRNA基因和AT富含区均存在很强的碱基A+T含量偏向性。比较15个鳞翅目昆虫蛋白质编码基因密码子的1、2、3位碱基组成,发现密码子第1位点的A、T含量相当;密码子第2位点的T含量显著高出A含量一倍多;密码子第3位点的A、T含量最高,A+T含量均在90%左右。结果表明线粒体基因组具有很强的GC→AT的进化选择压力。4.黑纹粉蝶A.melete线粒体基因组的蛋白编码基因除cox1的起始密码为CGA外,其它12个基因都使用了ATN作为起始密码子。樟蚕E.pyretorum线粒体基因组的蛋白编码基因除cox1的起始密码为CGA,cox2的起始密码GTG外,其它11个基因都使用了ATN作为起始密码子。在黑纹粉蝶A.melete和樟蚕E.pyretorum中多数采用完整的三联密码子TAA或TAG作为终止密码子,少数的为不完整的T。5.鳞翅目昆虫的线粒体基因组中蛋白编码基因密码子的使用和氨基酸使用都有极强的偏向性,其中NNU和NNA密码子的使用频率相当的高。氨基酸组成中Leu,Ile,Phe及Ser的含量最高。6.黑纹粉蝶A.melete和樟蚕E.pyretorum的线粒体基因组22个tRNA基因的顺序均与其它鳞翅目昆虫相同,大小在60~72bp之间。除trnS1(AGN)和trnS2(UCN)外,其余20个tRNA编码基因的二级结构为典型的三叶草型,碱基错配以U-G和U-U为主。7.鳞翅目昆虫AT富含区存在以下结构特点:rrnS基因下游有一段由"ATAGA"引导的保守的18~22bp多聚T结构;除天蚕A.yamamai,合目大蚕蛾C.boisduvalii,烟草天蛾M.sexta外,在其它已测序的鳞翅目昆虫线粒体基因组AT富含区中都有串联重复序列;位于trnM基因上游的AT富含区含有多聚T结构(在β链上)。8.基于线粒体基因组13个蛋白质编码基因的核苷酸序列及氨基酸序列的联合数据集构建了15种鳞翅目昆虫的系统树,采用BI和ML分析方法建树,所得拓扑结构类似,都显示蚕蛾总科,尺蠖蛾总科和夜蛾总科关系较近,结果与前人研究的结果不完全一致。因此关于鳞翅目的系统发育关系,有待于基于更多样本量的线粒体基因组数据进行深入探讨。本研究首次对鳞翅目黑纹粉蝶A.melete和樟蚕E.pyretorum线粒体基因组进行了序列测定、拼接及线粒体基因组注释和分析,丰富了鳞翅目昆虫的线粒体基因组数据,为进一步开展昆虫线粒体谱系基因组学的研究提供重要参考。通过将两种新测序的线粒体基因组与已测序的13种鳞翅目昆虫线粒体基因组序列进行比较分析,从结构、比较与进化基因组学的角度总结了鳞翅目昆虫线粒体基因组的结构组成以及序列进化等方面的一般特征。

【Abstract】 Lepidoptera is one of the second largest families among the insecta with more than 255 thousands described species widely distributed the throughout the world.The classification of species in Lepidoptera has been controversial and complicate.The mitochnodiral DNA(mtDNA) exists in almost all eukaryotic cells with highly copy numbers.Because of small size,compositional and nucleotide sequence conservation,maternal inheritance,and relatively rapid evolutionary rate,lack of intermolecular genetic recombination,excluding intron and intergenic spacer sequence,being easy to amplification and analysis,mtDNA has been extensively used for studying population structures and phylogenetic relationships at varios taxonomic levels.In order to enrich the Lepidoptera mitochondrial genome data,and go further into structural and evolutionary studies of the mitochondrial genomes in Lepidoptera insect,mitochondrial genomes of the two Lepidoptera species,Artogeia melete and Eriogyna pyretorum belonging to two families were sequenced,assemblyed,annotated and analyzed using the strategy of sub-PCR based on the long PCR technology.After sequence assembly and annotation,two new mitochondrial genomes combined with the thirteen other Lepidoptera species deposited in GenBank used to conduct a comprehensive comparative analysis including the base composition and codon usage of mitochondrial genome protein coding genes,secondary structure of tRNA genes,Sequence homogeneous of rRNA genes and structure of control regions,etc.Finally,the phylogenetic relationship of taxa with complete mitochondrial genome from Lepidoptera was reconstructed by concatenated amino acid sequences of 13 proteins downloaded from GenBank.Some conclusions drawn from the study as follows:1.Based on the strategy of the long PCR,a rapid and accurate approach to sequence the complete insect mitochondrial DNA was established.A set of new universal primers designed in this study can be used to amplify and sequence the mitochondrial genome from Lepidoptera,and has been successfully used to specie,A.melete and E.pyretorum.2.The size of mitcohondiral genomes of A.melete and E.pyretorum is,15140bp and 15327bp, respectively.They all have a remarkably conserved set of 37 genes and a control region known as the (A+T)-riched region.The gene order and transcription direction are the same as that of sequenced Lepidoptera species.Compared with sequenced Lepidoptera,the number and length of the integenic region and overlap region are different.The length of the intergenic regions between atp8 and atp6 is 7bp.3.The average A+T content of the A.melete and E.pyretorum mitochondrial genome protein coding sequence,rRNA and tRNA gene was corresponding well to the A+T bias generally observed in insect mitochondrial genomes.Compared the base compositon of the three codon positions of PCGs in fifteen Lepidoptera species,a common rule could be drawed as following,the first codon positions have the same T%and A%.The second codon positions all have a higher T%than A%,and the T%is as much as twice of the A%.The third codon positions have the highest A+T composition, which is nearly 90%.All the above has shown that the GC→AT evolutionary pressure is really strong.4.All protein coding genes of the A.melete and E.pyretorum start with a typical ATN codon, and the CGA start codon of the cox1 found in A.melete and E.pyretorum,the GTG start codon of the cox2 found in E.pyretorum.Majority of the 13 PCGs in these two species have a complete termination codon(TAA or TAG),while several PCGs use an incomplete termination codon T.5.The are obvious biases in the both codon and amino acid usage in Lepidoptera species mitochondrial PCGs.NNU and NNA are the most frequently used codons.Leu,Ile,Phe and Ser have the highest composition of all the amino acids.6.All the 22 tRNA genes of A.melete and E.pyretorum mitochondrial genomes have a typical cloverleaf structure except trnS1(AGN) and trn S2(UCN)which DHU arm could not form stable stem-loop structure.The length of the tRNA genes various from 60bp to 72bp.Most of mismatched base pair are G-U pairs and U-U pairs.7.The(A+T)-riched region of lepidopteran mitogenomes contains some typical structures: there is a structure including the motif ’ATAGA’ and 18 to 22 bp poly-T stretch downstream of the rrnS gene that is widely conserved in lepidopteran mitogenomes,there are variable tandem repeat units in the(A+T)-riched region of sequenced lepidopteran mitogenomes except for A.yamamai, C.boisduvaliiand M.sexta,A 9-bp poly-T is found immediately upstream of trnM.8.Based on the combined dataset of nucleotide and amino acid sequences of 13 protein coding genes,using the maximum likelihood(ML) and Bayesian inference(BI) methods to reconstruct the phylogenetic relationship of fifteen species of lepidopteran insects,The resulst showed the same topologies and supported a close relationship between the Geometroidea,Noctuoidea and Bombycoidea.This result deviates from the traditional view.To further studied the phyligenetic relationships of lepidopteran insects,a larger number of insect orders and mitogenome are required.This work is the first report about the complete mitochondrial genomes on A.melete and E. pyretorum.It adds two more Lepidopteran complete mitochondrial genome sequences,and has accumulated useful information for mitochondrial phylogenomics research of insect.The general properties of the organization and structure characteristics of lepidopteran mitochondrial genome are drawed form the viewpoint of structural,comparative and evolutionary genomics through comparison of the two new genome data wjth the other Lepidopteran species data in GenBank.

节点文献中: