节点文献

藏汉民族及3个不同地域的藏族群体线粒体基因组的比较研究:探查自然选择在基因组的印记

Comparative Study on Mitochondrial Genome between Tibetan and Han Populations, and Tibetans in Three Different Zones: Interrogating Mitochondrial Genes for Signature of Natural Selection

【作者】 顾明亮

【导师】 褚嘉祐; 王敦梅;

【作者基本信息】 中国协和医科大学 , 遗传学, 2008, 硕士

【摘要】 在人类漫长的进化历程(evolutionary process),自然选择究竟在人类基因组留下怎样的痕迹,是我们今天一直想解开的秘密。自达尔文的进化论学说问世以来,人类对自己朦胧的过去一直怀有强烈的好奇心和探索的欲望。但由于知识和技术瓶颈的限制,人类了解和探索自身奥秘的愿望一直没能实现。随着人类基因组计划(HGP)、国际单体型图计划(HapMap)的完成和基因组测序技术的日趋成熟,为人类探索自身的研究提供了大量的知识积累和技术平台的支撑,使人类进化历史的研究以新的起点重新提到科学研究的议事日程,使在基因组水平上探索自然选择作用的研究不断涌现。但由于核基因组易发生重组,破坏自然选择的效应,因此使结果的分析和解释变得复杂起来。另外,在研究人类的进化历史时遇到的另一个难题是难于区分自然选择和人口学(demography)事件,例如种群扩散(population expansion)、瓶颈效应(population bottlenecks)、群体结构(population structure)等因素所产生的效应。所以,关于自然选择对人类基因组变异的影响,依然众说纷纭,莫衷一是。mtDNA是独立于核基因组以外的遗传物质,具有较高的拷贝数、缺乏重组、较高的突变率和母系(单亲)遗传等特点。例如缺乏重组意味着有利的突变体由于“搭载效应”(genetic hitchhiking)而使其频率增加,并迅速地在该群体得到富集:母系遗传可促使有利的突变迅速地形成分离、表达,上述特性有利于一些重要的生物学特征如选择的效应等较易被检测。所以,mtDNA是研究人类起源和进化、具有巨大潜力的研究工具。以往的研究显示mtDNA具有严格的区域(地区)变异的特性,传统的观点将其归因于遗传漂变(genetic drift),自然选择对形成特征性mtDNA变异的作用一直没有得到证明。本研究旨在通过藏汉民族及不同地域的藏族群体mtDNA全序列的比较研究,探查是否存在线粒体基因组存在自然选择、存在何种选择及构成选择的因素。在本研究中,我们选择40例世居在西藏的藏族群体为研究对象,并以50例汉族群体为对照,对其mtDNA进行全序列测定;对3个不同地域藏族群体线粒体基因组的ATP6、ATP8和Cyt b基因约2kb的区域进行测序研究。应用AppliedBiosystems 3730 DNA自动测序仪对mtDNA进行双向序列测定;采用phredPhrap16.0,Network,DnaSP 4.20.2和SPSS 15.0等软件对基因组数据进行分析。应用SSPro和SSPro8软件对蛋白质的二级结构进行预测;PDB格式的alpha碳原子坐标通过3Dpro计算获得,进一步通过MaxSprout补全其它原子的坐标;利用Rasmol和Swiss-pdbviewer显示并比较结构。单点突变引起蛋白稳定性的改变通过Cheng等开发的基于SVM算法进行预测。利用RNAfold软件包对变异引起的RNA结构改变进行模拟分析。研究发现,90例藏汉群体属Macrohaplogroup M和N,归类到13个Haplogroups,除M9以外,其它Haplogroup在两个种族之间没有显著性差异。主成份分析显示,第一(PC1)和第二主成份(PC2)分别占总变异的41.3%和10.7%,2个主成份对总变异的贡献率占52.0%。进一步的分析显示,在主成份图上,藏汉群体分别位于不同的象限,第一和第二主成份分别占77.1%和10.1%,2个主成份对总变异的贡献率占87.2%,表现出鲜明的地域特点。藏汉群体mtDNA全序列比对分析发现18个显著性差异的变异位点,其中5个位点是我们首次报道(http://www.mitomap.org),有8个位点变异后被定义在Internal branch。我们对藏族群体中占优势的7个突变所致结构的变化进行结构预测和分析,发现突变后ND2 G4491A,CO2 G7697A,tRNA alanine T5628C和12S rRNA A1041G的结构发生显著变化,其稳定性增加,在能量学上意味着是一种适应性选择。另外,对5个D-loop区突变进行了单体型构建和分析,发现16145位点倾向于和16255、16284组合,16234倾向于和16316组合。我们对40例藏族、50例汉族、144例南亚和48例东亚群体的13个蛋白编码基因进行非同义(nonsynonymous/N)和同义替换(synonymous/S)分析并比较它们之间的差异,发现在40例藏族群体中,ATP6,ATP8和Cyt b基因的N/S比值较大(均大于1),并且Cyt b基因在藏族和南亚群体之间N/S有显著性的差异。另外,我们对3个不同地域藏族群体线粒体基因组的ATP6、ATP8和Cyt b基因约2kb区域的进行测序研究;应用Taijma’s D检验、Fu和Li’s D与F检验3种方法进行中性检验,发现随着海拔的增加,3个基因的全序列、ATP6和ATP8基因逐渐偏离中性模式,在ATP6基因出现显著性差异。通过N/S选择分析发现,在西藏群体ATP6基因可能存在适应性选择,并且随着海拔的增高呈现出适应性选择增强的趋势;Cyt b基因则受到纯化选择,并且随着海拔的降低,纯化选择的作用逐渐增大。通过对藏汉群体mtDNA全序列和不同地域的藏族群体的ATP6、ATP8和Cyt b基因约2kb区域的测序研究,提示在西藏群体ATP6,ATP8,ND2(G4491A),CO2(G7697A),tRNA alanine(T5628C)和12S rRNA(A1041G)基因可能存在适应性选择。其选择的主要因素应该是该群体所处的特殊的地理环境(如高海拔、高寒、低氧等),即不同的地理环境有直接的选择作用。

【Abstract】 During the long period of human evolutionary process,it currently leaves a puzzle for what signature of natural selection casts in human genome.Since the appearance of Darwin’s Evolution Theory,the intense curiosity and exploring desire possess human beings for their ambiguous past.The limitations underlying knowledge and technique hamper the process of exploring human’s mystery.The competition of Human Genome Project and HapMap as well as the maturity of biotechnology in sequencing put the investigation of human evolution from a brand-new standpoint on the agenda.The analysis and explanation on most results of these studies regarding natural selection,however,remain complexity,mainly due to high recombination rate of nuclear genome which can destroy the effects of natural selection.In addition, another puzzle encountered in the investigation of human evolutionary history is to distinguish natural selection from demography,such as the effects of population expansion,population bottlenecks,population structures and so on.Forasmuch, results of most of these studies regarding natural selection remain controversial,with no consensus on their implications.Human mitochondrial DNA(mtDNA),the extra-nuclear genetic material,is characterized by high copy numbers,lack of recombination,high mutation rate and inheritance through purely maternal lines.For example,maternal inheritance can enhance the rapid segregation,expression and adaptive selection of beneficial mtDNA mutation;lack of recombination means that the beneficial mutation can accumulated with the genetic hitchhiking of mtDNA haplotype.Therefore,all these characteristics of mtDNA offer the potentials of investigating human origin and evolution.Previous studies indicated that mtDNA sequences bore strong geographic variation, which was traditionally attributed to the hypothesis of genetic drift.However,the effect of selection on characteristic variation in mtDNA has not yet been proved.The aim in this study by compared between Tibetan and Han populations,and Tibetans in different zones is to detect the signature for natural selection in mitochondrial genome, meaning which and what are the factors for natural selection.In this study,we focused on 40 Tibetans who generationally resided in Tibet and 50 Han Chinese from Beijing to thoroughly decipher and compare the mtDNA whole sequences between the two populations and the mtDNA sequences(~2 kb) of ATP6, ATP8 and Cyt b genes between the three zone Tibetans.The mtDNA sequences were performed using Applied Biosystems 3730 DNA sequencer.Data in mtDNA were analyzed using softwares phredPhrap 16.0,Network,DnaSP 4.20.2 and SPSS 15.0. The SSPro and SSPro8 softwares were used to predict the protein secondary structure; the 3Dpro software was used to calculate the alpha carbon atom coordinate in PDB format,further the MaxSprout was used to fill the coordinates of other atoms,and finally Rasmol and Swiss-pdbviewer softwares were adopted to compare the structure. The protein stability variation caused by single-locus mutation was predicted by aid of SVM algorithm which was developed by Cheng et al.The structure changes of RNA conferred by mutation were simulated using the RNAfold software.Our results showed that all the pooled 90 studied subjects pertained to the Macrohaplogroup M and N,and were classified into 13 haplogroups.No differences were observed among all haplogroups between the two populations except for M9 haplogroup.Principal component analysis indicated that the first and second principal components(PC1 and PC2) respectively accounted for 41.3%and 10.7%of the total variance with the added contribution of 52.0%.Further analysis indicated that the Tibetan and Han populations pitched different quadrants,and PC1 and PC2 respectively accounted for 77.1%and 10.1%of the total variance with a total contribution of 87.2%,suggesting obvious geographic differences.A total of 18 variants were detected by comparing the mtDNA whole sequences between Tibetan and Han populations;of those 5 variants were reported firstly in current study(http://www.mitomap.org) and 8 variants were defined at the internal branch.After predicting the structure changes conferred by the 7 dominated variants in Tibetans,we found that the structures of ND2 G4491A,CO2 G7697A,tRNA alanine T5628C and 12S rRNA A1041G changed significantly after mutation and the stability was increased,meaning the existence of adaptive selection in bioenergetics. Additionally,we constructed the haplotypes of 5 variants harboring D-loop region, and founded that 16145 locus preferred to be combined with the 16255 and 16284 loci, as well as 16234 with 16316.After comparing the replacement of nonsynonumous(N) versus synonymous(S) of 13 peptide-encoded genes among 40 Tibetans,50 Han subjects,144 south Asians and 48 east Asians,we found that the N/S values of the ATP6,ATP8,and Cyt b genes were relatively large(greater than 1) among Tibetans. Significant difference was found for Cyt b gene between Tibetans and south Asians.In addition,we compared the sequences(~2 kb) of three ATP6,ATP8 and Cyt b genes among Tibetans in three different zones and found that the D or F value was negative using Tajima’s D test,Fu and Li’s D test and F test,which significant difference was found for ATP6 gene by Fu and Li’s F test.We further performed N/S (nonsynonymous(N) versus synonymous(S)) substitutions analysis for these three zones and it provides evidence for the existence of adaptive selection in ATP6 gene and the adaptive selection trend was increased with the increase of altitudes,whereas for the existence of purifying selection in Cyt b gene and the purifying selection trend was increased with the decrease of altitudes.The analyses of the mtDNA whole sequences and sequences(~2 kb) of three ATP6, ATP8 and Cyt b genes provide clues for the existence of adaptive selection for ATP6, ATP8,ND2(G4491A),CO2(G7697A),tRNA alanine(T5628c) and 12S rRNA (A1041G) genes in Tibetans.The primary factor of natural selection should be the special geographical environment(e.g.,high-altitude,hypoxia,extreme cold) which this Tibetan locates,namely the different geographical environment has the direct selective action on variation of mtDNA.

节点文献中: