节点文献
应用现代生物信息技术对湖北钉螺遗传多样性的研究
Genetic Diversity Study on Oncomelania Hupensis Based on Modern Biology Information Techniques
【作者】 李石柱;
【导师】 周晓农;
【作者基本信息】 中国疾病预防控制中心 , 流行病与卫生统计学, 2009, 博士
【摘要】 血吸虫病是一种严重危害人类健康的传染病,也是我国目前面临的重要公共卫生问题之一。湖北钉螺(Oncomelania hupensis)是日本血吸虫唯一的中间宿主,在日本血吸虫病传播过程中起着关键作用。湖北钉螺分布于我国大陆长江中下游及以南的湖沼、山丘和高山地区,由于受地理隔离的严重应影响,以及孳生环境、自然因素的差异,湖北钉螺不同地理群体间发生了显著的遗传分化。鉴于其对血吸虫病流行病学研究、血吸虫病预防控制工作的重大意义,湖北钉螺遗传多样性的研究得到了广泛的关注,但相对现代生物信息技术的快速发展,湖北钉螺遗传多样性的研究仍显不足,研究手段极不丰富。本研究首先通过采集不同景观地区的湖北钉螺样本,建立湖北钉螺空间分布数据库,在此基础上应用基因组DNA的酶切片段与生物素标记的(AAT)17,(GA)25, (CCT)17,(AC)25,(CAG)17,(CA)18, (CAC)5, (TC)10, (GT)8和(TG)1 8等寡核苷酸探针杂交、分离、富集和克隆测序,完成湖北钉螺微卫星DNA库的构建,并据此挑选具有多态性的微卫星位点,对长江中下游地区的湖北钉螺群体的遗传结构进行了分析;应用长PCR技术和引物步移测序技术,结合SubPCR和克隆测序策略,测定湖北钉螺湖南岳阳株的线粒体基因组(mtDNA)全序列,并通过测定不同景观群体湖北钉螺个体线粒体基因(16S)和核糖体间隔区(ITS1--ITS2)片段序列,综合分析了湖北钉螺不同景观群体遗传分化和地理隔离之间的关系。为此,本研究获得以下结果:一、湖北钉螺空间遗传信息管理系统的构建1.基于景观遗传学的理念,以湖北钉螺空间分布研究、种群遗传学研究为目的,在初步收集了湖北钉螺不同地理景观群体基础上,利用计算机语言设计、编制了湖北钉螺空间遗传信息管理系统。该系统总体结构包括了两个部分,一是基础数据库,二是管理系统,其中基础数据库根据研究样本的分类层次划分为三个数据库,分别是采集点数据库、样本数据库和遗传信息数据库。2.初步完成数据库的构建,包括了73个采集点、676条记录及其相关遗传信息。管理系统安全稳定,通过二次编码和有效索引,可以实现对数据的查询、筛选、修改、导入和导出功能,便于管理和操作.该系统可以为各类地区的钉螺特征提供查询服务,为研究设计和统计提供方便,因此,对湖北钉螺分布和群体遗传研究具有一定的应用价值。二、基于微卫星DNA的湖北钉螺遗传多样性研究1.首次构建了湖北钉螺微卫星DNA库。共获得了209条微卫星DNA序列,经远程blast检索显示,与已知的微卫星DNA序列无明显的同源性。所获得的微卫星DNA序列中,完整重复序列79条,占37.8%,非完整重复序列101条,占48.33%,复合重复序列有29条,占13.88%;微卫星DNA序列中以双核苷酸重复占多数,三核苷酸重复序列重复次之,多核苷酸重复比较少见;重复序列以(CA)n和(GT)n数量最为丰富,重复次数最多的(CA)n可达98次。2.筛选并描述了微卫星DNA库中的部分多态位点。按照微卫星DNA分类原则,筛选了67个微卫星位点,并对其中20个位点进行了鉴定。有16对引物有明显的特异性扩增,其中14个位点具有多态性,多态性比例为70%。随机挑选7个位点进行湖北钉螺群体基因扫描,有6个位点可以获得良好的信号,即P84,T5-13,T5-11,T4-22,T6-27和P82。6个微卫星DNA位点中,除P84位点的观测杂合度和PIC值较低,分别为0.1667和0.1813,其余位点的观测杂合度和PIC值范围在0.36-0.8929和0.8437-0.9289间,具有较好的多态性。3.应用6个微卫星位点检测对长江中下游5个湖北钉螺群体的群体遗传结构。6个微卫星DNA位点中,共检测到188个等位基因,不同位点在群体间平均为15.83个;等位基因在不同群体中的分布无明显的集中趋势。群体内遗传分析显示,所有位点平均的观测杂合度、期望杂合度和PIC值分别为0.637、0.811和0.777,多态性明显,综合所有指标的信息,湖北群体遗传变异程度最高,江苏群体最低。群体间遗传结构分析表明,江苏和江西群体间具有较高的遗传分化程度,安徽与湖南群体间则分化程度较小。总群和群体内基因交流不高,因而杂合度较高;然而群体间分化系数表明群体间分化较低,遗传变异主要来自群体内的个体间。三、基于线粒体基因组的湖北钉螺遗传多样性研究1.首次获得的湖北钉螺线粒体基因组全序列。湖北钉螺线粒体基因组全序列全长15 182 bp (Genbank登记号:FJ997214),为闭合环状分子,A+T含量为67.32%。共编码37个基因,包括13个蛋白基因、22个tRNA基因、2个RNA基因和一段A+T富集区,其中轻链编码8个tRNA基因,其余基因由重链编码。2.对线粒体DNA全序列进行生物信息学分析。13个蛋白质编码基因均以ATG为启动子,以TAA或TAG为终止子,其中ND1以潜在的丁作为终止密码子,所有编码蛋白基因转录方向相同,密码子的碱基使用较强的AT偏好。线粒体基因组间隔区共21处合计145bp,长度范围为1-30bp,最长的间隔区为30bp;基因重叠区较短,且仅2处,分别为4bp和7bp。线粒体基因组含有22个转运RNA,除2个tRNASer (AGN)和tRNAGln、tRNAIle以外都能形成典型的三叶草结构,且存在一个特有的tRNA (tRNASeC)。3.基于核糖体DNA的ITS1-ITS2和mtDNA-16S基因序列分析湖北钉螺不同景观群体遗传多样性。不同的DNA分子序列的遗传特征将我国大陆湖北钉螺群体可分为4个主要类群,即长江中下游地区群体、云南和四川的高山型群体、广西内陆山丘型群体和福建沿海山丘型群体,2个DNA分子(ITS1-ITS2、16S)在不同采集点之间的遗传差异呈明显的地理聚集性,并与地理距离之间形成显著的相关性(P<0.001),相关指数分别为RITS1-ITS2=0.784,R16S=0.717,群体遗传分布格局符合距离隔离模型。
【Abstract】 Schistosomiasis, the zoonotic infectious parasitic disease, is one of the major public health problems greatly threatening human health in China. Oncomelania hupensis, distributing in the southern areas to the Yangtze River, including marshland, mountainous and hilly regions, is the sole intermediate host of Schistosoma japonicum. Therefore O. hupensis plays a key role in the transmission of schistosomiasis japonica. Due to different geographical distribution, variation in ecological environment and natural factors, distinct genetic evaluation has occurred among O. hupensis generations. In view of Significance in the research of epidemiology, control and prevention of schistosomiasis, the investigation on genetic diversity of O. hupensis has attracted extensive attention. However, the studies and techniques applied in this field seem too far from adequacy to coping with the rapid development of bioinformatics in addressing these problems. Therefore, it is necessary to study on the population genetics and subspecies differentiation of O. hupensis in China.In this study, O. hupensis was sampled based on different landscape distribution, and geospatial database on bioinformatics of O. hupensis were established firstly. Secondly, after the establishment of microsatellite DNA database of O. hupensis, populations genetic structure of O. hupensis from middle-lower reaches of the Yangtze valley was analyzed based on its hybridization between restriction fragments of genomic DNA and oligonucleotide probe including (AAT)17, (GA)25 (CCT)17, (AC)25,(CAG)17, (CA)18, (CAC)5, (TC)10, (GT)8 and (TG)18 marked with biotin. Thirdly, the complete sequence of mtDNA from Hunan isolate of O. hupensis was detected by application of Long PCR and walking sequencing technology as well as SubPCR and clone sequencing. Additionally, the relationship between genetic variation of O. hupensis from different landscapes and geographic isolation on basis of mtDNA (16S) detection and ribosomal fragment (ITS1-ITS2) sequencing was explored in the line with the theory of landscape genetics.1. Establishment of management system on geospatial genetic information of O. hupensis1.1 In the line with the theory of landscape genetics, and aiming to investigate geospatial distribution and population genetics of O. hupensis, a management system on geospatial genetic information of O. hupensis was established in computer language. The system composed of 2 parts, one was the basic database which made up of 3 sub-datasets, i.e. collection sites, samples and genetic information datasets; the other was information management system which provide functions on accessing the datasets.1.2 Tthe database was primarily founded including 73 collection sites, 676 sample records and relevant genetic information of collected O. hupensis. Through second endoding and effective indexing, functions on accessing datasets, such as data query, filtration, amending, import and export, can be carried out which preserve further space for data amplification and online filling out. The system provides query service which facilitates study design and statistic analysis. Therefore, the system worth application in the study of distribution and population genetics in other samples in addition to O. hupensis.2. Genetic diversity of O. hupensis based on microsatellite DNA 2.1 A total of 209 effective sequences were attained from our study, of which 79 were completely repeated (37.8%),101 were incompletely repeated (48.33%) and 29 were combined (13.88%). Among microsatellite DNA, double-nucleotides took the major part and treple-nucleotides took the second place following mutiple-nucleotides. In addition, the number of (CA)n and (GT)n ranked the first place, of which that of (CA)n repeated 98 times.2.2 Based on the classification of microsatellite DNA sequences,16 out of 20 selected primer pairs from 67 designed ones resulted in obvious specific amplification in accordance with expected bands, among which 14 sites were polymorphic accounting for 70% of the total polymorphism. After gene scanning on 7 sites randomly selected,6 ones, i.e., P84, T5-13, T5-11, T4-22, T6-27 and P82 were found to attain good signals. Of these 6 sites, only P84 showed low observed heterozygosis and polymorphism information content (PIC) value, with 0.1667 and 0.1813, respectively. For others, observed heterozygosis and PIC value were between 0.36-0.8929 and 0.8437-0.9289, respectively, which showed good polymorphism.2.3 In application with 6 microsatellite DNA sites, genetic diversity in 5 populations of O. hupensis were detected. Among 6 microsatellite DNA sites, P84、T5-11 and T4-22 were unbalanced to some degree. A total of 188 alleles genes were detected, of which the average number of sites among different populations was 15.83 without obvious central tendency. Analysis of population genetics revealed that the observed and expected heterozygosis, PIC value of all sites equaled to 0.637,0.811 and 0.777, respectively. It was found that genetic variation of O. hupensis was the highest in Jiangsu population while was the lowest in Hubei population. Results from analysis of population genetics showed that genetic differentiation was high between Jiangsu and Jiangxi populations, while low between Anhui and Hunan populations. As a result, gene exchanges were not frequent among population and species caused high heterozygosis. However, low differentiation coefficient showed that genetic variation mostly resulted from that of individuals.3. Landscape genetics of O. hupensis based on mitochondrial genomes3.1 The 15 182 bp-long complete sequence of O. hupensis mtDNA (Genbank registration No.:FJ997214) was sequenced, and it is a closed circular molecular with 67.32% AT content which encoded 37 genes, including 13 protein genes,2 RNA genes and AT Rich Region, of which 8 tRNA genes were light chain coded and the others were heavy chain coded.3.2 All of 13 protein-coding genes were found with ATG as promoter and TAA or TAG as terminator, among which potential T was the terminator of ND1. All of those protein genes coded had the same transcription direction with strong AT preference of codon base. The length of 21 total intergenic region of mtDNA was 145bp ranging from 1-30bp with 2 short gene overlapping with length of 4bp and 7bp, respectively. Totally 22 transferring RNA were found in mtDNA, all of which were typical cloverleaf structure with specific tRNA (tRNASeC) except for 2 tRNASer (AGN), tRNAGIn and tRNAIle.3.3 The genetic diversity of landscape populations were analysised based on ITS1-ITS2 of ribosomal DNA and mtDNA-16S sequences, O. hupensis in China’s mainland could be divided into 4 populations, i.e., population in the middle-lower reaches of Yangtze Valley, mountainous population in Yunnan and Sichuan, inland hilly population in Guangxi and coastal hilly population in Fujian, all of which were in accordance with landscape ecological types. Obvious geographical aggregation of genetic diversity was observed between of 2 DNA molecular colleted in different point showed. As a result, there was a significant positive correlation (RITS1-ITS2= 0.784,R16S=0.717, P<0.01) between geographical distance and genetic variation which showed that population genetics distribution were in accordance with the Isolation-by-distance Model.
- 【网络出版投稿人】 中国疾病预防控制中心 【网络出版年期】2012年 02期
- 【分类号】R532.21;Q953
- 【被引频次】6
- 【下载频次】449
- 攻读期成果