节点文献

中国对虾基因组串联重复序列分析及其分子标记的开发与应用

The Analysis of Tandem Repeat Sequences in the Genome of Chinese Shrimp (Fenneropenaeus Chinensis), and the Development and Application of Molecular Markers

【作者】 高焕

【导师】 孔杰;

【作者基本信息】 中国科学院研究生院(海洋研究所) , 海洋生物学, 2006, 博士

【摘要】 串联重复序列属于高度重复序列,广泛分布于真核生物和一些原核生物基因组中,包括微卫星和小卫星重复序列,前者的重复单位长度为1-6bp,后者的重复单位长度在7bp及7bp以上。本文希望通过对中国对虾基因组随机测序序列中串联重复序列的分析,了解其在中国对虾基因组中的组成和分布特征,并从中开发一些具有多态性信息的微卫星位点,用于中国对虾的个体和家系识别。本研究的具体内容和结果如下: 1、对中国对虾基因组中的微卫星和小卫星重复序列进行了分析,获得了相当于整个基因组序列1.23‰的2597 000bp的总序列长度,从中找到微卫星重复序列3 888个,小卫星重复序列700个,重复序列总长305 555bp,占总序列的11.72%,其中微卫星序列的累积长度为232 979bp,占序列总长的8.97%,这个比例显著大于人类和按蚊等基因组中微卫星重复序列的比例。各重复序列类型的数目之间的比例情况与它们的长度之间的比例情况基本一致。微卫星重复序列中,两碱基重复类型的数目最多,其次是三碱基,再次是四碱基,其它的依次是单碱基、六碱基和五碱基重复序列类型; 小卫星重复序列中十二碱基重复序列类型的数量最多。不同重复序列类型中的各种重复拷贝类别也不同,如单碱基重复类型中A的重复数目最多; 两碱基重复类型中的是AT,三碱基是AAT等等; 而五碱基重复的数量和重复拷贝类别的数量都比其两侧的四碱基和六碱基都要少的多,进一步的分析发现五、七、十一和十三等由质数数目组成的碱基重复序列类型的重复序列数目和拷贝类别的种类都要比其相邻的重复序列的种类和数量要少,本文对此的初步分析认为,这个现象可能暗示了微卫星和小卫星之间的进化关系,即许多小卫星重复序列是在微卫星重复单位,尤其是1-3bp重复单位的基础上进一步经过重复和突变等过程形成的。在重复序列的数量频率分布上,两碱基重复序列近似成正态分布,而其它类型随着拷贝数目的增加,重复序列的数量则呈逐步减少的趋势。2、从1900个克隆序列中设计了12对具有较高遗传多态性且有稳定扩增产物的微卫星引物。对所有有扩增产物的微卫星核心序列分析表明,具有多态性信息含量的以两碱基重复序列类型为主,同时分析表明重复拷贝数目与相应的等位基因数(即与遗传多态性)呈一定程度的相关,但相关性不显著。

【Abstract】 The tandem repeat sequences including the microsatellite and minisatellite repeat sequences, belonged to high repetitive sequences, and distributed widely in the genomes of some prokaryotes and all eukaryotes. Generally speaking, the repeat unit size of microsatellites was from 1 to 6 bp, and 7 to greater than 200 bp for minisatellites. The objective of this paper lied in three aspects: firstly, to know the composition and distribution character of tandem repeats in the genome of Chinese shrimp Fenneropenaeus chinensis through analyzing parts of the genomic sequences obtained by random sequencing; secondly, to screen some polymorphism SSR markers; thirdly, to identify the individuals and parentages of F. chinensis using the above SSR markers. The more detailed contents and results were as follows: 1, For the first time, the distribution and frequencies of microsatellite and minisatellite repeat sequences were studied based on the sense of whole genome level. a total of 3888 microsatellites and 700 minisatellites were found from the 2597 000bp cumulative length random genomic sequences which was about 1.23‰ of the entire genome. The cumulative length of tandem repeats was 305 555bp, accounting for 11.72% of total sequence length, in which the cumulative length of microsatellites was 232 979 bp, accounting for 8.97% of total length. This proportion was greater than those of other organisms, such as human and mosquito, etc. The relative abundance of repeat sequences was similar to the frequency of the length of each repeat type. In microsatellite repeats, the dinucleotides were the most rich type, the following was trinucleotides, tetranucleotides, mononucleotides, hexanucleotides and pentanucleotides; the twelve-nucleotides was the most common type in minisatellite repeats. The dominant repeat classes were also different from each other in different repeat type, such as the A in mononucleotide repeat type, the AT in dinucleotides, and the AAT in trinucleotides, etc. However, the number of repeat sequences and repeat copy number found in pentanucleotide repeats were less than those of tetranucleotide repeats and hexanucleotide repeats. Furthermore, it was found that the classes and copy numbers of 7, 11, 13 etc. primer-number-composed repeats were significantly less than those of repeat types beside them. The phenomena may suggest the evolution relation between the microsatellite and minisatellite repeat sequences, i.e. many long size repeat units constituting of minisatellite repeats might come of repeat units of microsatellites, especially, the mononucleotide, dinucleotide and trinucleotide repeat types. The frequency distribution character of the dinucleotide repeats were centralized in a middle-leaning-left position(being similar to the normal distribution), and different notably from the other repeat types in which fewer sequences were seen with increasing copy number. 2, Twelve primer pairs with high amplified polymorphisms were screened from 1900 random clone sequences. The analyzed results to the core repeat types of microsatellite sites with high amplified polymorphism indicated that the polymorphism sites mostly belonged to dinucleotide repeat types, and the polymorphism information had some relativity to the copy numbers of the core repeat unites, but no significant relativity existed. 3, In order to efficiently use the microsatellite primer pairs, the multiplex PCR was established in F. chinensis firstly, and the multiplex PCR Genescan was further used to identify the individuals and parentages. Finally, based on the establishment of a diplex PCR and triplex PCR, the triplex PCR was successfully applied to identify two male shrimps mated with two females from four uncertain males, and exactly distinguished the off-springs among seven half-sibs, seventeen full-sibs, and the total thirty-two full-sibs respectively, and also orientate any off-springs to their corresponding parentages.

  • 【分类号】S917.4;Q78
  • 【被引频次】10
  • 【下载频次】519
  • 攻读期成果
节点文献中: