节点文献

水稻重组自交系的基因型鉴定及栽培稻和药用野生稻基因组序列比较分析

High Through-Put Genotyping for Rice Recombinant Inbred Lines (RILs) and Comparative Analysis of Oryza AA CC Genomes

【作者】 冯旗

【导师】 林志新; 韩斌;

【作者基本信息】 上海交通大学 , 生物化学与分子生物学, 2010, 博士

【摘要】 随着生物技术的发展,第二代测序技术的开发填补了很多传统生物学方法的不足。不断新增的基因组序列,为重新设计基因型鉴定战略,进行更有效的遗传图谱构建和基因组分析创造了机会。本研究中我们首次利用Solexa高通量基因组分析系统,开发了一种使用bar-coding的测序方法,通过获得低覆盖率的水稻全基因组序列对重组自交系进行基因型的鉴定。我们通过检测150个重组自交系个体的低覆盖率全基因组序列与亲本之间的单核苷酸多态性,设计了一种滑动窗口的方法,检测出每个个体在全基因组范围内的基因型分布和重组断点,分析并鉴定它们的基因型。利用这种方法,我们共构建了150个水稻重组自交系的遗传图,基因型分布的准确率达到99.94%,重组断点的分辨率为平均每40kb存在一个重组断点。相比我们之前利用287个遗传标记,基于传统的PCR扩增方法构建得到的水稻遗传图,Solexa高通量测序方法在数据的获取效率和鉴定重组断点的精确度方面分别提高了20倍和35倍。用基于全基因组测序方法构建的遗传图谱,成功地将水稻第一号染色体上一个控制植株高度性状的数量性状基因定位在100 kb区域范围内。通过计算机模拟,可以证明基于全基因组测序的基因型鉴定方法对各种生物群体的遗传作图都很适用,而且也可用于对较大的基因组和较低遗传多态性的生物物种进行鉴定与分析。随着测序技术的不断发展与革新,这种基于全基因组测序的基因型鉴定方法将有可能取代传统的基于遗传标记的PCR扩增方法,成为大规模发掘基因和解决各种生物学问题的一个强有力的工具。植物近缘物种之间的基因组结构和序列比较分析可以为研究植物基因功能和进化提供参考。本研究中我们对水稻药用野生稻Oryza officinalis(CC基因组型)103,844个BAC末端序列(相当于~73.8 Mb的基因组序列长度)进行了分析;并比较了CC基因组与水稻栽培稻粳稻日本晴基因组(AA基因组型)的结构特征,发现45%以上的O.officinalis基因组序列是由重复序列构成的,高于水稻栽培稻粳稻日本晴重复序列所占的比例(~38.87%)。为了进一步了解AA基因组和CC基因组在结构上的差异,包括它们在基因的结构特征以及基因组大小(genome size)上的差异,我们选择了AA和CC基因组两个共线性区段中的BAC重叠群进行精确测序,结果表明:在AA基因组共线性区段预测分析得到的57个基因中,有39个基因与CC基因组是同源基因;通过共线性区段的比对还发现CC基因组在基因内和基因间隔区通过转座子插入等方式使其基因组发生了膨胀,基因组大小比AA基因组偏大。特别是反转座子的插入在CC基因组中尤为突出,分析结果显示CC基因组和AA基因组中的RNA转座元件的分别占17.95%和1.78%,这就解释了为什么在共线性区段CC基因组相比AA基因组多出了近100 kb的主要原因。

【Abstract】 With the development of biotechnology, the next-generation sequencing technology makes up the shortage of conventional biological methods. The next-generation sequencing technology coupled with the growing number of genome sequences opens the opportunity to redesign genotyping strategies for more effective genetic mapping and genome analysis. We have firstly developed a bar-coding sequencing strategy and a high-throughput genotyping method for recombinant populations by using Illumina Genome Analyzer to generate low coverage of rice whole genome sequences. By detecting SNPs between the RILs and their parents, we designed a sliding window approach to collectively examine genome-wide single nucleotide polymorphisms (SNPs) for genotype calling and recombination breakpoint determination. Using this method, we constructed a genetic map for 150 rice recombinant inbred lines with an expected genotype calling accuracy of 99.94% and a resolution of recombination breakpoints within an average of 40 kb. In comparison to the genetic map constructed with 287 PCR-based markers for the rice population, the sequencing-based method was approximately 20 times faster in data collection and 35 times more precise in recombination breakpoint determination. Using the sequencing-based genetic map, we located a quantitative trait locus of large effect on plant height in a 100 kb region containing the rice‘green revolution’gene. Through computer simulation, we demonstrate that the method is robust for different types of mapping populations derived from organisms with variable quality of genome sequences and feasible for organisms with large genome sizes and low polymorphisms. With continuous improvement of sequencing technologies, this genome-based method may replace the traditional marker-based genotyping approach to provide a powerful tool for large-scale gene discovery and for addressing a wide range of biological questions.Comparative analyses of genome structure and sequence of closely related species have yielded insights into the evolution and function of plant genomes. A total of 103,844 BAC end sequences delegated ~73.8 Mb of O. officinalis that belongs to the CC genome type of the rice genus Oryza were obtained and compared with the genome sequences of rice cultivar, O. sativa ssp. japonica cv. Nipponbare. We found that more than 45% of O. officinalis genome consists of repeat sequences, which is higher than that of Nipponbare cultivar. To further investigate the evolutionary divergence of AA and CC genomes, two BAC-contigs of O. officinalis were compared with the collinear genomic regions of Nipponbare. Of 57 genes predicted in the AA genome orthologous regions, 39 had orthologs in the regions of the CC genome. Alignment of the orthologous regions indicated that the CC genome has undergone expansion in both genic and intergenic regions through primarily retroelement insertion. Particularly, the density of RNA transposable elements was 17.95% and 1.78% in O. officinalis and O. sativa, respectively. This explains why the orthologous region is about 100 kb longer in the CC genome in comparison to the AA genome.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络