节点文献

棉纤维发育相关EST-SSR的特征、功能及其定位

Characteristics, Function and Genetic Mapping of EST-SSRS Related to Fiber Development in Gossypium

【作者】 韩志国

【导师】 张天真;

【作者基本信息】 南京农业大学 , 遗传学, 2006, 博士

【摘要】 为了增加微卫星标记用于构建饱和的棉花分子遗传图谱,以及将功能基因组学用于棉纤维发育机理的解析和棉花遗传育种,本研究从大量的亚洲棉和陆地棉纤维发育相关EST中开发SSR引物,并且分析了SSR的特征、功能、海陆间的多态性以及在四倍体棉种At和Dt亚组的分布特征。从来源于亚洲棉开花后7~10天纤维cDNA文库的1187条EST,设计了763对EST-SSR引物,其中包含简单SSR 605条,复合SSR 158条。605条EST中,六核苷酸和三核苷酸比例最高,分别占到36%和31.3%,二核苷酸重复占20.4%,五核苷酸重复占7.9%,四核苷酸重复占4.6%。在包含简单SSR的605条EST的核苷酸重复基元中,二核苷酸重复中比例最高的是AT/TA,占12.6%。763对EST-SSR引物中687(90%)对可以在异源四倍体陆地棉TM-1和海岛棉海7124中得到扩增产物。其中有120对引物可以在这两个材料间产生多态带型,用于遗传作图。120对EST-SSR引物总共得到143个多态位点,通过作图软件,将其中的135个整合到了已有的包含511个SSR位点的遗传图谱中。从这些135个位点的分布情况来看,并不是随机的,因为有84个分布在染色体At亚组,51个在Dt亚组。利用构建的7235 5~25天的纤维和徐州142开花后0~5天的胚珠以及3~22天后纤维的cDNA文库,随机测序得到13505条EST,去掉冗余之后为5811条。其中966条包含一个以上微卫星(SSR)。根据EST-SSR引物设计的标准,共得到489对EST-SSR引物。在这部分EST中,三核苷酸重复最多,占59.1%;然后是二核苷酸重复,占30%;四核苷酸重复占6.4%;六核苷酸重复占2.7%;最少的是五核苷酸重复,占1.8%。在所有的重复基元中,AT/TA占的比例最高,约为18.4%,其余的依次为CTT/GAA(5.3%),AG/TC(5.1%),AGA/TCT(4.9%),AGT/TCA(4.5%),AAG/TTC(4.5%)等。489对引物在作图亲本TM-1和海7124间存在多态的共114对,产生130个多态位点。其中129个位点整合到现有的遗传图谱上,有66个分布在染色体At亚组,63个在Dt亚组。在从Genbank dbEST数据库下载来源于陆地棉遗传标准系TM-1的-3~3dpa胚珠的32190条EST中,去冗余之后为12463条,利用SSRIT和Primer3设计得到454对EST-SSR引物。这些EST重复基元中,六核苷酸重复和三核苷酸重复最多,分别占47.3%和38.8%,二、四、五核苷酸重复分别占7.5%、3.2%、3.2%。所有的核苷酸重复基元中,三核苷酸AGA/TCT所占比例最高,约占4.3%;二核苷酸重复中,比例最高的是AG/TC,占2.6%:四核苷酸重复中,AAAC/TTTG比例最多,占约1.5%;五核苷酸和六核苷酸重复中,CCCAA和CCACCT/GGTGGA所占比例最高,分别是0.4%和1.1%。454对引物扩增海陆BC1作图群体亲本TM-1和海7124,总共84对产生多态条带,可以进行定位研究。84对引物扩增共存在90个多态位点。其中84个位点(包括8个偏分离位点)能够整合到该作图群体的遗传框架图上。84个位点分布在26条染色体上,有40个分布在染色体At亚组,44个分布在染色体Dt亚组。616对MUSS/MUCS EST-SSR引物扩增共有22对产生多态条带。产生23个多态位点。其中有22个位点(包括3个偏分离位点)可以整合遗传框架图上。四倍体棉花基因组的At和Dt亚组各分布11个多态位点。从亚洲棉和陆地棉的87154条EST序列中,利用blastclust软件得到非冗余序列39507条,总长度为28005.9kb,其中二~六核苷酸重复大于等于18bp的SSR共有2146个,平均每13.05kb出现一个SSR。所有的重复类型中,六核苷酸重复最多,占36.8%;AT/TA比例最高,占7.4%。在本研究得到的386个标记位点中,有24对EST-SSR引物扩增得到的重复位点分布在相应的部分同源染色体上。有16对EST-SSR引物产生的重复位点分布在非部分同源染色体或同一染色体。整合后的图谱包含1052个位点,图距总计6321cM,平均两个位点间的距离为6.0cM。其中A3、A11、D1和D11染色体各包括两个连锁群。每条染色体上的标记数目从22-58不等,图距从144.5-383.5 cM。整合到遗传图谱上的43个偏分离EST-SSR标记中,偏向亲本TM-1的有22个,偏向海7124的有21个。利用Blast2go软件,将已知功能的EST按照细胞组分、分子功能和生物进程分为三大类。在细胞组分一类中,大多数被归为细胞(cell,41%)和细胞器(organelle,36%)两类;分子功能的分类中,催化活性(catalytic activity)和结合(binding)各约占22%和24%;在第三分类生物进程中,大部分归为生理进程(physiological process,38%)或细胞进程(cellular process,36%)。本研究将部分糖代谢相关基因、转录因子和信号转导类基因定位到染色体,如蔗糖合酶定位到了A6染色体上,E6定位到A5和D5染色体,MYB60定位到A13染色体等。另外,A8染色体上定位了一个耐盐蛋白基因NAU920。针对棉花遗传图谱的研究已经开发了多种分子标记,但是根据棉花已克隆的基因开发的SNP标记相对较少。利用PCR技术扩增出陆地棉TM-1和海岛棉海7124中基因FbL2A的序列,分析了材料间存在的单核苷酸多态性(SNPs)。根据TM-1和海7124两个材料中FbL2A基因的测序结果,采用NEBcutter软件分析,选择BstUI内切酶酶解扩增产物,利用Mapmaker v3.0作图软件将FbL2A基因定位到D2染色体。FIF1基因是棉纤维优势表达的一个基因,研究证明它可能在棉纤维的发育过程中起着很重要的调控作用。根据已发表的亚洲棉FIF1基因,设计特异引物,克隆了陆地棉TM-1和海岛棉海7124中FIF1基因。根据两个四倍体棉花FIF1基因的SNP变异,采用SNAP和CAPs方法将该基因定位到A8染色体。研究表明棉花SNP标记的开发是可行的,也是有效的。EST-SSR和SNP标记的开发将能够更方便的将EST或基因应用到棉花的图谱构建,QTLs定位及克隆,以及棉花二倍体和四倍体的进化与比较基因组研究。

【Abstract】 To increase the numbers of microsatellites available for use in constructing a genetic map,and facilitate the use of functional genomics to elucidate fiber development and breeding incotton, we sampled microsatellite sequences from expressed sequence tags (ESTs)transcribed during fiber development in the A-genome species Gossypium arboreum andAD-genome Gossypium hirsutum to evaluate their characterization, putative function, levelof polylnorphism and distribution in the At and Dt subgenomes of tetraploid cotton.From ESTs derived from G. arboreum fibers at 7~10 days post anthesis (dpa) fibercDNA library, 1,187 ESTs were found 605 containing simple sequence repeats (SSRs) and158 containing complex sequence repeats (CSRs); 763 EST-SSR primer pairs weredeveloped, and 687 (90%) amplified PCR products from allotetraploid cotton (G. hirsutumcv. TM-1 and G. barbadense cv. Hai7124). Among the 605 SSR-ESTs, hexanucleotides(36%) are the most abundant motif, followed by trinuleotides (31.3%), dinucleotides(20.4%), pentanucleotides (7.9%) and tetranucleotides(4.6%). AT/TA (12.6%) is the mostfrequent repeat in all the motif type. However, only 120 (17.4%) of 687 were found to bepolymorphic and segregating in our interspecific BC1 mapping population[(TM-1×Hai7124)×TM-1]. One hundred and thirty-five of 143 loci detected with these 120EST-SSRs were integrated into our backbone map including 511 SSR loci. The distributionof the EST-SSRs appeared to be non-random, since 84 loci were anchored to the At and 51to the Dt subgenome of aUotetraploid cotton based on linkage tests.From 13,505 ESTs developed from 7235 5~25 dpa fiber and Xuzhou142 0~5 dpa ovuleand 3~22 dpa fiber cDNA libraries, 5,811 were non-redundant ESTs and 966 contained oneor more SSRs. From them, 489 EST-SSR primer pairs were developed. Among theEST-SSRs, 59.1% are trinucleotides, followed by dinucleotides (30%), tetranucleotides(6.4%), hexanucleotides (2.7%), and pentanucleotides (1.8%). AT/TA (18.4%) is the mostfrequent repeat, followed by CTT/GAA (5.3%), AG/TC (5.1%), AGA/TCT (4.9%), AGT/TCA (4.5%), and AAG/TTC (4.5%). One hundred and thirty EST-SSR loci wereproduced from 114 informative EST-SSR primer pairs, which generated polymorphismbetween our two mapping parents. Of these, 129 were integrated on our allotetraploidcotton genetic map, including 66 on At subgenome and 63 on Dt subgenome.From 32,190 ESTs generated from -3~3dpa ovule of G. hirsutum cv. TM-1downloaded from Genbank dbEST, 12,463 non-redundant ESTs were developed by the softblastclust. And 454 EST-SSR primer paris were developed through the software SSRIT andPrimer3. Hexanucleotides and trinucleotides were observed at the highest frequencies,47.8% and 38.8%, respectively. And then di-, tetra-, pentanuleotides, is 7.5%, 3.2%, 3.2%,respectively. AGA/TCT (4.3%) is the most frequent repeat. AG/TC (2.6%) is the mostmotif in dinucleotides, AAAC/TTTG (1.5%) in tetranucleotides, CCCAA (0.4%) inpentanucleotides and CCACCT/GGTGGA (1.1%) in hexanucleotides. Eighty-four primerswere polymorphic between the parent TM-1 and Hai7124, and produced 90 loci.Eighty-four loci, including 8 distorted loci, were intergrated into the cotton genetic map,and 40 were distributed on At subgenome, 44 on Dt subgenome.There are 22 primer pairs produced polymorphic bands between TM-1 and Hai7124 from616 MUSS/MUSS EST-SSR primers. Twenty-two loci, including 3 distorted loci, wereintergrated into the genetic map, 11 and 11 loci distributed on the At and Dt subgenome,respectively.From a total 87,514 ESTs developed from G. arboreura and G. hirsutum, we obtained39,507 non-rududent ESTs, and 2,146 SSRs were found by the soft SSRIT with a SSR per13.05 kb. Hexanucleotides were the most abundant motif(36.8%) and AT/TA is found atthe highest frequency, 7.4%.Among these 386 loci in this study, there are 24 EST-SSRs producing duplicated locidistributed on the corresponding homoelogous chromosomes, and 16 EST-SSRs producingduplicated loci on the non-homoelogous chromosomes. The last genetic map contained1052 loci, and composed of 26 chromosomes with a genetic distance of 6321 cM (averageof 6.0 cM between loci). The chromosome A3, A11, D1 and D11 contain two linkagegroups, respectively. The number of loci on every chromosome is from 22 to 58, with agenetic distance from 144.5 to 383.5 cM. There are 43 distorted EST-SSR loci, including22 to TM-1 and 21 to Hai7124.All ESTs were categorized to 3 main classes, namely Molecular Function, BiologyProcess and Cell Component through the soft Blast2go online. Among the Cell Component class, 41% were classficated to cell, and 36% to organelle. Catalytic Activity and Bindingcontain 22% and 24%, respectively, in the Molecular Function. As to Biology Process, 38%were belonged to Physiological Process and 36% to Cellular Process.Some genes of carbohydrate metabolism, transcript factor and signal transduction weremapped. For instance, sucrose synthase were mapped on A6, E6 on A5 and D5, MYB60 onA3. And a gene coded salt tolerance protein was mapped on A8.Many kinds of molecular markers are being used in cotton genetic mapping, but there islittle report on SNP markers development of cotton genes. With the PCR-based method,FbL2A genes were isolated from TM-1 and Hai7124. Based on Single nucleotidepolymorphisms (SNPs) of the FbL2A gene sequences in TM-1 and Hai7124, the amplifiedproducts were digested by BstUI selected through NEBcutter software online. And the genewas mapped on D2 by Mapmaker v3.0 mapping software.FIF1 is predominantly expressed early in developing cotton fibers and identified be a keyregulator of cotton fiber development. In this paper, FIF1 genes from G. hirsuturn cv. TM-1and G. barbadense cv. Hai7124 were cloned on the basis of its published sequence in G.arboreum L. Based on the SNPs between FIF1 genes from two tetraploid cotton, SNAPand CAPs markers were developed, and the gene was mapped on A8. This result indicatesthat SNP marker development is feasible and effective in Gossypium.These EST-SSR and SNP markers can be used in genetic mapping, identification ofquantitative trait loci (QTLs), and comparative genomics studies of diploid and tetraploidcotton.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络