节点文献

霜霉病菌诱导的黄瓜叶片cDNA文库构建及其表达序列标签(ESTs)分析

Construction of cDNA Library from the Cucumber Leaves Induced by Pseudoperonospora Cubensis and Analysis of Expressed Sequence Tags

【作者】 牛德

【导师】 王丽娟;

【作者基本信息】 东北农业大学 , 植物学, 2010, 硕士

【摘要】 黄瓜是全球十大蔬菜栽培作物之一,我国的黄瓜总种植面积及总产量目前已跃居世界首位,而黄瓜霜霉病作为全球范围内黄瓜产区主要叶部病害之一,严重威胁着黄瓜的正常生产。鉴于目前黄瓜霜霉病的分子生物学研究较少和ESTs分析功能基因组具有的明显优越性,构建受霜霉菌侵染的黄瓜叶片cDNA文库并进行ESTs分析,对于研究黄瓜与霜霉菌互作过程中的基因表达,克隆黄瓜抗霜霉病基因,明了黄瓜的抗病机理具有重要指导意义。试验比较了目前常用的十种RNA提取方法对黄瓜不同组织(根、茎、叶和幼果)总RNA的提取效果,进而采用改良SDS法提取接种霜霉病菌后第4h、8h、16h、24h、48h和72h的抗霜霉病黄瓜品种‘649’的叶片总RNA,等量混合作为反转录模板,使用CreatorTM SMARTTM cDNA Library Construction Kit构建霜霉菌接种初期黄瓜叶片的全长cDNA文库,从中随机大规模挑取阳性克隆进行序列测定并进行全面的生物信息学分析。主要研究结果如下:1.黄瓜属于RNA易提取植物,十种方法基本上都可以从黄瓜四种组织中提取到RNA,但提取效果间存在一定的差异。十种方法对黄瓜叶和幼果中RNA的提取效果普遍优于对黄瓜根和茎中RNA的提取;对黄瓜叶片RNA提取效果较好的方法有RNAPlant试剂法和改良SDS法;对黄瓜幼果RNA提取效果较好的方法有Trizol法、SDS法和改良SDS法;十种方法提取黄瓜根和茎中的RNA,在纯度方面均不高,但基本上都可保证所提RNA的完整性。2. cDNA文库构建过程中:确定LD-PCR的循环数为21个循环,发现cDNA与pDNR-LIB载体以1.5:1比例连接时效果较好。经质量检测,获得的原始文库滴度为5.5×106pfu·mL-1,重组率约为99%。而扩增后的文库滴度达6.5×109pfu·mL-1。文库中包含的插入片段大小在0.5-2.0kb之间,并多在1.0kb左右,最大片段约为2.0kb。3.从黄瓜cDNA原始质粒文库中随机挑取3360条阳性克隆进行测序,成功得到3091条ESTs序列,使用SeqClean软件去除载体序列、重复序列和长度少于100bp的序列后,最终得到的2903条高质量的ESTs序列,所得ESTs序列长度范围为101-604bp,平均长度414.6bp,GC含量平均为42.44%。4.使用CAP3软件对2903条ESTs序列进行聚类拼接,得到2507条非重复序列(Unigenes),其中包括211个Contigs和2296个Singlets,非冗余序列占全部序列的86.36%,得到的unigenes序列长度范围为101-1426bp,平均为422.73bp,GC含量平均为38.21%,长度在600bp以上的有53条。5.基因表达丰度分析表明:在2507条unigenes中,共有高丰度表达基因(表达频率≥5)18个,约占总数的0.72%;中丰度表达基因(5>表达频率≥2)139个,约占总数的7.69%;其余低丰度表达基因约占91.60%,说明黄瓜叶片中大多数基因呈低丰度表达。6.同源性来源分析表明:在2507条unigenes中,共有1653条(占总数65.94%)在NCBI非冗余核苷酸数据库(nt)中有其匹配序列,序列同源性来源于110多个物种,来源较多的依次是黄瓜(23.29%)、葡萄(13.43%)、杨属(13.19%)、大豆(10.41%)等。7.对组装后的2507条unigenes在NCBI和Swiss-Prot非冗余蛋白数据库中进行blastx比对,共获得超过500种的已知功能蛋白,根据蛋白类别可将其划分为四类,将具有已知功能和推测功能的1547个unigenes在UniProt数据库中确定其蛋白功能,构建基因表达图谱,被赋予功能的基因累计达到2231个(包括一因多效),其中参与抗病/防御的基因有427条占19.16%,参与信号转导的56条占2.51%。8.对组装后的2507条unigenes在COG数据库中比对,结果显示:共有662个unigengs在COG数据库中得到功能注释信息,涉及23项蛋白功能,其中信息储存和加工类有351条占53.01%,细胞加工类有44条占6.65%,代谢类有207条占31.27%,防御类有5条占0.76%,功能不明确有55条占8.31%。9.对组装后的2507条unigenes序列在KEGG数据库中进行比对,结果显示:共有1851个unigengs获得相关的注释信息,涉及44个的非重复代谢途径,Pathway分析表明:对代谢途径的注释主要集中在各类氨基酸的合成与代谢、有机物合成与代谢等方面,还包括3个信号传导途径,3种疾病的发生途径,一个抗原加工途径和一个霍乱弧菌的侵染途径。10.通过在NCBI、Swiss-Prot、KEGG和COG数据库的比对,最终确定有468条unigenes在上述四个数据库中均未找到其匹配项,确定为新基因。

【Abstract】 Cucumber (Cucumis sativus L.) is one of the ten top cultivate vegetable crops in the world, and the total planting area and total yield of cucumber in China have taken the first place in the world. However, cucumber downy mildew, which is one of the most destructive diseases of cucumber leaves in cucumber production area, has great threat to cucumber production. In view of little is known about the molecular mechanisms of this disease and ESTs analysis has obvious advantages on studying functional genomics, the construction of cDNA Library from cucumber leaves infected with Pseudoperonospora cubensis and analysis on relevent ESTs have great meaning on studying the gene expressions in the process of cucumber leaves interacting with P. cubensis, cloning resistance-related genes and revealing resistance mechanism.In this research,10 different methods of extraction total RNA were evaluated about the availabilities for isolating total RNA from the different tissues of cucumber, such as roots, stems, leaves, and young fruit. As a result, the improved SDS method were chosen to extract total RNA from leaves of the disease-resistant cucumber cultivar ’649’ at 4 h、8 h、16 h、24 h、48 h and 72 h after infected by P. cubensis respectively, and then, equal quantity mixed as reverse transcription template, the full-length cDNA library was constructed used CreatorTM SMARTTM cDNA Library Construction Kit. Following, the positive clones were chosen randomly from the constructed cDNA library to sequencing, and finally the high quality ESTs were analysised through bioinformatics. The research has led to the following results:1. The results showed that cucumber belongs to the plant that RNA exaction easily; total RNA in four different tissues of cucumber was all extracted successfully with 10 different methods, but where was some discrepancy among the extraction effect. Overall, the extraction effect of total RNA in leaves and young fruit extracted by 10 different methods better than that of extracted from stems and roots. The RNAplant Reagent method and Improved SDS method have better extraction effect of total RNA from leaves; the Trizol Reagent method, the SDS method and the improved SDS method have better extraction effect of total RNA from young fruit; the purity quotient of total RNA in stems, roots extracted by 10 different methods was all not high, but the integrity are all better.2. In the prosess of constructing cDNA library, the cycle number of LD-PCR was determined to be 21, and a better ligation effect was obtained when the proportion of cDNA and pDNR-LIB vector was 1.5:1. The results showed that the primary titer of the constructed cDNA library was 5.5×106 pfu·mL-1, and 6.5×109 pfu·mL-1 for amplified library. The recombination rate was about 99%. The size of inserted cDNA fragment ranged from 0.5 kb to 2 kb, majority at about 1 kb.3.3360 positive clones were chosen randomly from the constructed cDNA primary library of Cucumber leaves to sequencing and a total of 3091 ESTs were obtained successfully. At last,2903 high quality ESTs were acquired after removing vector sequences, repeat sequences and short sequences (<100 bp) with SeqClean software. The ESTs length ranged from 101bp to 604bp and the average length was 414.6bp. The average content of GC was 42.44%.4.2507 unigenes (included 211 contigs and 2296 singlets) were clustered obtained with CAP3 program from 2903 ESTs and the proportion of the novelty was 86.36%. The length of unigenes ranged from 101 bp to 1426bp and the average length was 422.73bp. The average content of GC was 38.21%. Besides, there were 53 unigenes with length above 600bp.5.The analysis of genes expression abundance results showed that the number of high abundance expression genes (expression frequency≥5) was 18, which took 0.72% of the total. The number of middle abundance expression genes (5>expression frequency≥2) was 139, which took 7.69% of the total. The rest were low abundance expression genes, its number took 91.60% of the total. This distribution indicated that most gene in the cucumber leaves were low abundance expression genes.6. Homology origin analysis showed that 1653 unigenes (65.94%) had matching sequences in the NCBI nr database. There had comprehensive sequence homology origin and the origin more than 110 species, which mainly were Cucumis sativus (23.29%), Vitis vinifera (13.43%), Populus (13.19%),Glycine max(10.41%) and so on.7. A total of more than 500 function-known proteins were obtained by blastx analysis used 2507 assembled unigenes in the NCBI and Swiss-Prot database and these proteins could be divided into four categories based on their function annotation. The protein function of 1547 unigenes with function known or function putative were determined by UniProt database and the results were then used to construct gene expression map. A total of 2231 functional genes were given (including one-gene with multi-effect), including 427 (19.16%) plant disease resistance/defense related genes and 56 (2.51%) plant signal transduction related genes.8. The 2507 assembled unigenes were analyzed by blastp in the COG database, and the results showed that there were only 662 unigengs in the COG database obtained annotation information, referring to 23 protein functions, which including 351 (53.01%) information storage and processing related genes,207 (31.27%) metabolism related genes,44 (6.65%) cell processing related genes,5 (0.76%) disease resistance/defense related genes and 55 (8.31%) function indefinite genes.9. The 2507 assembled unigenes were analyzed in the KEGG database, and the results showed that a total of 1851 unigenes had related annotation which involved 44 non-repeat metabolic pathway. Pathway analysis showed that these metabolic pathway mainly focused on various types of amino acids synthesis and metabolism, organics synthesis and metabolism, it also included 3 kinds of signal transduction pathway,3 kinds of disease occurrence pathway,1 antigen processing pathway and 1 cholera bacillus infection pathway. 10. A total of 468 unigenes had no any information through blast in NCBI, Swiss-Prot, KEGG and COG database, and finally they were identified as new genes.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络