

Expressed Sequence Tage Analysis and Gene Expression Profile of Heading Leaf of Chinese Cabbage (Brassica Rapa L. Ssp. Pekinensis)

【作者】 戴大鹏

【导师】 曹鸣庆; 马荣才;

【作者基本信息】 首都师范大学 , 植物学, 2004, 博士

【摘要】 结球白菜(Brassica rapa L.ssp.pekinensis)原产于我国,是我国北方地区的重要蔬菜之一。在长期的自然进化和人工选择的共同作用下,形成了以贮藏营养物质为功能器官的叶球。研究叶球的形成过程无疑对于提高大白菜的产量和品质有着重要的意义。目前有关结球机理的研究在国际范围内仍然处于起步阶段。为了从基因组水平了解结球期叶片的基因表达情况,实验室先前构建了结球白菜结球前期包心叶和球叶cDNA文库,通过对文库的进一步处理和分析,我们获得了以下主要结果: 利用λZipLox噬菌体侵染寄主菌DH10B,完成了cDNA文库中2967个阳性噬菌体克隆向质粒克隆的转变,并完成全部转化菌的质粒提取工作。随机挑选质粒进行序列测定,获得2802个较好的测序结果,其中1361个序列为自测,1441个序列由上海生工及鼎国生物技术公司测定。对影响测序反应的因素进行了探索,发现所用质粒模板的纯度和用量是影响测序成功与否的关键因素。 初步分析表明,插入片段长度小于150 bp或者全为polyA的序列共268个,没有任何插入片段的空载体序列有162个。去除以上两类序列后,共获得2372条可用于后续分析的有效EST序列,占全部序列的84.6%。全部片段总长度为1088467bp,平均每EST长度为459bp,高于同类物种EST研究报道的平均长度。 利用DNAStar软件中的SeqmanⅡ模块对2372条EST序列进行片段重叠群(contig)分析,共计获得1641个不同的片段重叠群,所有contig全长为782616bp,平均长度为477bp)。其中,1247个contig仅由一条EST组成(52.6%),即singletons。结合blast比对结果,估算非冗余序列数目应在1499个左右,冗余序列约占EST总数的36.8%。 利用blastx程序,在蛋白质水平上与登录在NCBI非冗余蛋白质数据库(nr)中的序列进行比对后发现,1232个contig即1810条EST在数据库中可以找到同源物,占有效EST总数的76.3%。比对分值小于80及没有同源物的部分与est others数据库进行比对后,又有295个contig即432条EST可以找到同源EST,剩余的114个contig即131条EST仍然不能找到同源EST序列,这部分序列可认为是首次报道的EST序列。 物种来源上看,虽然功能已知的同源蛋白质中,来自拟南芥的占有绝大多数,但与est others数据库进行比对后,同源性最高的EST中只有51%来自拟南芥。说明结球白菜与拟南芥的基因表达模式存存较大差别。这部分相似性较低的序列,对于研究结球白菜独特的发育和代谢途径以及进行遗传作图具有特殊的意义。 同源物功能已知部分参照拟南芥研究中的方法,根据其同源蛋白的生物学作用或生化功能分类,发现参与蛋白质合成的基因数量最多Q2.5%L 其次是参与能量代谢O.7%L细胞防卫O.2o\信号转导(7.lo)、细胞结构历.9O\初级代谢(6石%)、蛋白质修饰和加工储藏(6.4%)等途径的基因。与几种已发表的双子叶植物叶片EST的功能分类研究比较后发现,在结球白菜结球叶中参与蛋白质合成过程的蛋白质所占比例最大,而不像其它双于叶植物叶片EST研究中以参与能量代谢过程的蛋白质占绝对优势。推测这种表达模式的差异可能与白菜生长环境变化(如低温胁迫)及结球叶的旺盛生长有关。 通过提高反应混合物中引物和dNTP的浓度,对CDNA的全面扩增技术oOlyAPCR)作了进一步的优化。杂交实验显示,白菜莲座期后期及结球期中期叶片cDNA的扩增产物保持了起始mRNA的浓度比例关系,可以用于后续杂交分析。利用载体两端通用引物,对所有挑选出的1438个CDNA克隆的质粒进行了PCR扩增,按照基因功能模块分类,利用Bi小Rad公司的96孔Bi。-d帆点膜工具先后共点制了 16对尼龙膜,并以标记的 cDNA扩增产物作为探针对所有尼龙膜进行了杂交分析。 分析杂交结果,发现 152个杂交点在莲座期与结球中期叶片中表达量上有较明显差异,其中94个为候选结球期上调表达类型,其余5 8个为结球期候选下调表达类型。功能已知部分的候选基因中,表达水平上调部分主要集中于两类反应途径:一是泛素蛋白“S蛋白酶体途径,二是低温胁迫抗逆反应途径。推测结球期植株体内激素水平的改变和外界环境温度的下降可能是导致两类反应途径基因表达水平改变的主要原因。Northern杂交验证候选基因中的类金属硫蛋白及一功能未知基因,结果与点杂交一致,即前者在结球期呈现出上调表达模式,而后者呈现出下调表达模式。同时还获得了类金属硫蛋白基因全长cDNA,Southern杂交显示桔红心白菜基因组中该基因可能只有一个拷贝。 总之,本研究通过分析结球白菜结球前期叶片的表达序列标签(EST),构建了虚拟的结球期叶片基因表达图谱,并采用大规模的点杂交实验研究了结球白菜结球期叶片的基因表达情况。所获研究成果对于研究结球白菜的结球机理具有重要的理论指导意义,同时在实践应用上,对于许多具有结球习性蔬菜的分于育种也必将产生重要影响。

【Abstract】 China is an origin center of Chinese cabbage {Brassica rapa L. ssp. pekinensis) which is one of the most important vegetables in Far East. Under the co-effects of natural evolution and artificial selection the leaf head serving as the nutrition deposited organ has been formed. Studying the mechanism of heading process is very important in practice, especially for the aim of increasing the yield and quality of Chinese cabbage. Unfortunately, the pace for studying the mechanism of heading formation is just at the begining so far. A A ZipLox mediated cDNA library using the leaves of Chinese cabbage at the early heading stage had been constructed, in order to explore the gene expression pattern of heading leaf from the genomics level. Our work was to achieve the designed aims by analyzing that library further. Several main results and conclusions were following:After transfecting Escherichia coli DH10B with the A ZipLox cDNA library we finally got 2697 plasmid clones and exacted all the plasmid among them. Totally 2802 sequences with better results were acquired after sequencing with DNA sequencer (Beckman & Coulter CEQ8000 genetic analysis system) or with the help of two biological company Shenggong and Dingguo Co.. We also studied the factors influencing the sequencing reaction and found that high purity and appropriate amount of plasmid template were the vital factors for obtaining good sequencing results.The sequencing results indicated that 268 clones had inserts shorter than 150bp or only polyA sequences and 162 clones had no cDNA inserts. The sequences of remaining 2372 clones, occupied 84.6 percent of all the sequencing results, were used for further analysis. Total length of all the usable sequences was 1088467 bp, and the average length per EST was 459 bp, which was longer than that of other researchers reported.The SeqMan II module of DNAStar software was used for contig analysis of all the 2372 ESTs, and a total of 1641 different contigs were obtained. Total length of all the contigs’ consensus sequence was 782616 bp, and each contig had 477 bases for average. Further analysis revealed that 1247 contigs belonged to singletons which included only one EST in one contig. Combined with the later blast align results we predicted that the real number of non-redundant sequences was around 1499, so the redundant sequence occupied about 36.8%.After being compared with the non-redundant protein (nr) databases of NCBI using blastx program, 1232 contigs which included 1810 ESTs shared significant similarity with sequences registered in the database. For the sequences with lower score than 80 or even having no homologous in the nr database, biastn program was carried out for finding homological ESTs in the estothers database of NCBI. Searching results showed that 295 contigs which included 432 ESTs could find homologous, and the remaining 114 contigs accounting for 131 ESTs still could not find any homological ESTs. These new ESTs might be the sequences discovered firstly in the world.By analyzing the homological proteins’ sources of the organisms, it was found that majority of the functionally known protein came from Arabidopsis thaliana; however, at the nucleotide level, only 51% (1205/2372) of the total ESTs showing highest homology to ESTs coming from Arabidopsis thaliana. This result suggested that gene expression profiles of Chinese cabbage differ from that of Arabidopsis thaliana. Sequences with lower similarity to the Arabidopsis nucleotide could be specially used to study the unique developmental process and metabolic pathway in Chinese cabbage, and those sequences were also important for the genetic map drawing in the future.For the ESTs whose homological protein had elucidatory functions, functional assignment was carried out based on the method used in the Arabidopsis thaliana genome sequencing initiative. The results revealed that ESTs of the genes involved in the process of protein synthesis had the highest proportion which took up 22.5 percent of all the analyzed ESTs, then in turn were the ESTs

