节点文献

基于RNA-Seq技术的米曲霉RIB40转录组学研究

Transcriptome Study of Aspergillus Oryzae RIB40 Based on RNA-Seq Technology

【作者】 王斌

【导师】 郭勇;

【作者基本信息】 华南理工大学 , 发酵工程, 2010, 博士

【摘要】 米曲霉,是我国传统酿造行业的优良菌种,具有极强的蛋白表达和分泌能力,在食品、饮料、化妆品、医药等领域具有重要的应用。随着米曲霉遗传转化系统的逐步建立,米曲霉成为极具潜力的异源重组蛋白表达系统。然而,米曲霉转录组学的研究却进展缓慢,依靠EST技术和基因芯片技术的转录组学研究受到自身检测范围和灵敏度的局限,很难揭示米曲霉转录组的功能复杂性。而且,米曲霉在固体、液体培养条件下生长表型出现显著差异,并导致蛋白表达和分泌的差异,其中的原因仍不清楚;异源蛋白在米曲霉中的表达,与同源蛋白相比,表达水平很低,其中的原因有待分析。针对以上米曲霉的基础生物学问题,作者采用最新发展起来的RNA-Seq技术对米曲霉的转录组进行解析,并在转录组水平上研究上述问题。本研究对米曲霉在不同培养条件下的RNA样品进行RNA-Seq测序,测序得到的序列片段(reads)对米曲霉基因组的覆盖高达145倍,有超过90%的reads是高质量的在基因组上特异匹配的reads。根据RNA-Seq reads在米曲霉基因组及基因上的匹配,绘制了米曲霉全基因组水平的转录图谱。通过代表基因转录水平的标准化数据RPKM判定米曲霉基因组的12074个蛋白编码基因中共有11263个发生转录表达,使得米曲霉基因的注释率达到93.28%。基于RNA-Seq数据,在米曲霉基因组范围内分析基因转录体的结构:确定了4198个基因的5’非翻译区(5’UTR)和4357个基因的3’非翻译区(3’UTR),并分析了UTR长度分布及其与基因功能的关系;确定了1345个米曲霉基因的上游开放阅读框(uORF)和272个基因的上游起始密码子(uATG),其中具有uORF的基因占米曲霉蛋白编码基因总数的11.14%,是酵母的两倍;发现了1166个新转录本和800个新外显子(存在于513个已知基因中),Augustus软件预测认为共有700个新转录本可能编码新的蛋白。基于RNA-Seq数据,在米曲霉基因组范围内分析基因转录体的可变剪接(alternative splicing,AS)。结果表明,总共在1032个米曲霉基因中确定了1375个可变剪接事件,具有可变剪接的基因占米曲霉基因总数的8.55%,米曲霉是目前真菌中检测到可变剪接数量最多的物种。其中,内含子保留(retained intron,RI)是米曲霉中占比例最大的可变剪接形式,达到总数的91.56%,而RI在哺乳动物中则是十分稀少的AS形式,这是米曲霉区别于高等生物的可变剪接的一个特点。通过分析米曲霉中发生RI的基因数目和发生外显子剪接(cassette exon,CE)的基因数目的比例以及RI和CE的长度关系,推测ID(intron definition)机制是米曲霉基因可变剪接中识别剪接位点的主要机制。基于不同培养条件下的米曲霉RNA-Seq数据,比较了米曲霉在两种基本培养方式(固体培养(solid-state culture,SC)和液体培养(liquid culture,LC))下基因的差异表达,分析得到4628个差异表达基因(p-value<0.001),固体培养条件下转录上调的有2355个,液体培养条件下转录上调的有2273个。GO功能富集分析和KEGG代谢途径分析表明在固体培养条件下米曲霉的蛋白质翻译及修饰、能量代谢等比在液体培养条件下更加强大。通过比较米曲霉在内质网压力胁迫下(二硫苏糖醇DTT处理)的差异表达基因,分析了内质网压力胁迫下米曲霉的蛋白分泌模式;通过对内质网压力胁迫下米曲霉蛋白分泌调控的分析,推断米曲霉在固体培养条件下比在液体培养条件下,具有更强的蛋白分泌调节能力。本研究通过RNA-Seq技术深入解析了米曲霉的转录组,对米曲霉的基因结构、新转录本、代谢途径、可变剪接、蛋白表达及分泌等方面进行了研究,为米曲霉的基础生物学研究、基因工程和工业应用提供了参考数据。

【Abstract】 Aspergillus oryzae, an excellent strain used in Chinese traditional brewing industry, owns powerful capacity of protein expression and secretion, and has important applications in the field of grocery, beverage, cosmetic and medicine. And A. oryzae becomes an attractive expression system for heterologous recombinant protein genes, as the development of its genetic transformation system. However, the transcriptome research of A. oryzae progressed slowly. The research by EST and microarray technology, due to their limited depth of coverage and sensitivity, could not elucidate the functional complexity of A. oryzae transcriptome. Furthermore, it was still unknown why there were remarkable differences in growth phenotype and protein expression and secretion between solid-state culture (SC) and liquid culture (LC) in A. oryzae. And it was difficult to unpuzzle the fact that the expression level of heterologous genes in A. oryzae is lower when compared with homologous gene expression. Herein, in order to address the above fundamental issues in A. oryzae, the high-throughput RNA sequencing technology (RNS-Seq) was utilized to interrogate A. oryzae transcriptome. We hope to provide clue to the above problems on the transcriptome level.RNA samples of A. oryzae cultured under different conditions were sequenced by RNA-Seq technology. The obtained sequence fragments (reads) represent 145-fold A. oryzae genome lengths. Uniquely mapped reads accounts for >90% of all reads. The A. oryzae genome-wide transcriptome map was depicted according to the reads match in A. oryzae genome and genes. 11263 (93.28%) of the total 12074 A. oryzae protein-coding genes had transcriptional activity based on the RPKM value.The transcript structures of A. oryzae genes were analyzed on the genome scale based on RNA-Seq data. The results defined or extended 5′UTRs for 4198 transcribed genes and 3′UTRs for 4357 transcribed genes in A. oryzae, and the length distribution of UTRs and their functional categories were studied. 1345 upstream open reading frames (uORF) and 272 upstream initiation codons (uATG) were detected in the identified 5′UTRs. The amount of genes with uORF accounted for 11.14% of the total number of A. oryzae protein-coding genes, twice of that in yeast. The results also identified 1166 novel transcripts and 800 new exons in 513 annotated genes. 700 of the 1166 novel transcripts were likely to be protein-coding genes, based on the prediction by Augustus software.Alternative splicing (AS) of A. oryzae gene transcripts were analyzed on the genome scale based on RNA-Seq data. 1375 AS events were detected in 1032 A. oryzae genes (8.55% of all A. oryzae genes). This is the highest amount reported among fungi up to now. Retained intron (RI) accounted for the largest number of AS events (91.56%) in A. oryzae, while RI is a rare AS event in mammals. RI event is a feature of A. oryzae AS events, which was different from high eucaryotes. Based on the comparison of the amount and the length distribution between RIs and cassette exons (CEs), it is suggested that A. oryzae might perform splice site recognition predominantly by the intron definition (ID) mechanism.The differentially expressed genes between A. oryzae cultured in solid-state culture condition (SC) and liquid culture condition (LC), were analyzed based on RNA-Seq data. There were 4628 differentially expressed genes (p-value<0.001) between the two basic culture conditions (SC and LC), with 2355 up-regulated in SC and 2273 up-regulated in LC. GO functional enrichment analysis and KEGG pathway analysis of these genes, illustrated that the capacities of protein translation and modification and energy metabolism were much more powerful in SC than in LC. By comparison of the differentially expressed genes between SC and LC under ER stress (DTT treatment), the protein secretion model of A. oryzae under ER stress was depicted. The regulation of protein secretion in A. oryzae under ER stress was also analyzed, and the results illustrated that the regulation capacity for A. oryzae protein secretion was much more powerful in SC than in LC.The A. oryzae transcriptome was interrogated by RNA-Seq technology in this research. The results provided much information about gene structures, novel transcripts, metabolic pathways, alternative splicing events, protein expression and secretion and so on, which were valuable for such studies in A. oryzae as fundamental biological research, gene engineering and industrial applications.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络