节点文献

小尾寒羊和杜泊羊臂二头肌转录组及肌球蛋白轻链基因家族结构特征分析

Transcriptome Analysis of Small-tailed Han Sheep and Dorperrs Biceps Brachii and Structure Characteristics of Myosin Light Chain Gene Families

【作者】 张春兰

【导师】 王建民;

【作者基本信息】 山东农业大学 , 动物遗传育种与繁殖, 2014, 博士

【摘要】 绵羊是羊肉的重要来源,随着羊肉价格的逐渐盘升,如何提高羊只的产肉量、改善羊肉品质成为亟待解决的问题。研究不同生长性能绵羊骨骼肌间的全转录组和差异表达将有助于揭示肌肉生长和发育的分子机制。近年来,基于第二代高通量测序平台的转录组测序(RNA-Seq)可以实现快速、全面地获得特定组织于某个状态下几乎所有的转录本。本试验以生长性能显著不同的两只小尾寒羊和杜泊羊的臂二头肌为试验材料,分别构建了它们的转录组文库,并采用Illumina HiSeq2000高通量测序平台进行测序;获得的测序序列与绵羊参考基因组和参考基因比对后全面揭示两个转录组文库的特征,包括基因表达、基因注释、差异基因、可变剪接、新转录本、cSNP和SSR等;以qRT-PCR法对从中随机选取的12基因进行了相对定量分析对高通量测序数据的可靠性进行验证;采用RACE等方法克隆了小尾寒羊和杜泊羊MYL1、MYL2、MYL3和MYL4基因的全长cDNA序列,利用GENSCAN等生物软件对cDNA序列以及编码蛋白的结构特征等方面加以分析,并对这几个基因在绵羊的不同组织间mRNA表达谱进行分析。主要研究结果如下:(1)经RNA-Seq分析,分别在小尾寒羊和杜泊羊的两个转录组文库中得到50264608和52794216条长为90bp高质量的测序序列。这些序列中各有约三分之二以上的序列可以比对到绵羊的参考基因组中,而有42.77%和33.10%的序列被可被比对到绵羊的20236个参考基因上。在绵羊臂二头肌中,只有约1%的参考基因属于高表达基因(RPKM≥1000),而93%以上的基因为低表达基因(RPKM≤1000)。(2)两个转录组文库间发现有1300个差异表达基因(FDR≤0.001且│log2Ratio│≥1),其中有554个基因在小尾寒羊中表达上调而746个表达下调。对它们进行GO功能注释时发现分别有1066、1137和1067个基因可被注释到GO的生物学过程、细胞组分和分子功能条目中。而1152个差异表达基因被注释到了KEGG数据库的240条通路中,其中以新陈代谢通路(含有114个基因)和肌动蛋白细胞骨架通路(21个基因)包含基因最多。综合两方面的注释结果共发现有31个差异表达基因与肌细胞发育、分化和骨骼肌的生长相关。蛋白互作网络分析发现差异表达基因中的细胞外基质结构组分——整合蛋白ITGA8和胶原蛋白COL4A1与其他蛋白间互作最多,而肌球蛋白(myosin)、肌钙蛋白(troponin)以及一些与生长和发育密切相关的调节因子(IGFBP5、TGFBR3和VEGF)与其他基因的互作较少。(3)在小尾寒羊和杜泊羊文库中分别存在着40481和38851个潜在的cSNPs以及4721个SSRs,并分别以A→G和(AC)n为主要的分子标记。其中cSNPs主要分布于绵羊的1、2和3号染色体上。(4)小尾寒羊和杜泊羊文库中分别有37.72%和39.03%表达的基因发生了可变剪接,并以A3SS为主要的剪接方式。依照测序序列对6989个参考基因的两端或一端进行了延伸和优化。共发现了123678个平均长为343bp的新转录本单元。(5)克隆得到了绵羊MYL1、MYL2、MYL3和MYL4基因的全长cDNA序列。其中MYL1基因包括长度不同的两条cDNA序列,二者在5′端有所不同而3′端的序列相同。将这些序列提交至GenBank,登录号分别为KJ700419、KJ710701、KJ710702、KJ710703、KJ710704、KJ710705、KJ710706、KJ710707和KJ768855。同一个基因的cDNA序列在两个品种间的同源性均高于与NCBI中预测序列的同源性,主要表现为5′端和3′端长度和序列的不同。(6)绵羊MYL1、MYL2、MYL3和MYL4基因均包括5个外显子,它们的起始密码子上游5′侧翼序列中含有大量肌细胞增加因子MEF2以及生肌调节因子MyoD和MyoG的结合位点。(7)绵羊MYL1a、MYL1b、MYL2、MYL3和MYL4蛋白中只有1~2个氨基酸与NCBI中的预测序列不同。序列分析结果表明它们均为偏酸性蛋白、亲水性较好、无信号肽、有多个糖基化和磷酸化位点;二级结构以α-螺旋和无规卷曲为主;三级结构以人肌球蛋白轻链Ⅱ为最佳模板,两端呈对称性桶状;聚类结果显示脊椎动物中绵羊与山羊、人与大熊猫、灰鼠猴子与猕猴、小鼠与褐家鼠、牛和牦牛、猫与狗相近,而斑马鱼和爪蟾与以上脊椎动物相距最远。(8)绵羊MYL1基因主要在背最长肌中表达,两种可变剪接体MYL1a在杜泊羊中表达量高于小尾寒羊,而MYL1b在小尾寒羊中表达量高于杜泊羊;MYL2、MYL3和MYL4基因主要在心肌中表达;MYL4基因在杜泊羊背最长肌中表达而在小尾寒羊背最长肌中不表达;MYL1基因及其不同的可变剪接和MYL4基因与绵羊骨骼肌的生长有关。综上所述,本试验获得的对小尾寒羊和杜泊羊臂二头肌转录组的高通量测序分析,为进一步筛选肌肉生长和发育相关的基因提供数据。同时,对绵羊肌球蛋白轻链基因家族cDNA序列的克隆和特征分析为后期基因功能和调控机制的研究奠定了基础。

【Abstract】 Sheep is the main source of mutton. With rising of the price, improvement of muttonproduction and quality has become an important issue.An investigation of gene expression inovine muscle among different breeds would significantly promote our understanding ofmuscle growth and development.RNA-seq is a recently developed analytical approach for transcriptome via high-throughputsequencing.In this experiment, two cDNA libraries were constructed from biceps brachii ofDorper (DP)and Small-tailed Han sheep (SH), whose growth performance were significantlydifferent.The two libraries were sequenced using Illumina HiSeq2000sequencing platform.After mapped to the O. aries genome and reference genes, characteristics of the twotranscriptome were analyzed by bioinformatics analysis, including gene expression,annotation, differentially expressed genes (DEGs), alternative splicing (AS), novel transcriptunits (nTUs), SNP, SSR, etc.12genes were tested to verify the reliability of sequencing databy qRT-PCR. Then taken muscle of SH and DP as experimental materials, the cDNAs ofMYL1, MYL2, MYL3and MYL4were cloned using RACE methods. Additionally, structure,characteristics and mRNA expressing profile of these genes were analyzed. The main resultswere showed as follows:(1) For the SH and DP libraries, a total of50,264,608and52,794,216clean reads wereobtained, respectively. Approximately two-thirds of these reads could be mapped to O.ariesgenome. Among them,42.77%and33.10%reads could be aligned to20,236reference genes,respectively. Only about1%of these genes belong to high-expressed genes (RPKM≥1000),while the majority of them were low-expressed genes (RPKM≤1000).(2) Up to1,300significantly differential expressed genes (DEGs) with FDR≤0.001andabsolute value of log2Ratio≥1were found between the two libraries (554were up-regulatedand746down-regulated in the SH). When annotated to Gene Ontology (GO) database,1,066,1,137and1,067genes were found in biological processes, cellular components andmolecular functions items, respectively. After annotated to Kyoto Encyclopedia of Genes andGenomes (KEGG) pathway,1,152genes were found in240pathways. Among which,metabolic and actin cytoskeleton items were the main pathways with114and21genes,respectively. Combined annotated results of GO and KEGG, a total of31DEGs wereidentifiedbe related to muscle cell development, differentiation and skeletal muscle growth.After analysis of protein interaction network, COL4A1and ITGA8had numerous interactionswith other proteins but myosin, troponin, IGFBP5, TGFBR3and VEGF on the other way. (3)40481and38851cSNPs were exploited in SH and DP libraries. All of these cSNPswere mainly distributed in chromosome1,2and3.4,721SSRs were observed in the twolibraries.Among them, A→G and(AC)nwas the main type.(4)37.72%and39.03%genes had gone alternative splicing (AS) in the SH and DPlibraries, respectively. A3SS was the main type. According to sequenced clean reads,6,989reference genes were extended and optimized. And a total of123,678novel transcript unitswith average343bp length were discovered in the two libraries.(5) The cDNA sequences of MYL1, MYL2, MYL3and MYL4were cloned and submittedto NCBI (GenBank accession number are KJ700419, KJ710701, KJ710702, KJ710703,KJ710704, KJ710705, KJ710706and KJ768855). MYL1had two cDNA sequences withdifferent5′ends. Homology of the five cDNA sequences were higher between SH and DPthan SH (or DP) and predicted sequences published in NCBI. The difference mainly focusedon5′and3′tail ends.(6) Each of MYL1, MYL2, MYL3and MYL4gene had5exons. Many binding sites ofMEF2, MyoD and MyoG were found in upstream of their start codons.Thismeans thattranscriptional expression of MYLs might be regulated by these transcription factors.(7) One or two different amino acids were found between MYL1a, MYL1b, MYL2, MYL3MYL4protein and predicted sequences in NCBI. All of these proteins were acidic andhydrophilic with many glycosylating and phosphorylating sites. But they had no signalpeptide. Nearly half percent of their secondary structures was α-helix. And human MYL2protein was the best tertiary structure templatewith symmetric barrelfor the five types ofpredicted proteins. According to clustering results, sheep and goat, human and chimpanzee,squirrel monkeys and macaques, mice and sewer rat, cattle and yak, cat and dog were similar,while zebrafish and xenopus laevis were far from above vertebrates.(8) MYL1gene mainly expressed in longissimus of sheep. It had two kinds of alternativesplicing isoforms, MYL1aand MYL1b. MYL2, MYL3and MYL4gene were expressed inheart.In conclusion, RNA-Seq analysis of two transcriptome libraries in this study could afford afoundmantal databases for further selecting genes related to skeletal muscle. Meanwhile,characteristic analysis of cDNAs and proteinssequences about MYLs offered a foundation forstudy of theirfunction and regulated mechanism in the future.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络