节点文献

四种绦虫蛋白酶及其抑制剂的系统挖掘与功能分析

Genome-wide Analysis of Proteases and Inhibitors Sequences Identified through Bioinformatics Data Mining in Four Tapeworms

【作者】 闫鸿斌

【导师】 才学鹏;

【作者基本信息】 中国农业科学院 , 预防兽医学, 2013, 博士

【摘要】 绦虫(tapeworm)属于扁形动物门(Platyhelminthes)、绦虫纲(Cestoda),营寄生生活。据记载,地球上约有5000多种绦虫,其中有些种类,如细粒棘球绦虫、多房棘球绦虫和猪带绦虫等严重威胁人类和动物健康。棘球绦虫的中绦期(幼虫)寄生在人和动物的肝、肺等脏器,引起肝、肺包虫病,猪带绦虫的幼虫可寄生于人和动物的脑和肌肉组织,引起脑囊虫病或肌肉囊虫病。防治这些寄生虫病具有重要的公共卫生学意义。因此,药物靶标分子及疫苗和诊断候选抗原分子的筛选是绦虫学研究的热点,也是绦虫病防治的关键环节。猪带绦虫、细粒棘球绦虫、多房棘球绦虫和微口膜壳绦虫基因组的解析与研究,为绦虫病预防和控制技术研究与开发提供了丰富的数据资源。蛋白酶通过调控靶标蛋白质的激活、合成以及折叠来参与调解生物机体的绝大部分生理过程。蛋白酶对病毒、细菌和寄生虫相关病原体的复制和传播也至关重要。因此,蛋白酶及其抑制剂已成为医学领域疫苗和药物开发的重要靶标。本研究以四种绦虫基因组数据及其推导的蛋白质组数据为研究对象,以蛋白酶及其抑制剂为研究靶标,充分利用生物信息学技术,并结合一些专业的数据库,全面鉴定、分析了四种绦虫中蛋白酶和蛋白酶抑制剂的数量、种类及其潜在的功能。具体分析结果如下:1.分别从猪带绦虫(亚洲株)、细粒棘球绦虫、多房棘球绦虫、微口膜壳绦虫推导的蛋白质序列中鉴定出199、179、189和172个蛋白酶,约占蛋白编码基因总数的1.67%、1.75%、1.8%和1.70%,不包括无效冗余序列、无蛋白酶活性的同系物和可能出现的假基因等。2.鉴定的这些蛋白酶分布在天冬氨酸蛋白酶、半胱氨酸蛋白酶、金属蛋白酶、丝氨酸蛋白酶和苏氨酸蛋白酶这五大超家族中,其中,所占比例最高的是金属蛋白酶,约为33%-35%,其次是半胱氨酸蛋白酶(25%-29%)和丝氨酸蛋白酶(20%-28%),而天冬氨酸蛋白酶(2.2%-12%)和苏氨酸蛋白酶(7.5%-8.4%)所占比例较小。这些比例与其他物种中蛋白酶基因所占比例基本一致。这五类蛋白酶中,天冬氨酸蛋白酶所占比例在四种绦虫间变化最大,在猪带绦虫高达12%,而在其他三种绦虫则仅为2.2%-3.7%;其他四类蛋白酶在四种绦虫间所占比例变化较小。与近缘物种(吸虫和线虫)的比较研究发现,这四种绦虫中苏氨酸蛋白酶所占比例明显比曼氏血吸虫(6%)和秀丽线虫(5%)的高,而猪带绦虫中天冬氨酸蛋白酶所占比例显著高于曼氏血吸虫(4%)、秀丽线虫(5%)及其他三种绦虫。在半胱氨酸蛋白酶超家族,四种绦虫中均存在大量组织蛋白酶和泛素化与去泛素化蛋白酶。在丝氨酸蛋白酶超家族,四种绦虫中均存在大量胰蛋白酶样蛋白酶和枯草杆菌蛋白酶样丝氨酸肽链内切酶。3.通过KAAS(KEGG AutomaticAnnotation Server)分析发现,猪带绦虫、细粒棘球绦虫、多房棘球绦虫和微口膜壳绦虫中所鉴定的蛋白酶分别有117、163、146和165个能够找到直向同源分子(orthology)和KEGG(Kyoto Encyclopedia of Genes and Genomes)功能途径。其中,四种绦虫中与人类疾病相关的蛋白酶最多,分别是37、59、50和62个;参与代谢途径(Metabolism),如能量代谢、核酸和氨基酸代谢及药物代谢等的蛋白酶分别有24、20、20和27个;参与遗传信息传递过程的酶分子分别有19、22、22和19个;参与细胞途径,如细胞能量的转运、细胞的运动性、细胞增殖和死亡、细胞间信息的交流等的酶分子分别有21、27、25和25个;而参与环境信息处理过程(如信号转导、信号分子及其相互作用等)和生物有机体系统组成(如免疫、神经、消化、循环、分泌等系统)的依次分别是8、10、9、13个和8、25、20、19个。四种绦虫间比较分析发现,猪带绦虫中参与人类疾病和系统组成的蛋白酶显著少于其他三种绦虫。4.从猪带绦虫、细粒棘球绦虫、多房棘球绦虫和微口膜壳绦虫推导的蛋白质组中分别鉴定到35、38、36和27个蛋白酶抑制剂,不包括无效冗余序列、无抑制活性的同系物,主要分布在6-7个蛋白酶抑制剂家族。大部分抑制剂含有N-末端信号肽序列(44%-72%)和α跨膜螺旋结构(15%-31%)。5.所鉴定的蛋白酶抑制剂主要包括库尼茨型(Kunitz, KU)丝氨酸蛋白酶抑制剂、丝氨酸蛋白酶抑制剂(serpins)、半胱天冬酶抑制剂(BIR)及胱蛋白(Cystatins)等。其中,从四种推测蛋白质数据库中分别鉴定到20、21、20和14个蛋白质至少含有一个库尼茨型(Kunitz, KU)结构域,大部分Kunitz型丝氨酸蛋白酶抑制剂只含有一个Kunitz结构域,也有一些(2-3个)含有多个Kunitz结构域,且绝大部分蛋白质含有N-末端信号肽序列。几乎所有Kunitz结构域均含有保守的三个二硫键结构,也有极个别结构域中二硫键位置的半胱氨酸残基发生突变。我们从细粒棘球绦虫蛋白质数据库中鉴定的21个Kunitz抑制剂涵盖了Gonzalez等人曾鉴定的8个Kunitz丝氨酸蛋白酶抑制剂(EgKU1-8)。此外,从这四种绦虫蛋白质数据库中鉴定到5、6、4和6个丝氨酸蛋白酶抑制剂serpins,均含有较为保守的反应中心环。其中,从多房棘球绦虫鉴定的一个蛋白(EmuJ001193100)与Merckelbach和Ruppel于2007年报道的多房棘球绦虫丝氨酸蛋白酶抑制剂serpinEmu序列完全相同。本研究将为寄生虫致病机理与宿主免疫抑制机理等研究提供新的思路,为新型、高效抗囊虫病/包虫病等蠕虫病的药物或疫苗研发提供重要的靶标分子。

【Abstract】 Tapeworm belongs to Class Cestoda of Phylum Platyhelminthes. There are approximately5000specieson the earth, and among them, some species, such as Echinococcus granulosus, E. multilocularis,Taenia solium and so on, pose great threat to human and animal health. During living in livers, lungsand other organs of human and animals, metacestode larvae of E. granulosus and E. multilocularis leadto liver and/or lung hydatid disease. Neurocysticercosis or muscle cysticercosis occurs whenmetacestodes of T. solium are situated in the brain or muscle of human and animals. Prevention andtreatment on these parasitic diseases have important public health significance. Researches on novelefficient chemotherapeutic agents and immunoprophylaxis vaccines against parasites are emerging to behotspots in this field. It is a key step to screen and identify the chemotherapy targets and vaccinecandidates for the development of chemotherapeutic drugs and vaccines. Recently, the project ofsequencing and annotation of the genomes of T. solium, E. granulosus, E. multilocularis andHymenolepis microstoma has been completed. The data will provide rich resources for further studieson prevention and treatment methods against cestode infections.Proteases, also termed proteinases or peptidases, are proteolytic enzymes, involved in variousphysiological processes of all living organisms through regulating protein activation, synthesis andfolding of target proteins or molecules. Proteases are also essential for replication and dissemination ofviral, bacterial, parasitic and related pathogens, and therefore, the proteases are important targets forvaccines and drugs.In the present study, we identified and described the numbers, types, and potential functions of theproteases and their inhibitors from the putative proteomic data of four tapeworms using bioinformaticstechnology, combined with some databases. The results are as follows.1. After culling the data of redundant sequences, inactive homologs and putative pseudogenes, we haveidentified199,179,189and172proteases, which correspond to1.67%,1.75%,1.8%and1.70%of thepredicted proteins of T. solium (Asian strain), E. granulosus, E. multilocularis and H. microstoma,respectively.2. As expected, the proteases identified here are grouped into five catalytic classes with differentproportions:2%-12%aspartic proteases,25%-29%cysteine proteases,33%-35%metalloproteases,20%-28%serine proteases, and7.5%-8.4%threonine proteases, respectively, in four tapeworms. These proportions are largely in harmony with those from other organisms. Among five classes, theproportions of the aspartic proteases in T. solium (12%) is greater than those in other three tapeworms(2.2%-3.7%), and no significant differences are observed in other four protease classes among fourtapeworms. Comparatively, the proportions of the threonine proteases in the four tapeworms are higherthan those of Schistosoma mansoni (6%) and Caenorhabditis elegans (5%). We also note that obviousexpansion in the relative proportion of the aspartic proteases in T. solium compared to S. mansoni (4%)and C. elegans (5%). In the cysteine protease classes, a large number of Cathepsins and ubiquitinationand deubiquitination proteases were identified. Also, lots of trypsin-like proteases and subtilisin-likeserine endopeptidase were observed in the serine proteases.3. KAAS (KEGG Automatic Annotation Server) was able to assign orthology and KEGG (KyotoEncyclopedia of Genes and Genomes) functional pathways to117,163,146and165of T. solium, E.granulosus, E. multilocularis and H. microstoma proteases, respectively. Among them, the largestnumber of proteases is assigned to be engaged in human diseases:37,59,50and62in the fourtapeworms, respectively.24,20,20and27proteases are assigned to be involved in metabolic processes(including energy, nucleotide, amino acid and drug metabolism), respectively, while19,22,22and19proteases are involved in genetic information processing, respectively.21,27,25and25are involved incellular processes (such as communication and cell cycling), and8,10,9,13and8,25,20,19proteasesof the four tapeworms participate in environmental information processes and organismal systems,respectively. Comparatively, among the four tapeworms, a larger number of proteases are supposed tobe involved in human diseases in T. solium than those in other three cestodes.4. We identified35,38,36and27protease inhibitors from T. solium, E. granulosus, E. multilocularisand H. microstoma genomes, respectively, while inactive homologs and putative pseudogenes are notincluded. These inhibitors are classed into6-7inhibitor families, and most of them are containingN-terminal signal peptide sequence (44%-72%) and transmembrane α helix (15%-31%).5. The protease inhibitors identified here consist of Kunitz serine protease inhibitors, serine proteaseinhibitors (serpins), caspase inhibitors (BIR), Cystatin (Cystatins) and so on. We observed that20,21,20and14proteins containing at least one Kunitz domains were present in the four cestodes genomes,respectively, and the majority of these Kunitz serine proteases contained a N-terminal signal peptidesequence. Almost all of these Kunitz domains were comprised of conserved three disulfide bonds.In addition, we identified5,6,4and6serine protease inhibitors (serpins) in these four tapeworms,which contain a conservative reactive center loop. It is noted that the serpins sequences of E.multilocularis (EmuJ001193100) identified here are the same as that of serpinEmureported by Merckelbach and Ruppel in2007.In conclusion, these comprehensive analyses on these proteases will not only represent a goodcomplement to the growing knowledge of proteolytic enzymes, but also provide a foundation forexpanding our knowledge in cestodes and exploring potential targets for the development of newchemotherapies and immunoprophylaxis

节点文献中: 

本文链接的文献网络图示:

本文的引文网络