节点文献

基于蛋白质网络的关键蛋白质识别方法研究

Identifying Essential Proteins Based on Protein Interaction Networks

【作者】 王峘

【导师】 王建新;

【作者基本信息】 中南大学 , 计算机科学与技术, 2011, 硕士

【摘要】 关键蛋白质是生物体生存和繁殖所必需的蛋白质,在生命活动中扮演重要角色。关键蛋白质的识别对于生命科学的研究具有重要意义,在疾病诊治和药物设计等方面也具有重要的应用价值。在后基因组时代,随着高通量技术的发展,可获得的蛋白质相互作用数据日益丰富,基于蛋白质网络的关键蛋白质识别成为新的研究热点。本文从网络拓扑的角度出发,在分析节点拓扑特征的基础上,深入挖掘了蛋白质网络的特征,设计了有效的关键蛋白质识别方法。主要研究工作包括:针对目前以中心性测度为主的基于拓扑的关键蛋白质识别方法只能反映节点特征而无法表征边的重要程度这一不足,引入边聚集系数的概念,构造了一个融合网络中点和边双重特性的测度参数SoECC,并用于关键蛋白质的识别。在酵母蛋白质相互作用网络上的实验结果表明,SoECC的预测准确率和效率普遍高于六种中心性测度,并且SoECC预测出的关键蛋白质表现出明显的聚集效应,这种现象是边聚集系数涵义的体现,也与先前研究者的结论相吻合。针对现有的关键蛋白质识别方法对生物意义及生物功能的挖掘不够深入这一缺点,引入蛋白质复合物的信息,构造了一个新的测度参数SoID来识别关键蛋白质。实验结果表明,SoID预测的关键蛋白质数量普遍多于六种中心性测度的预测结果,在敏感度、特异性等指标上也具有一定优势,并且SoID能够有效识别低度关键蛋白质。针对目前能够获得的蛋白质相互作用数据中包含大量的假阳性这一事实,提出了一种新的相互作用加权方法,在加权网络中使用六个经典的中心性测度来预测关键蛋白质。实验结果表明,任何一种中心性测度在加权蛋白质网络上预测的准确率和效率都普遍高于在相应的非加权蛋白质网络上的预测结果。基于网络拓扑的关键蛋白质识别方法的准确性在很大程度上受网络可靠性和数据真实性的影响,对网络加权可以提升关键蛋白质的预测性能。本文提出的几个关键蛋白质识别方法,通过引入多种信息,有效地提高了识别准确度,为关键蛋白质的识别研究提供了新的思路。

【Abstract】 Essential proteins are those proteins which are indispensable to the viability and reproduction of an organism. They play an important role in cell activities. Identification of essential proteins is significant not only for the research of life science, but for practical purposes, such as diagnosis and treatment for diseases and drug design. With the development of high-throughput technology in the post-genomic era, a wealth of protein-protein interaction data have been produced. Consequently, identifying essential proteins based on protein interaction networks becomes a hot topic.This paper proceeds from network topology, explores the characteristics of protein interaction networks on the basis of analysis of topological characteristics of nodes, and designs efficient methods for identifying essential proteins. The main original works include:The current methods for identifying essential proteins based on topology, such as centrality measures, only indicate the features of nodes in the network but can not characterize the importance of edges. In view of this, we propose a novel method based on edge clustering coefficient, named as SoECC, which binds characteristics of edges and nodes effectively. The experimental results on yeast protein interaction network show that, both accuracy and efficiency of SoECC are universally higher than that of the six centrality measures. Besides, we find that essential proteins identified by SoECC show obvious cluster effect. It is a significant phenomenon which agreed with previous researches.The existing methods for identifying essential proteins mostly ignore the biological significance and function of proteins. Aiming at this drawback, we introduce protein complexes into our research and construct a new measure SoID for identifying essential proteins. The experimental results indicate that, comparing with the six conventional centrality measures, SoID has a certain advantage in sensitivity and specificity. The essential proteins detected by SoID are also universally more than that detected by the six centrality measures. Besides, SoID can effectively discover the low-connectivity essential proteins.In consideration of the fact that there exist a lot of false positives in currently available protein interaction datasets, we propose a new method for weighting the interactions and predict essential proteins using the six classic centrality measures in the weighted protein interaction network. The experimental results show that, the accuracy and efficiency of any centrality measure in weighted protein interaction network are universally higher than that in the corresponding unweighted protein interaction network. The accuracy of identification methods based on network topology is heavily affected by reliability of networks and reality of datasets. Weighting the protein interaction networks can improve the performance of identification of essential proteins.The several methods proposed in this paper improve the accuracy of identification of essential proteins effectively. Moreover, by means of employing various information, this paper provides a new idea for identification of essential proteins.

  • 【网络出版投稿人】 中南大学
  • 【网络出版年期】2012年 04期
  • 【分类号】Q51;O157.5
  • 【被引频次】1
  • 【下载频次】287
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络