节点文献

蛋白质构象预测算法的研究

Research on Algorithms of Protein Structure Prediction

【作者】 曲毅

【导师】 唐好选;

【作者基本信息】 哈尔滨工业大学 , 计算机科学与技术, 2007, 硕士

【摘要】 蛋白质的天然构象是由其氨基酸序列确定的,而蛋白质的生物学功能在很大程度上又依赖于其构象,因此蛋白质构象预测是蛋白质研究中发展已久但仍具有挑战性的问题,是后基因组时代生命科学中重大的研究课题之一。研究发现,蛋白质的天然构象形式完全包含在组成其分子的氨基酸序列的信息之中,这一观点奠定了蛋白质构象预测的理论计算基础。事实上,目前已有许多基于HP模型求解蛋白质构象预测问题的算法,而且取得了一定的成果,其中最具代表性的有蒙特卡罗算法、遗传算法、近似算法、基于重要性抽样的SISPER算法以及基于裁减复制策略的PERM算法等,但从求解效率来看,此类算法还存在较大的提升空间。本文分别将改进的蚁群优化算法和模拟退火算法应用于两种蛋白质构象简化HP模型的预测问题,使问题的求解效率得到了进一步提高。本文采用改进的蚁群优化算法求解基于格点模型的蛋白质构象预测问题。针对以往算法中存在容易产生非法构象和算法运行时间长的问题,本文提出用“克隆”的方法处理非法构象;用“单点变异并向前重构”的方法用于局部搜索阶段来缩短算法运行的时间。通过实验证明,这两点改进在算法中的应用是正确有效的。基于非格点模型的蛋白质构象预测问题,可以视为一个连续函数优化问题,因此本文采用改进的模拟退火算法对其进行求解。针对模拟退火算法的特点提出了三点改进方法:增加记忆功能,限制接受退化解向量以及邻域的再次搜索。数值实验结果表明,算法结构简单,达到最优解的效率高,是一种较好的启发式连续全局优化算法。从上述两种算法得到的构象模型可以看出,HP模型虽然简单,但能够反映出蛋白质折叠构象的一些简单性质,即在蛋白质天然构象中,疏水氨基酸残基总是被极性氨基酸残基所包围,形成一个疏水核心。由此表明两种改进算法用于蛋白质构象预测是可行有效的。最后,本文实现了一个简单的蛋白质折叠构象的图形模拟系统,进一步验证了论文中给出的两种算法的可行性和有效性。

【Abstract】 The protein natural structure is decided by its amino-acid sequence, and its biological functions are dependent on this structure extensively. So prediction of protein structure is a long historic task, and still a challenge in the research of protein, which is becoming an important research domain in the life science on the post-genome era.Researches have shown that protein’s natural structure is decided by their amino-acid sequences, which is theoretical computation base of the protein’s structure prediction. Many algorithms have been proposed for the protein structure prediction problem in simple HP model, and have get some fruits, such as Monte Carlo algorithm, the genetic algorithm, approximate algorithm, SISPER and PERM. Shown from the efficiency of solving this problem, those algorithms need to advance. In this paper, we apply improved Ant Colony Optimization Algorithm (ACO) and improved Simulated Annealing Algorithm (SA), to the prediction of protein structure in two kinds of simple HP model respectively, advancing the efficiency of solving this problem.For the protein structure prediction problem based on lattice model, we apply improved ACO to solve it. In old algorithm, the infeasible structures are frequently encountered and required long CPU time. For those problems, propose a“Clone”method to deal with the infeasible structures, and propose a“Point mutation and Reconstruction”method to reduce CPU time in local search phase. Shown from the empirical results, two methods are accurate and feasible.For the protein structure prediction problem based on off-lattice model, it can be treated as a continuous function optimization problem, we apply improved SA to solve it. For the characteristic of SA, propose three improvements in this paper: increasing memorial function; restrained accepting exasperate solution; search in neighborhood. Numerical results illustrate that this algorithm has simple configuration and high efficiency to get optimization solution, and is very suitable for continuous global optimization.Shown from the model of the structure which is get from the algorithm, although HP model is simple, it shows the structure’s characters. It’s that in the natural structure, the hydrophobic residues are always surrounded by the polar residues, forming a hydrophobic core. It illustrate that two improved algorithms are feasible and effective for the protein structure prediction.In the end, we realize the simple graphics simulation system for protein folding structures, farther it demonstrates the feasibility and validity of the algorithms in this paper.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络