节点文献

求解蛋白质结构预测问题及矩形packing问题的启发式算法

Heuristic Algorithms for the Protein Structure Prediction Problem and the Rectangles Packing Problem

【作者】 刘景发

【导师】 黄文奇;

【作者基本信息】 华中科技大学 , 计算机软件与理论, 2007, 博士

【摘要】 求解NP难度问题一直是计算机科学技术的一个瓶颈任务。近年来的研究表明,对于NP难度问题可能根本不存在既完整严格又不太慢的求解算法。因此,这类问题的求解方法多为启发式方法。蛋白质结构预测问题和矩形packing问题都是NP难度问题。设计求解它们的快速而有效的算法具有深刻的理论意义和普遍的实用价值。受自然界和人类社会中智慧的启发,提出了求解这两类NP难度问题的有效的启发式算法,所做的主要工作有:对于一个具有两种氨基酸(疏水氨基酸和亲水氨基酸)的三维非格点的蛋白质AB模型,受物理世界物体间相互作用的规律的启发,提出了该模型的蛋白质结构预测问题的拟物算法。结合本问题的一些启发式知识,获得了高效的启发式策略,包括初始构形的产生和跳坑策略,使得搜索过程尽量避免过早陷入局部极小点,即使到了局部极小点,能够使搜索过程跳出,将其引向前景更好的区域。结合这些启发式策略以及拟物算法得到了启发式拟物算法。该算法对文献中的氨基酸序列进行了实算,所得构形的最低能量比PERM+算法(即非格点的PERM(Pruned-Enriched Rosenbluth method)结合共轭梯度法)所得构形的最低能量更低。提出了AB模型的蛋白质结构预测问题的启发式共轭梯度(HCG)算法。算法在三维欧氏空间中利用启发式的方法产生初始构形,然后用共轭梯度法优化低能状态,一旦计算陷入极小值的陷阱,则采用启发式的跳坑策略跳出局部极小点。将HCG算法应用到蛋白质AB模型中去预测和发现蛋白质结构。计算结果表明,对于链长为55的序列,HCG算法所得结果比当今国际文献中最先进的几个算法所得结果更好。受PERM+算法的启发,给出了AB模型的基于面心立方体(FCC)格点的PERM++算法。该算法利用基于FCC格点的PERM算法产生初始构形,然后用共轭梯度法进一步优化低能状态。数值实验表明,PERM++算法的计算结果优于PERM+算法。ELP方法在应用中存在一些缺陷,针对这些缺陷提出了基于拟物策略的局部搜索方法,将此方法同ELP方法相结合得到了改进的ELP算法(ELP+)。计算结果表明,对于部分氨基酸序列,ELP+算法找到了比PERM+算法、多正则(MUCA)Monte Carlo方法、势能曲面变平(ELP)方法、构形空间退火(CSA)算法等国际上最先进的一些算法给出的最低能量状态具有更低能量的构形。使用拟人的策略,提出了基于欧氏距离的占角最大穴度优先的放置方法,为矩形packing问题的快速求解提供了一种高性能的启发式算法。算法的高效性通过应用于标准电路MCNC和GSRC得到了验证,其计算结果优于当今国际上已公开发表了的最先进的几个算法的结果。基于拟人的思想,提出了带有预放置模块的布局问题的穴度算法及其改进的穴度算法。用标准电路MCNC对算法性能进行了测试,实算结果表明,改进的穴度算法比基于角模块表(CBL)和B*树型结构(B*-tree)表示的随机优化算法的计算结果都要好。简而言之,我们对两类NP难度问题——蛋白质结构预测问题和矩形packing问题进行了分析和研究。受自然界、人类社会和具体问题的启发,为它们提出了一系列的有效求解算法。实践证明,这种模仿自然界物质运动和人类行为的方法是一个寻求高效算法的有效途径。这些方法对于研究其它NP难度问题的现实求解具有普遍的实际意义。然而求解蛋白质结构预测问题和矩形packing问题是复杂的开放性课题,其中还有很多问题有待于进一步研究和讨论。

【Abstract】 Solving NP-hard problems is always the bottleneck task in the field of computer science and technology. In recent years, investigations show that for NP-hard problems, there may not exist an algorithm that is both complete and rigorous and not too slow. So its solution methods are generally heuristic. The protein structure prediction problem and the rectangles packing problem are NP-hard problems. Solving them has profound theoretical significance and universal practical value. Inspired by the wisdom of nature and human being, we proposed some effective heuristic algorithms to solve these two NP-hard problems. The main works are as follows:A three-dimensional off-lattice protein AB model was studied, where two species of amino acids, hydrophobic and hydrophilic, were considered. Enlightened by the law of forces among things in the physical world, a corresponding quasi-physical algorithm was put forward for the protein structure prediction problem. By observing and learning from problem structure, high efficient heuristic strategies were presented, including producing the initial configurations and jumping out of traps. These approaches could effectively avoid falling into local minima, and even if the search fell into local minima, the calculation could get out of them and reached better prospects positions. The heuristic quasi-physical algorithm was proposed by integrating the heuristic strategies into the quasi-physical algorithm. For all instances in literature, the proposed algorithms found the configurations with lower energy than those obtained by PERM+, which integrated the off-lattice PERM (Pruned-Enriched Rosenbluth method) and conjugate gradient minimization.A heuristic conjugate gradient (HCG) algorithm was presented for the protein structure prediction problem of the AB model. The algorithm firstly produced initial configurations in three-dimensional Euclidian space by the heuristic method, and the conjugate gradient minimization was then used to optimize low-energy configurations. Once the computation fell into the local minima, the heuristic off-trap strategy was used to get out of the predicament. The HCG algorithm was applied to the AB protein model to predict protein structure. The computing results showed that for sequence with length n=55, the HCG algorithm found the configurations with lower energy than those nowadays given in literature.Inspired by the PERM+ algorithm, a so-called PERM++ algorithm based on face-centered-cubic (FCC) lattice was put forward to solve the protein structure prediction problem of the AB model. In PERM++, the initial configurations were produced by applying the PERM to the FCC lattice, and conjugate gradient minimization was then used to reach the minimum energy states. The numerical results showed that the performance of the PERM++ algorithm outperformed that of the PERM+.A technical shortcoming in ELP was revealed. In order to overcome it, a local search method based on the quasi-physical strategy was presented. An improved ELP algorithm (ELP+) was proposed as the integration of the local search method and the ELP method. The calculating results showed that for several of amino acids chains studied, the ELP+ algorithm found the configurations with lower energy than those by PERM+, by multicanonical (MUCA) Monte Carlo method, by energy landscape paving (ELP) method, and by conformational space annealing (CSA) algorithm, respectively.Based on the quasi-human strategy, we proposed the so-called corner-occupying and the largest hole degree first placement policy based on Euclidian distance. An effective heuristic algorithm was presented, and this algorithm could obtain the solution to the rectangles packing problem quickly. Experimental results on MCNC and GSRC benchmark circuits showed promising performance. As compared with other public algorithms, the proposed algorithms could get better results.Finally, based on the quasi-human strategy, the hole degree algorithm and the improved hole degree algorithm were proposed to solve the modules placement problem with pre-placed modules. Experimental results on MCNC benchmark circuits demonstrated that the proposed heuristic deterministic algorithm was quite effective in solving the problem and outperformed the stochastic optimization placement algorithms based on corner block list (CBL) and B*-tree type representations, respectively.In a word, two NP-hard problems, the protein structure prediction problem and the rectangles packing problem were studied in this dissertation. Inspired by nature, human being and problem structure, we put forward some effective algorithms to solve them. Experimental results showed that simulating the moves of things in nature and the actions of human being was an effective way of designing high-performance algorithms. These approaches have universal practical value for studying and solving other NP-hard problems. Since the protein structure prediction problem and the rectangles packing problem are open problems, there are still some other questions need to study.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络