节点文献

基于免疫遗传算法的基序识别方法的研究

Research on Motif Discovery Method Based on Immune Genetic Algorithm

【作者】 王汀

【导师】 骆嘉伟;

【作者基本信息】 湖南大学 , 计算机应用技术, 2010, 硕士

【摘要】 从生物序列中识别基序是生物信息学中的一个热点问题,也是生物学中研究基因调控机制的基础计算问题之一。由于基序长度较短、非百分百保守以及生物数据复杂性高等原因,通过计算方法识别基序仍旧是人们研究基因转录调控过程的一大难点。遗传算法由于其鲁棒性强、随机性、全局性以及适于并行处理等优点,近年来被越来越广泛地应用到基序识别问题的求解中,并成为重要的发展方向。虽然遗传算法已形成一套较为完善的算法体系,但早熟收敛、随机漫游等不足限制了其应用。生物学领域的研究发现,生物免疫系统可以很好地保持种群多样性,抑制早熟收敛和限制随机漫游。因此,利用免疫原理可以有效地改进和提高遗传算法的性能。针对遗传算法缺少种群多样性保持策略的不足,考虑生物免疫系统的优点,本文将浓度调节机制引入到遗传算法中,提出了一种基于浓度机制的免疫遗传算法,并将其应用于基序识别。根据基序识别问题定义和基序的表示方式,设计了新的抗体亲和力和抗体浓度计算公式,模拟生物体免疫行为,在遗传算法比例选择算子的基础上,添加了浓度调节因子来抑制高浓度抗体的繁殖,使提出的算法能够有效地保持种群多样性,避免早熟收敛现象的发生。实验结果表明该算法有较好的基序识别效果,.能够在较长序列中识别基序,在一次运行中识别多基序。为了克服算法在遗传过程中随机搜索造成的种群退化现象,本文模拟人工免疫系统,引入疫苗的调控作用,提出了一种基于疫苗机制的基序识别算法,通过疫苗提取、疫苗接种和免疫选择来抑制种群的退化现象,加快算法收敛速度。通过在模拟生物数据和真实生物数据上的实验,说明该算法进一步提高了基序识别效果。

【Abstract】 Motif discovery in biological sequences is a hot issue in bioinformatics and a fundametal computational problem with important applicaitons in understanding gene regulation. As the motif with very short length, non-hundred percent conservative, and the complexity of biological data, the identification of motifs through computational methods is still a major challenge. Because of its relative superiority to the traditional optimization algorithm, evolutionary algorithm has recently been more widely applied to the motif discovery problem, and becomes an important direction of development.The lack of premature convergence and random roaming limits GA’s application. People find that biological immune system can keep the diversity well, and inhibit premature convergence and random roaming. Therefore, we can effectively improve and enhance the performance of genetic algorithm by using the immune theory.Considering the lack of population diversity maintain with genetic algorithm and the advantages of biological immune system, we introduce concentration regulation into genetic algorithm and propose an immune genetic algorithm based on concentration mechanism and apply to motif discovery. We define new formuals of antibody affinity and antibody concentration according to the definition of motif discovery problem and the representation of motifs. Based on the election operator of GA, we introduce a concentration regulation operator to inhibit the reproduce of high concentration antibodys. The proposed method can effectively maintain the population diversity and inhibit premature convergence phenomenon. Experimental results show that the method could find motifs in relative long sequences and multiple motifs in a single run.In order to inhibit the degradation during evolution, we introdce the regulation of vaccine and propose an motif discovery method based on immune vaccine. The Population degradation could be well inhibited by extracting the vaccine, vaccination and immunization choice, so the convergence could be speed up. Experimental results show that the motif discovery ability has been further improved.

  • 【网络出版投稿人】 湖南大学
  • 【网络出版年期】2011年 04期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络