节点文献

基于混沌的化学计量学新算法研究及基于量子化学计算的构效关系研究

New Chemometric Algorithms Based on Chaotic Concept and QSAR Aided Quantum Chemical Calculations

【作者】 吕庆章

【导师】 俞汝勤; 沈国励;

【作者基本信息】 湖南大学 , 分析化学, 2003, 博士

【摘要】 定量构效关系(QSAR)研究是化学计量学的重要分支。神经网络是一个优异的非线性建模工具,但其突出的缺点就是潜在的预收敛到局部最优点或过拟合导致的错误预报模式;遗传算法用于优化过程其目的是为了避免陷入局部最优点,但事实上由于在传统遗传算法的进化过程中只有少数最好的成员被保留下来用于产生子代成员,经过一定代数的进化后群体成员的多样性就会显著下降,也很可能过早收敛到局部最优点。为了克服这些缺点,混沌动力学模型被引入到了化学计量学算法的研究之中。混沌映射自身固有的对初始状态的敏感性,形似随机实有确定轨道永不回头的绝对遍历运行机制,为更好的化学计量学方法研究带来了曙光。混沌动力学数学模型可以直接用作搜索算法训练神经网络,但这一方法还有待研究。混沌动力学模型用于遗传算法的突变操作,可以使传统遗传算法群体成员多样性逐渐消失的现象得到解决,它克服了传统GA中可能的近亲繁衍情况,这正是传统GA过早收敛导致局部最优的原因。该策略不仅很大程度上克服了传统遗传方法可能的局部最优问题,而且混沌的不可捉摸的行为特征也使得该算法有效地克服了神经网络训练的过拟合问题。该方法的有效性在基于物理化学和量子化学参数的四面体卤化物的振动频率预测、基于物理化学参数的八面体六卤化物的振动频率预测、基于物理化学参数的氟相分配系数预测、以及基于量子化学密度泛函计算的氯氟碳化合物代替品的大气寿命预测中都得到了体现。尤其是在对大气寿命的预测上,由于数据结构不好,传统遗传算法很容易产生过拟合现象,但混沌突变的遗传算法很好地避免了该现象的发生。 基于“优生”这一新概念提出了一个新的优势进化算法,该方法能够快速收敛,它采取的全局撒网快速收敛的策略很好地解决了遗传算法在局部最优点附近的游荡式行为所造成的时间浪费,之所以它有如此的特征可能和它潜在的方向选择性有关。结合混沌来初始化优势进化算法的群体成员,还设计了基于混沌初始化群体成员的循环优势进化方法,该方法在混沌保证每次优势进化绝对不同的初始群体成员的情况下,不仅具有了优势进化的快速收敛优点而且有混沌映射保证群体成员的高度多样性,从而可以快速探测很多局部最优点,大大提高了找到全局最优点的机会,用它训练神经网络预测四卤化物振动频率研究充分体现了该策略的优越性。该方法可以应用到很多的优化问题之中,有很高的借鉴价值。 基于量子化学模拟计算,对四苯基铁卟啉及其7个卤素取代衍生物在催化氧摘要(ABSTRACT)化烷烃反应中的催化活性作为QSAR研究。分析了这些分子的电子结构特点及与催化活性的关系。详细讨论了氧分子在这些分子表面的活化模拟过程。发现较低的HOMO能级和铁原子上较大电子密度有利于铁叶琳具有较高的催化活性。四个苯基平面保持与叶琳的马鞍面交叉也有利于具有高催化活性。 量子化学参数的运用能够提高QSAR研究的效果并且为QSAR研究提供了又一个手段。量子化学分子模拟可以使QSAR研究提高到一个新的理论高度,同时量子化学模拟也能在QSAR的基础上做出更有针对性、更深层次的研究。两者结合的魅力在前面提到的研究体系中得到了充分的体现。

【Abstract】 Quantitative Structure-Activity Relationship (QSAR) is one of the important branches in chemometrics. The artificial neural network, a powerful modeling method for the nonlinear statistical studies, has some drawbacks like the possible premature convergence to the local optima and overfitting to the samples in the training set. One of the motives of the use of genetic algorithm (GA) in an optimization procedure is to avoid sinking into local optima. Actually, only a few fittest members of the whole population of a generation can survive during GA’s selection operation. After some generations the population diversity would be greatly reduced and the algorithm might lead to a premature convergence to a local optimum. To circumvent these shortcomings, the chaotic dynamic system is introduced in the studies of chemometric algorithms. The chaotic mapping itself is very sensitive to the initial state in a manner similar to the statistical randomness but with its underlying patterns appearing to be phantasmagoric in a deterministic style. The characteristic feature of chaos itself being able to search the space of interest exhaustedly has been employed to improve the performance of chemometric algorithms studied. We hope the usage of chaotic concept will shed some light onto the algorithm studies. Although the chaotic mapping model can be directly used as an algorithm in training an ANN, the efficiency of such an approach seems to be insufficient. A chaotic system applied as mutation operation in GA could significantly enhances GA’s potential in terms of maintaining the population diversity during the evolutionary process. This scheme effectively prevents the incest during the evolution of the general GA leading to misleading local optima. The effectiveness of such a chaotic concept based algorithm has been proved clearly and demonstrated in overcoming the convergence to local optima and also the overfitting to training samples often appearing in traditional neural network training. The effectiveness over tradational GA in training ANN of the proposed scheme have been demonstrated in predictions of the vibration frequencies of the tetrahedral tetrahalide species, of the vibration frequencies of the octahedral hexahalide species, of the fluorophilicity, and of the atmospheric lifetime of the substitutes of chlorofluorocarbans (CFCs) based on physico-chemical and semi-empirical quantum chemical parameters and density functional theory (DFT) calculations.The newly proposed prepotency evolution (PE) algorithm based on the concept of"prepotency" has a higher convergence speed comparing to the conventional genetic algorithm. The PE begins with an initial population distributing in the whole space of interest and converges with a high speed and does not jazz around local optima like the traditional genetic algorithms. This manner may result from its underlying direction selection for its evolution. To take advantage of the chaotic mapping, a repetition prepotency evolution algorithm with a population initialized by chaotic numbers is developed in which the chaotic mapping is an uncommon seeding-machine to produce different initial ’chromosome’ populations never appeared before for PE algorithm. The combination of chaotic mapping and PE makes PECA be able to probe lots of optima rapidly and effectively owing to the quick convergence speed of PE and relatively high population diversity guaranteed by chaotic mapping. When this scheme is tested in ANN training to predict the vibration frequencies of tetrahedral tetrahalide species, the results show that it greatly enlarges the opportunity to find the global optimum. This scheme can also be introduced in many other optimization procedures and will possibly carry out better results.The QSAR study of catalytic activity of iron-tetraphenylporphyrin chloride (Fe(TPP)Cl) and its 7 halogenated complexes in the oxidation reaction of isobutene was performed based on quantum chemistry calculation. The electronic structural characteristics, and their relationship wit

  • 【网络出版投稿人】 湖南大学
  • 【网络出版年期】2004年 03期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络