节点文献
开放骨架磷酸铝合成反应预测研究
Prediction Research on Open-framework Aluminophosphate Syntheses
【作者】 齐妙;
【导师】 吕英华;
【作者基本信息】 东北师范大学 , 物理化学, 2010, 博士
【摘要】 无机微孔材料的应用与材料本身的多孔结构有着密切的联系,例如孔道的维数、形状和面积的不同在应用上也具有巨大的差异。无机微孔晶体由于其独特的规则孔道结构而被广泛地应用于催化、吸附、分离和离子交换等领域,因而具有新颖结构的微孔晶体的设计、合成以及新合成路线的开发一直备受关注。其中,开放骨架结构的金属磷酸盐化合物由于其结构的多样性和潜在的应用价值,国内外很多学者已经对其开展了广泛而深入的研究。无机微孔晶体化合物的合成十分复杂,材料结晶受诸多因素的影响,例如原材料、凝胶组成、PH值、模板剂、溶剂,结晶温度和时间等。对这类材料合成的研究与分析,主要困难是由于它们的合成过程难以控制、结晶机理复杂难以理解和建模。在过去几年里,科研者试图建立新的合成方法的预测模型,尤其将统计学方法应用到目标材料的定向设计中,期望得到性能较好的特定结构预测模型,用来预测新型合成材料。尽管一些统计方法在化学材料分析中已经得到了广泛的应用并取得了预期的研究成果,但是对开放骨架磷酸铝合成实验中的分析和预测的研究相对较少。鉴于开放骨架磷酸铝丰富的化学结构,本文采用基于统计的机器学习理论和方法对磷酸铝分子筛进行了大量的结构分析与预测,主要应用在:挖掘合成参数对合成产物某一特定结构的影响程度,为合成实验提供特定结构形成机理的解释;建立合成参数对产物孔道环数和产物类型的预测模型,提高定向合成实验的成功率。具体研究内容分为如下两部分:一、利用多种基于统计的机器学习方法对数据库的合成参数和产物结构进行了大量的分析与预测,具体如下:1.鉴于数据库中的合成参数之间存在严重的相关性,而偏最小二乘能够很好的解决变量间的多重共线性问题,本文采用偏最小二乘法分析合成参数对预测产物特定结构的影响程度,并采用主成分分析方法提取产物某些特定结构的综合信息,建立合成参数对产物特定结构的回归模型。2.在使用相同模板剂的合成反应中,采用BP神经网络方法分析凝胶组成及其成分组合对预测产物类型的影响程度。3.由于支持向量机能够较好的解决非线性、高维数、局部极小点等问题,本文采用支持向量机方法预测产物的孔道环数和产物类型,挖掘对生成具有特定孔道环数和特定结构类型的化学材料的模板剂参数,并采用交叉验证方法进一步提高分类器的可靠性。4.鉴于多元线性回归对变量之间不可以存在严重相关的限制,采用岭回归方法建立合成参数对产物类型的预测模型,并详细研究了岭参数和阈值的选取对预测性能影响。5.本文还采用偏最小二乘与Logistic回归结合的统计方法(PLS-LR)进行合成参数对产物类型的预测。首先采用偏最小二乘方法去除合成参数之间相关性,得到新的低维变量;然后采用Logistic回归方法在低维变量上预测产物的类型;最后通过对预测结果的影响确定偏最小二乘提取的成分个数,建立合成参数对产物类型的预测模型。大量实验与分析说明了本文采用的基于统计的机器学习方法能够挖掘出合成参数对生成产物特定结构的影响程度,并且建立了性能良好的合成参数对产物特定结构和特定类型的预测模型。二、针对磷酸铝合成数据库存在的类不平衡问题,提出了新的采样方法。数据的类不平衡问题会降低分类器的分类性能,针对预测实验中的数据存在类不平衡问题(如两类样本的比例为1:3),基于无监督的模糊C均值方法,本文提出了两种有指导的上采样方法:FCMP1,FCMP2;两种有指导的混合采样方法:FCMP1+Tomek和FCMP2+Tomek。这些方法不仅考虑了类间不平衡问题,而且考虑了类内不平衡问题,克服了现有方法的盲目采样的缺点。并且,在混合采样方法中同时去除了两类的噪音样本或边缘样本,使两类样本更具有可分性。实验结果表明,在采样后的数据集上的预测结果要明显优于原始数据的预测结果。与一些现有的采样方法相比,本文提出的采样方法展示了更好的预测性能。本文采用基于统计的机器学习方法,建立了磷酸铝合成反应数据库中合成参数对产物特定结构的一系列预测模型;为了有效的解决类不平衡问题,提出了新的采样方法来提高预测性能。本文的研究将使分子筛骨架的定向设计变得更加直接有效,减少实验成本开销,尤其对于根据功能需要定向设计具有特殊结构的分子筛骨架有重要指导意义。
【Abstract】 The applications of different microporous inorganic materials have direct and close relations to their porous structures. For example, the differences of dimensionality, shape and the volume of pore will result in huge difference in applications. Microporous inorganic crystals have been widely used in the fields of catalysis, adsorption separation and ion-exchange because of their unique and regular pore structures. Therefore, the design and synthesis of microporous crystals with novel structures, and the development of new synthesis routes are always being concerned. Among them, owing to their structural diversities and potential applications, open-framework metal phosphate compounds have been extensively and deeply studied by many domestic and abroad scholars. The synthesis of microporous inorganic crystals is very complex and the crystalline of materials is affected by many factors such as the source materials, the gel composition, the PH value, the template, the solvent, the crystallization temperature and time etc. For the synthesis research and analysis of these materials, it is difficult to control the process of synthesis, understand and model their complex crystallization kinetics. In the past years, researchers have tried to establish prediction models of new synthesis methods. Specially, they applied some statistical methods to the rational designs of target materials in order to obtain good prediction model for specific structure, which were used for the synthesis of new materials. Although these statistical methods have been widely employed and obtained good predictive results in chemical material analysis, the study on analysis and prediction of open-framework aluminophosphates (AlPOs) is not enough.In view of the rich chemical structure of open-framework AlPOs, the theories and methods of machine learning based on statistics are employed to analyze and predict the structures of AlPOs molecular sieves in this thesis. The methods are mainly applied to mine the influence of synthesis parameters to predict the some resultant structures and provide rational interpretation of the formation specific structure, establish the prediction model of synthesis parameters to resultant pore ring and type for enhancing the rate of success of rational synthesis experiments. The detailed study is divided into the following two parts:Part I: A series of analysis and prediction works are done between the synthesis parameters and the resultant structures using machine learning methods based on statistics on the AlPOs database described as follows.1. On account of the severe correlation among the synthesis parameters, partial least squares (PLS) which can deal with the problem of severe correlation among variables is employed to analyze the influence of synthesis parameters to predict the resultant specific structures. Furthermore, principal component analysis (PCA) is used to extract the synthetic information of some resultant specific structures to establish the regression model of synthesis parameters to resultant specific structures.2. Under the condition of using the same template for synthesis, back propagation neural networks (BPNNs) is adopted to analyze the influence of the gel compositions and their combinations to predict the resultant type.3. Since the support vector machine (SVM) can solve the problems of nonlinear, high dimensionalities and local minimum points, it is adopted to predict the resultant pore ring and type. Also, the influence of template attributes for predicting the material with specific pore ring and type is mined. Moreover, the cross validation is adopted to further enhance the reliability of classifier.4. To avoid the limitation that variables can not be serious correlation in the multiple linear regressions (MLR), the ridge regression (RR) is used to establish the prediction model of synthesis parameters to resultant type. In addition, the effect on prediction performance for the selection of ridge parameter and threshold is studied in detail.5. A statistical method combining PLS and logistic regression (LR), named as PLS-LR, is also adopted in this thesis to accomplish the prediction of synthesis parameters to resultant type. First, the correlation among synthesis parameters is removed using PLS to obtain new low dimensional variables. Then, LR is used to predict the resultant type based on low dimensional variables. Finally, the number of components in PLS is determined through analyzing the effect on prediction results with different number of components.Extensive experiments and analysis domonstrate that the machine learning methods based on statistics can mine the influence of synthesis parameters to the specific resultant structures and establish good prediction model of synthesis parameters to resultant specific structure and type.Part II: Aiming to solve the problem of class imbalance existing in the AlPOs database, novel resampling methods are proposed.The class imbalance will degrade the classification performance of classifier. Owing to the existence of class imbalance in the predictive experiments (such as the ratio of two classes is 1: 3), this thesis proposes two guided over-sampling methods on the basis of on fuzzy c-means (FCM), named as FCMP1 and FCMP2, and two guided combined-sampling methods, named as FCMP1+Tomek and FCMP2+Tomek. These methods not only consider the inter-class imbalance but also the intra-class imbalance to overcome shortcoming of blind resampling methods. Moreover, the combined-sampling methods remove the noisy or borderline samples for both classes simultaneously, which results in the two classes more discriminative. The experimental results demonstrate the predictive results on sampled dataset are better than the results on original dataset. Furthermore, compared with some existing resampling methods, our proposed resampling methods exhibit much better predictive results.In this thesis, machine learning methods based on statistics are employed to establish a series of predictive models of synthesis parameters to resultant specific structure on AlPOs database. To solve the problem of data class imbalance effectively, novel resampling methods are proposed to improve the predictive performance. The research of this thesis will make the rational design of molecular sieves framework more straightforward and efficient, and reduce the experimental cost. In particular, this work will provide important guiding significance for rational designing the molecular sieves framework with specific structures.
【Key words】 Microporous Inorganic Materials; Aluminophosphate Syntheses; Machine Learning; Synthesis Analysis and Prediction; Cross Validation; Resampling;