节点文献
有毒有机污染物正辛醇/空气分配系数(KOA)的定量预测方法
Quantitative Predictive Methods on Octanol-Air Partition Coefficient (KOA) of Toxic Organic Pollutants
【作者】 李雪花;
【导师】 陈景文;
【作者基本信息】 大连理工大学 , 环境工程, 2008, 博士
【摘要】 正辛醇/空气分配系数(KOA)是描述污染物在空气和环境有机相之间分配行为的一个关键参数,对于评估污染物的长距离环境迁移能力和生物蓄积性等具有重要意义。实验测定KOA,成本昂贵,比较费时,并且部分化合物尚缺乏标准样品。因此,需要发展简便而准确的KOA理论预测法,用于估算有毒有机污染物的KOA。本论文发展了预测KOA的碎片常数模型和基于Dragon描述符的3D-定量结构-活性关系(QSAR),并对现有的三种KOA预测方法和本研究建立的两种KOA预测模型进行了全面评价。1.首先建立了预测不同温度下卤代芳烃化合物KOA的碎片常数模型。利用该模型定义的5个碎片常数和1个结构修正因子,可以预测卤代芳烃化合物在10℃到40℃之间的KOA值。碎片常数模型的训练集由包含C、H、O、Cl和Br原子的芳烃化合物组成,因此模型的应用范围为氯代和溴代的芳烃化合物,如氯苯(CBs)、多氯代萘(PCNs)、多氯联苯(PCBs)、多氯代二苯并二噁英和多氯代二苯并呋喃(PCDD/Fs)、多环芳烃(PAHs)、多溴代联苯醚(PBDEs)等,这些都是典型的持久性有毒物质(PTS)。内部验证(Jackknife检验)和外部验证(包含316个数据点)结果表明,该模型具有较高的稳健性和预测能力。相比于采用量子化学描述符的QSAR模型,本研究建立的KOA预测模型更加简单而准确。2.为了扩展KOA预测模型的应用范围,全面收集了272个化合物在环境温度(25℃)下的logKOA实验测定值,数据集包含了CBs、PCBs、PCNs、PCDD/Fs、PBDEs、PAHs、有机氯杀虫剂、多氟代磺胺药物、羟基烷基硝酸盐、磺胺乙醇、端醇、卤代烃、醚、酮、醛、酸、酯等有机污染物。基于272个logKOA的实验测定数据,采用逐步回归-偏最小二乘(SR-PLS)变量筛选法,确定原子中心碎片的最佳组合,建立了预测有毒有机污染物单一温度下KOA的碎片常数模型。包含23个原子中心碎片的最优模型解释了因变量总方差的97.7%,预测均方根误差(RMSE)为0.43。内部验证(去一法和去多法)和外部验证结果表明,该碎片常数模型具有较高的稳健性,并且对于应用域内的化合物具有较高的预测能力。单一温度下KOA的碎片常数模型具有更大的应用域,可用于预测多种类、宽范围的有机污染物在25℃下的KOA。3.基于272个logKOA数据,采用Dragon描述符建立了3D-QSAR模型。经SR-PLS变量选择方法,最优模型共引入9个分子结构描述符(x1sol、GATS2p、C006、C025、H050、Mor04p、L3s、C005、N072),解释了因变量总方差的98.2%,预测RMSE为0.38。通过最优模型的机理分析得出,控制KOA的主要影响因素为分子在正辛醇中的色散作用、特定结构碎片形成氢键的能力、分子形状和对称性方面的3 D结构特征以及共轭体系的电子效应。内部验证(去多法、去一法、Y的随机性检验)和外部验证结果表明,3D-QSAR模型具有较高的稳健性,并且对于应用域内的化合物具有很高的预测能力。4.从应用角度综合评价和比较了本文建立的单一温度KOA的碎片常数模型、基于Dragon描述符的3D-QSAR模型和现有的3种KOA定量预测方法。现有的3种KOA定量预测方法包括:基于正辛醇/水分配系数(KOW)和亨利定律常数(KH)的直接计算法,基于溶解自由能的理论计算法,基于量子化学描述符的QSAR模型。结果表明:(1)KOW-KH直接计算法的理论应用域很大,稳健性较高。但是大多数化合物的KOW和KH的实测值较缺乏,因此KOW-KH直接计算法的预测准确性高度依赖于KOW和KH预测值的准确性,它们的误差可能联合扩大KOA的预测不确定性。(2)基于溶解自由能的理论计算法,应用域很大,但溶解自由能的计算误差较大,从而影响KOA的预测准确性。(3)基于量子化学描述符的QSAR模型,分子结构描述符具有明确的物理化学意义,并且能够分辨同分异构体。(4)碎片常数法具有明确的算法,碎片划分简便而快速,并且对应用域内的化合物具有较好的预测能力,但该方法应用范围受训练集化合物覆盖程度的限制。(5)基于Dragon描述符的3D-QSAR模型,对同分异构体具有很好的分辨能力,并且分子结构描述符易于机理解释,对于应用域内的化合物该模型给出更加准确的预测结果。
【Abstract】 The octanol-air partition coefficient(KOA) is a key physicochemical parameter for describing the partition of toxic organic pollutants between air and environmental organic phases.KOA plays an important role in evaluating the global distribution,transport and biomagnification of toxic organic compounds.Experimental determination of KOA is costly, time-consuming,and restricted by lack of sufficiently pure chemicals,thus there is a need to develop a simple and accurate method to estimate KOA.In the current thesis,fragment constant models and 3D- quantitative structure-activity relationship(QSAR) model based on Dragon descriptors were developed and five predictive methods of KOA were evaluated and compared.Firstlyly,a fragment constant method based on 5 fragment constants and 1 structural correction factor,was developed for predicting logKOA at temperatures ranging from 10℃to 40℃.As aromatic compounds that contain C,H,O,Cl and Br atoms,were included in the training set for model development,the fragment constant model can be applied to a wide range of chlorinated and brominated aromatic pollutants,such as chlorobenzenes(CBs), polychlorinated naphthalenes(PCNs),polychlorinated biphenyls(PCBs),polychlorinated dibenzo-p-dioxins and dibenzofurans(PCDD/Fs),polycyclic aromatic hydrocarbons(PAHs), and polybrominated diphenyl ethers(PBDEs),all of which are typical persistent toxic substance(PTS).It can be inferred from internal validation(Jackknife test) and external validation that the fragment constant models have good predictive ability and robustness. Compared to QSAR model based quantan chemical descriptors,the present model has the advantage that it is much easier to implement.Secondly,in order to expand the utility of the fragment constant method,272 experimental values of logKOA at 25℃were collected from literatures.The KOA data set included a wide range of compound classes,such as PCBs,CBs,PCNs,PCDD/Fs,PBDEs,PAHs, organochlorine pesticides,hydroxy alkyl nitrates,polyfluorinated sulfonamide, sulfonamidoethanols,telomere alcohols,halogenated hydrocarbons,alcohols,ketones, aldehydes,acids,esters and ethers etc.Based on training set of the 272 compounds,the fragment constant model was developed for predicting logKOA values at the ambient temperature(25℃).The best combination of 23 atom centered fragments was selected by stepwise regression-partial least squares(SR-PLS) variable selection method.For the training set,R2 = 0.977 and root mean square error(RMSE) =0.43.Internal and external validation indicated that the fragment constant model was ideal for predicting logKOA of new compounds within application domain(AD).This fragment constant model could be used to estimate KOA for a wide set of heterogeneous organic compounds at 25℃.Thirdly,based on the training set including 272 compounds,3D-QSAR model for predicting logKOA was developed using Dragon descriptors.The best combination of 0~3 dimension(D) Dragon descriptors was selected by SR-PLS variable selection method.The optimal model contained nine descriptors(Xlsol,GATS2p,C006,C025,H050,Mor04p,L3s, C005,N072),leading to R2=0.982 and RMSE=0.38.It was concluded from the optimal QSAR model that the main factors governing KOA are dispersive interactions in octanol solution,potential of hydrogen bond,3D structural feature of symmetry and shape, distribution of conjugated electric charge.Internal and external validation has shown the 3D-QSAR model with good robustness and good predictive power for compounds within AD.Currently,there are five predictive methods for KOA:(a) the fast estimation method that employ the octanol-water partition coefficients(Kow) and Henry’s law constant(KH),(b) a direct method by computing the solvation free energy of organic chemical molecules in octanol(△Gs) using quantum chemical solvent models,(c) the QSAR model that employ quantum chemical descriptors,(d) the fragment constants model,(e) the QSAR model based on Dragon descriptors.For the five predictive methods for KOA,models were evaluated and compared by goodness of fit,robustness and prediction power,AD and algorithms.The KOW-KH method had a broader AD,but it was limited by the lack of sufficient data of KOW,KH and their temperature dependence.Besides,experimental or estimative errors in KOW and KH values could be propagated or magnified due to the division.Theoretically the AD of the△Gs method could be extended to every compound;nevertheless the method relied on the accurate calculation of△Gs.The quantum chemical descriptors of the QSAR method are favorable for mechanistic interpretation,and this method could identify isomers(high resolution).The fragment constant model was a simple and transparent method,but its AD was restricted by the coverage of the training set.The 3D-QSAR model based on Dragon descriptors gave more accurate predictions for the compounds within the AD,and had high resolution for isomers.
【Key words】 Octanol-air partition coefficient; QSAR; Fragment constant model; Dragon descriptors; Evaluation; Stepwise regression; PLS; Validation; Domain;