节点文献

多维校正用于复杂体系定量分析和小分子与DNA相互作用机理研究

Multi-way Calibration for Quantitative Analysis in Complex Systems and Studies of the Interactions between Small Molecules and DNA

【作者】 邹鸿雁

【导师】 吴海龙;

【作者基本信息】 湖南大学 , 分析化学, 2009, 博士

【摘要】 本文作者对化学计量学中多维校正用于定量分析的几个重要问题进行了方法探索和应用研究。同时,我们也利用了二阶校正方法对小分子与DNA的相互作用机理进行了探讨。本论文内容主要涉及以下几个方面:1.提出了一种新的三线性分解算法-稳健信息自提取不对称三线性分解算法(RISEATD)。该方法把整体最小二乘理论融入到三线性模型的分解过程中,能快速有效地提取有用信息。同时该算法结合了平行因子分析(PARAFAC-ALS)和交替三线性分解方法(ATLD)的迭代特点,用一种不对称的方式分辨求解三维数据阵中三个潜在的载荷矩阵。该算法的显著特点就是当体系噪声大或共线性强时,分辨得到的感兴趣分析物的图谱都是十分稳定的,并且具有较快的收敛速度。本文还利用模拟的荧光光谱数据阵和真实的激发-发射荧光光谱数据对方法进行了测试,并与传统的PARAFAC, PARAFAC-ALS和ATLD算法计算结果进行了比较。结果表明,RISEATD方法在三维数据分解中具有优越的性能。2.利用化学计量学中具有“二阶优势”的二阶校正方法,与三维激发-发射荧光光谱相结合,以“数学分离”代替“化学分离”,提出了采用荧光分析直接测定血浆和药片样中的盐酸特拉唑嗪含量的新方法。运用PARAFAC,交替惩罚三线性分解算法(APTLD)和RISEATD对三维荧光数据进行解析,最终实现了血样中盐酸特拉唑嗪的定量测定。该方法快速简便,无需复杂的样品预处理,花费成本低廉,定量结果满意。同时,PARAFAC, APTLD和RISEATD与标准加入法相结合,实现了对实际样品药片中盐酸特拉唑嗪的定量测定。所得结果与标准的色谱方法比较,结果令人满意。3.本章提出了一种激发发射矩阵荧光与二阶校正方法结合,在血浆和尿液中检测右美沙芬和奎尼丁的方法。由于这两种药物的荧光与血浆和尿液基质的光谱相互叠加严重重叠,因而未经分离直接用光谱的方法来检测血样和尿样中的右美沙芬和奎尼丁几乎是不可能的。本章利用了三种二阶校正方法,PARAFAC、自加权交替三线性分解算法(SWATLD)和APTLD的二阶优势,利用数学分离来代替化学分离,实现严重干扰下对右美沙芬和奎尼丁定量测定。4.随着现代高阶分析仪器和数据采集技术的的发展,特别是二阶校正方法在处理三维数据阵时的应用,研究药物与DNA的相互作用成为可能。即使该混合物中存在着很复杂的化学平衡,也可以很方便的预测感兴趣的组分和DNA的相互作用机制。对于二阶校正方法最值得注意的优势在于,对三维阵的分解通常是唯一的,可以直接分辨出复杂体系中感兴趣组分的相对浓度和光谱图。本章采用紫外和荧光分析结合二阶校正方法对吡柔比星与DNA的相互作用进行了研究。荧光测数据采用PARAFAC算法和交替归一加权残差(ANWE)算法进行解析,可分辨得到动力学平衡体系中各组分的激发、发射光谱以及相对浓度,为吡柔比星与DNA的相互作用机制的研究提供了更为直观的有用信息。这对抗癌药物的抗癌机理、以及新型药物的设计合成方面都有很大的帮助。5.随着农药大面积、持续的使用,农作物、蔬菜、水果甚至动物体内都可能有农药残留,人们的生命和健康造成威胁。所以研究农药对DNA的潜在损伤作用对于保证人体免受农药的危害具有一定的实际意义。本文利用三维荧光结合二阶校正方法PARAFAC和APTLD对西维因和DNA的相互作用进行了研究,为西维因和DNA的相互作用机制的研究提供了很有用的信息。6.最小支持向量机(LS-SVM)以其优越的性能在多元校正建模中得到越来越广泛的应用。然而,它的性能在很大的程度上还依赖于模型误差的同质性和数据集分布的均一性。该工作探讨了多元校正建模中的训练集样品的代表性和最优化样品加权问题。由于多元校正的样品光谱空间的多维性和复杂性以及样品选取过程中的不确定性,准确估计训练集样品在整个样品空间的代表性尚存在一定困难。为解决以上问题,同时考虑到样品的代表性很难通过考察单个样品进行估计,我们把全局优化样品加权的思想和最小支持向量机相结合,提出了最优化样品加权最小支持向量机这一新算法。该算法通过对原来的训练集样品进行非负加权,在校正建模过程中同时考虑了模型的复杂性和预测能力,最优样品权重通过粒子群优化算法搜索获得。将该算法应用于真实的标准数据集的结果表明,在原始校正样品的代表性较差时,最优化样品加权最小支持向量机算法确实能够很好地改善模型的预测性能。7.在对光谱数据的多元校正建模中,传统的波长变量选择方法对某些波长的舍弃将导致有用信息的丢失。为了获得更加灵活的变量选择和建模,以粒子群优化算法为基础,提出了一种变量加权版本的最小支持向量机用于多元校正中光谱变量的选择。变量加权的策略旨在不人为删除和保留变量,允许变量的非负加权。采用粒子群优化算法实现非负的变量加权实质上可视为对波长变量的某种最优化重新刻度。若使用粒子群优化算法同时优化模型其它参数则使得变量加权的支持向量机变成一个无需人为调节参数的全自动建模方法,因此将比传统的变量选择及建模方法有更多的灵活性,且更智能化,运算速度快。将该算法应用于真实的标准数据集的结果表明变量加权最小支持向量机方法确实能在多元校正模型中实现对变量的最优化刻度,保留更多的结构信息,从而帮助得到训练和预测能力更优且智能化的回归模型。

【Abstract】 The research work focuses on mulity-way calibration for the quantitative analysis in complex chemical systems and studies on the mechanism for the interactions of small molecules with DNA.The robust information self-extracting asymmetric trilinear decomposition algorithm (RISEATD) has been developed, based on the total least squares principle and an asymmetric way with partial bilinearization to calculate the three underlying profile matrices in the resolution of the trilinear model. It can obtain useful information quickly. The new proposed algorithm combines the way of decomposition for PARAFAC-alternating least squares (PARAFAC-ALS) and alternating trilinear decomposition (ATLD). The results obtained by simulated data and real excitation-emission spectral data sets have shown that RISETLD method retains the second-order advantage of quantification for analyte(s) of interest even in the presence of potentially unknown interferents even when the noise and colinearity are high Comparing with PARAFAC, PARAFAC-ALS and ATLD algorithms, the developed method can supply acceptable results.The collection of EEM fluorescence spectra of a mixture and combination with second-order calibration methods can quantify the analytes even in the presence of uncalibrated interferences that has been called the "second-order advantage". With the property of "mathematical separation" to displace "chemical separation", three second-order calibration methods based on PARAFAC, the alternating penalty trilinear decomposition (APTLD) algorithms, and RISEATD respectively, have been utilized for the direct determination of terazosin hydrochloride (THD) in human plasma samples, coupled with the excitation-emission matrix fluorescence spectroscopy. Meanwhile, the three algorithms combing with the standard addition procedures have been applied for the determination of terazosin hydrochloride in tablets and the results were validated by the high-performance liquid chromatography with fluorescence detection.Three second-order calibration methods were presented to allow accurate and reliable quantitative analysis of dextromethorphan and quinidine in human plasma and urine samples at biological fluids by excitation-emission matrices fluorescence. PARAFAC, self-weighted alternating trilinear decomposition (SWATLD) and APTLD algorithms were applied and the performances of the three methods were compared. It has been found that all the methodologies could obtain good results.With the development of high-order analytical instruments and chemometric algorithms, it becomes easier to obtain and resolve multi-dimensional data from complex systems. The combination of excitation-emission matrix fluorescence and second-order calibration methods could provide a powerful tool for studies of parallel competitive binding reactions of many chemical components with DNA in the presence of interferents. The relative equilibrium concentrations of the component can be directly obtained. In this paper, UV-vis spectroscopy and fluorescence were combined to study the binding of DNA with the anthacycline antibiotic drug pirarubicin (THP). Ethidium bromide (EB) as the fluorescence probe was used to study the competitive binding interactions of THP with DNA by excitation -emission fluorescence matrices coupled with PARAFAC and the alternating normalization-weighted error algorithm (ANWE) with the second-order advantage. The relative equilibrium concentrations of EB-DNA, EB and THP in the equilibrium system can be directly obtained, which makes it possible to determine the reaction pattern of different interacting pairs in a mixture medium.The pesticide is used indiscriminately. It may residue in fruits, vegetables and ground and surface waters that lead to pose a potential hazard for consumers, so the toxicity has raised the public concern about the ecosystem and human health. Therefore, the investigation of genotoxicity and genetic damage via the interaction of DNA with insecticide is very important. Competitive binding interactions of carbaryl and the fluorescence probe EB with DNA have been studied by excitation-emission fluorescence spectroscopy to obtain a three-dimension excitation-emission fluorescence data array. The second-order calibration methods based on PARAFAC and APTLD algorithms were used to resolve the data array obtained.Least squares-support vector machine (LS-SVM) has been introduced into multivariate calibration by many investigators for its attractive features and promising empirical performance. However, the performance of models is strongly dependent upon the homogeneity of the model errors and the uniformity of the data sampling. The representation of training samples for multivariate calibration has been discussed and the concept of weighted sampling has been introduced to multivariate calibration. Due to the high-dimensionality and complexity of spectral data space and the uncertainty involved in sampling process, the representation of training samples in the whole sample space is difficult to evaluate and select representative training samples for multivariate calibration depends largely on experiential methods. If the training samples fail to represent the sample space, sometimes the predictions of new samples can be degraded. In order to solve this problem, a new algorithm for multivariate calibration is developed by combining optimized sampling and least squares-support vector machine, where the original training samples are non-negatively weighted and the complexity and the ability of prediction of the model are considered simultaneously. Two real data sets are investigated and the results demonstrate that sample-weighted least squares-support vector machine models can improve the ability of prediction for a model when the representation of original calibration sample is poor.For multivariate calibration, all of the wavelength variables might carry more or less molecular information, it seems more advisable to investigate all the possible variables rather than traditional variable selection. Based on particle swarm optimization algorithm, a more flexible variable selection and modeling method, variable-weighted least squares-support vector machine is proposed. The strategy of variable weighting allows non-negative weights of variables rather than removing or reserving any variables. Using particle swarm optimization (PSO) to seek the non-negative weights of variables can be seen as an optimized rescaling of the variables in certain sense. If employing PSO to search for the other parameters in the model of least squares-support vector machine at the same time, the variable-weighted least squares-support vector machine would become a total-automatically modeling approach and therefore be more flexible and intelligent than traditional variable methods.

  • 【网络出版投稿人】 湖南大学
  • 【网络出版年期】2011年 05期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络