节点文献

Isomap用于中药生产过程近红外光谱在线检测研究

The Research on the Application of Isomap in the Near Infrared Spectrum Online Determination of Traditional Chinese Medicine

【作者】 覃锋

【导师】 杨辉华;

【作者基本信息】 桂林电子科技大学 , 计算机应用, 2008, 硕士

【摘要】 质量控制是中药现代化过程的核心问题。现行工艺采用中药指纹图谱技术实现质量控制,由于分析时间长,不能实现在线质量分析。近红外(NIR)光谱技术分析速度快、能反映待测物质多种理化性质,因此适于中药生产过程在线检测。本文主要研究中药NIR光谱的回归建模方法,即建立中药NIR光谱与其化学成分含量及质量信息的定量预测模型。本文首先研究将常规算法–偏最小二乘(PLS)应用于NIR光谱建模,并基于Visual C++编程工具开发了一套NIR光谱建模软件。该软件具有多种光谱预处理和波长选择算法,功能齐全,可分别在离线和在线两种情况下实现NIR光谱建模。针对NIR光谱与待测理化性质之间存在的非线性关系,本文重点研究将流形学习算法引入到NIR光谱建模中,并提出若干改进算法。流形学习算法是最近提出的一类有广泛应用前景的非线性降维方法,能揭示高维数据有意义的低维结构。本文主要研究流形学习中的一种算法–等距映射(Isomap),并将Isomap算法引入NIR光谱建模,同时对该算法进行了改进。首先针对Isomap算法中的距离公式和近邻点个数K的选取,研究了该算法的扩展:引进核函数方法改进Isomap算法中的距离公式形成kIsomap算法;提出了根据样本分布密度来选择近邻点个数的dIsomap算法;集成kIsomap和dIsomap算法形成kdIsomap算法。结合PLS,提出一种NIR光谱建模的新方法–先用各种Isomap算法对NIR光谱数据做非线性降维,再用PLS做线性降维并建立校正模型。将这些方法应用于两个公共的NIR数据集建模,得到了更好的建模效果。Isomap是作为一种非线性降维方法提出的,它不能处理新样本,也不能用于监督学习。借鉴最近提出的Kernel Isomap算法能够处理新样本的功能,进一步利用Isomap与KPCA之间的联系,以及KPCA与KPCR之间的联系,将Isomap算法进行扩展,提出了监督的Isomap算法–SKIsomap,其既能处理新样本又能用于回归,从而拓展了Isomap算法的应用范围。将SKIsomap算法应用于建立安神补脑液提取过程中的二苯乙烯苷和淫羊藿苷的回归校正模型,效果较好。本文还研究了流形学习的其他两种算法:局部线性嵌入(LLE)、拉普拉斯正则化最小二乘(LapRLS)。提出了NIR光谱的LLE-PLS非线性建模方法和LapRLS半监督回归方法,并应用于建立丹参多酚酸盐柱层析过程中丹酚酸B含量的回归校正模型。本文将中药指纹图谱技术、NIR光谱在线检测技术、流形学习算法和自动控制技术综合应用于中药生产过程中,可实现对药物体系中化学成分群的实时监测及生产工艺的实时控制,对保证产品质量的均一、稳定、可控具有重要意义。

【Abstract】 Quality control is the core issue of the modernization of Traditional ChineseMedicine (TCM). The currently used fingerprint techniques cannot be used in theonline quality control for its long analysis time. Fortunately, the techniques of nearinfrared (NIR) can be adopted for its short analysis time and its ability to re?ect thephysical abilities of the analytes.Modeling methods of NIR spectra were mainly studied in this dissertation. Fore-casting calibration models were set up between o?ine NIR spectra and reference valuesof the fingerprints and the information of the online products. First, a software wasdeveloped based on Visual C++ and the commonly used modeling algorithm -partialleast squares (PLS). This software possessed various methods for spectra pretreatmentas well as wavelength selection, and can be used in both o?ine and online modeling.More importantly, manifold learning was introduced into the NIR modeling for thenonlinear relations between NIR spectra, and some improvements were made.Manifold learning algorithms are kinds of novelly promoted nonlinear methodsfor dimension reduction, which can e?ectively find out the intrinsic low dimensionalstructure from high dimensional data. One of these algorithms -Isometric mapping(Isomap) was studied and improved in this dissertation, which was introduced intoNIR modeling. First, some expansions were made to Isomap. That is, kIsomap methodwas established after the introducing of Kernel function; dIsomap could select numberof the neighbor points according to the density of sample distribution; kdIsomap wasformed after integrating kIsomap and dIsomap. Then a nonlinear modeling mothedwas put forward by combining Isomap algorithms and PLS. In the method, Isomapalgorhthms were used to reduce the dimensions of the high dimensional NIR spectraldata. PLS was used to dimension reducing and modeling by succession. Finally, allthe methods were applied to two public benchmark NIR datasets and modeling. Andresults showed that the modeling methods were better than PLS.Isomap is an important nonlinear algorithm for dimension reduction, but cannotbe used to process new samples and therefore cannot serve as a supervised learningalgorithm. Based on the capability of Kernel Isomap to process new samples, therelations exist between Isomap and KPCA as well as the relations exist between KPCAand KPCR, a new method -superviese Isomap (SKIsomap), which can be applied tonew samples and regression was put forward and applied to correlate the NIR spectrawith the concentrations of chrysophenine and icariin in the extraction of Anshen BunaoYe.The other two algorithms of manifold learning algorithms were also studied in this dissertation, namely locally linear embedding (LLE) and Laplacian Regularized LeastSquares (LapRLS). LLE-PLS and LapRLS regression were put forward. And it wasapplied to correlate the NIR spectra with the concentrations of salvia acid B in theelution of column chromatography of Salvianolate.The techniques of TCM fingerprint, NIR online determination, manifold learn-ing algorithms and auto-control were integrated and applied to TCM manufacture inthis study, and actualized the real-time monitoring and online control of the chemi-cal components in the medicine, consequently guarantee the uniformity, stability andcontrollability of the product quality.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络