节点文献

近红外光谱技术在多组分检测及模式识别中的应用研究

Study on Multi-Component Determination and Pattern Recogition Analysis Using Near Infrared Spectra Technique

【作者】 刘波平

【导师】 王俊德;

【作者基本信息】 南京理工大学 , 应用化学, 2011, 博士

【摘要】 本文通过近红外光谱技术结合化学计量学手段分别对食品、饲料和复合肥等多组分含量的快速检测以及模型识别判别分析进行了较为深入的方法研究。文中所建立的方法成功应用于产品分析和质量控制,发挥了近红外光谱技术快速、准确和实用的优势。同时,化学计量学方法在具体问题的解决中,也得到发展。由于产品中成分复杂,相互干扰严重,用于分析的近红外光谱吸光度差异通常很小、信号弱、吸收峰之间混叠,且谱峰宽。以致于找不到待测组分不受任何干扰的特征峰,使对产品中多组分同时定量和识别分析较为困难。为此,提出了PLS,KPLS, PLS-BP, GRNN和ELMAN方法建立线性和非线性多组分定量模型和PCA-马氏距离建立模式识别模型,本文的主要研究内容如下:1、偏最小二乘近红外光谱技术在多组分分析中的应用与研究本文建立了近红外光谱PLS法对瘦肉7种脂肪酸含量测定的方法,在建立PLS模性时,需对采集的原始光谱进行数学处理,以过滤噪音、提高信噪比。实验表明,光散射是影响近红外光谱的主要因素。同时,PLS提取主成分时,因其能同时将因变量矩阵和自变量矩阵用主成分表示,可以有效地降维,消除自变量间可能存在的复共线关系,而明显改善数据结果的可靠性和准确度。本研究还用核函数建立以复合肥中N、P2O5、K 2O三组分为对象的KPLS非线性多组分模型,KPLS通过非线性映射到高维数空间提取了光谱中掩藏的非线性信息。相对于PLS线性体系,KPLS对多组分的含量预测准确度和相关性都有提高。尤其对P2O5、K2O含量的预测提高更明显。2、偏最小二乘与BP神经网络用于近红外光谱技术多组分定量分析研究为解决BP网络过拟合、以及学习速度慢等问题,用PLS对输入BP网络的光谱数据进行压缩,建立了PLS-BP法同时测定饲料中水分、灰分、蛋白质、磷四组分含量和饲料中四种氨基酸含量的方法。与BP法比较,PLS-BP输入网络的数据减少,大大提高运算速度和减少训练次数,模型的预测精度也好于BP模型。本研究还提出了用PLS提取光谱X和组分Y的主成分及权重,解决近红外光谱BP模型隐含层节点数,输入和输出层初始权值凭经验选取问题。建立了以土豆中粗纤维、淀粉、蛋白质三种营养组分含量的PLS-BP近红外光谱多组分预测模型,这种经PLS和BP组合的网络较BP网络改进了训练效果,使得运算速度加快,网络达到最优,精度也更高。这一研究对近红外光谱BP网络结构的建立,具有一定的理论和指导意义。3、偏最小二乘与广义回归神经网络用于近红外光谱技术多组分定量分析研究提出了将GRNN方法引入近红外光谱多组分分析中,用PLS对输入网络的光谱数据进行压缩,建立了饲料中水溶性氯化物、粗纤维、脂肪三组分含量测定方法。PLS-GRNN与BP、GRNN网络进行比较,PLS-GRNN, GRNN模型训练步数要明显少于BP网络,训练时间也短。PLS-GRNN要比GRNN和BP网络的预测精度和拟合性能更好。应用该法还成功预测了南丰蜜桔总糖、总酸含量,这一研究为近红外光谱多组分分析提供了一种新的途径。4、基于Elman神经网络的近红外光谱技术多组分定量分析研究发展了Elman网络与近红外光谱分析技术的结合,提出把具有动态信息处理能力的Elman网络模型引入近红外光谱多组分分析中,经PLS压缩提取主成分,加入内部反馈信号,增加了Elman网络本身处理动态信息的能力,使得Elman网络在节点结构更简单,从而提高了建模和预测速度。在饲料苯丙氨酸、赖氨酸、酪氨酸和胱氨酸四组分含量测定中,对BP和Elman网络进行了比较,在BP和Elman两模型学习误差相同,Elman网络拟和残差平均值MRE也不如BP模型下,Elman网络网络预报精度却高。说明ELman网络对动态系统具有适应时变特性的能力。应用该法还成功预测了鲜乳中脂肪、蛋白质、乳糖含量,表明Elman神经网络是一种新颖、可靠的预测方法。这为同时测定近红外光重叠的多组分动态非线性体系提供了新的途径。5、近红外光谱技术联用PCA-马氏距离对掺假乳的鉴别以PCA-马氏距离近红外光谱法建立了巴氏杀菌乳和复原乳、鲜乳和分别掺有植物奶油、乳清粉假乳的判别分析模型。确立了最佳建模条件,对掺假复原乳0.50%-100%,植物奶油0.50%-10%,乳清粉0.20%-3.3%的样品判别成功率达100%。为鉴别掺有植物奶油和乳清粉的假乳探索了一个新方法。

【Abstract】 This thesis focuses on in-depth investigation into the analytic method of using a combination of near infrared spectroscopy (NIRS) with chemometric techniques to achieve rapid determination and pattern recognition analysis of multi-component content in foods, feedstuff and compound fertilizer. Methods established in the thesis have been successfully applied to product analysis and quality control, fully utilizing the rapid, accurate and practical advantages of near infrared spectroscopy. Through specific problem solving, chemometrics methods have also been further developed. Due to complicated product composition and serious interference between product components, absorbance difference used during near infrared spectroscopy was usually small, the signal was weak, and absorbance peak were each overlapped and wide. No un-interfered feature peak could be identified for the testing component, which made it very difficult to conduct multi-component quantitative analysis and pattern recognition analysis in product. Therefore, this paper proposed using several methods including PLS, KPLS, PLS-BP, GRNN and Elman methods to establish linear and nonlinear multi-component models and build pattern recognition model using PCA-Mahalanobis Distance. Main contributions made by this thesis are summarized below.1、Investigation of multi-component quantitative analysis and its applications using PLS near infrared spectroscopyThis paper established a method of applying NIRS with Partial Least Squares (PLS) to analyze concentration of 7 fatty acids found in pork. Before building a PLS model, collected raw spectrum need to be processed first, using mathematical techniques such as noise filtering and increasing signal-to-noise. Experiments indicated that light scattering was the main factor influencing NIRS. When using the PLS method to extract main components, main components can be represented using variable matrix and argument matrix simultaneously. This can effectively reduce dimensionality, avoid possible overlap relationship between independent variables, and thus improve the reliability and accuracy of results. Kernel function was introduced to establish KPLS (Kernel Partial Least Square) nonlinear multi-component model for N, P2O5, K 2O in compound fertilizer, KPLS model was able to extract nonlinear information hidden in the spectroscopy via reflection of inner nonlinear algorithms to high dimensionality space. Comparing to PLS linear system, KPLS was able to improve forecasting relevance and accuracy when analyzing for multi-component, especially for P2O5 and K 2O.2、Investigation of combined use of PLS and BP neural network in NIRS quantitative analysis of multi-component substancesIn order to solve problems such as slow learning, and network over-fitting when implementing BP neural network, a new method PLS-BP method was established by introducing PLS to compress spectroscopy data being entered into BP neural network. This PLS-BP method was then applied to simultaneously determine moisture, ash, protein and phosphorus content, as well as the four types of amino acid content in feedstuff. Comparing to the BP method, PLS-BP method greatly enhanced operation speed and reduced training time by fewer input data. In addition, the prediction results based on PLS-BP model would also be more precise than those based on BP model. This study also initiated the method of extracting main factors and weight of spectroscopy X and component Y by PLS. This method is able to solve the problem that the number of implied layers, weight initialization of input layer and output layer were selected only by experience. In this thesis, a PLS-BP prediction model was also established based on determination of fibre, starch and protein, the three main nutrients found in potato. The combined use of PLS and BP networks in the PLS-BP model was able to improve training effect, increase operation speed, enhance precision. This research has theoretical contribution as well as practical implications to the establishment of NIRS BP network3、Using partial least squares and general regression neural network for NIRS multi-component quantitative analysisIn this part of the thesis, the GRNN method was innovatively introduced into NIRS multi-component analysis. PLS was used to compress NIRS before taken as inputs of GRNN, this established the method for determination of chlorine, fibre, fat content in feedstuff. Comparing PLS-GRNN with BP and GRNN network, training steps and time for PLS-GRNN and GRNN network was significantly fewer than BP network. PLS-GRNN had better prediction precision and better fitting than GRNN and BP network. Applications of this method have successfully predicted contents of total sugar and acid in Nanfeng Orange. This research provided a new way for NIRS multi-component quantitative analysis.4、Investigation of use of Elman neural network in NIRS multi-component quantitative analysisThis thesis proposed a combination of Elman neural network with NIRS techniques by introducing Elman neural network, which have dynamic information processing ability, into NIRS multi-component analysis. After PLS compression of original spectra data and addition of internal simple feedback signal, the ability of Elman network to process dynamic information have been enhanced. This simplifies nodal structure of Elman network and thus increased model building and forecasting speed. During determination of phenylalanine (Phe), lysine (Lys), tyrosine (Tyr), cystine (Cys) content in feedstuff, although BP and Elman model had the same of training speed, MRE of Elman neural network worse than BP network, but MRE had high prediction precision. This result showed that Elman neural network has accommodating ability in response to dynamic system. The applications of this method have also successfully predicted content of fat, protein, lactose and total solid in fresh milk, which showed the Elman neural network was a novel and reliable prediction method, which suggested Elman neural network is a new way for advanced determination of multi-component with overlapping NIRS in a dynamic nonlinear system.5、Identification of adulterated milk by using PCA-Mahalanobis Distance and NIRSPattern recognition analysis models were built for pasteurized milk and reconstituted milk, fresh milk and adulterated milk for vegetable cream, and whey powder based on PCA-Mahalanobis Distance and NIRS. The optimal conditions were defined for these models, and accurate discrimination for adulterated reconstituted milk, vegetable cream,, and whey powder with concentrations between 0.50%-100%,0.50%-10%,0.20%-3.3%, respectively achieved 100%. This brings a new method for discrimination of adulterated milk that has been mixed with whey powder and vegetable cream.

节点文献中: