节点文献
转基因番茄的可见/近红外光谱快速无损检测方法
Nondestructive Detection of Transgenic Tomato Based on Visible and Near Infrared Spectroscopy
【作者】 谢丽娟;
【导师】 应义斌;
【作者基本信息】 浙江大学 , 农业电气化与自动化, 2009, 博士
【摘要】 生物技术的快速发展,特别是基因工程技术的发展使得人类可以将外源基因插到受体物种的基因组中,从而使物种产生抗逆性、抗虫性及其它新的特性,这影响着人类生活的方方面面,包括农业、家禽业、工业和药业。全球转基因作物的种植面积和种类每年都在持续增长,但此项技术对生态环境、人类健康、伦理道德等可能带来的问题尚不明确,先需进行转基因生物检测和鉴别的相关技术研究。目前转基因生物的检测方法主要有DNA检测和蛋白质检测两大类,但这些方法存在较多不足,如操作复杂,费用高,且蛋白质检测方法只适合检测未加工的产品等。近红外光谱信息来源于有机分子中的含氢基团(C-H、O-H和N-H)振动的合频与各级倍频的吸收,应用其进行检测具有快速、无需复杂的样品预处理、低价和易实现在线等优点。转基因番茄是第一个获准进入市场的转基因农产品。本研究综合利用光谱分析技术、分子生物学、植物生理学、光学和化学计量学等诸多领域的知识,以转基因番茄(包括叶、果实、果汁和种子)及其亲本为研究对象,进行基于可见/近红外光谱技术的转基因番茄快速无损鉴别和生理特性指标(包括叶绿素和乙烯合成量)无损检测的研究。用不同的模式识别方法(判别分析(Discriminant analysis, DA).判别偏最小二乘法(Discriminant partial least squares, DPLS)、簇类的独立软模式分类法(Soft independent modeling of class analogy, SIMCA前馈反向传播神经网络(Back propagation neural network, BPNN)、径向基函数神经网络(Radial basis function neural network, RBFNN)和最小二乘支持向量机(Least sqares support vector machine, LS-SVM))进行转基因番茄与其亲本的定性分析,同时对叶片中叶绿素含量和番茄果实的乙烯合成量等生理特性指标与可见/近红外光谱的相关性进行研究,并利用主成分回归(Principal components regression, PCR)、偏最小二乘回归(Partial least squares regression, PLSR)和多元线性回归(Multi linear regression, MLR)建立定量模型。本文的研究目的在于证明利用可见/近红外光谱分析技术进行转基因番茄快速无损鉴别和叶片中叶绿素含量及番茄乙烯合成量等生理特性指标定量检测的可行性,建立基于可见/近红外光谱分析技术的转基因番茄快速无损检测方法体系,为研发具有自主知识产权的转基因番茄高通量快速检测仪器提供方法依据。主要研究内容、结果和结论如下:(1)分析了光谱仪分辨率和扫描次数对近红外光谱和建模结果的影响。结果表明:分辨率为4 cm-1和8 cm-1时所建立的番茄果实鉴别模型的识别率最高,为78.89%;随着扫描次数的增加,光谱越来越光滑,均方根噪声逐渐变小,方差值也逐渐变小模型的识别率随着扫描次数的增加而提高,扫描次数为128次时,模型的识别率最高;在a=0.05水平上分辨率大小和扫描次数对番茄果实近红外光谱噪声的影响显著,对吸光度的影响不显著。(2)分析了转基因番茄叶、果实、果汁和种子与其亲本的光谱差异。结果表明:用配有InGaAs检测器的Nexus智能型FT-NIR光谱仪所采集的转基因番茄叶与其亲本的漫反射光谱图上显示转基因番茄叶的吸光度值在1380 nm前小于亲本的吸光度值,但是在1380 nm后趋势有所改变;用配有Si检测器的Nexus智能型FT-NIR光谱仪所采集的转基因番茄叶与其亲本的可见/近红外漫反射光谱显示转基因番茄叶的吸光度值小于亲本的吸光度值;用微型光纤光谱仪USB4000所采集的番茄叶透射光谱图显示转基因番茄叶的透过率大于亲本的透过率;2)同一成熟期成熟番茄的近红外漫反射光谱图显示转基因番茄的吸光度值较其亲本高;不同成熟期番茄的可见/近红外漫反射光谱图显示转基因红番茄的吸光度值大于其亲本的吸光度值,转基因青番茄的光谱与其亲本在578 nm处交叉,在578 nm前,转基因青番茄的吸光度值大于其亲本的吸光度值,在578 nm后,转基因青番茄的吸光度值小于其亲本的吸光度值;3)转基因番茄汁与其亲本的近红外透射光谱极为相似,放大后能观察到微小差异;4)转基因番茄种子与其亲本的近红外原始光谱显示光谱之间有微小差异。结果表明,无论是转基因番茄叶、番茄果实、番茄汁还是番茄种子,它们的光谱与亲本的光谱间确实存在差异。(3)比较了不同仪器和检测方式对鉴别结果的影响。结果表明:1)用微型光纤光谱仪USB4000采集的漫反射光谱建立的转基因番茄叶与其亲本的鉴别模型的判别正确率最高,最适合用于转基因番茄叶与其亲本的鉴别。2)对于同一成熟期的转基因番茄与其亲本的鉴别,使用USB4000采集的漫反射光谱建立的模型对样本鉴别的正确率最高;对于不同成熟期的转基因番茄与其亲本的鉴别,同样是用USB4000采集的光谱建立的模型的鉴别正确率要高于FT-NIR光谱仪采集的光谱建立的模型。(4)比较了不同模式识别方法对转基因番茄叶、番茄果实、番茄汁和番茄种子定性分析结果的影响。结果表明:1)对转基因番茄叶与其亲本的判别结果显示,判别分析和判别偏最小二乘结合微分或标准规-化处理都可以实现对所有样本的正确分类。2)同一成熟期转基因番茄与其亲本的判别结果显示,判别分析建立的模型的鉴别效果最好,总体判别正确率为94%,前馈反向传播神经网络和最小二乘支持向量机模型的总体判别效果相同。对于不同成熟期的转基因番茄与其亲本的鉴别结果显示,SIMCA法所建模型的鉴别总体效果最好,为86.08%。3)对转基因番茄汁与其亲本的判别结果显示,结合二阶微分光谱,径向基函数神经网络和最小二乘支持向量机建立的模型预测精度最高,识别率均达到了100%。4)对转基因番茄种子与其亲本的鉴别结果显示,在全波段,判别分析结合25点平滑处理光谱所建的转基因番茄种子与其亲本鉴别模型的性能最优,对样本鉴别的总体正确率达到了95.81%。结果表明,利用可见/近红外光谱,可实现对转基因番茄与其亲本的鉴别。(5)用连续投影算法提取光谱特征波长后重新建立了转基因番茄与其亲本的鉴别模型。结果表明:随着波长数的增加,模型的判别正确率逐渐增加,但是运算所需的时间也逐渐增加。重建模型的结果显示:与模型重建前相比,对于同一成熟期转基因番茄与其亲本的鉴别,判别偏最小二乘法的重构模型的性能提高了,鉴别速度提高了45倍。对于不同成熟期转基因番茄与其亲本的鉴别,SIMCA方法建立的模型最优,总体判别正确率为84.36%。结果表明,利用连续投影算法提取光谱的特征波长后重新建立转基因番茄与其亲本的鉴别模型,可以提高鉴别速度。(6)对番茄叶叶绿素含量和番茄果实的乙烯合成量进行了定量分析。在剔除光谱异常样本和浓度异常样本的基础上,用PLSR, PCR和MLR方法建立了模型,比较了不同光谱预处理方法对模型的影响;对于番茄果实的乙烯合成量,还研究了不同建模波段对模型的影响。结果表明:1)番茄叶在全波段(670~1100 nm)的原始光谱所建的叶绿素含量的PLSR模型性能最优,相关系数r、校正均方根误差(Root mean square error of calibration, RMSEC)、预测均方根误差(Root mean square error of prediction, RMSEP)和交叉验证均方根误差(Root mean square error of cross validation, RMSECV)分别为0.961、1.50、2.25和3.00。2)光谱经多元散射校正后在全波段范围内所建的PLSR模型最适合于乙烯合成量的定量分析,模型的相关系数r达到了0.904。结果表明,利用可见/近红外光谱结合化学计量学技术,可实现对番茄叶的叶绿素含量和番茄果实的乙烯合成量的定量分析。
【Abstract】 The tremendous recent progress in biotechnology and particularly in genetic engineering techniques has enabled the introduction of exogenous sequences which confer new characteristics, such as herbicide tolerance, resistance to insects and solutions to other problems associated with commercial agriculture. It made revolutionary impacts on every aspects of human activities include agriculture, livestock, industry and medicine. The scope and number of genetically modified (GM) crops planted each year continues to grow. However, the potential problems of GM organisms (GMOs) for environmental, ethical and religious impact are unknown. It is necessary to research on the detection of GMOs and detection technique, which is one of the most important consumer concerns regarding food safety and quality. There are several commonly used GMOs testing protocols mainly including nucleic acid-based and protein-based detection methods. Some of the drawbacks of those methods include the use of multiple procedres in the protocol and high costs. Protein-based methods are only suited to the inspection of raw materials. Near infrared (NIR) spectroscopy is sensitive to major organic compounds (e.g. vibration overtones of C-H, O-H and N-H). It has increasingly been adopted due to its advantages:rapidity of analysis, no need for complex sample preparation or processing, low cost, and its suitability for on-line process monitoring and quality control.Transgenic tomato is the first transgenic produce that received administrative approval. Utilizing the knowledge of different fields, such as spectral analysis, molecular biology, optics and chemometrics, etc, combining independent innovation, taking transgneic tomatoes and their parents as object investigated, this research focuses on non-invasive discriminant methods and physiological property (chlorophyll content and ethylene content) detection based on visible/near infrared (Vis/NIR) spectroscopy. In this dissertation, classifications of transgenic tomato leaf, fruit, juice, seed and their parent were studied using Vis/NIR spectroscopy technique and pattern recognition methods. The relationship between physiological property indexes and Vis/NIR spectra was analyzed. Quantitative models were established based on Vis/NIR spectra for chlorophyll and ethylene content determination. This study is to prove the feasibility of transgenic product discrimination and build a rapid and non-invasive detection method to quantify chlorophyll and ethylene content based on spectroscopy technology. It will provide basis to research and develop high throughput and rapid detection equipment to realize the rapid discrimination of transgenic tomato.The main results and conclusions were listed as follows:(1) The influence of spectral acquisition parameters on spectra and modeling results were analyzed. Taking an FT-NIR spectrometer as an example, the influence of resolution and scan number on tomato fruit spectra and modeling results were analyzed. The results indicated that:The discriminant rate was relatively high when the resolution was 4 cm-1 and 8 cm-1, the value was 78.89%. With the increasing of scan number, the spectra was smoother and the root mean square noise (RMSN) and variation value decreased. The discriminant rate increased with scan number. The difference of spectral absorbance at different resolutions and scan number was not distinct (a= 0.05). For RMSN, the difference was distinct (a= 0.05).(2) The difference of spectra acquired from transgenic tomato leaf, fruit, juice, seed and their parent was analyzed. The results indicate:1) The Vis/NIR diffuse reflectance spectra of transgenic leaves absorb less light than their parents below 1380 nm. The trend altered above 1380 nm. The transmittance of transgenic leaf spectra is higher than their parents.2) The NIR diffuse reflectance spectra of transgenic red tomato fruits at the same ripeness stage absorb more light than their parents. The Vis/NIR diffuse reflectance spectra of transgenic red tomato fruits absorb more light than their parents. There is a cross for transgenic green tomato fruits and their parents at 578 nm. The spectra of transgenic green tomato fruits absorb more light than their parents before 578 nm. The trend alters below 578 nm.3) The original transmission spectra of transgenic tomato juice and its parent are similar. The slight difference could be found by magnification.4) Slight difference between transgenic tomato seeds and their parents diffuse reflectance spectra exists. The results indicate that the difference between the spectra of transgenic tomato leaf, fruit, juice and seed and the spectra of their parents does exist. (3) The influence of different spectrometers (or detectors) and detection modes on discriminant accuracy was compared.1) For transgenic tomato leaves and their parents, discriminant accuracy of models based on diffuse reflectance spectra using a miniature optic fiber spectrometer USB4000 were 100%, which was more suitable for transgenic tomato leaves and their parents discriminant.2) For transgenic tomatoes and their parents at the same ripeness stage, discriminant accuracy of models using USB4000 was highest. For tomatoes at the different ripeness stages, discriminant accuracy of models based on diffuse reflectance spectra using USB4000 was much better than the accuracy of models using an FT-NIR spectrometer.(4) The influence of different pattern recognition methods, including discriminant analysis (DA), soft independent modeling of class analogy (SIMCA), discriminant partial least squares (DPLS), back propagation neural networks (BPNN), radial basis function neural networks (RBFNN) and least squares support vector machines (LS-SVM) was compared.1) For discriminating transgenic tomato leaves and their parents, the correct classifications for transgenic and non-transgenic tomato leaves were both 100% using DA and DPLS after derivative or standard normal variate spectral pretreatment.2) For discriminating transgenic tomato fruits and their parents at the same ripeness stage, discriminant accuracy of DA models was highest. The rate was 94%. The discriminant results of BPNN and LS-SVM models were same. For tomatoes at the different ripeness stages, discriminant accuracy of SIMCA models was highest. The value was 86.08%.3) For juice, RBFNN and LS-SVM models using second derivative spectral data produced the highest level of classification rate. An overall classification accuracy of 100% was reached, both for transgenic and non-transgenic tomato juice, which demonstrated the perfect discriminatory power to differentiate transgenic and non-transgenic samples.4) For seed, DA model after 25 points smoothing using diffuse reflectance spectra in 800-2500 nm turned out highest results. An overall classification accuracy of 95.81% was reached. It can be concluded that transgenic tomatoes (leaf, fruit, juice and seed) can be classified from their parents based on Vis/NIR spectroscopy technique. (5) Successive projections algorithm (SPA) was used to extract characteristic wavelength. Discriminant accuracy increased with the number of wavelengths for transgenic and non-transgenic tomatoes. But the time cost for modeling also increased. Models were rebuilt using several characteristic wavelengths. The results indicate that for DPLS rebuilt model, when transgenic tomatoes and their parents at the same ripeness stage were discriminated, discriminant accuracy of calibration and validation increased. The performance of other methods was reverse. Models were rebuilt using 7 characteristic wavelengths when transgenic and non-transgenic tomatoes at different ripeness stages were discriminated. The results indicates that discriminant accuracy of linear and non-linear pattern recognition methods decreased. The discriminant accuracy of SIMCA models was highest. The value was 84.36%. The results indicate that transgenic tomatoes and their parents discriminant models can be built only with characteristic wavelengths by SPA to shorten classification time.(6) Quantitative models were established based on Vis/NIR spectra for determination of chlorophyll content in tomato leaves and ethylene content in tomato fruits. Based on the removal of spectra and concentration outliers, models by different calibration methods of partial least squares regression (PLSR), principal components regression (PCR) and multi linear regression (MLR) were built. The influence of different spectral pretreatments on model performance was analyzed. The influence of different wave bands on ethylene content deternimation models was also researched. I) The performance of PLSR chlorophyll content model established in 670~1100 nm with original spectra was much better, the correlation coefficients (r) was 0.961. Root mean square error of calibration (RMSEC), root mean square standard error of prediction (RMSEP) and root mean square error of cross validation were 1.50,2.25 and 3.00, respectively.2) After multiplictive scatter correction pretreatment, the performance of PLSR ethylene content models in full spectral region was much better, correlation coefficients of PLSR model was 0.904. It could be concluded that chlorophyll content in tomato leaves and ethylene content in tomato fruits can be determinated based on Vis/NIR spectra.
【Key words】 transgenic tomato; visible/near infrared; rapid; non-invasive detection;