节点文献

基于局部线性嵌入的降维算法研究及其在精准农业中的应用

The Research of Dimension Reduction Algorithm Based on Locally Linear Embedding and Its Applicaions in Precision Agriculture

【作者】 阎庆

【导师】 梁栋;

【作者基本信息】 安徽大学 , 电路与系统, 2014, 博士

【摘要】 传统的粗放型农业生产模式效率低下且对生态环境的污染严重,已经不适应新世纪农业发展的需求。现代农业逐渐摆脱原始农业、传统农业和工业化农业的束缚,进入以知识高度密集为主要特点的知识农业发展新阶段。将现代信息技术、生物技术和工程装备技术应用于农业生产的“精准农业(Precision Agriculture)"已经成为现代农业的重要生产形式。将图像处理和机器视觉等技术的应用是精准农业实施中的主要特色之一。通过对光学图像或者高光谱图像的智能分析,有效提高作业效率。但是光学图像数据提供的信息有限,在很多应用中存在局限性。而高光谱遥感图像因为波段众多,光谱分辨率和空间分辨率都很高,因此对地物的分辨更加准确,在精准农业的应用中具有其他数据无法比拟的优势,已经成为未来精准农业应用中的主要数据形式。这些新的数据分析手段虽然给农业生产带来了革命性的变化,但是另一方面也因为其数据量巨大,不仅给存储和传输带来了困难,同时也给数据的分析和处理带来了巨大的挑战。因此如何有效降低数据的维数,减少数据量是精准农业图像分析中的一个重要课题。本文主要研究局部线性嵌入算法在精准农业数据降维问题中的应用。结合精准农业实施中如杂草识别等问题的需要,主要围绕局部线性嵌入算法监督性的实现、近邻参数自适应选择、适当的分类算法的设计等问题进行了深入研究。主要的研究工作与创新成果如下:(1)信息技术、模式识别技术在精准农业中的主要应用之一就是依据图像和光谱数据完成对作物属性的自动识别。而常规的局部线性嵌入算法是一种非监督算法,直接应用于分类识别中往往效果不佳。针对这个缺陷,提出一种基于Fisher准则的监督局部线性嵌入算法。算法首先对训练样本进行Fisher投影变换,寻找最佳投影方向。在此方向上各类样本具有最大可分性。利用训练样本在该投影轴上的投影距离来构造邻域结构,则可以最大程度得利用训练样本的监督信息指导降维,从而有效提高识别率。实验结果表明,基于Fisher准则的监督局部线性嵌入算法比常规局部线性嵌入算法具有更优异的降维效果,用简单的分类算法就可以实现较高的识别率。(2)局部线性嵌入算法应用于分类识别问题时,其精度还受到另外一个因素的影响,即局部线性嵌入算法主要参数之一的近邻参数κ。该参数选择的恰当与否将严重影响识别结果。但是目前还没有特别成熟的选择算法出现,多数情况下是根据实验结果进行多次反复人工尝试。这也成为局部线性嵌入算法发展中的瓶颈。针对精准农业中所处理数据的特点以及局部线性嵌入算法邻域构造对识别效果的影响,设计一种基于监督局部线性嵌入方法的近邻参数自适应调整的算法。实验结果表明,该方法可以根据所采集数据的分布特点自动确定近邻参数,在保证高识别率的前提下又增强了算法的稳定性和实用性。(3)降维算法只是数据处理的第一步,确保高识别率的另外一个重要环节是分类算法的选择。而局部线性嵌入算法对于新增测试样本必须和训练样本重新训练完成降维后才能进行分类,计算量大,效率低下。根据局部线性嵌入算法利用重构误差构造邻域结构的特点,将测试样本与正负类流形重构误差的大小作为判断训练样本类别属性的根据。这种分类方法是直接基于数据流形本身的特点构造的,又不需要引入新的未知参数,具有应用方便的特点。实验结果证实监督局部线性嵌入和这种分类算法的结合可以保证较高的识别精度。(4)杂草识别是精准农业应用中的主要问题之一。因为自然界生物的多样性,即使同种植物形态颜色上也具有一定的差异,而异类植物却又可能具有相似性。利用传统的机器视觉方法,通过颜色,形态等特征识别精度不高,而且容易受到自然环境的影响。本文主要面向玉米田间实地采集的图像数据完成杂草识别任务。该组图像中环境很复杂,玉米和多种杂草共生。设计了根据形态学方法自动分割杂草和玉米的方法,然后利用监督局部线性嵌入对分割后的图像进行降维,并用支持向量机完成分类识别,最终取得了理想的实验结果。证明了基于Fisher准则的局部线性嵌入算法在非实验室环境下也具有很好的适应性。(5)高光谱数据结合了光谱分析和图像处理的优势,在精准农业中的病虫害监测,品质检测等多个问题中都取得了成功的应用。针对实验室采集的患有条锈病的小麦叶片成像高光谱数据,根据“图谱合一”的思想,将一种图像纹理特征分析手段——灰度共生矩阵和光谱信息进行联合分析,充分利用了成像光谱数据的优势。实验结果表明,这种将传统图像分析手段和光谱数据结合的方法能够更好地发现作物受病害影响的程度,尤其是作物受病害影响初期或者称为隐性病时期,识别效果更优于传统的光谱分析方法。

【Abstract】 China is a agricultural country with a population of billions, the problem of agriculture has always been one of the primary issue in the governments at all levels. Traditional agriculture in our country has been called the "intensive cultivation". This ensures our agriculture production have the advantages of higher per mu yield, but on the other hand, relying on artificial intensive cultivation purely must lead to low productivity problem. Modern agriculture is getting rid of the bondage of primitive agriculture、traditional agriculture and industrial agriculture, and entering into the knowledge agriculture development stage with the main characteristics of knowledge intensive.The Precision Agriculture which applied the modern information technology, biotechnology and engineering equipment technology applied in agricultural production has became the main production form of knowledge agriculture of every country in the new century.When the image processing and machine vision technology are applied in the precision agriculture, the intelligent analysis result of the images can be used to guide the robot to accomplish some field works automatically, which can improve the efficiency rapidly. But the information in optical image data is limited, it is not enough for many applications. Hyperspectral remote sensing image having plenty of bands, high spectral resolution and high pixel resolution, therefore it can provide more accurate information of ground objects, which has incomparable advantage over other datas.And in recent years, its application in precision agriculture has become increasingly widespread. For example, by satellite remote sensing technology hundreds of hectares of land are measured out the fertility of different plots, and control the agricultural machinery to complete the quantitative fertilization according to the local situation; the spectrum characteristics of crops can also be captured utilizing the ground remote sensing device. And these information can used to distinguish weeds from crops or judge degree damaged by diseases. It is hard to achieved relying on traditional agricultural methods.These new data analysis means has brought a revolutionary improvement to the agricultural production.But on the other hand,because of the huge datasize, not only the storage and transportation become a difficult task, but also the analysis and processing of datas have a greater challenges. So how to effectively reduce the dimension of data and the datasize is an important research subject in precision agriculture image analysis. This paper mainly studies the local linear embedding algorithm application to the problem of data dimension reduction in precision agriculture. Meeted the need of classification problem in the implementation of precision agriculture,such as weed identification, mainly around how to utilize the supervised information of learning samples in locally linear embedding algorithm, the adaptive selection of parameters, and the proper classification algorithm design were studied. The main research work and innovative results are as follows:(1)The basic theory of manifold learning method and developments are introduced.The influence of neighbor parametes、intrinsic dimension、noise and other issues to the dimension reduction effect is researched. The characteristics of the manifold method which is used commonly are analyzed, and the sensitivity to the parameters of them is compared. A kind of important research issue in the precision agriculture is to complete the automatic identification of some properties of the crops by intellectual technology, which is the typical applications of information technology and pattern recognition in precision agriculture. But the conventional locally linear embedding algorithm is an unsupervised algorithm, so its application in identify crop variety or diseases directly are often ineffective. A supervisied local linear embedding algorithm based on Fisher criterion is proposed. Firstly,the Fisher projection was carried out on the training samples to find out the best projection direction,and different kind of samples in this direction has the maximum separability. The projection distance of training samples in this direction is used to construct the neighborhood structure, which can make use of the training samples’ supervision information to instruct dimension reduction, so as to improve the recognition rate. The experimental results show that the supervied local linear embedding algorithm based on Fisher projection is more excellent than the conventional algorithm, so it can achieve high recognition rate only by some simple classification algorithm.(2)After the supervision problem is solved,there is another factor will affect the identification precision when the locally linear embedding algorithm is applied to the identification problems in precision agriculture, namely the neighbor parameter which is one of the main parameters in locally linear embedding algorithm. Whether the selection of this parameter is appropriate will seriously affect the recognition result. And this parameter selection is directly related to the characteristics of the datasets processed. There is no mature theory to direct this selection method currently, in most cases,the selection is obtained according to the result of many repeated experiments artificially. It has become a bottleneck in the development of local linear embedding algorithm. Aiming at the characteristics of data processed in the precision agriculture and the influence of neighborhood structure to recognition effect, the adaptive algorithm based on the supervised locally linear embedding. The experimental results show that this algorithm can ascertain neighbor parameter automatically according to the distribution characteristics of the dataset, on the premise of guarantee to obtain high recognition rate the algorithm efficiency is improved, so practicability is enhanced.(3)For classification problems, dimension reduction algorithm is just the first step, the another important link to ensure high recognition rate is the choice of classification algorithm. The locally linear embedding algorithm for the new test samples must repeat all steps again to finish dimension reduction with the training samples before classification, amount of calculation is large and the efficiency is low. Because the neighborhood structure is established according to the reconstruction error in the local linear embedding,a classification algorithm is used which compute the reconstruction error of the test samples versus the positive and negative manifolds and then judge the catigory of samples according to reconstruction error. This classification method is directly based on the characteristics of data’s manifold itself, and it does not introduce new unknown parameters, so it has the characteristics of easy application.(4)Weed identification is one of the main problems in application of precision agriculture. Because of the biological diversity in the nature, even if the same plants, there also has a certain differences on color and configuration, while different plants may be very similar. Using the traditional machine visual methods, by such as color and shape characteristics, the identification accuracy is not very high, and easily affected by the natural environment. Aimed at images aquired On corn field which have weeds and corn with complex symbiotic environment, a method is designed to segment weeds and corn automatically by the image morphology. Then using supervised locally linear embedding dimension reduction was carried out on the image after segmentation, the ideal experimental results were obtained. The local linear embedding algorithm based on Fisher projection also has the very good adaptability in the natural environment is proved.For the wheat blade hyperspectral datas which have rust disease collected in laboratory, according to the thought "the unity of the image and spectrum", a kind of image texture feature analysis method——gray symbiotic matrix(GLCM) is introduced, and conjoint analysis based on the GLCM and spectral information is carried out,so the advantages of imaging spectral datas are utilized fully. The experimental results show that this combination of traditional image analysis methods with the spectral method can recognize crops affected by the disease,especially in the early stage which can also be called recessive period, the identification effect is much better than that is obtained by the traditional spectral analysis method.

  • 【网络出版投稿人】 安徽大学
  • 【网络出版年期】2014年 09期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络