节点文献
基于判别学习的图像目标分类研究
Image Object Classfication Research Based on Discriminative Learning
【作者】 陈海林;
【导师】 吴秀清;
【作者基本信息】 中国科学技术大学 , 信号与信息处理, 2009, 博士
【摘要】 图像内容分析与理解是视觉智能的重要内容之一,图像目标分类是图像内容分析与理解领域的研究热点,图像目标分类在实际生活中有着重要的应用,已经获得广泛研究。当前图像目标分类的基本思想是先建立图像目标的描述,然后利用机器学习方法学习图像目标类型,最后利用学习得到的模型对未知图像目标进行分类或识别。计算机表示的底层特征与人类理解的高层语义特征存在语义鸿沟,使得图像目标分类面临着很大挑战,图像目标分类有待进一步研究。由于判别学习具有很好的实际应用性能,本文主要研究如何将图像目标描述与判别学习进行融合,并应用于图像目标分类。本文主要从两个大的方面研究图像目标分类,即通用图像目标分类和特定图像目标分类,对于通用图像目标分类采用基于局部特征的图像描述与判别学习算法相融合的方法,对于特定图像目标分类根据特定图像目标的特性提取不同的全局特征,然后结合相应的判别分类方法进行图像目标分类。本文的主要研究工作和创新点归纳如下:1.充分挖掘局部特征在特征空间的结构特性,提出密度导向的树型结构核函数,该核函数是非参数核函数,具有与特征点数目成线性关系的计算复杂度,能够计算出具有不等势的两个特征点集之间局部匹配关系,具有较好的匹配能力,无需用户指定特定参数,满足正定条件,可以用于基于核函数的学习算法,能够将图像目标的描述和判别分类器进行良好的融合,进行图像目标定位或识别。实验表明该核函数具有良好的局部匹配性能和图像目标的分类能力。2.研究局部特征在图像空间的位置相关性,提出局部特征空间相关核函数,该核函数可以较好地描述局部特征在图像中相对位置关系,满足正定条件,可以嵌入基于核的学习算法,且具有较好的时间效率。实验结果表明局部特征空间相关核函数具有较好的分类性能。3.研究局部特征同时在图像空间和特征空间的关系,提出双空间金字塔匹配核函数,该核函数可以满足正定条件,具有线性计算复杂度,可以嵌入基于核的学习算法。实验结果表明双空间金字塔匹配核具有较好的分类性能。4.仔细分析遥感图像的语义内容,设计一种遥感图像语义内容层次模型,可以将遥感图像语义层次模型应用于遥感图像分类、检索,目标检测和识别等。提出基于角点分布特征的中低分辨率遥感图像飞机检测方法,该方法利用飞机的角点分布特征可以快速地进行目标粗定位,为后面的分类判别减少计算量,然后使用简单有效的空间结构特征和决策树对飞机进行判别,实验取得良好的效果。5.针对基于相机的中英文字符语言种类自动识别问题,提出一种基于后验概率估计的层叠分类器,该层叠分类器的节点分类器采用判别学习算法,采用两种方法设计层叠分类器的节点阈值,即独立阈值设计和非独立阈值设计,并从理论上设计满足整体要求的层叠分类器。该层叠分类器的设计为高分类率的分类器设计提供了一种理论方法。为了能够很好地挖掘中英文字符之间的结构差异,提出采用基于象素梯度信息的水平垂直笔画向量和梯度方向相关图,以及基于位置相关象素的相对灰度信息的Census变换(Census Transform)直方图,它们对光照、噪声以及分辨率等都具有良好的鲁棒性,可以应用于基于相机的图像。理论分析和实验结果表明,非独立阈值设计可以使层叠分类器获得更高的分类率,提出的方法对于基于相机的中英文字符语言种类具有良好的分类能力。
【Abstract】 The analysis and understanding of the image content is one of the important contents for the visual intelligence,the image object classification is a research focus in the field of the analysis and understanding for the image content,there are very important applications by the image object classification in the practical life which has been researched widely.Currently,the basic thinking of the imgae object classification is firstly building the image object presentation,secondly learning the image object class by the machine learning,and then classifying or recognizing the the unseen image objects by the learned models.The semantic gap occurs between the low level features represented by computers and the high level semantic features understood by the human,it makes the image object classification face the great challenges,and the image object classificaiton should be researched further.Because the discrimative learning has the good practical ability,this thesis mainly researchs how to fuse the imgae object presentation and the discriminative learning in order to classify the image objects.This thesis researchs the image object classification mainly from two great issues including the general image object classification and the special image object classification,for the general image object classification the fusion of the local feature based image presentation and the discriminative learning is used,and for the special image object classification,the different global features can be extracted according to the characteristic of the special image objects,and then the corresponding discriminative classification methods are combined to classify the image objects.The main research works and creative points in this thesis are summed up as followings:1.Sufficiently mining the structural characteristic of feature space from local features,the density-guided tree-structured kernel is proposed,which is a non-paramatic kernel,has the linear compuation cost with the number of feature points,can compute the partial matching relations between two feature sets with unequal cardinality,has better matching ability,does not require the users specify the special paramaters,satisfies the positive define condition,can be used to the kernel based learning algorithms,can fuse the image object presentation and classifier well, and can also locate or recognize the image objects.The experimental results show that the prosed kernel has the good matching ability and the image object classification ability.2.Researching the location correlation from the local features in the image space, local feature spatial correlation kemel is proposed,which can describe the relative location relationship from the local features in the image space,satisfies the positive define condition,can be emmbeded into kemel based learning algorithm,and has the better time efficiency.The experimental results show the local feature spatial correlation kernel has the better classification ability.3.Researching the relations for local features in both image space and feature space,the bi-space pyramid matching kernel is proposed,which can satisfy positive define condition,has linear computation cost,can be emmbeded into kernel based learning algorithm.The experimental results show the bi-space pyramid matching kernel has better classification performance.4.Carefully analysing the semantic content of the remote sensing images,a hierarchical model of semantemes for remote sensing images is designed,which can be used to the remote sensing image classification,retrieval,object detection and recognition,etc.Comer distribution based airplane detection in the middle/low resolution remote sensing images is also proposed,the coarse locations of the object can be achieved fastly using comer distribution feature of the airplane,the computation cost for the classification can be reduced,and then the airplane can be discriminated using the simple and efficient spatial structure feature and the decision tree.The experment achieves the good performance.5.Aiming at the automatic recognition of the camera-based chinese and english character language types,the posterior probability estimation based cascade classifier is proposed,in which the discriminative learning algorithm is used,there are two methods for designing the node thresholds of cascade classifier,such as independent threshold designing and dependent threshold designing,and the cascade classifier satisfying the whole requirements is designed from the theory.The designing of the proposed cascade classifier can provide a theoretic method to design the classifier with high classificaiton rate.In order to mine the structure difference of the chinese and english characters,the gradient information of the pixel based horizontal and vertical stroke vector,gradient orientation correlogram and the relative gray information of the location correlation pixels based census transform histogram are used,they can be robust to illumination,noise and resolution and so on,and can be applied to camera-based images.Both the theoretic analysis and experimental results show that the dependent threshold designing can make the cascade classifier achieve the higher classification rate,the proposed method has the good classification performance to the camera-based chinese and english character language types.
【Key words】 Sopport vector machine; local feature; kernel; hierarchical model; corner distribution; cascade classifier;