节点文献

基于局部不变特征的图像分类研究

Research on Local Invariant Feature Based Image Classification

【作者】 黄飞

【导师】 景晓军;

【作者基本信息】 北京邮电大学 , 通信与信息系统, 2013, 硕士

【摘要】 随着数字技术的发展,如何有效组织、检索、分类大量图像信息成为热点研究课题,图像分类方法是其中的一项关键技术。本文研究了基于局部不变特征的图像分类方法,包括基于多码书词包模型的图像分类技术和基于概率潜在语义分析的改进型图像分类算法两个主要方面。论文主要贡献如下:一、提出了基于局部特征显著性准则的类属码书生成方法,并设计了基于多码书的多类分类器。图像的词包模型(BoW)可以有效利用局部不变特征,近年来在图像分类等领域应用十分广泛。码书生成是BoW模型中的重要技术,本文提出的类属码书生成算法在一定程度上避免了直接采用K-均值等方法聚类时造成的码书辨别性损失的问题。基于多码书的多类分类器在有效利用各个码书的辨别性的同时,降低了BoW向量的维度,也方便了新增类别的添加。二、提出了结合空间信息的概率潜在语义分析(pLSA),并应用于场景分类。在BoW模型中,视觉词汇通过特征聚类而来,难免会产生一些同义词和多义词。pLSA可以有效的解决BoW的多义词和同义词现象。但是,一般基于pLSA的图像分类方法,并没用利用图像的空间信息。空间信息对于图像分类任务是非常重要的,本课题在pLSA中加入空间信息,提高了分类准确率。这里的空间信息包括两个部分,一是将各个视觉词汇的邻域词汇作为上下文信息,二是各个潜在主题的空间位置信息。最后在上述理论基础上设计仿真实验,结果符合预期,较好地验证了理论研究的结论和实际应用的可行性。

【Abstract】 As digital technology has been developing fast in recent years, how to effectively organize, search and classify vast quantities of image has become a valuable research subject with image classification as one of the most important parts. In this dissertation, we discuss image classification techniques based on local invariant feature. More specifically, they are based on bag-of-words (BoW) model and probabilistic Latent Semantic Analysis (pLSA). The main work includes the followings:The proposal of a new method to build class-specific codebook based on features’ significance and a multiclass classifier based on class-specific codebooks. BoW model, which has been widely used in image classification, was proposed for the efficient use of local invariant features. Codebook is an important part of BoW, but the k-means like clustering method used to build codebook may lower codebook’s discriminative ability. The technique we introduced here can alleviate the loss of discriminative ability. The multiclass classifier based on class-specific codebooks can take advantage of the discriminative ability of class-specific codebook and lower the dimension of BoW vector.The proposal of the pLSA which incorporated spatial information and was applied to scene classification tasks. In BoW, the code words come from feature clustering, which may generate some "polysemy" and "synonymy". pLSA has the ability to solve the problem of "polysemy" and "synonymy" and has been successfully used in scene classification as an intermediate representation of images. However, it didn’t utilize the spatial information of an image which is important for scene classification tasks. To improve the accuracy of classification, we proposed a new method which incorporates spatial information coming from neighbor words and topics’ position into pLSA. Finally, an image can be represented by the position distribution of each latent topic, and subsequently, we train a classifier on the topics’ position distribution vector for each image.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络