节点文献

基于图嵌入与视觉注意的特征抽取

Feature Extraction Based on Graph Embedding and Visual Attention

【作者】 赵才荣

【导师】 刘传才;

【作者基本信息】 南京理工大学 , 计算机应用技术, 2011, 博士

【摘要】 在模式识别领域,如何在高维数据中寻找有效的低维表示是个核心问题。而特征抽取是解决此问题的关键环节。本文对基于图嵌入和视觉注意的特征抽取理论与算法进行了较为深入的研究,主要工作和研究成果如下:(1)在基于图嵌入方法的特征抽取算法中,邻域图的构造是整个算法的核心问题。本文改进了类间惩罚图的构造方法,并设计了类间相斥图。由于改进的类间惩罚图刻画了更多的局部边缘信息,而类间相斥图则描述了全局边缘信息,本文算法综合了两个图的优点,这有助于该算法在优化目标函数过程中寻找最佳的鉴别边缘。在此基础上,提出了融合类间相斥图的局部最大边界嵌入算法(RLMME)。本在YALE、ORL、AR人脸数据库以及USPS数字手写体数据库上的实验结果证实了该算法的识别性能优于PCA, LDA, LPP, MFA。(2)在邻域图的构造过程中,如何设计一个正确反映样本关系的边界权重函数非常重要。在邻域图中,边界权重函数的本质就是样本相似度或差异度的度量函数。本文提出了模糊局部保持嵌入的特征抽取算法(FLMME).在该算法中,我们设计了一种新的模糊渐进的权重度量函数,该函数赋予同类中越近的近邻越大的权重,对于异类的近邻样本,越近的近邻则赋予越小的权重。基于此权重度量准则,本文构造了模糊渐进的类内邻域图和类间邻域惩罚图。在利用新构造的邻域图得到的投影子空间上,相邻同类样本将更加紧致,而相邻的不同类样本则更加远离。在WINE人工数据集,YALE、ORL、AR人脸数据库以及USPS数字手写体数据库上的实验结果表明该算法比PCA, LDA, LPP, RLMME算法更为有效。(3)近二十年来,人们提出了许多视觉注意的计算模型。但是这些模型依然存在着选择合适的初级特征以及特征融合策略问题。为此,本文提出了融合边缘信息稀疏嵌入的显著性视觉注意改进算法。在视觉特征抽取初级阶段,本文引入边缘特征,以增加全局轮廓信息的描述。此外,通过考虑不同的特征显著性的差异,本文提出了稀疏显著性因子来度量特征显著的程度。越稀疏的特征,其显著程度越高。根据特征的显著程度,可以把不同特征图重新组合为稀疏嵌入的显著图。在自然彩色图像上的实验结果表明,相对于传统的视觉注意算法,改进后的算法能更准确合理地刻画显著区域。此外在Sheffield建筑物数据库上的识别实验表明,基于本文算法得到的Gist特征优于传统方法,这进一步证明了本文提出算法的有效性。(4)在传统的建筑物识别方法基础上,本文提出了多尺度Gist特征流形方法,并分阶段描述了基于该方法的建筑物识别模式。在特征抽取阶段,本文抽取了一种多尺度Gist特征,用以描述建筑物图像的全局结构信息。由于高维Gist特征具有潜入在低维特性,所以在特征降维阶段,本文提出了增强模糊局部最大边界嵌入算法(EFLMME)对Gist特征进行维数约简单。在Sheffield建筑物数据库上的实验效果表明,相对于传统的建筑物识别方法,本文提出的方法对光照变化、旋转变换、有遮挡等问题具有较强的鲁棒性,在建筑物图像上的识别率也得到显著的提高。

【Abstract】 It is one of the most important problems to find the low dimensional and effective representations in the field of pattern recognition. And feature extraction is a key step to solve the problem. The dissertation presented the deep researches on feature extraction theorems and algorithms based on graph embedding and visual attention. The main works and research results are as follows:(1) In graph-based dimensionality reduction algorithms, the construction of neighborhood graph is an essential problem in the graph embedding algorithms for feature extraction. The paper improved the construction of inter-class penal graph and designed a novel inter-class repulsion graph. The improved inter-class penal graph characterized more local marginal information and the inter-class repulsion graph described the global marginal information, which help us find the optimal discriminant margin. According to the local and global inter-class graph, we proposed a local maximal margin embedding algorithm combined with inter-repulsion graph (RLMME). Experimental results on the Yale, ORL, AR face databases and USPS handwriting digital databases show that our proposed algorithm outperforms PCA, LDA, LPP, and MFA.(2) In the procedures of constructing the neighborhood graph, it is crucial to construct a marginal weithted function that can correctly reflect the relationships among the samples. In the neighbor graph, the weight of edge is, in essence, used to measure the similarity and diversity between samples. The paper presented an improved algorithm called fuzzy local maximal marginal embedding (FLMME) for linear dimensionality reduction. Significantly differing from the existing graph-based algorithms is that two novel fuzzy gradual graphs are constructed in FLMME, which help to pull the near neighbor samples in same class nearer and nearer and repel the near neighbor samples of margin between different classes farther and farther when they are projected to feature subspace. The proposed FLMME algorithm is evaluated through experiments by using the WINE database, the Yale, ORL and AR face image databases and the USPS handwriting digital databases. The results show that the FLMME outperforms PCA. LDA, LPP and RLMME.(3) Numerous computational models of visual attention have been suggested during the last two decades. But, there are still some challenges such as which of early visual features should be extracted and how to combine these different features into a unique "saliency" map. According to these challenges, we proposed a sparse embedding visual attention system combined with edge information, which is described as a hierarchical model in this paper. In the first stage, we extract edge information besides color, intensity and orientation as early visual features, adding the global edge information in the saliency maps. In the second stage, we present a novel sparse embedding feature combination strategy based on sparse saliency factor. Results on scene image show that our model outperforms other visual attention computational models. In addition, experimental results on the Sheffield building database show that the gist feature based on the proposed method can achieve the better performance than that of traditional method. This further testified the effectiveness of the proposed method.(4) Multi-scale gist (MS-gist) feature manifold for building recognition is presented in the paper. It is described as a two-stage model. In the first stage, we extract the multi-scale gist features that represent the structural information of the building images. Since the MS-gist features are extrinsically high dimensional and intrinsically low dimensional, in the second stage, an enhanced fuzzy local maximal marginal embedding (EFLMME) algorithm is proposed to project MS-gist feature manifold to low dimensional subspace. To evaluate the performance of our proposed model, experiments were carried out on the Sheffield buildings database. Results show that the proposed model is superior to other models in practice of building recognition and can handle the building recognition problem caused by rotations, variant lighting conditions and occlusions very well.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络