节点文献

基于谱图理论的人脸表情识别算法研究

Research on Facial Expression Recognition Algorithms Based on Spectral Graph Theory

【作者】 支瑞聪

【导师】 阮秋琦;

【作者基本信息】 北京交通大学 , 信号与信息处理, 2010, 博士

【摘要】 随着信息和计算机技术的飞速发展,人脸面部表情识别技术越来越受到重视。人脸表情识别是智能人机交互的重要基础,该课题涉及图像处理、运动跟踪、模式识别、生理学、心理学等研究领域,是当前国内外模式识别和人工智能领域的研究热点。本文主要研究人脸表情特征提取的若干问题。基于谱图分析理论,分析表情图像的内在特性,提出能够有效表征人脸表情的特征从而用于分类。主要创新性工作包括:第一,为挖掘人脸表情图像样本的内在结构,采用监督型谱分析方法(SSA)提取表情特征。将人脸表情图像样本表示为图的形式,然后用谱图分析的方法处理这些图的结构。与传统谱聚类方法和其它降维方法相比,监督型谱分析方法具有以下三个优点(1)解决了小样本问题(small-sample-size),可直接对表情样本向量进行矩阵变换,不需要用其它降维方法进行预处理;(2)利用样本的类别信息,将样本点及其关系看做连接图进行分析,映射后的结构也很好的保留了原有图的特性;(3)可以反映数据潜在的非线性特性。实验结果表明它可以有效地提取人脸表情特征,提高人脸表情识别的精确度。第二,为了增强谱分析方法的判别性,提出了基于判别信息的谱分析方法(DSA)。谱分析方法主要保留数据的非线性局部结构,即同类样本点之间的近邻关系,而忽略了不同表情类别之间的关系,从而影响表情分类结果。针对这个问题,我们在谱分析算法中引入判别信息,同时考虑数据集的非线性局部结构和非线性外部结构,在保留样本点近邻关系的基础上也保留表情类别之间的近邻关系,从而得到判别性能更强的人脸表情特征。第三,为了解决基于向量的特征降维方法数据矩阵维数过高,计算量大等问题,提出基于二维图像的模糊判别性局部保留映射算法(2D-FDLPP)。将模糊性和判别性引入监督型局部保留映射算法,并扩展到基于二维图像矩阵。基于图像矩阵的二维降维方法不需要将二维图像转换为一维向量,直接对二维图像矩阵进行特征提取运算,克服了矩阵奇异等问题,且提取的特征中包含更多图像信息。在二维局部保留映射算法的基础上,利用模糊方法计算样本类别隶属度,构建模糊权重矩阵,从而分散相似表情类别之间的近似特征。此外,将表征表情类别间近邻关系的加权类间离散度引入目标函数,使其同时考虑样本近邻点之间的局部保留特性和表情类别之间的局部保留特性,得到判别性强的表情特征。第四,提出基于图的稀疏非负矩阵分解方法(GSNMF)并用于提取人脸表情特征。常用的基于矩阵分解的特征降维方法所得到的分解矩阵中常包含负数,而负数在表情图像分析中是没有意义的。因此,我们基于非负矩阵分解的思想,对矩阵分解添加非负性约束。同时,根据谱图理论,将图的保留约束及稀疏性约束引入非负矩阵分解,得到表征面部各部分的基图像,进行线性组合从而表征整幅表情图像。此外,提出求解约束条件下的非负矩阵分解方法的投影梯度方法框架。为保证特征分解后局部最小值的平稳性,采用投影梯度方法寻求分解矩阵,从而保证结果是满足最优化条件的最优解。大量实验证明了该方法在表情识别中的有效性,且对面部部分遮挡的表情图像具有一定鲁棒性。

【Abstract】 Facial expression recognition technique becomes more and more important under the rapid technology improvement of information and computer. Facial expression recognition is one of the most important bases of intelligent human-computer interaction, and the subject involves many research fields, including image processing, motion tracking, pattern recognition, physiology, psychology, etc. It is research hotspot of pattern recognition and artificial intelligence. In this paper, we focus on some issues on facial expression feature extraction. Based on spectral graph theory, analyze the intrinsic characters of the facial expression images, so that to extract efficient facial expression representation for classification. The main contributions are listed as follows:First, in order to discover the intrinsic structure of facial expression images, we utilize supervised spectral analysis algorithm to extract facial expression features. Compared to traditional spectral clustering algorithms and dimensional reduction algorithms, supervised spectral analysis algorithm (SSA) benefits from the following three aspects:(1) SSA does not suffer from the small-sample-size problem. It can make matrix transformation directly on data matrix and do not need any other dimensional reduction methods for preprocessing. (2) SSA utilizes the class label information of samples, construct graph according to the data points and their relationship, and the data points after projection can preserve the graph structure. (3) SSA can effectively discover the nonlinear structure hidden in the data. Experimental results show that SSA can extract facial expression features efficiently, and enhance facial expression recognition accuracy.Second, in order to enhance the discriminant power of spectral analysis algorithm, discriminant spectral analysis algorithm (DSA) is proposed. Spectral analysis algorithm mainly preserves the nonlinear intra-locality structure, that is, the local neighborhood relationship between the data points. However, it ignores the relationship between facial expression classes. To enhance the discriminant power, we introduce discriminant information to supervised spectral analysis algorithm. By taking consideration of both nonlinear intra-locality and nonlinear inter-locality structure of the original data points, we obtain discriminant subspace which can preserve both neighborhood relationship of data points and neighborhood relationship of facial expression classes. Third, vector-based dimensionality reduction methods face the shortcomings of high dimension of data matrix and high computation complexity. To overcome these problems, Two-dimensional Fuzzy Discriminant Locality Preserving Projections (2D-FDLPP) is proposed. Fuzzy assignment and discriminant information are introduced to supervised locality preserving projections, and it bases on two-dimensional iamge matrix. Matrix-based dimensionality reduction method extracts the facial expression features directly from image matrices, and does not need to convert two-dimensional image to vector. Moreover, it does not suffer from matrix singular problem, and the features contain more image information. Based on two-dimensional locality preserving projections, we utilize fuzzy k-nearest neighbor classifier to calculate the membership degree, and construct fuzzy weight matrix. Furthermore, the weighted between-class scatter, which denotes the local neighborhood structure of facial expression classes, is introduced to the object function. By preserving both local neighborhood of data points and facial expressions, we obtain more discriminant facial expression features.Fourth, the graph-preserving sparse non-negative matrix factorization algorithm is proposed. The decomposition matrices obtained from common used matrix factorization-based methods always contain negative values, which are physically meaningless in facial expression recognition. Therefore, according to non-negative matrix factorization algorithm, we add non-negative constraint to matrix factorization. Also, both graph-preserving constraint and sparseness constraint are introduced to non-negative matrix factorization. Then parts-based basis images are obtained from the constrained matrix factorization, and facial expression images are represented by combining the basis images linearly. Furthermore, the framework for constrained non-negative matrix factorization is proposed. To guarantee the stationarity of the minimal solution, the projected gradient method is used to ensure the stationarity of limit points. Experimental results show that graph-preserving sparse non-negative matrix factorization is efficient for facial expression and robust to partial occluded facial expression images.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络