节点文献

基于张量表示的人脸表情识别算法研究

Research on Facial Expression Recognition Algorithms Based on Tensor Representation

【作者】 刘帅

【导师】 阮秋琦;

【作者基本信息】 北京交通大学 , 信号与信息处理, 2012, 博士

【摘要】 表情是人类用来表达情绪的一种基本方式,是非语言交流的重要组成部分,近些年随着自然人机交互和智能机器人的发展,计算机自动人脸表情识别技术受到越来越多的关注。本文主要基于图像数据的张量表示和分解技术,并结合图保持的流形学习方法,研究表观人脸表情图像的特征提取,并最终用于表情识别。论文的主要创新点包括:1.抽象出张量子空间模型,在此基础上对传统张量算法中的张量投影进行正交化的改进,提出正交张量流形学习算法,取得了更好的识别效果,并从理论上给出了解释。1)通过对大量基于“张量-张量”映射的降维算法的分析,提出统一的张量子空间模型,其中详细定义了张量子空间的基(基张量)、投影和重构等概念。该模型是向量子空间模型的自然扩展,使我们能够以一个新的视角看待张量降维问题。根据该模型,向量降维算法中的一些概念或性质可以很自然地被引入到张量算法中(如正交性,非负性,稀疏性等),从而提高相应张量算法的性能。2)在张量子空间模型基础上,研究了张量投影的正交化,并对已有的张量流形学习算法进行正交化的改进,分别提出正交张量邻域保持嵌入算法(OTNPE)和正交张量边界费舍尔分析算法(OTMFA)。理论分析和实验结果表明,投影正交化能够使传统的张量流形学习算法更好地保持人脸表情的流形结构,从而改善了人脸表情的表征和识别效果。2.将张量秩一分解技术与图保持的流形学习准则结合,提出张量秩一差分图保持分析算法(TR1DGPA)。TR1DGPA中首先构造一个体现两两类间鉴别性的判罚图,同时根据局部线性嵌入算法(LLE)构造内类的近邻图,并以差的形式将二者结合形成差分图保持目标,然后在此目标下将每个张量样本分解为一组共同秩一张量的加权线性组合,其权重系数构成该样本的低维特征。TR1DGPA具有如下特性:(1)保持张量样本内部的空间排列信息;(2)保持样本类内的局部流形;(3)加强样本两两类间的鉴别性。该算法能够很好地收敛,而且相比“张量-张量”映射的降维算法和向量降维算法具有更小的计算复杂度。通过实验发现,相比以前的一些相关算法,TR1DGPA对于表观人脸表情的识别更加有效。3.提出正交的张量秩一差分图保持投影算法(OTR1DGPP)。OTR1DGPP目的是在差分图保持目标下,依据求张量秩一分解原则求解一组正交化的基张量用于投影。算法中给出了一个全新的正交化方案,该方案相比之前类似的算法具有更大的灵活性,而且能够很好地收敛。实验表明,与以前的一些正交化算法相比较,OTR1DGPP对人脸表情的识别能取得更好的结果。4.将非负张量分解(NTF)技术与图保持的流形学习准则结合,提出鉴别的邻域保持非负张量分解算法(DNPNTF)。NTF能够将非负张量样本集分解为一组非负基张量和权重系数的线性组合,然而NTF是基于重构最优的,并没有考虑原始样本集的流形结构和鉴别信息。DNPNTF算法在NTF的基础上增加了图保持约束,使求出的非负基张量能够同时保持样本集同类内的局部流形和不同类间的分离性。在求解过程中采用梯度下降法,并构造出乘法更新规则,保证解的非负性。另外详细证明了算法的收敛性。实验表明DNPNTF对人脸表情的识别比其他相关的非负算法更加有效,而且DNPNTF所求出的非负基图像具有更好的稀疏性。

【Abstract】 As a basic way to display humans’inner emotions, facial expression makes an important part of the non-word communication between people. Recent years, along with the development of the natural man-machine interaction and the intellectual robot, the automatic facial expression recognition has attracted more and more attentions. This thesis mainly researches feature extraction from appearance facial expression images for recognition, using the tensor representation and decomposition techniques, combined with the graph preserving based manifold learning methods. The innovative work of this thesis includes:1. Conclude a tensor subspace model, and based on which we othogonilize the tensor projections for the traditional tensor algorithms, and propose the orthogonal tensor manifold learning algorithms.1) Through analysis of several "tensor-to-tensor" projection based dimensionality reduction algorithms, we sum up a generalized tensor subspace model which explicitly gives the definitions about the basis of the tensor subspace (basis tensor), the tensor subspace projection and reconstruction. The model could be reckoned as as a natural extension of the vector subspace model, and by which we can consider the tensor dimensionality reduction algorithms from a new perspective. This model makes some conceptions and characteristics in the vector dimensionality reduction algorithms available for the tensor algorithms (e.g. the orthogonality, the non-negativity, the sparseness and so on), such that the performances of the corresponding tensor algorithms may be improved.2) Under the introduced tensor subspace model, we investigate the orthogonality of the tensor projection, and improved the existing tensor based manifold learning algorithms into the orthogonal version, where Orthogonal Tensor Neighborhood Preserving Embedding (OTNPE) and Orthogonal Tensor Marginal Fisher Analysis (OTMFA) are proposed. Both the theoretical analysis and the experimental results show that the orthogonalization could make the traditional tensor based manifold learning algorithms preserve the facial expression manifold much better, therefore the facial expression representation and recognition can be improved.2. Propose the Tensor Rank One Differential Graph Preserving Analysis algorithm (TR1DGPA) algorithm through combining the tensor rank-one decomposition technique with the graph preserving based manifold learning criterion.First, a penalty graph representing the pairwise inter-class discrimination is constructed, and meanwhile an intra-class affinity graph is built by the Locally Linear Embedding (LLE), then the differential graph preserving objective is formed by the difference of these two graphs. Finally, under this objective TR1DGPA decomposes each original tensor sample into a linear combination of rank-one tensors, where the coefficients form the low dimensional feature of the original sample. TR1DGPA has the following characteristics:(1) it preserves the inner spatial structure information within the original tensor samples;(2) it preserves the intra-class local manifold;(3) it enhances the pairwise inter-class separability. We prove that TR1DGPA converges very well and has less computational complexity than the vector representation based algorithms and the "tensor-to-tensor" projection based algorithms. In the experiment, compared with some former related algorithms, we find that TR1DGPA is more effective for the appearance facial expression recognition.3. Propose the Orthogonal tensor rank one differential graph preserving projections algorithm (OTR1DGPP).OTR1DGPP aims to obtain a set of orthogonal rank-one basis tensors for projection according to the tensor rank-one decomposition principle, based on the differential-form graph preserving objective function. In the algorithm, a novel, effective and converged orthogonalization process is given, which has more flexibility than the former similar algorithm. The experiments show that OTR1DGPP can obtain better facial expression recognition results than some former related orthogonal algorithms.4. Propose the Discriminant Neighborhood Preserving Non-negative Tensor Factorization (DNPNTF) algorithm by combining the Non-negative Tensor Factorization (NTF) and the graph preserving based manifold learning principle.NTF could decompose an ensemble of non-negative tensors into a group of non-negative basis tensors and the corresponding weighted coefficients. However NTF is based on optimal reconstruction and it does not consider the manifold structure and the discriminative information within the original samples. DNPNTF adds the graph preserving constraint based on NTF, which make the resolved non-negative basis tensors could preserve the intra-class manifold and keep the inter-class separability. During the resolving process, we adopt the gradient descent method and construct the multiplicative update rule, ensuring the non-negativity of the solutions. Also we give the detail proof for the convergence of the algorithm. The experiments for the facial expression recognition verify that DNPNTF is more effective than the related non-negative algorithms. And we find that the non-negative basis images obtained by DNPNTF have better sparseness.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络