节点文献

高精度人脸识别算法研究

Research on Highly Accurate Face Recognition Algoirhtms

【作者】 邓伟洪

【导师】 郭军;

【作者基本信息】 北京邮电大学 , 信号与信息处理, 2009, 博士

【摘要】 人脸识别是一个古老而又年轻的学术问题,人们对它的探索已经跨越了三个世纪。早在1888年,《Nature》杂志首次发表了利用人脸进行身份识别的文章,掀开了人类对人脸识别探索的序幕。120年后的2008年,《Science》杂志刊登了关于“100%自动人脸识别精度”技术评论,指出现实环境下的高精度人脸识别仍然是一个遥远的目标,需要我们持续开展创新的研究工作。人脸识别是一个特殊的复杂模式识别问题,它具有:1)训练集合的高维性和小样本性;2)图像的类内变化远大于类间变化;3)特征空间内样本的结构复杂等特点。然而,现有的识别算法大多从特定目标函数出发,缺乏对人脸自身属性的考虑。它们虽然能够从数学上很好地描述具有固定变化的数据集,从而取得很高的识别精度,却不能解决现实复杂环境下的人脸识别问题。为了从根本上提高技术水平,使得算法能够应对现实环境的复杂性和不可预测性,本文引入了两个人脸识别的新思路。第一,从人脸类别在特征空间中的类别分布特点出发,利用人脸类别分布的先验知识,设计适应人脸自身属性的新特征提取算法;第二,引入心理学、神经科学等跨学科知识,利用多层次的仿生手段,设计生物启发式人脸特征提取算法。以这两种思路为指导,面向现实环境下的高精度人脸识别需求,本文主要进行了如下研究:(1)从全局散度分析、局部重叠分析、流形结构分析和分类错误四个角度出发,利用实验手段探索了人脸图像空间中的类别分布特性。各个类别的人脸图像在测量空间中呈现出相似的结构,而且高度重叠。基于人脸类别结构的特殊性,提出人脸空间类别分布的两个先验假设:第一,每类的协方差矩阵相等,同时类内散度远大于类间散度,使得每类的协方差矩阵都近似等于全局的协方差矩阵。第二,不同人之间的类间散度存在于同一子空间中,而且该类间子空间的主方向与类内散度的子空间主方向是不“冲突”的。这两个先验知识使得算法可以不用局限于特定人的训练集合,转而采用大规模的通用数据库对人脸识别中的特征抽取模型进行训练。在FERET数据库上的大规模实验证明,在基于类别结构先验知识的通用学习框架下,采用经典线性鉴别分析方法就可以获得大大超过国际评测最好结果的精度。(2)通过对当前特征抽取算法的深入分析,提出了一个基于投影寻踪技术的算法框架,统一了主成分分析、独立成分分析、线性鉴别分析、局部保持投影、非监督鉴别投影等主流的特征抽取算法,给出了人脸识别中的特征抽取问题的一种新颖的综合性表述。在投影寻踪框架中,算法首先通过白化变换对数据进行预处理,随后利用优化技术寻找具有最优投影特性的低维投影向量。基于投影寻踪框架的实验验证了当前的非监督特征抽取算法都不能很好地寻找低维特征投影,它们在人脸识别中的高性能主要来源于白化过程对人脸类别结构的变换,而并非以往研究中声称的低维投影。为了充分利用人脸类别结构的先验知识,本文提出了一个局部寻踪算法,在白化空间内最大限度地保持样本的局部邻近特性。在AR和FERET数据库上的实验表明局部寻踪算法的非监督和监督版本、线性和核版本都能获得高于当前同类算法的精度。(3)从“样本均匀化”的几何直觉出发,提出了局部离散寻踪算法,解决单训练样本条件下的人脸识别难题。该算法的目标是使得在特征空间中靠近的类别变得分散,降低类间样本混淆的风险,从而提高识别精度。局部寻踪算法引入了两种样本均匀化的新方法:第一,基于奇异值分解的新白化变换理论—该变换可以在小样本情况下把训练样本集变换成一个规范正交集,使得样本间距完全均匀。第二,局部离散投影—对主成分分析进行局部化改造,可以在降低数据维数的同时使得原来局部聚集的样本变得稀疏。该算法的核版本能够进一步提高算法对非线性结构的学习能力,在使用更小量的特征的情况下获得同等精度。在包含1196人的大规模FERET数据库上,局部离散寻踪算法仅采用每人单样本的训练就可以获得远远高于FERET’97评测最好结果的精度水平。为了进一步提高精度水平,本文还提出了融合通用特征模型和特定人特征模型的人脸识别新方法,在FERET评测的duplicate测试图像集中的识别精度超过90%,为该测试集至今发表的最高识别精度。(4)从心理学、神经科学与模式识别的交叉性出发,提出一种多层次的生物启发特征提取方法。该方法包含三个层次。第一,在低层特征生成阶段,采用了一组源于视觉通路早期阶段响应特性的生物启发式特征来表示人类图像。这些特征针对多方面的视觉特性,涵盖了空间局部性、空间频率选择性、边缘方向选择性、色度选择性等。第二,从低层特征到高层特征的映射采用了一个增量式的鲁棒鉴别分析模型。首先使用增量式主成分分析方法把高维低层特征映射为通用的中层特征,再使用鲁棒鉴别分析模型映射为专用于身份识别的高层特征。第三,对不同的人脸特征进行独立的编码,形成独立的视觉通路。最后的识别决策融合了不同的视觉通路得到的多个人脸编码的相似度。生物启发式人脸识别系统模拟了复杂的人类信息融合策略,有效地整合各种信息源获得稳定的人脸识别性能。FRGC版本2实验4中,生物启发式系统获得了超过93%的验证率,不仅超过了FRGC2005评测的最好性能,而且为目前在该标准实验中发表的最高识别精度。

【Abstract】 Human has an excellent appetence on face recognition, it’s our dream to makes the machine has the same intelligent recognition ability. Original dream and curiosity drive people conduct the continuing research on automatic face recognition. As the development of the modern information technologies, automatic face recognition is attached importance to broad fields such military, commercial, security, in virtue of its good applicability and non-intrusive property. Face recognition has become one of the most representative and challenging research content of pattern recognition domain. Face recognition is an old but young academic problem, about which people have though across three century. As early as 1888, Nature magazine published the first academic paper on face recognition, starting people’s exploration on face recognition.120 years later, i.e.2008, Science magazine published a technical comment on "100% automatic face recognition accuracy", which pointed out the highly accurate face recognition in the real world application is still an ambitious goal and its solution require me to continue the innovative research.Face recognition is a special complex pattern recognition problem, whose particularities lie in its high feature dimensionality versus small sample size, the large within-class variations versus the small between-class variations, and so on. These particularities make the mature pattern classification theory cannot be applied to solve face recognition problem, which quality the feature extraction as the determinant of the accuracy level. Unfortunately, traditional feature extraction methods often start from a specific object function, which makes the algorithm can adapt to the specific variations contained in a certain data set, and thus achieve high recognition accuracy. However, they cannot solve the real world recognition problem in complex conditions. In order to improve state-of-the-art performance, we induce two core ideas in this paper. Firstly, capitalize on the prior knowledge on the class configuration of the face feature space, design the algorithm that can adapt to the intrinsic property of the face pattern. Secondly, conduct the take an interdisciplinary research, drawing from the accumulated and vast knowledge of both the computer vision and psychology communities to solve the face recognition problem. Inspired by the two core ideas, the major research contents of this paper are as follows.(1) Explore the class configuration of the face space by global scatter analysis, local overlap analysis, manifold structure analysis and classification error, and draw a conclusion that the each face class share a common structure in the measurement space, and their structures high overlap in the space. Based on these characteristics, we suggests two important prior assumption on the face space:First, each class share the same covariance matrix, the within-class scatter is much larger than the between-class scatter, which makes the class conditional covariance is roughly equal to the global covariance of the whole data set. Second, the difference between different persons resides in the same between-class subspace, and the principal direction of the between-class subspace is not conflict with the within-class subspace. These two prior assumptions makes us can train the algorithm in a large generic data set. The large-scale experiments on FERET database shows the traditional algorithm that makes use of these prior assumptions can largely outperform the best results reported in the international evaluation.(2) A family of linear feature extraction methods, such as PCA, ICA, LDA, LPP, and UDP, has been proved effective to address this challenging problem. In this paper we unify these methods into a projection pursuit framework with different projection indices, and suggest that their feasibility on face recognition mainly come from the whitening process. We propose a locality pursuit algorithm, which pursues the optimum projection that preserves or dissipates the local clusters in the whitened space. The experiments using AR and FERET databases show that the proposed method achieves better face recognition performance than other unsupervised methods.(3) Inspired by the geometric intuition of "sample uniform", we propose a locality pursuit algorithm, which aims to solve the challenging face recognition problem with single image per person. The idea of this algorithm is to disperse the samples which are close in the measurement space, and thus reduce the risk of recognition error. Experiments on the FERET database that contains 1196 persons show that locality pursuit algorithm can outperform the best result of the FERET’97 evaluation by a large margin. In order to further improve the face recognition accuracy, we propose to fuse the generic model and identity-specific model for face recognition. The new method achieves over 90% accuracy on the FERET duplicate probe set, which is the highest accuracy reported on this challenging probe set.(4) Face recognition technology is of great significance for applications involving national security and crime prevention. Despite enormous progress in this field, machine-based system is still far from the goal of matching the versatility and reliability of human face recognition. In this paper, we show that a simple system designed by emulating biological strategies of human visual system can largely surpass the state-of-the-art performance on uncontrolled face recognition. In particular, the proposed system integrates dual retinal texture and color features for face representation, an incremental robust discriminant model for high level face coding, and a hierarchical cue-fusion method for similarity qualification. We demonstrate the strength of the system on the large-scale face verification task following the evaluation protocol of the FRGC version 2 Experiment 4. The results are surprisingly well:Its modules significantly outperform their state-of-the-art counterparts, such as Gabor image representation, local binary patterns, and enhanced Fisher model. Furthermore, the integrated system reduces the recognition error rate by 71.3 percent over the FRGC 2005 best result.In summary, the algorithm proposed in this paper is simple but effective, with clear theoretical meaning. They achieve excellent on FERET and FRGC experiments, the two most major evaluation in the literature, which clear show the core ideas of this paper is highly applicable in the real world settings, worth further research.

  • 【分类号】TP391.41
  • 【被引频次】17
  • 【下载频次】2251
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络