节点文献

基于学习的人脸识别研究

Research on Learning-Based Face Recognition

【作者】 孔万增

【导师】 朱善安;

【作者基本信息】 浙江大学 , 控制理论与控制工程, 2008, 博士

【摘要】 人脸识别是生物特征识别的关键技术之一,其最核心的两大步骤为人脸检测与识别。它的主要任务就是从图像或视频中准确地找出人脸并确定其身份。本文从基于学习的角度出发,在聚类、流形、子空间学习等机器学习方法的基础上,提出了改进或新的人脸识别方法,并和其他人脸识别方法作比较,取得了较好的效果。本文主要研究了以下几方面的内容:1)针对人脸检测问题,通过肤色法分离人脸目标与背景后,提出两种方法定位人脸即:积分投影—高斯曲线法和改进减法聚类法,分别对应单人脸和多人脸检测。积分投影—高斯曲线法将人脸二值图像分别在X、Y轴积分投影,根据投影曲线分别计算相应的高斯曲线,通过求解高斯方程快速得到较为准确的人脸区域。改进减法聚类法运用一种新的距离定义,通过图像中人脸目标的统计信息对算法的关键参数进行预估计,能自动结束人脸目标搜索。下采样降低减法聚类的运算量,提高了算法的运行速度,同时验证了减法聚类在视频运动目标定位中的良好应用效果。精确地检测人脸需要对不同姿态的人脸进行姿态角估计,本文在肤色区域提取的基础上,提出姿态角度估计目标函数,并讨论了两种寻优方法,即梯度下降法和次全局枚举寻优法,来估计姿态角度值。根据估计的姿态角度作相应的旋转校正,在校正后的区域利用眼睛和嘴的色度和亮度特点分别构造映射图,提取出眼睛和嘴,并验证人脸。2)针对流形学习的人脸识别问题,围绕流形学习方法的本质要素,即:(1)如何构造近邻结构图;(2)以什么样的距离测度来衡量人脸样本的近邻;(3)遵循什么样的目标准则来构造低维嵌入,从三方面入手,衍生了中心近邻嵌入学习和鉴别矢量角嵌入学习两种新的流形学习方法。中心近邻嵌入的学习算法,与经典的局部线性嵌入和保局映射不同,它是一种有监督的线性降维方法。该方法首先通过计算各类样本中心,并引入中心近邻距离代替两样本点之间的直接距离作为权系数函数的输入;然后在保持中心近邻几何结构不变的情况下把高维数据嵌入到低维坐标系中。鉴别矢量角嵌入的识别方法,构造了一幅有正/负连接边的邻接图,算法中连接边权系数的测度采用矢量角代替矢量模,不但省去了传统方法中对热核权函数t参数的估计,而且降低由于图像样本间的亮度差异对识别率造成的影响。3)为了实现人脸识别免于特征提取,提出了一种基于正交补脸的人脸识别方法。该方法基于空间正交分解理论,首先对不同类的原始训练样本进行Gram-Schmidt正交化,以正交化后的基张成各个不同的子空间,然后把测试样本分解为子空间投影及子空间正交补两部分。正交补的范数体现了测试样本到各类子空间的距离,并以此作为分类的依据。4)针对单样本人脸识别问题,本文提出了一种基于单样本切割的子模块主成分分析方法。该方法将单样本人脸图片切割成大小相同、互不重叠的多个子模块,构成新的样本集。对所有子模块作主成分分析(PCA)并提取特征,同一人脸的子模块特征系数作为分类识别的依据。

【Abstract】 Face recognition is a key technique among biometric identification technologies, and its most important components are face detection and recognition. The aim of face recognition is to detect faces from images or videos accurately and recognize their identities. This dissertation focuses on learning-based face recognition, including machine learning methods such as clustering, manifold and subspace learning. The contributions of the dissertation are:1) To deal with the problem of face detection, two methods are proposed based on skin detection, which are called integral projection—Gaussian curves and modified substractive clustering, corresponding to single face and muti-face detection. In the approach of integral projection—Gaussian curves, two curves are obtained by integral projecting the binary-image to X and Y axes respectively, from which Gaussian curves are calculated and then, an accurate face region is found rapidly through the solution of Gaussian equation. The modified clustering algorithm proposes a new definition of distance for multi-face detection, and its key parameters can be predetermined adaptively by statistical information of face objects in the image. Downsampling is employed to reduce the computation of clustering and speed up the process of the proposed method. Meanwhile, the proposed approach also implements well in location of moving objects in video sequence. In order to estimate the angle of pose accurately, a cost function is proposed. The methods of gradient descent and sub-global enumerating are employed to search for the angle of pose. By rotating the image with the estimated angle, the pose is calibrated. And then, the eye map and mouth map are constructed by their characteristics of chroma and lum in the candidate region. Consequently, eyes and mouth are extracted for face validation.2) Focusing on the 3 essentials of manifold learning in face recognition, namely, (1) how to construct the neighborhood graph; (2) which measure can be used to estimate the true distance between two face samples; (3) what is the suitable cost function for embedding into subspace, two novel learning algorithms are derived from the manifold learning, which is called center based neighborhood embedding(CNE) and discriminant vector angle embedding(DVAE). Unlike the classical methods such as local linear embedding(LLE) and local preserving projection(LPP), CNE is a supervised linear dimensionality reduction method. It first computes centers of all sample classes. The input of the weight function between two samples is replaced by center based neighborhood(CN) distance. Then, the high-dimensional data are embedded into a low-dimensional space with preserving the CN geometric structure. On the other hand, DVAE constructs a graph with both positive and negative edges. The measure in DVAE is the angle between two vectors instead of modulus in traditional methods. It can be exempted from the estimation of the parameter in heat weight function. When test sample is embedded into low-dimensional space, a classification called angle nearest neighbor is used for face recognition.3) In ordrer to free face recognition from feature extraction, a method called orthogonal complement faces (OC-faces) is presented. The method is based on the orthogonal decomposition theorem. Firstly, the Gram-Schmidt orthogonal transformation is performed on the original training data of each class. Secondly, the orthogonal basis of each class spans a corresponding subspace. Therefore, the query sample can be decomposed into the sum of two components which are the orthogonal projection of query sample onto the corresponding subspace and the orthogonal complement of the subspace, respectively. Furthermore, the norm of the orthogonal complement indicates the distance between the query sample and the subspace of each class, so it can be used for classification.4) In order to deal with the problem of face recognition with one sample per person, a method called sub-block principle component analysis (PCA) based on partitions of the sample is presented in this disstertation. It first divides the sample into a few sub-blocks which have equal size and are non-overlapping, and then treats all the sub-blocks as a new sample set. Finally, PCA is performed on all the sub-blocks so as to extract features. Classification is done according to the projection coefficients of sub-blocks of a person.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2009年 07期
  • 【分类号】TP391.41
  • 【被引频次】12
  • 【下载频次】1323
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络