节点文献

单目摄像机实现的注视方向估计研究

Research on Gaze Estimation Based on One Monocular Camera

【作者】 葛宏志

【导师】 陈熙霖;

【作者基本信息】 哈尔滨工业大学 , 计算机应用技术, 2011, 博士

【摘要】 注视方向估计是计算机视觉和模式识别研究的热点问题之一,具有重要的理论意义和应用价值。对注视方向估计技术的系统研究能够推动这些领域的发展,在人机交互、心理学研究等领域具有重要的应用前景。近年来,虽然侵入式的注视方向估计取得了很大进展,但非侵入式的注视方向估计尚不成熟。要实现真正鲁棒、实用的非侵入式的自动视线估计和跟踪系统还需要解决大量的关键问题,尤其需要研究高效的人眼描述特征,以及实现头部自由动作的视线方向估计算法。本文研究了以单摄像机作为信息获取手段的非侵入式注视方向估计的相关问题,包括基于单摄像机的数据获取与基准数据(ground truth)的自动标注、眼部表观特征的表示、头部自由运动的注视方向估计等问题。概括而言,本文的主要研究工作包括:1.设计了一种同步采集注视方向、头部姿态数据和面部图像的方法,并实现了相应的装置。在基于统计学习的算法中,系统的性能依赖于大量标注的训练数据。因此,标注的数据集是注视方向估计研究的基础和前提。本文设计的这种数据采集方法,能够在复杂环境下,同步采集图像、姿态、注视方向以及各个目标间的空间位置关系。其采集的数据为后续实验的训练和测试提供了保障。2.提出一种基于方向二值模式特征的注视方向估计方法。随着注视方向的改变,在眼窝中巩膜和虹膜位置之间的相对位置也随着改变。这些改变可以看作是虹膜横向和纵向运动,这种运动会引起眼部图像纹理的相应变化。针对虹膜纵向和横向的移动变化特点,提出方向二值模式(Directional Binary Pattern, DBP)的表示方法。通过计算四个方向上差分信息,使DBP特征不仅包含局部纹理信息,同时还包含特定方向的二值差分信息。因此,DBP特征适合解析虹膜相对运动而引起眼部图像的纹理变化。同时DBP特征对光照变化具有鲁棒性,能减少因光照影响而引起的计算误差。3.提出了一种基于混合特征的注视方向估计方法。混合特征由模型特征和表观特征组成。模型特征提取特征点间的几何向量;表观特征是从眼睛图像提取基于Gabor特征的方向二值模式(Gabor Directional Binary Pattern, GDBP)。本文将两种特征通过支持向量回归(Support Vector Regression, SVR)算法融合起来,从而获得某一确定的头部姿态下的注视方向。将方向二值模式(DBP)用于编码图像的Gabor幅值特征,从而表示表观特征,并取得了较好的性能。基于混合特征的方法具有如下特点:(1)根据不同的计算方向对眼部图像进行二值化;(2)成功地将DBP算子和Gabor幅值特征进行结合,最后提取空间直方图特征作为判别特征;(3)既利用了表观特征较好的统计特性,也得益于模型特征在对光照变化的鲁棒性。4.提出一种头部自由动作的注视方向估计方法。对基于图像特征的注视方向估计研究而言,包含两个重要的问题:头部姿态和眼睛注视方向。目前,头部自由动作的注视方向估计方法多数通过先确定头部姿态,后估计注视方向的方法实现。本文提出一个分布式算法实现头部可动作的注视方向估计,分别估计头的姿态和眼睛的注视方向。在此基础上,提出了一种基于人脸和眼睛特征层级融合的注视方向估计方法。实验验证了该方法的有效性。通过上述工作,本文对基于单摄像机的注视方向估计涉及的一些相关问题进行了研究。结果表明:眼部图像的模型特征和表观特征从不同角度描述了注视方向的信息,高效地对二者进行融合,可以取得更为稳定的估计结果。另外,本文基于所提出的方法实现了相应的原型系统。实验结果表明,本文提出的方法具有潜在的应用价值。

【Abstract】 Gaze estimation is one of the hot research topics in computer vision and patter recognition. It is very significant in the theoritic and practical aspects. Progress in gaze estimation could push these fields forward. Gaze estimation can also be used in Human-Compter Interaction (HCI), and psychology research. Although intrusive gaze estimation has made a big progress in recent years, non-intrusive gaze estimagtion is still in preliminary stage for application. To achieve robust non-intrusive gaze tracking system, it still needs to overcome some key problems. Especially, it needs effective feature and gaze estimation method to implement head-free gaze estimation.This thesis focuses on the some problems related to non-intrusive gaze estimation from a monocular camera. The problems include data collection, and automatically labeling the ground truth of the collected data, eye appreance feature representation, and head-free gaze estimation. The main contributions of the thesis are as following:1. Propose a data collection method which can capture gaze direction, head pose, and face image simultaneously, and a capture studio is implemented based on the above method. For a statistical learning algorithm, the performance relies on large amounts of labelled data. Therefore, the labeled data is the foundation of gaze estimation research. This thesis proposes a novel method of data collection in the complex environment. Our method can synchronously collect the images, head pose, gaze, and the spatial position of subjects. The collected data provides a guarantee for the further experimental training and testing.2. Propose a novel feature named Directional Binary Pattern (DBP) for gaze estimation. The sclera and the iris change their position within an eye socket with the change of gazing different directions. The change can be looked as horizontal and vertical movement of iris, which causes the texture change of eye image. To characterize iris vertical and horizontal movement, a directional binary pattern is proposed. By calculating the difference in the four directions, DBP not only contains the local texture information, but also contains specific directions binary differential information. Therefore, DBP is suitable to descript the texture changement of eye image related to the movement of iris. Mean while, DBP is robust to light variances and can decrease calculating error related to the light reflection.3. Propose a hybrid feature-based method for gaze estimation. Hybrid feature contains the model-based feature and appearance-based feature. Model-based feature contains the geometric vector among the feature points; appearance -based feature is extracted from the eye image based on the Gabor Directional Binary Pattern (GDBP). In this thesis, the combination of features is calculated by Support Vector Regression (SVR) algorithm and one of hybrid features corresponds to a gaze direction in a fixed head pose. For the appearance-based feature, the DBP operator successfully combines with the Gabor amplitude information, which has made a perfect performance. Hybrid feature-based approach has the following characteristics: (1) Binarize the eye image into different calculating directions. (2) Successful combination of the DBP operator and the Gabor amplitude informations, and the final discriminating feature is the extracted spatial histogram from the hybrid features. (3) Explode their statistical properties of features, and also benefit from the robustness to light variances.4. Propose a gaze estimation method which independs to head pose. To video-based gaze estimation, there are two important components: the head pose and gaze direction. At present, the algorithms realize the gaze tracking under the free head motion by calculating the head pose and gaze direction in sequence. This paper presents a distributed framework to estimate the head pose and gaze direction respectively, which can achieve the gaze tracking under the free head pose. On this basis, this paper proposes an algorithm for gaze tracking by the combination of the head and eye features. Experimental results show that our method is effective.In conclusion, through above-mentioned work, this dissertation makes a deep research on the problems of gaze estimation from a monocular camera. The experimental results show that the appearance-based feature and model-based feature have the discriminating information related to the same gaze direction. And then, excellent system performance can be acheved by effectively combining the two features. Moreover, in this dissertation, the gaze direction can be estimated by only one camera under the free head motion. And the proposed methods are applied in the gaze estimation system. The experimental results show that the proposed methods have the practical value.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络