节点文献

运动人体跟踪及特征行为识别

Kinetic People Tracking and Characteristic Behavior Identifying

【作者】 苏伯超

【导师】 车仁生;

【作者基本信息】 哈尔滨工业大学 , 仪器科学与技术, 2009, 博士

【摘要】 利用视频序列自动跟踪人体运动,并在跟踪结果基础上识别人体行为的智能视觉系统,因其广泛的应用前景和实用价值,一直都是具有挑战性的研究热点。一般而言,该热点问题的研究分为两大组成部分,分别为运动人体视频跟踪和特征行为识别。对于此问题的两个组成部分,存在众多的研究方法。但是由于人体的运动由人体肢体的运动组合而成,而人体肢体运动的复杂多变,导致整个人体的运动方式多样且繁杂,因此人体运动的精准跟踪不容易实现。人体的特征行为识别属于更高层次的计算机视觉课题,用于理解并描述人的特征行为,同样存在众多的研究难点。本论文在深入总结前人工作的基础上,将人体视频跟踪和人体行为的识别结合,构建了一个自动跟踪视频序列中人体运动,并自动描述视频序列中人体特征行为的智能视觉系统,该视觉系统要解决的问题包括:实时的运动人体跟踪、通过视频序列学习人体模型、利用人体模型实现对人体运动的跟踪、对人体特征行为的识别等。为此论文将研究的内容作如下安排:首先分析国内外现有的人体视频跟踪方法。对人体跟踪方法做了分类,分析了这些算法的功能,并指出现有算法不适用于本系统的原因。对于现有的行为识别方法,亦进行了回顾与分析。在此基础上,提出视频跟踪与识别系统的整体框架图,同时介绍了本系统所采用的方法。针对现有经典用于检测运动物体的背景减除法的不足之处,将类哈尔算子和积分图像加以推广,用于在视频中快速检测人体的各种肢体。类哈尔特征(Haar-like Feature)因其固有的特点,适用于检测矩形或类矩形的图像区域;积分图像的特点是可以利用类哈尔特征快速定位人体肢体在图像中的位置,完成对人体各种肢体的检测。在获得检测肢体基础上,在颜色空间利用聚类算法完成对特征点的分类,将检测得到的不同肢体分类同时去除噪声。由于传统的聚类算法(K-Mean, Mixture-Gaussian)不适用于本研究的视频跟踪系统,为此引入基于核函数的非参数聚类算法,在颜色空间直接计算特征点的梯度,并迭代将该特征点移往特征空间的局部极值处,完成对特征点的分类,同时去除了噪声。在聚类数据的基础上利用概率推理框架学习人体模型。传统的隐马氏模型用于跟踪人体运动时,容易丢失跟踪目标,最后导致人体模型学习的失败。为此,对隐马氏模型做了改进,并使用动态规划法对改进的隐马氏模型进行推理得到最优的肢体序列,然后利用聚类算法搜索获得最能代表该类肢体的肢体模板。最后利用人体骨骼的限制,对改进的隐马氏模型进行拓展,构建一个可利用视频序列来学习整个人体模型的推理框架。获得人体模型后实现对视频序列中运动人体的跟踪。通过检测视频序列中每帧图像中的人体模型,用于实现对运动人体的跟踪。人体模型检测过程通过将模型与视频中每帧图像匹配来实现,匹配过程利用代价函数衡量人体模型与图像的匹配效果,该代价函数的最优化可通过动态规划法实现。动态规划法的计算复杂度为二次方,引进了距离转换(Distance Transform),可降低计算复杂度。同时通过改进对人体模型根结点的选择方式,显著缩小匹配过程的搜索空间,从而加快了人体模型与图像的匹配时间。利用获得的人体跟踪姿态参数实现对视频中人体特征行为的识别。利用经过标注的动作库和跟踪获得的人体姿态参数,构造用于推理的隐马氏模型,然后采用维特比算法,合成一个与跟踪阶段获得的人体姿态相匹配的最优动作序列,因为该动作序列事先已经过标注,因此可实现对视频中人体行为的识别。此外,还针对某些特征行为,基于运动人体的视频跟踪姿态参数构建了一个针对特征行为的二维行为识别系统,有效高速,缺点是适用范围相对较窄。

【Abstract】 The intelligent visual system which marries kinetic people tracking with consequent behaviour identifying based on tracks, is always challenging front problem. The research generally consists of two parts, which are tracking people and identifying their behaviour.At present there were many methods for people tracking and behaviour identifying. Nevertheless, people motion consists of people segments movements, and the movements of the segment are complex. Those factors result in the multiplicity of people motion, and being uneasy to track accurately. The people behaviour identification, which used for understanding and describing people behavour, belongs to higher level task of computer vision, and has many difficulties.Based on former researches, this paper unites tracking with identifying, and develops an intelligent vision system for tracking people and identifying their feature behavoiur automatically. This system is meant to solving the following problems: real-time tracking, people model learning atomatically, tracking kinetic people with people model, and identifying their behaviour based on tracking. This paper is arranged as follows:We analyzed the domestic and oversea methods for tracking people, classified those methods into categories, and figured out why they were not fit for our system. Meanwhile, we analyzed the behaviour identification methods. based on that, we proposed our framework for tracking and identifying, and introduced the algorithm we used.Aiming at the weakness of classical background subtraction, we generalized the haar-like feature and integral image ever used for detecing face, and used them to detect different segments of people in video. For its natural characteristic, Haar-like features were suitable for localizing rectangle or rectangle-like region of the image, and we used integral image to position people parts with those features rapidly.We classified the points of color space using cluster method, and separated the detections from noise. Traditional cluster algorithm can not used in our system. We brought in non-parametric cluster method which grounds on kernel function, computed the gradient of points, moved the points towards mode iteratively, and assigned the point to one certain cluster. It means we classify the points while removing noise.Based on clustering, We learned the people model using probabilitic inferences. Traditional HMM was easy to lose target while tracking people, and resulted in the failure of learning. We modified the HMM, and made inferences with dynamic programming to obtain the optimal segment sequence, and searched for the template of that segment, using cluster algorithm. Then we extended the modified HMM making use of the geometric restriction of people, and builded an inference framework learning complete people model with video.We tracked kinetic people in video using learned people model. We tracked kinetic people by detecting people model in frame of video. By matching the people model to image, we detected the model in image, using a cost function as measure of match. For optimalizing the cost function, we used dynamic programming, and we introduced distance tansform for reducing the computation complexity. At the same time, we improved the selection of root node of people model, remarkedly abbreviated the search space of matching procrdure and speeded up the match of people model to frame of the video.We identified people behaviour with people pose of tracking stage. Using annotated motion library&people pose obtained by tracking, we builded a HMM for inference, and synthesized an optimal motion sequence which matched the tracked pose best, with Viterbi algorithm.For the motion sequence was annotated in advance, we can identify the people behaviour of video.Furthermore, we developed an identification system based on the pose tracked, aiming at the property of certain behaviour. It was effective and rapid, yet disadvantage in relatively cabined scope of application.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络