节点文献
三维人体运动分析与动作识别方法
3D Hunam Motion Analysis and Action Recognition
【作者】 蔡美玲;
【导师】 邹北骥;
【作者基本信息】 中南大学 , 计算机科学与技术, 2013, 博士
【摘要】 随着运动捕获技术的成熟和推广,高效、快捷的获取大量基于三维的运动数据集已经成为现实。由于三维运动捕获数据较好地保持了运动细节,并真实地记录了运动轨迹,数据精度高,已被广泛应用在计算机动画、影视制作、数字娱乐、体育仿真、医学理疗等领域。在此背景下,基于运动捕获数据的人体运动分析已经成为近年来图形学领域的一大热点。其中,实现对于三维人体运动数据的关键帧提取、自动识别和分类是人体运动分析的重要研究内容,是实现对于运动捕获数据进行有效管理与重用的重要前提和基础。关键帧提取以原始三维运动捕获数据为基础,提取出最能表示运动序列的语义信息的若干关键姿态,是数据压缩、降维、特征提取与表示的重要方法和手段,在关键帧动画创作、人体运动分析与重用等领域已得到了广泛应用。人体运动识别通过分析提取人体运动特征参数,实现自动分析和理解人体各类运动和行为。运动识别技术在高级人机交互、康复工程、体感游戏控制、基于内容的检索方面具有广泛的应用前景和极大的经济价值与社会价值。本文基于捕获的三维人体运动数据,在运动数据的关键帧提取、动作识别与运动分割以及带拒识能力的连续动作识别三个方面展开工作,具体为:(1)运动数据的关键帧提取方法研究。将影响关键帧提取效果的重建能力和压缩率两个重要因素作为优化目标,提出了两种关键帧提取方法。第一种方法将关键帧提取划分为帧预选和基于重建误差优化的精选2个阶段,首先提取运动序列的“极限姿态”作为候选关键帧,在第二个阶段,定义帧消减误差作为关键帧重要性的度量标准,将重建误差作为关键帧提取过程中的优化目标,并且同时考虑压缩率目标,基于帧消减方法提取满足重建误差要求或者压缩率要求的关键帧序列。这一方法的主要特点和优点是可以直接对重建误差或压缩率目标要求进行设置,设置方式简单直观。第二种方法考虑重建误差和压缩率两个目标的竞争性和矛盾性,将关键帧提取问题转换为带约束的多目标优化问题,提出一种基于Pareto多目标遗传算法的关键帧提取方法。这一方法的主要特点和优点是不需要用户指定任何参数即可得到一组具有Pareto最优性的候选关键帧序列集合。实验结果表明了本文方法的有效性。(2)基于概率主成分分析的动作识别与分割方法研究。属于同一类型的人体运动数据应具有相同内在维度和相似结构,形成独立类别,因此对每个运动类型可以采用一个统一的分布模型来表示。提出采用概率主成分分析方法建立各类动作的高斯概率分布模型,并基于期望最大法学习得到模型参数,然后根据各个已知分类的高斯模型,基于最小错误率贝叶斯决策理论定义分类决策规则,并实现了动作分类算法。利用基于概率主成分分析方法建立的模型能够对运动变化信息建模的特点,本文扩展动作分类算法,提出了针对包含多个动作的长运动序列的在线识别和自动分割算法。实验结果表明了本文方法的有效性。(3)基于自组织增长运动图的动作识别及拒识方法研究。针对存在非训练类型动作样本的长运动序列识别问题,提出了结合支持向量机方法和自组织增长运动图方法的带拒识能力的动作识别系统框架。提出了将支持向量机用于运动数据在线识别的方法,分析了不能直接简单采用支持向量机的边缘信息进行拒识的现象和原因。分析了自组织特征图用于描述样本数据的结构分布的能力;针对传统自组织特征图学习方法缺少自适应能力的局限性,提出了自组织增长运动图学习算法,用于根据不同运动类型的内在复杂性学习自适应结构和大小的运动图;然后基于学习的运动图定义了拒识规则;最后基于自组织增长运动图上提取的关键模式集进行分段分类结果验证,以提高识别精度。由于本文的方法结合了支持向量机和运动图两者的优势,因此不仅具有良好的鉴别区分已知分类样本的能力,也具备有效拒识属于未知分类样本的能力。实验结果表明了本文方法的有效性。
【Abstract】 As motion capture technology matures, obtaining massive3D motion dataset with high efficiency and effectiveness has been possible. Motion data have been widely applied to computer animation, movie production, digital entertainment, PE simulation and medical therapy as it could maintain the motion details accurately and record the real motion trace precisely. Therefore, human motion analysis based on captured data has become a popular issue. In addition, the keyframe extraction from3D human motion data, automatic recognition and classification are the most significant parts in human motion analysis and important bases and foundations for efficient management and reuse of the captured motion data.Based on the raw3D motion capture data, keyframe extraction is for the purpose of extracting the key postures which are considered as the abstract representation of the raw motion sequence. As one of the most important methods and strategies in data compression, data reduction, feature extraction and data representation, keyframe extraction technology has been universally applied to animation creation, human motion analysis and reuse, etc. Human motion recognition is aim to analyze and understand all kinds of human motions and behaviors by extracting and analyzing the parameters related to human motion features. It is universally admitted that motion recognition technology has a promising future as well as huge economical value and social value in advanced human-machine interface, recovery project, motion-sensing controller, and content-based image retrieval. This dissertation mainly focuses on three aspects:the keyframe extraction from motion data, postures recognition and motion segmentation, continuing postures recognition with rejection ability. Following is the details:(1) Research on the keyframe extraction from motion data. Two keyframe extraction methods are proposed in this dissertation by optimizing the reconstruction ability and compression rate since they are two significant factors during the keyframe extraction. In the first method, the keyframe extraction experiences two phrases:pre-selection phase and refinement phase. During the first phase, the’extreme postures’are extracted from the motion sequence as the candidate keyframes. In the second phase, the importance of the keyframe is measured by decimated error, and then we optimize the reconstruction error as well as the compression rate. As a consequence, the keyframes satisfying the demands of reconstruction ability or compression rate are extracted. The advantage of this method is that reconstruction error and compression rate can be set directly in a simple way. Concerning the competitiveness and contradictory between the reconstruction error and compression, the keyframe extraction is modeled as a multi-objective optimization problem with constraints in the second method. Consequently, a keyframe extraction method based on Pareto multi-objectives Genetic Algorithm is presented, in which a set of candidate keyframes with Pareto optimality can be obtained without any given parameters related to the threshold, which is the major advantage of this method. The experiment results demonstrate the efficiency of it.(2) Research on posture recognition and segmentation based on Probabilistic Principle Component Analysis (PPCA). Motion dataset in the same class could be represented by a uniform distribution model for the same inner dimension and similar structure of the human motion data. For each motion type, Probabilistic Principle Component Analysis (PPCA) is adopted to build its Gaussian Distribution Model, whose parameters are learnt by Expectation-Maximization (EM). With these learned models, the decision rule can be found and a polychotomizer based on the minimum error Bayes decision theory to recognize single actions is easily obtained with discriminant functions determined by these models. Then an algorithm is presented to recognize the input motion based on the polychotomizer. By extending this algorithm, an online recognition and automatic segmentation algorithm for long motion sequence including different actions is proposed since PPCA has the ability of modeling for changing information from one action to the next. The experiment results provide strong evidence for the validity of the proposed method.(3) Research on motion recognition and rejection method based on self-organizing incremental motion map. In terms of the long motion sequence involving motion types which do not appeared in training dataset, we present a motion recognition system with the ability of rejection recognition which combines Support Vector Machine (SVM) and self-organizing incremental motion map. We apply SVM to online recognition for motion data, and give the reasons why the margin information of SVM could not be applied to rejection recognition directly and simply. Meanwhile, we analyze how self-organizing map (SOM) describe the distribution of the samples. In order to improve the adaptive ability of the traditional learning method base on SOM, a novel self-organizing incremental motion map learning algorithm is put forward, in which a map with adaptive structure and size is automatically adapted according to the complexity of different motion type. And then the rejection recognition rule is acquired according to the learned motion map and used for rejection. In the last step, the key patterns learned from the same motion map by Genetic Algorithm are used for the final confirmation if the result segments are really what their types claim. Combining the advantages both of SVM and motion map, the proposed method not only can identify motion types in the training dataset, but also can reject the motion types which are out of the training dataset. The experiment results provide strong evidence for the validity of the proposed method.