节点文献

面向多方式人际交互的肢体动作识别研究

Research on Human Action Recognition for Multi-modal Human and Robot Interaction

【作者】 曹雏清

【导师】 李瑞峰;

【作者基本信息】 哈尔滨工业大学 , 机械电子工程, 2012, 博士

【摘要】 随着机器人技术的迅速发展,服务机器人逐渐进入家庭生活与服务领域,在融入人类社会的过程中,服务机器人需要具备自然的人机交互能力,现有的人机交互技术距离实际应用仍有较大差距,需要学习人与人之间多方式的交互模式,提高服务机器人的人机交互技术水平,确保实际人机交互过程的自然和高效性。本文针对肢体动作识别在服务机器人的人机交互技术中的优势,从肢体动作中静态手势、上肢动作和步态识别三个主要的交互方式的入手,构建多通道人机交互方法,研究基于肢体动作识别的多方式自然通用人机交互。文中首先研究针对实际人机交互中静态手势识别的难点,提出一种基于单目视觉静态手势识别方法,通过融合多特征的静态手势表征方法,从颜色、纹理和轮廓信息三个方面对静态手势进行描述,实现对静态手势的快速准确的识别,对复杂背景、部分遮挡及用户独立性,具有较强的鲁棒性,并通过公共数据库测试和现有文献中方法进行对比,验证本文识别方法的有效性。研究了基于单目图像和深度图像融合的上肢动作识别技术,首先通过对深度和VGA摄像头进行标定,获取深度图像和RGB图像的坐标关系。然后构建人体骨骼模型,获取关节主节点处的空间坐标,建立基于球坐标系的三维直方图分布,通过各个主节点的分布情况实现对不同上肢动作的识别,对比现有的上肢动作识别方法,能够有效的消除用户个体的手势表达在时间和空间上的差异性,并在复杂环境条件下具有良好的测试效果。提出一种基于激光雷达的步态识别技术,在非接触的条件下从室内较大范围内激光数据中,快速提取用户的步态信息。该方法通过激光数据的预处理,提取数据中的人脚数据段,获取人脚的位置信息,通过建立的用户行走步态模型,从连续数据帧中的人脚位置信息的变化中获得人步态运动中的行走速度、单步长、单步时间和步态速率等步态特征,为人机交互提供重要的交互信息。通过实验验证基于激光雷达的步态识别方法的有效性,并对行走中男女步态的特点进行分析。最后在智能服务机器人实验硬件平台上构建基于肢体动作识别的多方式人机交互系统,通过建立多方式交互信息的语义理解模型,综合肢体动作,表情等多方式交互信息进行机器人的多方式人机交互实验,让机器人与用户实现多方式人机交互的过程,验证基于肢体动作识别的多方式人机交互性能,并建立多方式人机交互系统的评估机制,对多方式人机交互效果进行实验分析。本论文针对服务机器人在实际应用中自然人机交互的要求,通过肢体动作识别技术和多方式人机交互系统的研究提升服务机器人的交互性能,有利于机器人实际应用及产业化发展,同时对多方式人机交互实际应用效果进行评估,为多方式人机交互模式研究提供有利参考,对服务机器人具有实用性参考价值。

【Abstract】 With the development of robot technologies, the applications of robot have beenextented to the service field and human world. To work in the human society, servicerobot need to have the natural human robot interaction ability. Currently, some problemsstill unsolved in human robot interaction. For natural and efficient communication, weneed to improve the service robot interaction ability on the basis of human intercationmodals. Considering the advantages of human action recognition for human robotinteraction of service robot, we systematically researched the human action recogntiontechnology, including hand posture recognition、upper body gesture recogniton、gaitrecogntion, and set up a multi-modal human-robot interaciton based on human actionrecognition.In the dissertation, we first proposed a vision based method to overcome theproblems of hand posture recogntion in human robot interaction. We casted hand posturerecognition as a sparse representation problem, and proposed a novel approach calledjoint feature sparse representation classifier for efficient and accurate sparserepresentation based on multiple features. By integrating different features for sparserepresentation, including gray-level, texture, and shape feature, the proposed method canfuse benefits of each feature and hence is robust to partial occlusion and varyingillumination. Additionally, a new database optimization method was introduced toimprove computational speed. Experimental results, based on public and self-builddatabases, showed that our method performed well compared to the state-of-the-artmethods.By fusing the information from color image and depth image, a new upper bodygesture recognition method was proposed. By means of the VGA camera and depthcamera calibration, the coordinate transformation between color image and depth imagewas estimated. Key points of upper body object were extracted based on human skeletonmodeling. Then the coordinates of key points in spherical coordinate system wererepresentated by3D histogram. According to the model-based sparse classificationmethod,10upper body gestures were recognized in the experiments. Compared withcommon used methods, our method achieved the better results, especially with the complex background and different users.A method for gait feature extraction is given based on laser range data. Thecontourship of the legs are quickly picked up from the laser data in the large area. Inorder to collect the gait features, such as walk speed, step length, step time and stepvelocity, the position of human were located by legs position extracted from laser data ofcontinous frames. Experimental results show that our method performs well for gaitfeatures extraction.Finally, multi-modal human and robot interaction system was formed based on ourservice robot. To achieve natural and universal human and robot interaction, a newhuman robot interaction architecture has been proposed on the basis of semanicunderstanding. By fusing human action information, facial expression, voice information,user was able to natural interact with service robot. According to the evaluation of themulti-modal human and robot interaction system, our system were test based on severalexperiments.In this thesis, to get natural and univeral human and robot interaction, a mobilerobot platform was developed with mulit-modal human and robot interation system byhuman action recogntion which improve the interaction performance of the robot.Experiments presented in this thesis verified that these techniques can improve theperformance of service robot and possess a practical reference value.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络