节点文献

基于非接触观测信息的机器人行为模仿学习

Robot Imitation Learning Based on Non-contact Observation

【作者】 马乐

【导师】 杨俊友; 福田敏男; 王硕玉;

【作者基本信息】 沈阳工业大学 , 人工智能与电气运动控制, 2014, 博士

【摘要】 行为学习是智能机器人实用化的关键技术之一。模仿学习使机器人将指导者的行为演示信息自主转化为自身行为,是机器人行为学习必然途径与有效方法。行为观测与表征-执行是模仿学习的重要问题。目前的行为观测方法多为接触式,对设备条件和指导者专业知识要求较高且实用性不强。现有行为表征与执行模型不能适应不同类型与层次的行为。因此本文针对以视觉为主的非接触观测信息行为模仿学习方法开展深入研究。针对模仿学习建模与研究效率及安全性问题,建立了视觉观测信息下的模仿学习人机关系,提出基于雅可比矩阵的末端微分运动实时控制算法,实现了机器人末端速度控制。建立3D仿真结合实体的模仿学习系统平台以提高研究效率。针对一般非标注场景下的视觉行为观测问题,首先提出主从特征点描述(MSKD)和二值化像素离散采样(BIDS)局部不变特征描述子算法。MSKD利用了主点与辅助点间关系构造描述子,屏蔽了无关点计算并有效表征了特征点局部特性。BIDS利用了局部区域离散采样信息构造二值特征,克服了光照影响并提高了特征匹配速度。提出的描述子单点计算和匹配时间均优于传统方法。然后提出了基于样本仿射变换的特征训练方法,利用仿射变换模拟不同观测点图像,并整合同一特征点在不同变换图像下的局部特征,经训练后的MSDK与BIDS描述子的识别正确率均由于传统方法。最后建立了基于RGB-D图像的实时目标识别与定位方法。利用深度图像前景分割方法屏蔽了无关远景图像区域,加速了特征提取。利用深度相机模型实时准确地计算了相机坐标系下目标的实际空间位置。实验证明了提出方法的可有效地完成一般非标注环境下的视觉行为观测问题。针对不同层次与类型行为的表征与执行问题,提出了控制图模型。模型将行为表征为由具有特定含义的行为元节点构成的图结构。提出的基于B-Spline曲线的行为元表征方法和基于动态规划的行为元实时执行算法,实现了不同轨迹的行为元表征与执行。仿真与实验验证了提出的控制图模型能在不同机器人平台下有效地表征与执行不同类型及层次行为。针对视觉观测信息下的机器人行为模仿学习问题,提出了适用于视觉观测序列的控制图模型学习方法。首先采用尺度规范化与平滑滤波方法,解决了观测序列的尺度差异与抖动问题;其次提出基于相关性函数的观测序列分割方法,将观测序列分割为待学习的子序列;再次提出了基于弧长约束梯度下降的行为元轨迹学习算法,将观测序列表征为B-Spline曲线;最后提出基于RBF网络的行为元泛化提升算法,将行为元参数函数化,以提升模型的泛化能力。仿真实验验证了提出的学习算法有效性。在RCA87A与雅马哈机器人平台上的多实例视觉观测行为模仿学习综合实验,验证本文建立方法的有效性、通用性与实用性。

【Abstract】 Behavioral imitation learning is one of key technologies for robot application.Imitation learning enables robots to transform demonstrated behaviors to their actions,which is an inevitable and effective method of behavioral imitation learning in robotics.Observation and representation-reproduction are important contents of imitation learning.There exist many behavioral observation methods by contact, which often need complexdevices and advanced professional knowledge and are hard to be applied. The proposedmodels for representation and reproduction can handle different layer and class behaviors.Therefore, in this dissertation the imitation learning of non-contact observation based onvision was researched in depth.The relationship between human and robot for visual observation was built formodeling of imitation learning and research efficiency and security. The real time controlalgorithm of end-effector differential motion was proposed to control its velocity. Imitationlearning system, which integrated3D simulation with real robots, was built to increaseresearch efficiency.For the problem on visual behavioral observation in a general unmarked scene,firstly, two local invariant descriptors were proposed, which were main-sub featuredescriptor (MSKD) and binary intensity discrete sampling (BIDS). The relationshipsbetween main key and sub-keys were used to construct descriptors by MSKD, whichavoided computing irrelevant points and represented the patches of key points effectively.BIDS constructed descriptors by sampling discretely around the keys, which overcome theproblem on lighting effects, and sped up the feature matching. The time costs of computingper key and matching of proposed descriptors were less than traditional methods. Secondly,a feature training algorithm based on sample image affine warping was proposed. In thisalgorithm, the affine transform was used to simulate observed images in different views,and the features in different affine transformed images of the same key in sample imagewere integrated. The MSKD and BIDS trained by the proposed method were superior totraditional methods in matching accuracy. Finally, the real-time objective recognition and positioning based on RGB-D image were built. The foreground segmentation based ondepth image shielded regions of irrelevant background image, which sped up featureextraction. The objective positions of actual space on the camera coordinate werecomputed in real time and accurately based on depth camera model. Experiment testifiesthat the proposed methods can deal with visual behavioral observation in a generalunmarked environment.For the problems on representing and reproducing behaviors in different types andlevels, the cybernetic graph (CGM) model ware proposed, which represented a behaviorwith a graph, in which each node was a behavior primitive that had specific signification.The method of representing behavioral primitives based on B-Spline curve, and thealgorithm of behavioral primitive real-time control based on dynamic programming wereproposed, which represented and reproduced different types of behavioral primitivetrajectories. Simulations and experiments testifies that CGM represents and reproducesbehaviors in different types and levels in different robots.For robot behavior imitation learning from visual observed information, learningmethod of CGM was proposed, which was suitable for visual observed sequences. Firstly,the problems on difference in scale and shaking of sequences were resolved bynormalizing scale and smooth filter. Secondly, the method of sequence segmentation basedon correlation function segmented sequences into sub-sequences for learning. Thirdly, thelearning algorithm for behavioral primitive trajectories based on gradient descent withconstraint of arc length was proposed, which transformed observation sequences intoB-Spline curve. Finally, the parameters of behavioral primitives were functionalized by anew generalization boosting algorithm, which enhanced generalization performances.Simulations testifies that the proposed methods were efficient, and the multi-instancesynthetical experiment of imitation learning from visual observation on RCA87A andYamaha robots approves that the proposed methods in this dissertation are efficient,commonly used and applied.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络