节点文献

基于强化学习的机械臂避碰研究

Research on Obstacle Avoidance of Robotic Manipulator Based on Reinforcement Learning

【作者】 张尚炜

【导师】 李世其; 付艳;

【作者基本信息】 华中科技大学 , 工业工程, 2007, 硕士

【摘要】 实际应用中对机器人的智能化程度要求越来越高,强化学习技术成为增强机器人智能的重要手段。强化学习可以不需要先验知识,通过机器人或智能体对环境进行探测和环境做出的响应,经过一定阶段的学习后就能够掌握足够的知识进行障碍的回避。由于强化学习仅仅依赖于机器人传感器对环境的感知而不需要对环境和机械臂进行精确建模,因而在机器人及其他领域得到广泛应用。本文在研究了机器人避障问题现有方法和强化学习理论的基础上,将强化学习方法引入机械臂的避碰问题研究,建立了一个平面三自由度机械臂的多agent避碰系统,agent能够感知的信息包括距最近障碍物的距离和当前姿态的偏差角信息,这两种信息也是系统的状态变量,agent在两种信息的共同作用下运动。针对机械臂控制的实时性要求,使用具有在线学习特点的强化学习主要方法之一的Sarsa(λ)作为避碰系统的基本控制策略,给出了系统算法的具体实施过程。通过仿真试验,证明了强化学习方法在机械臂避碰问题中的可行性和有效性。由于机械臂避碰系统具有连续的状态空间,因此对状态空间进行硬化分的方式往往不能反映状态的真实属性。本文将聚类方法与强化学习算法结合,使用K-均值聚类算法对连续状态空间进行自适应划分,仿真试验表明相同环境下与硬化分方式相比,自适应划分能够提升避碰能力。在Microsoft的.NET平台上开发出一个基于强化学习的平面3自由度机械臂避碰系统的仿真平台,用以展示避碰试验结果并对系统算法的各个性能指标和环节进行具体分析。通过不同的仿真实验,验证了系统具有较强的避碰能力,机械臂在一些复杂的环境中也能成功避开障碍到达目标。

【Abstract】 This paper reports on the obstacle avoidance problem for robotic manipulators. Machine learning (ML) has become an important means to enhancing the intelligence for the robots, as the increasing requirement in practical application. After a period of learning, reinforcement learning (RL) robots or agent without prior knowledge can avoid obstacles just depend on the exploration and the environmental response. The RL method was applied to this domain because it relies on the sensors’perception on environment, not the need of accurate modeling environment and robot itself.A multi-agent obstacle avoidance system was built for a 3-DOF planar manipulator. The system combines a repelling influence related to the distance between manipulator and nearby obstacles with the attracting influence produced by the angular difference to drive the manipulator moving.According the real-time demand of manipulator control, the Sarsa(λ) algorithm, which is a major method of RL, was selected as a basic control strategy for its on-policy feature and efficiency. The implement process of the algorithm was given and in the end of this paper, a simulation experiment showed the RL method’s feasibility and availability.As the obstacle avoidance problem for robotic manipulators has continuous state space, the state space partitioning is a important factor to improve the applicability and efficacy of reinforcement learning algorithms. The k-means clustering algorithm is used to partition a state space. A series of simulations are provided to demonstrate the practical values and performance of the proposed algorithms in solving robot motion planning problems.A simulation platform was developed for the obstacle avoidance problem of a 3-DOF planar manipulator on the Microsoft.NET. It is used to show the simulation trial results and analyze the obstacle avoidance algorithm. A series of experiments demonstrate the system having strong capacity for collision avoidance, even in some complex environments.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络