节点文献

基于RoboCup多智能体系统学习与协作问题的研究

Multi-agents System Learning and Cooperation Research of RoboCup

【作者】 杨宝庆

【导师】 刘国栋;

【作者基本信息】 江南大学 , 控制理论与控制工程, 2008, 硕士

【摘要】 随着计算机技术的发展,分布式人工智能中多智能体系统(MAS:Multi-agent System)的理论及应用研究已经成为人工智能研究的热点。RoboCup(Robot World Cup)即机器人世界杯足球锦标赛,是一种典型的多智能体系统。该系统具有动态环境、多个智能体之间合作与竞争并存、受限的通讯带宽以及系统设置的随机噪声等特点。通过该系统这个具有普遍意义的试验平台,可以深入研究和评价多智能体系统中的各种理论和算法,并将结果推广到众多领域。本文的主要研究工作如下:1)针对RoboCup中Agent决策任务的复杂性特点,设计了基于分层学习的决策框架。该决策框架将Agent的决策任务按高级到低级分为多个层次,每层的决策通过相应机器学习方法实现,并以下一层的学习结果为基础。而针对层结构的误差积累问题,采取了一种改进的层结构,加入了一个协调层,用于对决策信息进行评价,并对明显错误的信息进行更正。2)为了提高Agent个体技术的智能性,采用遗传神经网络技术进行离线训练,实现了Agent的截球技术。实验表明,该技术较好地解决了噪声所造成的干挠影响。而对于智能体的踢球技术,则采用Q学习进行离线训练。3)针对Agent团队协作的进攻决策学习问题,对单Agent的Q学习算法进行了扩展。主要思想是引入学习智能体,同时,将统计学习与增强学习相结合,通过对智能体间联合动作的统计来学习其它智能体的行为决策。本文的相关实验在Robocup仿真比赛环境下进行,实验结果证明采用本文的学习算法有效地实现了Agent在复杂环境下的智能决策。

【Abstract】 With the development of the computer technology, research on the theory and application of Multi-agent system( MAS) has become a hot spot of Artificial Intelligence. The Robot World Cup(RoboCup) is a typical MAS with characters such as dynamic environment, the co-existence of cooperation and competition among several agents, limited communication bandwidth, and the noisy environment. Based on this general test plat form, various theories of MAS can be researched and applied to many field.Considering the complexity of agent decision task in RoboCup, layer learning based on decision framework is designed. The framework divides the full decision task into several layers from high-level to low-level. To solve errors accumulation among layers, we adopt the improving layer’s structure with a corresponding layer, which can be used to judge of decision-making information and correct inaccuracy information.In order to improve the intelligence of individual skills, the off-line learning method is adopted to learn the basic techniques such as ball interception.With the analysis of two different solutions,an improved dichotomy algorithm based on neural network and genetic algorithm is proposed to achieve ball interception. Q learning method is adopted to train basic skills of ball kicking.For the learning problem of agent team cooperation, the basic Q learning algorithm is extended introducing the concept of learning agent .And meanwhile,the agent can learn other agents’ action policies through observing and counting the joint action, a concise but useful hypothesis is adopted to denote the optimal policies of other agents,the full joint probability of policies distribution guarantees the learning agent to choose optimal action.All the experiments are made under RoboCup simulation platform.The results have proved that the agent’s learning method proposed in the paper can effectively improve the intelligence of agent decision in complex domain.

  • 【网络出版投稿人】 江南大学
  • 【网络出版年期】2009年 03期
节点文献中: