节点文献

具有认知能力的智能机器人行为学习方法研究

Research on Behavior Learning Methods for Intelligent Robot with Cognitive Ability

【作者】 王作为

【导师】 张汝波;

【作者基本信息】 哈尔滨工程大学 , 计算机应用技术, 2010, 博士

【摘要】 行为学习是智能机器人设计中的关键技术之一。目前,机器人行为学习方法只限于学习反射式行为。人为给出任务的知识表示结构,根据训练样本来不断调整参数,一旦任务改变则需要重新编程。具有该行为学习能力的系统不具备认知能力,无法产生复杂的智能行为。研究具有认知能力的机器人系统已经成为机器人学研究的重要方向,研究涉及认知心理学、认知科学以及动物行为学等领域。本文着重研究了机器人的认知机制,深入分析了认知模型对于机器人智能发展的重要性。提出了具有认知能力的智能机器人体系结构,并对认知模型中的知识表示以及学习方法进行深入研究,最后利用该研究成果实现了环境的空间认知,自底向上突现出了多任务规划行为。论文的主要工作如下:首先,本文从智能产生的角度重新对机器人的范式进行分类。新的范式分类不仅涵盖了传统的系统范式,而且完善了智能机器人的认知层次,区分了不同的智能等级,明确了认知能力在机器人系统范式中的地位。在此基础上,本文提出了具有认知能力的智能机器人体系结构。该体系结构具有自主学习的能力,只需要给出基本的反射式行为,所有的高级认知能力都可以通过自主学习得到,不需重新编程。各模块之间互相依赖并且可以同时学习,具有实时的学习能力。其次,研究了环境特征的自组织提取,利用“主动感知行为”和“感知-运动协调”来获得环境特征。给出基于变化检测和激活强度的活性神经元设计方法,并利用动态增长自组织特征图(GDSOM)实现了路标的自组织提取和路标识别。实验表明该路标提取和识别方法无需精确定位控制和传感器度量模型,具有较好的鲁棒性和计算速度,并且有效解决了“感知变化性”问题,为认知能力打下基础。再次,研究了时空经验的知识表示和学习方法。讨论了认知数学模型——观测驱动马尔科夫决策过程(ODMDP)并提出了相应的求解策略。借鉴生物神经元的特性,提出一种新的生物神经网络模型—时空联想记忆网络(STAMN)。该网络实现了状态和行动的增量学习并且解决了ODMDP的状态定位问题。利用STAMN实现了环境的空间认知,实验表明该网络可以用于解决循环环境的同时全局定位和标图(SLAM)问题。最后,研究了具有认知能力的强化学习方法。针对机器人所面临的多任务学习问题,提出了具有认知能力的强化学习模型,并提出了适合多任务学习的k步记忆和k步预测的Sarsa((k-M)(k-P) Sarsa)算法。该强化学习模型解决了ODMDP的策略学习问题,并且具有较好的收敛速度。迷宫环境实验验证了智能机器人的多任务学习的有效性。

【Abstract】 Behavior learning is one of the key techniques for intelligent robot design. Nowadays, behavior learning methods of robot is limited to reflex behavior learning. Knowledge representation structure of tasks is given by human beforehand, and training samples is used for parameter tuning. Once the task is changed, reprogramming is needed. Systems that possess such behavior learning capability do not have the cognitive ability, and are unable to emerge complex intelligent behavior. Research on the robotics systems with cognitive ability is becoming an important research direction of robotics, which is closely related to cognitive psychology, cognitive science and animal behavior.This thesis focuses on the research of the cognitive mechanism of robotics, and thoroughly analyses the importance of cognitive model to the development of robot’s intelligence. The architecture of intelligent robots with cognitive ability is presented, knowledge representation and learning methods of the cognitive model is thoroughly studied. Finally, the results are used to achieve environmental spatial cognition, and emerge the multi-tasks’planning behavior in a bottom-up way. The main contributions are as follows:Firstly, the paradigm of robot architecture is reclassificated from the viewpoint of intelligence acquisition. New paradigm classification not only covers the traditional paradigm, but also completes the cognitive levels of intelligent robot, differentiates the intelligent levels of robot systems, and specifies the importance of cognitive ability in the paradigm of robot architecture. Based on this, this thesis presents the architecture of intelligent robots with cognitive ability, which realizes autonomous learning, only needs the fundamental reflex behavior, and acquires the high-level cognitive ability through autonomous learning, instead of reprogramming. Modules are dependant on each other, learning synchronously, and so, possess the ability of real-time learning. Secondly, self-organized extraction process of the environmental features is studied.“Active exploration behavior”and“sensory-motor coordination”are used to acquire environmental features. Design method of the activity neurons based on variaty detection and activation intensity is presented. The growing dynamic self-organizing feature map (GDSOM) is presented to extract and recognize the landmark. Experiment results show that this landmark extracting method does not need exact location control and sensor metric model, while, possesses better robustness and less computing burden, which effectively solves the problem of“perception variability”, and builds the foundation for cognitive ability.Thirdly, the knowledge representation and learning method of spatio-temporal experience are studied. The cognitive mathematical model, that is observation-drived Markov decision process (ODMDP), is discussed, and the solving strategy is proposed. Referring the characteristics of biological neuron, a new biological neural network model, spatio-temporal associative memory networks (STAMN), is proposed to realize the incremental learning of state and action. The state localization problem of ODMDP is resolved. STAMN proposed here is applied to achieve environmental spatial cognition. Experiment results show that this network can effectively solve the SLAM problem for large-scale circular environment.Finally, the reinforcement learning methods with cognitive ability are studied. Reinforcement learning model, which resolves strategy learning problem of ODMDP, and a (k-M)(k-P) Sarsa algorithm are proposed for the multi-tasks learning problem of robot. Their feasibility and effectiveness are validated by the maze environment multi-tasks experiments.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络