节点文献

动态多智能体建模与决策问题研究

Study on Dynamic Multi-Agent Model and Decision

【作者】 姚宏亮

【导师】 张佑生; 王浩;

【作者基本信息】 合肥工业大学 , 计算机应用技术, 2007, 博士

【摘要】 复杂的动态决策问题是人工智能领域中复杂系统研究的一个重要组成部分。本文基于贝叶斯技术和决策理论,提出一种具有更强知识表示能力的动态决策模型——多Agent动态影响图,用于动态环境中的多智能体建模;探讨了多Agent动态影响图概率分布的近似计算方法、推理算法,以及多智能体的协作问题。全文主要内容及创新之处如下:(1)给出了影响图的一种结构分解方法,将影响图分解成概率网络结构部分和效用结构部分;提出一种融合结构先验知识的MDL评分标准以降低传统MDL评分标准对数据的依赖性,并基于该评分标准提出一种PS-EM算法用于概率网络结构部分的模型选择;通过将联合效用函数表示成各个局部效用函数的和,进而构造一种用于学习局部效用函数的BP神经网络实现影响图效用结构部分的学习。实验结果表明了该模型选择方法的有效性。(2)通过对相关概率决策模型的分析,将多Agent影响图在时间上进行扩展,提出一种新决策模型——多Agent动态影响图(MADIDs),用于表示动态环境中多Agent协作关系。为了有效地计算MADIDs的概率分布,以Agents之间的策略相关性为指导,给出一种概率分布的分层分解方法,并基于KL差分对近似分布的误差进行了分析。(3)针对MADIDs的1.5片联合树精确推理算法计算复杂性高和BK近似推理算法误差大的问题,提出一种扩展的BK(EBK)算法。EBK算法通过对MADIDs的概率分布进行分层分解来提高推理的计算效率,通过引入分割团来减小算法的推理误差,并且添加了效用结点和决策结点的推理。针对粒子滤波推理算法计算上维数过高和因式粒子滤波推理算法误差过大的问题,将粒子滤波和联合树推理算法的优点相结合,提出了一种联合树因式粒子推理(JFP)算法。JFP算法将MADIDs的概率分布转变成局部因式形式以提高计算效率,并利用联合树来传播因式粒子以减少推理误差。在仿真足球机器人中的一个局部协作模型上,对上面的各种算法进行了实验验证。(4)在基于协作图实现多Agent协作方法的基础上,将角色引入协作图中给出了一种扩展的协作图,以减少协作中的通信。给出一种基于MADIDs的多Agent协作方法,通过环境的推理和局部效用的计算实现协作。通过对对手建模避免局部协作的通信。

【Abstract】 The complex dynamic decision problem is an important part of the complex system research in Artificial Intelligence domain. Based on Bayesian technology and decision theory, Multi-Agent Dynamic Influence Diagrams(MADIDs) model is presented for modeling the dynamic Multi-Agent system, which is a dynamic decision model with more strong knowledge representation ability. The method of approximating distribution, inference algorithms and Multi-Agent coordination are discussed. The main research contents and innovations in this dissertation are as follows:(1) A structural decomposition method of Influence Diagrams(IDs) is presented, and an Influence Diagram can be composed into two parts: probability structure and utility structure. A new MDL scoring is presented for reducing dependency on data, which merges the prior knowledge of network structures. Based on the new MDL scoring, a PS-EM algorithm is proposed for learning probability structure of IDs. The utility function of IDs is the sum form of the each local utility function, and a Neural Network is constructed for learning local utility functions of utility part. The experiment results show that PS-EM algorithm is efficient.(2) Based on analyzing some probability decision models, Multi-Agent Dynamic Influence Diagrams(MADIDs) are presented by introducing a temporal aspect into the framework of MAIDs, and coordination relationships in dynamic environment can be modeled. To efficiently compute the probability distribution of MADIDs, a method of hierarchical decomposition is presented for approximating distribution of MADIDs under the guidance of the strategic relativity among Agents, and the errors are analyzed based on the KL divergence.(3) Aimming at the high computation complexity of the 1.5 slice junction tree exact inference algorithm and the large error of BK approximate inference algorithm, an extensional BK (EBK) approximate inference algorithm is proposed. MADIDs are hierarchically decomposed for improving the efficiency of inference in EBK algorithm, and the conditionally independent separators are induced for decreasing the error of the inference, and the inference of decision nodes and utility nodes are added for inferring MADIDs. The particle filter algorithm and factored particle algorithm are discussed, and a junction tree factored particle filter(JFP) algorithm is presented by combing the advantages of the junction trees and particle filter. JFP algorithm converts the distribution of MADIDs into the local factorial form for improving computational efficiency; For decreasing error, the inference is performed by propagating factor particle on junction tree. Some simulative experiments are performed in the RoboCup simulation environment to verify and compare above algorithms, the results of which are quite satisfactory.(4) The method of Multi-Agent Coordination using Coordination Graph (CG) is discussed; further, an extensional Coordination Graph is presented by inductting roles into CG to decrease the coordination communication. A Multi-Agent Coordination method is given based on MADIDs, where the coordination is realized by inference of environment and computation of local utility; and the communication of local coordination is avoided by modeling the opponent.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络