节点文献

基于人工神经网络的决策算法研究

Use of Neural Networks as Decision Makers in Strategic Situations

分页下载
分章下载
整本下载
在线阅读
不支持迅雷等下载工具，请取消加速工具后下载。

【作者基本信息】上海交通大学，电路与系统， 2009，硕士

【摘要】智能包括在特殊情况下为了实现某个特定的目标,作出正确的决策、达到特定目标的能力。迄今为止,大部分的智能系统仅仅能够模拟某一个特定的推理过程,而很少有系统能够根据环境自动地找到找到自己的思维方式。此外,神经网络(Neural Network)也从来没有在这个领域中被采用。本文介绍一个新的智能系统,它能根据自己所在的环境自动地做出决定,以达到某种特定的目标。即在面临需要达到某一个目标的情况下,该智能系统必须进行自我调整,自己找出最佳的策略。在大多数情况下,某种特定环境的情况参数需要以非线性的方式映射到最终的决策。这种映射过程可以通过人工神经网络来完成。在本论文中,我们利用人工神经网络来充当决策者。我们可以证明精心设计的人工神经网络能够在复杂的环境下(例如其他智能系统的比赛中)具有像人一样的行为、做出合理的决策。本文采用了一种新的人工神经网络结构。我们将对这个新的结构进行和介绍和测试,可以证明这个人工神经网络能够像人一样智能地决策。本文除了采用新的类人人工神经网络结构,还引入了一种新的训练方法。这种训练方法能够让我们的类人人工神经网络不断进化,并最终收敛到一个最佳的决策。这种新的训练方法受启发于人类的学习过程,包括一种新的BP(Back-Propagation)随机无监督强化训练方法(Stochastic Unsupervised Reinforcement-learning Rule)。本文中,我们也通过数学方法证明了这种训练方法的有效性。更重要的是,我们采用的这种训练方法和许多其他的强化训练方法不同,它能够使用在非离散输出的应用中,因而拥有更为广阔的实际应用前景。为了验证本文引入的新的类人人工神经网络结构和新的训练方法。我们通过计算机软件实现该类人人工神经网络,并对其进行测试。测试中,我们采用框架(Framework)为真实生活中的数学模型,例如博弈理论中提供的模型,尤其是重复囚徒困境(Iterated Prisoner Dilemma)的模型。因为博弈论建立的模型常常被使用在对新的类人人工智能模型的测试,我们可以通过这些模型验证我们类人人工神经网络与训练方法的设计,并最终证明我们的类人人工神经网络能够用于制作拥有智能行为的机器。通过测试,我们可以得出我们设计的类人人工神经网络能够像人一样做出只能的决策,从而证明了用人工神经网络根据环境进行决策的想法的正确性。更多还原

【Abstract】 Intelligence consists of the ability to make right decisions in a given situation in order to achieve a certain goal. Until now, most of intelligent systems were just able to copy some reasoning process, but very few systems could find their own way of thinking by themselves and none of them was constituted of only neural networks. This thesis introduces a new intelligent agent who is capable of intelligent behavior, which means that he is able to adapt himself to his environment and to make his own decisions, in order to achieve a predetermined objective. Thus, confronted to strategic situations (situations in which one has to make the right decisions in order to achieve a given goal), such an intelligent agent will be able to adapt himself and to find his own optimal strategy.Most of time, effective decision-making in strategic situations requires nonlinear mapping between stimulus and the appropriate decision. This sort of mapping can be provided by Artificial Neural Networks. Therefore, in this thesis, we describe the utilization of Artificial Neural Networks as decision makers, and we demonstrate that if they are well designed, they are capable of intelligent behavior in complex situations, such as competitive situations against other intelligent agents.The Artificial Neural Network designed in this thesis, benefits of a new architecture that is introduced, explained, and that will then be tested in order to see the ability of such an intelligent agent to make decisions as humans do. Then, after its architecture has been introduced, the Artificial Neural Network will have to evolve according to a new learning rule, as every other neural network, in order to converge to a good Decision Making. This new learning rule that we introduce in this thesis is inspired from the human-learning process and consists in a new stochastic unsupervised reinforcement-learning rule using Back-Propagation. Its effectiveness is also mathematically demonstrated. Furthermore, unlike most of reinforcement learning rules, it is designed so it can be used even with continuous outputs, what makes it worth for a lot of different real-life applications.Finally, to validate the architecture and the human-inspired reinforcement learning that we introduced, the Human-Like Artificial Neural Network is tested, and is shown to be able to evolve himself and to make decisions as humans do. The frameworks used for these tests are mathematical models of real-world situations such as those provided by Game Theory and in particular the Iterated Prisoner Dilemma, which has been used several times those last few years to test new models in artificial intelligence. Thus, Game Theory provides us a framework that validates the design of our Human-Like Artificial Neural Network and of the new reinforcement-learning rule we designed, and it allows us to demonstrate that Artificial Neural Networks can be used to design machines that are also capable of intelligent behavior.更多还原

【关键词】神经网络；增强学习；后向传播；博弈论；重复囚徒困境；
【Key words】 Neural Networks； Reinforcement Learning； Back-Propagation； Game Theory； Iterated Prisoner’s Dilemma；

【网络出版投稿人】上海交通大学

【分类号】TP183
【被引频次】2
【下载频次】239
攻读期成果

知网节下载

节点文献中：

本文链接的文献网络图示:

本文的引文网络

节点文献