节点文献

基于认知科学的计算机围棋博弈问题的研究

Study on Problems for Computer Go Based on Cognitive Science

【作者】 余磊

【导师】 刘锦高;

【作者基本信息】 华东师范大学 , 通信与信息系统, 2011, 博士

【摘要】 认知科学是20世纪世界科学标志性的新兴研究门类,它作为探究人脑工作机制的前沿性尖端学科,已经引起了全世界科学家的广泛关注。由于在视觉表现形式上具有抽象、复杂的特性,计算机围棋正在成为认知科学研究的重要方向和工具。近年来,研究者大多采用Monte Carlo (MC)、Upper Confidence bounds applied to Trees (UC)、All Moves As First (AMAF)等不包含任何围棋知识的统计算法解决围棋对弈问题。本文则以不同的角度,通过研究围棋棋手的知识结构、表述形式、思维模型等,提出了一系列基于模拟人类棋手思考方式,量化棋手模糊判断结果的计算机围棋解决方案,不但有助于围棋程序棋力的提高,更有助于提升对人类认知能力的理解,促进认知科学的研究、发展,具有重要的创新意义和实用价值。论文具体的创新内容如下:1.对棋块的气划分等级。根据分级结果,可以判断棋块的安全程度,确定捕获目标,产生候选着点并对其排序。通过Memory-enhanced Test Driver (MTD(f))算法对候选点依次进行搜索,寻找出正确的捕获棋步。实验结果表明其效果较好,可以应用于计算机围棋实战中。这也为第三章计算厚势价值时确定有效子数做了必要的准备。2.对厚势价值进行量化。与传统的计算完全控制点的数量不同,本文提出了厚势的影响有如控制概率在二维空间的弥散,所有空点被控制的概率总和即为此厚势价值的思想,并为此设计了一种棋子影响函数,建立了一种计算厚势价值的数学模型;尝试将厚势的价值分为基本值和附加值两部分,进而动态地调整不同棋力下对厚势价值的不同判断,模拟了人类棋手对厚势的感觉;利用简单遗传算法对模型参数进行分级优化,获得了各个棋力层次下的厚势价值量化模型。实验结果表明该模型达到了较高精度,可以应用于计算机围棋序盘、中盘、收官等模块中,为第四章量化棋局形势的程序设计提供了基础。3.提出了一种对棋局形势进行量化评价的方法,采用获胜概率表征量化结果。通过模拟人类棋手判断形势的思考方式,以领先目数和棋局进展程度作为获胜概率的计算依据,并结合多级种群竞争消亡算法对模型参数进行了优化。当围棋程序的棋力发生变化时,只需相应调整模型中的参数,因此模型可以适用于不同棋力层次,具有较强的移植性和较高的普适性,在实验中取得了较好的效果,为第五章程序选择、确定最佳棋步奠定了基础。这种根据围棋知识构建模型、利用遗传算法确定参数的方法也可以应用于其他计算机博弈问题的解决。4.提出了一种计算机围棋中盘着手策略,包括棋步产生、评估和确定的方法。通过计算实地价值、战略价值、棋形价值和后续价值,对候选棋步进行初始评估。根据目的性差异,将棋步分为进攻和防御两部分;并结合获胜概率,计算攻防力度调整权值,从而动态调整攻击和防御棋步的评估值,寻找出最佳棋步。此方法既模拟了人类棋手在落子前的思考过程,又发挥了计算机擅于运算的特长,将动态分析、静态搜索和知识库的应用相结合,体现出一定层次的智能。5.开发了一个计算机围棋博弈系统CognitiveGo,将上述内容整合实现。每当落子前,CognitiveGo先根据模式库寻找一些可能的着点,继而根据自身棋力,对双方的实地、厚势、棋块强弱进行判断,并结合判断结果调整下一步棋的目标方向。在此过程中,模式库影响候选棋步的推荐,形势判断则影响目标之间的转换。综上所述,本文主要采用模拟人类棋手思考过程,建立相应模型的方式对计算机围棋相关问题进行研究,其研究方法和成果对于提升计算博弈智能,促进认知科学发展具有现实的应用价值和理论意义。

【Abstract】 Cognitive science is a symbolic emerging research field of world science in the 20th century. As an exploratory subject focusing on the working of human brain, it has aroused wide attention from all around the world. Computer Go has become an important tool in cognitive scientific research due to its visual abstractness and complexity.In recent years, researchers have been solving Go game problems by using the algorithms including no Go knowledge such as MC, UCT, AMAF, etc. This paper presents a series of Go game resolutions based on the simulation of human players’ thinking method and quantization of the general judgment by analyzing the knowledge structure, expression style and thinking mode of Go players. It is not only helpful for improving the level of the Go program, but also for promoting the cognitive ability of the human beings, which has innovative significance and utilitarian value in the research and development of cognitive science.The innovative thinking is as follows:1. Set levels for the groups’liberties. According to the levels, the groups’safeness can be judged, the target can be fixed and the candidate moves can be generated and sorted. Search the candidate moves in sequence by MTD(f) algorithm to find the best capture move. The result shows the method has good effect and can be applied in the real game of computer Go, which lays groundwork for defining effective number of stones when calculating the strong group’s value.2. Quantize the value of the strong groups. Instead of the traditional way of calculating the controlled points, this paper presents a theory that the influence of the strong groups resembles the diffusion of control probability in the two-dimensional space and the sum of the control probability of the unoccupied points equals to the value of the strong group. An influence function and a mathematical model are established to prove it. We attempt to divide the value of the strong groups into the basic value and the added value, and then further adjust the judgments of the value of the strong groups and simulate the sense of human players toward the strong groups. Finally, the simple genetic algorithm is used to optimize the model’s parameter by level and the strong group quantization models of different Go levels are obtained.The experimental result shows the model is of high accuracy so that it can be applied in starting game, middle game and end game of the computer Go and provide the basis for the static evaluation program and founds the basis for the subsequent program design of quantizing the game situation.3. Based on the above model, a quantization method is presented to evaluate the situation and winning probability is used to show the results of quantization. The parameters are optimized by simulating human player’s thinking method, taking the leading points and game process as the calculating basis of winning probability and combining species compete-die out algorithms. The parameters in the model can be adjusted according to different Go levels so that the model can successfully cope with various Go levels, which boasts good portability and universality and obtains satisfying results in the experiment.The method which establishes model by Go knowledge and defines parameters by genetic algorithm can also be applied in other computer games. Thus it lays the foundation for the subsequent programs to select and define the best move.4. A middle game strategy in computer Go including methods of generating, evaluating and defining moves is presented. The candidate moves are initially evaluated by assessing the values of territory, strategy, shape and influence. The moves are divided into attacking and defending moves according to the difference of the purposes. Combining with winning probability, the strength of attack and defense is calculated and the weight value is adjusted, and then the weight values of attacking and defending moves are dynamically adjusted so that the best move is obtained. This method does not only simulate the thinking process of human players but also makes full play of the calculating advantage of the computer to combine the dynamic analysis, static search and knowledge base, which shows the intelligence of the computer.5. CognitiveGo, a computer Go game system is developed by integrating the above. CognitiveGo searches some possible moves according to the pattern library before playing each move, and then make further judgments about the territory, strong groups and group strength of both sides according to its Go level. After that, it adjusts the subject direction of the next move by the judgment result. During the process, the pattern library has influence on the recommendation of the candidate moves while the transformation of the subject is influence by the positional analysis.In conclusion, this paper researches on the computer Go problem mainly by establishing the model based on simulating the thinking process of the human players. The research method and results has academic and practical value in promoting computer game intelligence and the development of cognitive science.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络