节点文献

分布估计学习算法研究

Research on Estimation of Distribution Learning Algorithms

【作者】 樊建聪

【导师】 梁永全;

【作者基本信息】 山东科技大学 , 计算机软件与理论, 2010, 博士

【摘要】 机器学习是以经验数据为基础,以设计和开发能够使计算机进行行为演化的算法为主要内容的一门学科。机器学习使用的经验数据主要包括数据流和数据库中的数据。对机器学习的研究主要集中在基于经验数据识别复杂模式,进行智能决策等各种自动学习行为的产生。分布估计算法(通常缩写为EDA)是演化计算领域的一个新的研究分支。EDA以种群中个体的分布估计加上从分布中进行抽样两步过程,来代替经典演化算法的搜索算子。EDA的主要目标是不再使用任何算子(重组、变异等),而是通过显示建模来寻找有意义的(最佳的)个体的概率分布情况。本文主要进行基于EDA的学习算法及其理论和应用方面的相关研究,完成的主要工作包括:(1)设计了分布估计一般学习框架FrEDL。FrEDL主要由四个步骤完成,初始化、估计、演化和评价。给出了FrEDL相关的概率基础、代数和代数系统等方面的性质分析。(2)设计了基于EDA的半监督学习算法EDA-SSL。EDA-SSL利用少量有标签数据估计大量无标签数据的类别分布情况,也就是利用已知类别样本估计数据集总体密度分布。利用低维数据的EDA-SSL算法分析过程解析了基于概率的演化学习基础过程,利用真实的高维数据进行了算法的验证和分析。(3)设计了基于EDA的无监督聚类学习算法EDA-USL。通过利用聚类学习的度量方法、数据集的属性相关性分析等提高算法性能的技术,给出了EDA-USL的具体细节描述和分析。算法在真实数据集上进行了验证和分析,实验结果表明该算法具有较高的稳定性,在聚类准确率方面也有较好的性能。(4)提出了基于概率的多Agent小生境竞争学习策略,在竞争学习过程中实现Agent之间的捕获。利用分布估计算法求解多Agent之间的捕获问题,多个追捕者捕获多个逃跑者。通过多次竞争过程,对多Agent竞争的演化过程进行了性能上的分析,表明基于分布估计算法的捕获问题求解方法在时间和迭代次数上都优于其它方法。

【Abstract】 Machine learning is a scientific discipline that is concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases. A major focus of machine learning research is to automatically learn to recognize complex patterns and make intelligent decisions based on data. Estimation of distribution algorithm (abbr. EDA) is a relatively new branch of evolutionary algorithms. EDA replaces search operators with the estimation of the distribution of selected individuals+sampling from this distribution. The aim of EDA is to avoid the use of arbitrary operators (mutation, crossover) in favour of explicitly modelling and exploiting the distribution of promising individuals.This thesis is devoted to some learning algorithms and their theories and application studies based on EDA. The main achievements are summarized as follows:(1) A general learning framework based on EDA (abbr. FrEDL) is designed from the perspective of probability estimation based evolutionary computation. FrEDL consists of four steps, initialization, estimation, evolution and evaluation. The explicit probability basis about the FrEDL is analyzed. And the mathematical properties analysis of the implicit algebra and algebraic system pertained to FrEDL are provided.(2) A semi-supervised learning algorithm based on EDA (EDA-SSL) is presented. EDA-SSL uses a few data samples with class label to estimate class distributions of a mount of data instances without class labels. EDA-SSL is compared with several classification algorithms in error rates of classification and also with genetic algorithms (GA). The experimental and analytical results show EDA-SSL is better than or comparable with other algorithms in classification accuracy.(3) Unsupervised clustering learning algorithm based on EDA (EDA-USL) is designed to solve the analysis of data set without labels. EDA-USL is described and analyzed in detail by measurement methods of attributes and analysis methods of correlation between attributes. EDA-USL is verified on real-world data set and analyzed. The experimental results show that EDA-USL has highly stability and well performance in classification accuracy. (4) The capture problem among multi-agent is solved by EDA. The capture problem involves that some pursuers pursue several evaders through part of trajectory. The probabilistic evolutionary courses of multi-agent experiencing some competitions are analyzed in performances. The analysis shows that capture problem of multi-agent solved by EDA is better than other methods in several aspects.

节点文献中: