节点文献
贝叶斯网络结构学习及其应用研究
Bayesian Network Structure Learning and Its Applications
【作者】 胡春玲;
【作者基本信息】 合肥工业大学 , 计算机应用技术, 2011, 博士
【摘要】 现实世界中存在着大量的不确定性现象,建立有效的模型是对不确定性问题正确决策的关键。针对问题领域中变量之间的不确定性关系,贝叶斯网络提供了一种紧凑、直观且有效的图形表达方式。建立高效稳定的贝叶斯网络学习算法是贝叶斯网络走向应用的关键所在,多年来,贝叶斯网络学习及其应用一直是国内外研究的热门课题。本文在对贝叶斯网络的国内外研究现状进行全面分析的基础上,针对结构学习目前所面临的收敛速度慢和可能收敛于局部最优两大主要问题,对数据完备和数据缺失两种情况下的贝叶斯网络结构学习进行了研究,并进一步地对贝叶斯网络在灵敏度分析和频繁模式挖掘中的应用进行了研究。全文主要内容如下:1.贝叶斯网络的结构学习研究①数据完备情况下贝叶斯网络的结构学习:研究发现MCMC方法抽样过程产生的马尔可夫链具有各态遍历性,并能保证最终收敛于平稳分布,因而具有良好的精度。 MHS是最常用的MCMC方法之一,但MHS算法抽样过程的融合性差,收敛速度较慢。本文从初始值、建议分布和对网络子结构的抽样三个方面对MHS抽样算法进行改进,提出了一种贝叶斯网络结构学习算法PCMHS,该算法同时进行多个MHS抽样,构建多条并行的收敛于Boltzmann分布的马尔可夫链。算法PCMHS首先基于节点之间的互信息,进行所有马尔可夫链的初始化,在其迭代过程中,算法PCMHS基于并行的上一代抽样的样本总体得到产生下一代个体的建议分布,并通过同时对网络中弧和子结构的抽样产生下一代个体。算法PCMHS能收敛于网络结构的平稳分布,因而具有良好的学习精度,而该算法又通过使其初始分布和建议分布近似于其平稳分布,有效地提高了抽样过程的收敛速度。在标准数据集上的实验结果也验证了算法PCMHS的学习效率和学习精度明显优于经典算法MHS和PopMCMC。②数据缺失情况下贝叶斯网络的结构学习:针对数据缺失严重情况下,具有缺失数据的贝叶斯网络结构学习方法存在的学习效率偏低和易于陷入局部最优等问题,本文建立了一种具有缺失数据的贝叶斯网络结构学习算法BC-ISOR,该算法基于界定折叠方法从缺失数据集学习不同变量子集的概率分布,然后基于依赖分析方法进行网络结构的学习。针对属性个数不超过30的数据集,算法BC-ISOR可以通过一遍扫描数据集得到所有已经发生的实例数和可能的实例数,其对缺失数据的处理效率与数据的缺失率无关,并通过在结构学习的过程中采用启发式切割集搜索算法和在冗余边检验之前识别出所有的边的方向来降低条件独立性检验的次数和阶数,因而具有良好的学习性能。在标准数据集上的实验结果表明该算法具有良好的学习效率和学习精度。2.贝叶斯网络的应用研究学习贝叶斯网络的目的是基于贝叶斯网络的推理开展贝叶斯网络的应用研究。①贝叶斯网络的灵敏度分析:贝叶斯网络的灵敏度分析基于连接树推理算法,主要包括证据重要性分析和参数灵敏度分析。Shafer-Shenoy和Hugin算法设计了两种不同的基于连接树的推理分析算法的消息传播方式,相比于Shafer-Shenoy算法,Hugin算法具有较高的推理分析效率,但在邻接树中存在零因子的情况下不能保证能够通过局部计算进行灵敏度分析,针对这一问题,本文通过在Hugin算法的消息传播过程中引入零因子标志位和零因子处理机制,提出了一种用于进行灵敏度分析的Hugin算法的改进算法R-Hugin,并从理论和实验两个方面证明了R-Hugin算法的正确性和有效性。②基于贝叶斯网络的频繁模式发现:本文采用贝叶斯网络表示领域知识,提出一种基于领域知识的频繁项集和频繁属性集的兴趣度计算和剪枝方法BN-EJTR,其目的在于发现当前领域知识不一致的知识,以解决频繁模式挖掘所面临的有趣性和冗余问题。针对兴趣度计算过程中批量推理的需求,BN-EJTR提供了一种基于扩展邻接树消元的贝叶斯网络推理算法,用于计算大量项集在贝叶斯网络中的支持度,同时BN-EJTR提供了一种基于兴趣度阈值和拓扑有趣性的剪枝算法,实验结果表明:与同类方法相比方法BN-EJTR具有良好的时间性能,而且剪枝效果明显,分析发现经过剪枝后的频繁属性集和频繁项集相对于领域知识符合有趣性要求。
【Abstract】 There are many types of uncertainty in the real world and building effective uncertainty modelsis crucial for making correct decisions on uncertain problems. A Bayesian network provides acompact, intuitive and effective graphical expression mode for uncertain relationships of variables inapplication domains. Efficient and reliable algorithms for structure learning are essential forBayesian network applications. Both Bayesian networks and their applications have been hotresearch topics in China and abroad in recent years. After an extensive review on related work inBayesian networks on dealing with two major problems, low convergence rates and convergence tolocal optima faced by existing structure learning algorithms, this thesis studies structure learningunder two scenarios: when the data is complete and when there are missing data items. In addition,the applications of Bayesian networks in sensitivity analysis and frequent patterns mining areinvestigated in the thesis. The main contributions of the thesis are as follows.1. Bayesian network structure learning①Bayesian network structure learning with complete data. Exisiting research has shown thatthe Markov Chain Monte Carlo (MCMC) is a stochastic simulation that ensures ergodicity andreaches solutions with a long term frequency equal to the Boltzmann. The Metropolis-Hastingssampler (MHS) is the most frequently used MCMC method. But the Markov chain of MHS has theproblem of poor mixing and low convergence rates. This thesis improves the MHS algorithm on itsinitial structure, proposal distribution, and sub-structure sampling, and presents an improvedalgorithm PCMHS for learning Bayesian networks. The PCMHS algorithm runs multiMetropolis-Hasting samplers simultaneously, and each sampler is a Markov chain simulating asystem converging to Boltzmann distribution. Firstly, the PCMHS algorithm, based on the mutualinformation between nodes, initializes all Markov chains. In the process of each iteration, thealgorithm, based on the population from parallel Metropolis-Hasting samplers, generates theproposed distribution for the next generation, and uses edge sampling and sub-structure sampling toproduce the individuals of the next generation. The PCMHS algorithm converges to a stabledistribution and has a better learning accuracy. In addition, PCMHS provides an initial distributionand a proposed distribution as close as possible to the stable distribution, and improves theconvergence rate significantly. The experimental results on benchmark datasets also demonstrate thatthe learning accuracy and efficiency of the PCMHS algorithm outperform state-of-the-art algorithmsMHS and PopMCMC.②Bayesian network structure learning with missing data. Exisitng methods for learningBayesian networks from missing data have the problems of getting stuck on local optima and lowlearning efficiency in the case of a large percentage of missing data. To solve these problems, an algorithm BC-ISOR is proposed for learning Bayesian networks from datasets with missing data.BC-ISOR firstly estimates the probability distribution of a variable set from missing data based onthe Bound and Collapse method. Then, BC-ISOR learns Bayesian networks based on dependencyanalysis. When a dataset has no more than30attributes, BC-ISOR can obtain realistic instances andprobable instances through one dataset scan, and its missing data processing efficiency is irrelevantto the missing rate. In addition, through using heuristic ideas for searching cut-sets and orienting allthe edges before removing redundant edges, BC-ISOR can reduce the number and order ofconditional independence tests. So the BC-ISOR algorithm has a good learning performance.Experimental results on benchmark datasets show that BC-ISOR learns more efficiently andaccuracily than the well-known algorithm SEM.2. Applications of Bayesian networks①Sensitivity analysis of Bayesian networks. Sensitivity analysis of Bayesian networks is basedon a tree-join algorithm and mainly includes evidence sensitivity analysis and parameter sensitivityanalysis. The Shafer-Shenoy algorithm and the Hugin algorithm provide two different messagepropagation modes for reasoning-analysizing algorithms based on joint trees. Compared with theShafer-Shenoy algorithm, the Hugin algorithm is more efficient, but cannot guarantee sensitivityanalysis through a local calculation in the case of zero-division. To overcome the limitation of theHugin algorithm in the case of zero-division, a refined Hugin algorithm, R-Hugin, is proposed in thisthesis for sensitivity analysis, which introduces a zero-division flag and a zero-division processingmechanism in the message propagation process of the Hugin algorithm. Meanwhile, the correctnessand efficiency of the R-Hugin algorithm are validated by both theoretic analysis and experiments.②Frequent pattern discovery based on Bayesian networks. Based on background knowledgerepresented as a Bayesian network, this thesis presents a BN-EJTR method for computing theinterestingness of frequent items and frequent attributes, and for pruning. BN-EJTR seeks to findinconsistent knowledge relative to the background knowledge and to resolve the problems ofun-interestingness and redundancy faced by frequent pattern mining. To deal with the demand ofbatch reasoning in Bayesian networks during computing interestingness, BN-EJTR provides areasoning algorithm based on extended junction tree elimination for computing the support of a largenumber of items in a Bayesian network. In addition, BN-EJTR is equipped with a pruningmechanism based on a threshold for topological interestingness. Experimental results demonstratethat BN-EJTR has a good time performance compared with the same classified methods, andBN-EJTR has effective pruning results. Our analysis indicates that both the pruned frequentattributes and the pruned frequent items are un-interesting with respect to the backgroundknowledge.
【Key words】 Bayesian Network; Structure Leaning; Stochastic sampling; Join Tree; Frequent Pattern;