节点文献

基于支持向量机的集成学习研究

Ensemble Learning Based on Support Vector Machines

【作者】 李烨

【导师】 许晓鸣;

【作者基本信息】 上海交通大学 , 控制理论与控制工程, 2007, 博士

【摘要】 支持向量机是建立在统计学习理论基础上的一种新型的机器学习方法,由于具有良好的泛化能力,目前已经在许多领域得到了成功的应用。然而在应用过程中,支持向量机仍然存在一些不足:首先,为了降低求解优化问题的时间,需要采用逼近算法;其次,往往凭经验选择核函数,采用交叉验证法确定分类器参数,并不能保证参数的最优性,但目前仍然没有很好的解决办法;第三,支持向量机本质是两类分类工具,为解决多类问题需要进行扩展,但是无论采用组合多个两类支持向量机的方法,还是在同一个优化问题中考虑所有类别,其分类性能并不如解决两类问题时显著,有些方法在实现上也过于复杂。这些缺点降低了支持向量机的稳定性和泛化能力。集成学习通过训练和组合多个准确而有差异的分类器,为提高分类系统的泛化能力提供了一条新的途径,成为近十年来机器学习领域最主要的研究方向之一。目前,国内外以神经网络、决策树等为基分类器的集成学习研究已经取得了很大的进展,但是对支持向量机集成的研究起步较晚,还需要大量的工作。本文从这一现状出发,研究有效的支持向量机集成学习方法,主要研究工作及创新性成果包括:1)介绍了支持向量机的原理、算法以及多类扩展方法,从基分类器构造和基分类器组合两个方面详尽地总结了集成学习的一般方法,综述了当前国内外支持向量机集成学习研究的发展现状。2)提出了基于属性约简的集成学习方法。粗糙集理论中的属性约简方法可以作为学习算法的一种对冗余数据的预处理手段,但是由于数据噪声和离散化的影响,在许多情况下会降低支持向量机等学习算法的分类性能。对一个包含冗余属性的决策表进行约简可以获得多个不同的约简属性子集,这些子集通常具有较好的分类能力,而且彼此具有一定的差异性,因而可以用来构造支持向量机集成。属性约简集成学习方法能有效利用训练数据中的互补和冗余信息进行融合分类,克服数据噪声和离散化对支持向量机分类性能的不良影响。3)提出了基于属性离散化的基分类器构造方法,指出了三种可能的实现策略:随机选择断点;采用某一种离散化算法,并选择不同数量的断点;或者采用多种离散化算法获得不同的断点集。本文采用第一种策略,首先基于RSBRA离散化算法构造支持向量机集成。进一步地,针对RSBRA离散化结果可能可能较大程度降低支持向量机分类性能的缺点,引入粗糙集理论的数据一致性指标,对RSBRA算法进行改进,使得离散化结果能保留足够的分类信息。然后,在此基础上提出基于改进RSBRA算法的集成学习方法。4)在当前基于搜索技术的集成学习方法中,通常需要一类指标对基分类器的性能进行评估,但是这些指标很难在准确性和差异性之间取得良好的折衷,或者不能直接反映集成的泛化性能。针对这一问题,提出直接遗传集成学习方法,利用遗传算法直接在集成所在的空间搜索分类性能优良的集成。直接遗传集成很容易实现分类器的选择性集成,研究表明,在组合较少分类器的情况下获得了比传统集成学习方法Bagging和Adaboost更好的分类效果。5)研究了多类分类问题中的分类器组合架构,为克服已有架构的不足,提出简化的架构,避免分类器组合过程中不必要的信息损失。在此架构下研究基于证据理论的度量层组合方法,利用支持向量机的后验概率输出和分类精度,定义基本概率分配函数,然后采用一定的规则进行合成。特别地,当采用一对一的多类扩展方法时,可能出现严重的证据冲突,经典的Dempster组合规则不再适用。因此,基于冲突信息部分可用的思想提出新的证据组合方法,根据全体证据的整体有效性确定冲突信息的有用部分,然后将有用部分根据基本概率分配的加权平均在焦元中进行分配,有效地解决了证据冲突的问题。

【Abstract】 Support vector machine is a new machine learning method based on the statistical learning theory. Because of its good generalization performance, support vector machine has been successfully applied in a variety of fields. However, there are some defects with support vector machine during the process of application. First, approximation algorithms are adopted to reduce the time and space complexities in solving the optimization problem. Second, usually the choice of kernel function depends on one’s experience and the choice of parameters is done by cross validation. Though the optimality of them has no guarantee, there is no better solution so far. Third, support vector machine is in nature a tool for binary-class classification and should be extended in order to solve multi-class problems. Unfortunately, either by combining several binary support vector machines or by considering all classes in one optimization problem, the classification performance does not improve as much as in the binary classification; moreover, some methods are too hard to be implemented. These defects degrade the stability and generalization performance of support vector machine.By training and combining some accurate and diverse classifiers, ensemble learning provides a novel approach for improving the generalization performance of classification systems. In recent ten years, ensemble learning has become a main research topic in the field of machine learning. Now, the research at home and abroad of ensemble learning based on neural networks and decision trees have made great progress, while the research on support vector machine ensemble starts relatively late and needs much further studies. This dissertation focuses on developing effective ensemble learning methods with support vector machine and the main contributions are presented as follows:1) First the principle and algorithms of support vector machine as well as the extension methods for multi-class classification are introduced. The general methods of ensemble learning are summarized in detail from both the construction and combination aspects of base classifiers. The current developments of ensemble learning research on support vector machine at home and abroad are reviewed2) An ensemble learning method based on attribute reduction is proposed. Attribute reduction methods in the rough set theory can be used as a preprocessing technique of redundant data for learning algorithms. However, it may reduce the classification performance of learning algorithms in many cases due to the influence of data noise and discretization. Reduction of a decision table with redundant attributes can produce more than one different reduct of attributes. The reducts usually have relative good classification capabilities and are different from each other. Therefore, the reducts can be utilized to construct support vector machine ensembles. Reduction based ensemble can utilize effectively the complementary or redundant information in the training data for fusion classification and overcome the harmful influence of attribute reduction on the classification performance of support vector machine.3) A construction method of base classifiers based on discretization of attributes is proposed. Three possible implementation strategies are pointed out: choosing cuts randomly; adopting some discretization algorithm and choosing different numbers of cuts; or adopting several discretization methods. In this paper, the first strategy is used to construct support vector machine ensemble based on the RSBRA discretization method. Aiming at the disadvantage of RSBRA that it may excessively degrade the classification performance of support vector machine, the level of consistency, which is coined from the rough set theory, is introduced to modify RSBRA so as to preserve enough information for classification. Afterwards, an ensemble learning method based on the modified RSBRA discretization method is proposed.4) In present ensemble learning methods based on search techniques, measures of performance are needed to evaluate the base classifier. Nevertheless, these measures either are hard to be adjusted to make a reasonable tradeoff between accuracy and diversity, or can not reflect directly the generalization performance of an ensemble. Aiming at this problem, we propose a direct genetic ensemble method which searches for a good ensemble in the ensemble space by genetic algorithms. The presented method can be implemented to produce selective ensembles of classifiers readily. The study shows that selective ensembles gain better classification performance than traditional ensemble learning methods such as Bagging and Adaboost by combining less classifier.5) The combination structures of classifiers in multi-class classification problems are studied. A simplified structure is proposed to overcome the defects of existent structures. Based on the structure, the measurement-level combination methods of classifiers based on the evidence theory are studied. In the evidence theory method, the basic probability assignment functions are defined by utilizing the posterior outputs and predictive accuracies of support vector machines and then combined by some rule. In particular, when the one-against-one method is used for multi-class extension, the evidence may conflict heavily and the classic Dempster combination rule is not applicable. Therefore, we propose a new evidence combination rule based on the thought that the conflicting information is partly valuable. The valuable part is determined according to the globe effectiveness of the evidence and then distributed among the focus elements according to the weighted sum of basic probability assignments. The rule tackles the problem of evidence conflicting effectively.

  • 【分类号】TP18
  • 【被引频次】14
  • 【下载频次】1579
节点文献中: 

本文链接的文献网络图示:

本文的引文网络