节点文献

鲁棒最小二乘支持向量机研究与应用

Research and Application of Robust Least Squares Support Vector Machines

【作者】 刘京礼

【导师】 徐伟宣; 石勇;

【作者基本信息】 中国科学技术大学 , 管理科学与工程, 2010, 博士

【摘要】 二分类问题是统计学习理论、机器学习以及人工智能中研究的一个重要问题。支持向量机模型采用结构风险极小化原则和核函数方法来构造分类模型,模型比较简单,解具有唯一性。最小二乘支持向量机模型使用误差均方和作为目标函数,把二次规划模型的求解转化成求解线性方程组,克服了支持向量机模型求解二次规划计算量大的问题。但是最小二乘支持向量机模型中的等式约束以及目标函数中的均方误差和使得模型的解丢失了稀疏性,降低了解的鲁棒性。由于随机的或者非随机过程的存在,现实生活中的数据经常带有噪声和不确定性。数据的噪声以及不确定性会影响统计学习分类算法模型的性能,降低分类的准确率及其分类模型的推广能力。支持向量机和最小二乘支持向量机模型都是采用了固定范数的目标函数,这种建立模型的方法不能够很好的适应各种各样的数据结构,从而使得模型的适应能力较弱。为了加强最小二乘支持向量机模型的鲁棒性和稀疏性,增强其推广能力,使模型能够根据数据结构自动进行调整,本文主要开展了以下几个方面的工作:1.系统整理了文献中对支持向量机模型(SVM)和最小二乘支持向量机模型(LS-SVM)中改进鲁棒性的方法,并指出这些改进模型存在的问题和缺陷。从而得到了本文将要研究的主要问题,即以加强最小二乘支持向量机模型的稀疏性、鲁棒性和可解释性为目的,对原有模型进行了较大的改进,给出了基于最小二乘支持向量机模型的有效二分类算法模型。2.针对最小二乘支持向量机模型丢失稀疏性和鲁棒性的原因,提出了使用核主成分法对样本数据中存在的噪声特征进行剔除,并借鉴先前的增强最小二乘支持向量机模型稀疏性的方法,对特征进行压缩,给出了一个双层L1范数最小二乘支持向量机模型—KPCA一L1-LS-SVM.通过使用KPCA方法,可以有效的进行特征抽取和提取。同时以L1范数作为目标函数,可以有效的消除噪声点对模型推广能力的影响,并使模型的解更稀疏,从而可以降低计算的复杂度。在仿真数据集和基准数据库上对该模型的测试表明该方法是有效的。3.在实际的二分类问题中,由于噪声点或者噪声特征的存在使得样本的标签会出现不确定的情况。分类模型应该能够自动判别哪些是相对重要的点,哪些是受噪声点影响较大的样本,从而在分类函数的构造中剔除这种样本。模糊隶属度的概念则可以用来描述样本标签的不确定性。本文采用L1范数作为目标函数以及模糊隶属度的概念可以构造出一个具有稀疏性和鲁棒性的基于最小二乘支持向量机的分类模型—模糊L1-LS-SVM.在测试数据集上的测试表明这个模型同样可以消除噪声点的影响,并具有较好的可解释性。4.在分类问题中,不同的样本在分类函数的构造中所起的作用是不同的。在分类函数的构造中,样本所包含的判别信息越是重要,相应的样本对分类模型的构造所起的作用就越大。因此,为了区别不同样本对于决策函数构造的不同作用,可以对包含重要信息的样本赋予较大的权重,而包含次要信息的样本所对应的权重就会较小。通过这种赋权的方法也可以消除噪声点对分类模型的影响,使得模型具有鲁棒的特征。无论是支持向量机还是最小二乘支持向量机模型,在目标函数中都使用固定的Lp范数,这是一种基于先验知识的建模方法,不能适应各种各样复杂的数据结构。从模型更好的适应数据的角度出发,本文提出了一个赋权鲁棒最小二乘支持向量机模型—RW-Lp-LS-SVM.在仿真数据集以及UCI基准数据库上的测试表明该模型具有鲁棒性特征,稀疏性好,具有较好的解释能力。5.信用评估数据库所包含的数据类型比较特殊,其类别比例极不均衡。为了检验本文所提出的三个模型的分类性能,我们使用这三个模型在三个信用数据库上进行测试,所得到的结果说明模型能够较好的适应信用数据库类别不均衡的特点,因而可以作为信用风险评价的备选模型。

【Abstract】 Binary classification is a wildly studied topic in statistical learning theory, machine learning and artificial intelligence. Support Vector Machines (SVM) adopts structural risk minimization principle and kernel method. It is a simple quadratic programming and has a unique solution. The objective of the Least Squares Support Vector Machines (LS-SVM) is a sum of squares error (SSE) term, thus its solution is obtained by solving a linear formulation equations, which makes it easier to be solved. The drawback of LS-SVM is that sparseness is lost in the solution because the use of equality constraints and the SSE. The solution in LS-SVM is also less robust.The real data sets are often accompanied with noise and uncertainty because of the randomness and no randomness. The noise and uncertainty may have a great impact on the classification model, which reduce the classification accuracy and the generalization ability of the model. Both SVM and LS-SVM adopt fixed objective function, which is a statistical learning method based on prior knowledge. The model construction in SVM and LS-SVM may not be adaptive to various kinds of data sets, thus makes the generalization worse. This thesis is mainly focus on how to improve the sparseness and robustness of LS-SVM and increasing its generalization ability.1. This thesis made a systematic review on how to improve the robustness of SVM and LS-SVM. We also pointed out the drawbacks of the existing models, from which we derived our main research topics, i.e. how to obtain an efficient binary classification model based on previous LS-SVM and how to improve the sparseness, robustness and interpretability of the model.2. Concentrating on the less robustness and sparseness of LS-SVM, We proposed to use the kernel principle component analysis (KPCA) to reduce the noisy features of the data sets. Based on the original work on how to increase the sparseness of the LS-SVM, we gave a bi-level L1 LS-SVM model-KPCA-L1-LS-SVM. KPCA can efficiently extract features from the original features and the usage of L1 in the objective function of the programming makes KPCA-L1-LS-SVM efficiently reduce the effect of the noisy data on the model, which reduces the computational complexity. Several tests on the simulation and benchmarking data sets prove the efficiency of KPCA-L1-LS-SVM.3. The existence of noisy data and features makes the labels of the sample data uncertain in binary classification. An efficient classification model can automatically determine which the relatively important data are and which are less important. The less important data play a lesser role in the construction of the separating hyper-plane. The idea of fuzzy membership can be used to describe the uncertainty of the labels. By adopting the fuzzy membership and the L1 norm in the objective function, we proposed a new model, which is called fuzzy-L1-LS-SVM. The numerical tests on this model proved that it can get rid of the impact of noisy data on the solution and had good interpretability.4. Different data plays a different role in the construction of the decision function. The more important the information contained in the data, the more important in the construction of the separating plane. To differentiate the different role of the data in the formulation of the decision function, the thesis proposed to assign a heavy weight on the more important data, while the less important data may be assigned a small weight. The weight can also get rid of the negative impact on the classification model to some extent, thus makes the model a robust one. The use of fixed Lp norm in the objective function in SVM and LS-SVM is not a data-driven model, which makes it less suitable for various complex data structure. In order to be more adaptive to the data structure, a weighted robust LS-SVM model is proposed. The simulation and the UCI benchmarking data tests proved that the model is robust, sparse and have good interpretability.5. The credit evaluation data sets have a very special data structure, which has unbalanced category. We tested the three models on the two UCI credit data sets and a credit data set of an anonymous American bank to prove the efficiency of these three models. The results showed the models are efficient in handling the kind of unbalanced data sets and can be an alternative tool in credit risk evaluation.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络