节点文献

支持向量机在机器学习中的应用研究

Study on Application of Machine Learning Based on Support Vector Machine

【作者】 罗瑜

【导师】 何大可;

【作者基本信息】 西南交通大学 , 交通信息工程及控制, 2007, 博士

【摘要】 近十年来,基于统计学习理论的支持向量机方法逐渐成为机器学习的重要研究方向。与传统的基于经验风险最小化原则的学习方法不同,支持向量机基于结构风险最小化,能在训练误差和分类器容量之间达到一个较好的平衡,它具有全局最优、适应性强、推广能力强等优点。但是直到目前为止,支持向量机方法还存在一些问题,例如训练时间过长、核参数的选择等,成为限制支持向量机应用的瓶颈。本文的研究主要围绕以上两个问题展开,研究结果在多个国际通用的基准数据集上进行验证。论文的主要成果如下:1)系统地研究了支持向量机的训练方法。目前支持向量机的训练算法是以序贯最小最优化(SMO)为代表的,其中工作集的选择是实现SMO算法的关键。本文对基于Zoutendijk最大下降方向法和函数逼近的工作集选择方式进行了总结和整理,并对这种选择策略重新进行了严格的数学推导。研究指出,当二次规划问题的Gram矩阵在非正定的情况下,目前存在的工作集选择算法存在某些不足。2)对于大规模训练集的缩减研究。支持向量机在小样本情况下具有优于别的机器学习算法的性能,但并不意味着支持向量机只限于应用在小样本情况。现实中的问题大多具有大规模的样本,虽然目前有了以SMO为代表的快速训练算法,但对于大规模训练集仍然存在训练时间过长的缺点,不能满足实时性的要求。本文根据支持向量的几何分布,提出了在原输入空间和高维映射空间中预选支持向量的两种方法。原输入空间预选支持向量方法是受启发于最近邻规则,通过与支持向量的几何分布结合,使用Delaunay三角网络寻求包含支持向量的边界集的原理。受聚类方法的启发,基于样本类别质心的方法实现了高维特征空间支持向量的预选。实验证明这两种支持向量预选策略是有效的,在大幅缩减训练时间的同时基本不损失SVM的推广能力和预测性能。3)对支持向量机模型选择的研究。支持向量机通过核函数将样本从输入空间映射到高维特征空间(Hilbert空间),从而实现在特征空间中寻求线性判别超平面。但是,不同的核对应着不同的特征空间,而支持向量机的训练结果在不同的核映射下往往有不同的效果。本文通过对像集线性可分程度和模型复杂程度的估计,寻找可以使学习机器具有良好推广能力的特征空间,并以此为标准实现核的选择。特征空间确定之后,分析惩罚因子与间隔宽度之间的关系,通过间隔宽度实现对惩罚因子的选择。本文的模型选择方法并不寻求核函数、惩罚因子与学习机器推广能力之间的解析表达式,而是以间接的方法估计参数对学习机器推广能力的影响,指导模型的选择。4)对机器学习的实际应用的研究。本文对机器学习的重要问题——人脸识别进行了研究,提出了一种基于关键部件的人脸识别方法。由于一对余多类分类算法缺乏理论上的依据,本文以后验概率作为支持向量机的输出,实现了以相似度为判别标准的多类分类算法。对ORL和YALE人脸图像数据库进行仿真实验,结果表明,该方法具有对表情、姿态以及角度的变化具有较好的鲁棒性。本文研究了SVM在金融领域的一个典型应用——个人信用评估,主要探讨了基于SVM的特征选择和提取方法(遗传算法和主分量分析法)的实际应用效果。实证分析表明,小样本信用数据下SVM的准确度和推广性能显著好于BP神经网络;基于遗传算法的SVM能使银行检测出信用评级的关键决定因素。这对于我国银行进行个人信用评价具有重要的现实意义。

【Abstract】 During the last decades, the method of Support Vector Machine (SVM) based on the Statistical Learning Theory became an important research field in machine learning. Different from those traditional algorithms based on empirical risk minimization rule, SVM is based on structural risk minimization rule. So SVM can achieve a good balance between empirical risk and classifier capacity. In addition, SVM has other advantages such as global optimization, excellent adaptability and generalization ability. However, there are still some problems with SVM, such as too time-consuming and difficulty of selecting kernel parameters, which restricts the application of SVM. Thus, our study focuse on the above- mentioned issues. The research results have been tested on several benchmark data sets of the world.The contributions of this dissertation include:1) Study on the training algorithm of SVM. Currently, sequential minimal optimization (SMO) algorithm has become the best training algorithm for SVM, Working set selection is the key of implementing SMO. We studied the working set selection strategy based on Zoutendijk’s maximal descent direction method and function approximation deeply, and deduced it strictly. We found that the existing selection method has some defaults when the Hessian matrix of the quadric programming problem is not positive definite.2) Study on the reduction for large-scale training set. SVM has better performance than other learning algorithms in case of small sample. But that does not mean that SVM is only used for small sample. As a matter of fact, a majority of problems encountered in real world belong to large-scale data set. In case of large sample, even the SMO algorithm consumes too much training time and can not satisfy the real-time requirements. Based on the geometry distribution of the support vectors, two reduction strategies for pre-selection of support vectors in primal input and high-dimentional feature space are proposed. Enlightened by clustering method, the strategies for for pre-selection of support vectors in high-dimentional space is based on category centroid of sample. Enlightened by a combination of the nearest neighbor rule and geometric distribution, the strategies for for pre-selection of support vectors in primal input space is based on the approach of searching boundary set including support vector using Delaunay Triangulations network. Experiments results show that these two reduction strategies are effective in that they can reduce training time sharply without downgrading the generation ability and prediction accuray.3) Study on model selection of SVM. After mapping the samples from primal input space to high-dimentional feature space (Hilbert space) using kernel function, we can obtain linear discriminant hyperplane in the feature space. Different kernels correspond to different feature spaces, and different results of SVMs are obtained by mapping based on different kernels. By measuring the linear discrimination degree and the complexity of models, a feature space with good generation ability for learning machine is found, and kernel selection is performed. Once feature space is determined, the relationship between the penalty factor and the margin is analyzed, and penalty factor is selected by means of the margin. Instead of building an analytic formula reflecting the relationship between kernel function, penalty factor and generalization ability, the proposed model selection method seeks to estimate the effect of parameter selection on generalization performance indirectly and provide a guide for model selection.4) Study on application of SVM. In this paper, two typical problems are studied using SVM in detail. Firstly, for the typical problem of pattern classification - face recognition, a new identification method based on face component is proposed. Because one-against-rest SVM classifation algorithm is still lack of theoretic foundation, another multi-classification algorithm with similarity as discrimination standard, which use posterior probability as output of SVM is proposed. The experiment is conducted on the ORL and YALE face image database. The result indicates that the proposed method is robust in that it is insensitive to expression and pose variations. Secondly, a classical application in financial domain, personal credit evaluation, is studied. In particular, the application of two feature selection and extraction methods based on SVM (genetic algorithm and principal component analysis) is discussed. Empirical experiment gives useful suggestions. In case of small credit data sample, SVM outperforms BP neural network in terms of prediction accuracy and generaliazation ability. In addition, the hybrid method of combining SVM and genetic algorithm can help bank to identify the critical factors affecting credit evaluation. These conclusions can be of great significance for domestic banks to evaluate personel credit.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络