节点文献

几类支持向量机变型算法的研究

Research on Some Variants of Support Vector Machine

【作者】 杜喆

【导师】 刘三阳;

【作者基本信息】 西安电子科技大学 , 应用数学, 2009, 博士

【摘要】 统计学习理论是一种基于小样本的机器学习理论,经过几十年的发展,目前已经形成了一套比较完整的理论体系.支持向量机是在此理论基础之上提出来的一种新机器学习方法.它根据结构风险最小化原则,通过核函数在一个高维特征空间中构造最优线性决策函数,避免了维数灾难,且可达到全局最优解,具有较高的泛化能力.由于支持向量机具有良好的性能,已经被广泛地应用于模式分类,函数逼近和密度估计等问题,日益受到学者们的广泛关注,其理论研究、算法实现和应用方面都取得了重大进展,成为机器学习领域的热点课题.为了降低标准支持向量机的计算复杂度,提高其学习速度和泛化能力,本文主要研究几类支持向量机变型算法的理论及其应用,主要内容如下:一、对目前支持向量机的研究现状做了综述,然后简要地介绍了支持向量机的相关基本知识.二、研究一类基于核最小平方误差的支持向量机变型算法.首先给出最小二乘支持向量机的分类几何解释;其次将用于分类问题的临近支持向量机推广到回归问题上,提出临近支持向量回归机,并给出一种基于Cholesky分解的快速算法,此外证明了新模型对分类问题和回归问题的模型等价性;然后结合最小二乘支持向量机和临近支持向量机的优点,提出直接支持向量机,该模型可同时适应于分类问题和回归问题,且求解更简单,训练速度快,泛化能力也未降低.该模型与最小二乘支持向量机相比,增强了问题的凸性,保证得到的解全局最优;与临近支持向量机相比,修正了线性与非线性模型不统一的缺点,测试速度更快.最后通过数值实验验证了上述研究的可行性和有效性.三、首先理论上证明了模糊支持向量机模型与多惩罚因子支持向量机的等价性,提供了将模糊支持向量机隶属度参数作为模型选择参数进行自适应求解的理论依据;然后针对模糊支持向量机的隶属度设计方法,分别基于支持向量机分类面与样本的几何分布关系和基于支持向量机分类的本质,提出两种更合理的新隶属度设计方法,通过数值实验验证了这两种方法的有效性,并与现有一些方法进行了比较研究.四、研究二次损失函数的模糊支持向量机泛化能力.首先给出二次损失函数模糊支持向量机的数学模型,然后证明其等价于带有新核函数的硬间隔支持向量机,将模糊隶属度参数转化为新核函数的参数;其次将硬间隔支持向量机的四种泛化能力估计方法,推广到二次损失函数模糊支持向量机;最后通过理论分析和数值实验系统地比较它们的估计性能,得到对于二次损失函数的模糊支持向量机最好的泛化能力估计界,为后期的模型选择奠定基础.五、研究了双惩罚因子的二次损失函数支持向量机的模型选择及其应用.将二次损失函数支持向量机应用于乳腺癌X线影像病灶点的识别,针对问题中两类数据的不均衡性,对其采用不同的惩罚因子,然后根据前面的研究结果,通过最小化泛化错误率上界,给出一种自动确定模型参数的方法.最后通过对X线影像中的肿块检测和钙化簇检测实验,验证了该方法不但有效,而且可达到更好的泛化精度.

【Abstract】 Statistical learning theory is a theatrical framework of machine learning for small samples. During the past decades, it has been developmenting to be a relatively comprehensive system of theory. Support vector machine (SVM) comes out as a new machine learning algorithm based on this theory. According to the structural risk minimization (SRM) rule, it can get the global optimal linear decision function in a higher dimensional feature space via a kernel function. It avoids the curse of dimensionality and is of good generalization ability. Since its good performance in pattern recognition, function approximation and density estimation, it has attracted a great attention of researches, and developed rapidly in theory, computing and applications, and becomes a hot topic in machine learning.In order to improve the training speed and/or generalization ability of traditional SVM, This dissertation mainly focuses on the research of the theory and application based on some variants of SVM. The contents in this dissertation are described as follows.1. A review of current status of related research of SVM is given, and then it is followed by a brief introduction of the fundamentals of SVM.2. A study on several SVM variant based on kernel minimum square error (MSE). Firstly, the geometric description of Least Square SVM (LSSVM) classifier is described; Secondly, the Proximal SVM (PSVM) model for classification problem is extended to regression problem, thus Proximal Support Vector Regression Machine is presented, as well as a fast computing method based on Cholesky decomposition. The equivalence of classification model and regression model is also proved. Taking the advantages of LSSVM and PSVM,a new model called Direct Support Vector Machine (DSVM) is proposed. The new model can be used both in classification and regression problems, but be much simpler and has faster training speed and higher generalization ability. Compared to LSSVM, it enhances the convexity of the problem, guarantees to get the global optimum; compared to PSVM, it overcomes the disadvantage of differences of linear and nonlinear cases, and is higher in testing speed. In the end, comprehensive numerical experiments show the feasibility and affectivity of all these researches above.3. A study on fuzzy SVM (FSVM). Firstly, the equivalence of FSVM and SVM with multiple penalty factors is proved theoretically, which is the theoretical foundation of setting the fuzzy membership adaptively as hyperparameters of models. Secondly, as the strategy of pre-setting the fuzzy membership, two designing ways are given, based on the geometrical distribution of classification hyperplane and data samples and the nature of classification of SVM, respectively. Then the numerical experiments are done to show the effectiveness of these two methods, as well as comparisons with other methods available.4. Evaluating the performance of fuzzy SVM with L2 loss function (L2-FSVM). Firstly, the model of L2-FSVM is given, and then it is transformed equivalently as hard margin SVM with a new kernel function, in which the fuzzy memberships are restated as kernel parameters. Secondly, four methods of estimating the generalization error of hard SVM are extended to be applied in that of L2-FSVM. In the end, via comparative analysis and overall numerical experiments, the best estimation of generation error of L2-FSVM is concluded, which can be used as a criterion of model selection.5. Research on the model selection method of dual-penalty-factor SVM with L2 loss function and its application. Since the imbalance of binary class of data in the digital mammography, SVM with L2 loss function and different penalty factors are used. Then, according to the previous research results, one method of determining these hyperparameters automatically is presented, via minimizing the generalization error bound. By the experiments of detection of mass and microcalcifications in the digital mammography, the effectiveness of the proposed method is demonstrated. It is concluded that this method outperforms other setting ways of hyperparameters in terms of generalization ability.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络