节点文献

支持向量机参数优化问题的研究

Research on Parameter Optimization of Support Vector Machine

【作者】 胡俊

【导师】 薛小平;

【作者基本信息】 哈尔滨工业大学 , 基础数学, 2009, 硕士

【摘要】 作为数据挖掘中的一项新技术,支持向量机是在统计学习理论基础上发展起来的一种性能优良的新型机器学习方法。它被认为是机器学习领域非常流行的方法和成功的例子。当支持向量机应用于实际问题时,首先面临的是模型参数的选择,包括支持向量机中的参数选择和核函数参数选择。参数的选择直接决定着支持向量机的训练效率和应用效果,因此如何选择参数是应用支持向量机的主要问题。本文以优化理论为基础,以数学规划为手段,对支持向量机在实际应用中的参数优化问题进行研究。主要分成以下三个部分:第一部分,在统计学习理论基础下分析了支持向量机和模糊一类支持向量机的基本模型和算法,并结合结构风险最小化理论,分析了模型中核参数和惩罚参数对分类机性能产生的影响。第二部分,针对支持向量机中的参数优化问题,本文从不同角度出发对惩罚参数和核参数采用不同的优化更新规则。首先定义分离指标用以描述给定数据集的样本间的分离关系,从而建立以核参数为变量的无约束优化问题;然后将核函数参数的最优值代入到支持向量机中,从分类机的推广性能出发建立一个以惩罚参数为变量的有约束优化问题;最后采用遗传算法求解优化问题,与网格法的对比数值实验验证了本文方法的有效性。第三部分,针对模糊一类支持向量机模型,本文首先从理论上解释了惩罚参数的物理意义,给出其取值范围,并分析了支持向量机与模糊一类支持向量机的关系。然后基于遗传算法,分别建立以核参数为变量的无约束优化问题和以惩罚参数为变量的有约束优化问题来确定最优参数值。数值实验的结果表明本文提出的参数优化方法的优越性。

【Abstract】 As a technique of data mining, support vector machine (SVM), which is developedon the frame of the statistical learning theory, has been a new excellent machine learningmethod. SVM has been considered as an extraordinarily popular method and successfulexample in the field of machine learning. When it’s applied to practical applications, thefirst problem confronted is the choice of model parameters, including penalty parameterand kernel parameters. The choice of parameters, which directly determine the trainingefficiency and performance, is the key factor in the application of SVM directly.In this paper, the choice of the parameter selection are considered by using theoptimization and mathematical programming methods. The main contents are dividedinto the following three parts:In the first part, basic models and algorithms of SVM and fuzzy one-class supportvector machine (FOC-SVM) are analyzed on the framework of statistical learning theory.Combined with theory of the structural risk minimization, performance affected by thechoice of different penalty parameter and kernel parameter in SVM classification isanalyzed.In the second part, in terms of parameter optimization problems for SVM, severaloptimization rules are applied to obtain the optimal penalty parameter and kernelparameter. Firstly, a separation index is defined to describe separation relationshipbetween two types of samples for given data sets. Then an unconstrained optimizationproblem on kernel parameter is established. Next, on the condition of fixing the kernelparameter of SVM to its optimal value obtained above, a constrained optimizationproblem whose variable is penalty parameter is established in terms of the generalizationability of classification with SVM. Then the genetic algorithm is used to resolve theoptimization problems obtains above. Finally, some numerical experiments are carriedout. The comparison results with the grid method show the effectiveness of proposedmethod.In the third part, we explain the meaning of penalty parameter of FOC-SVMtheoretically and give its range which can be taken over. The relationship between SVMand FOC-SVM is analyzed. Then an unconstrained optimization model on nuclearparameter and a constrained optimization model on penalty parameter are establishedrespectively. Then the optimal parameter values are given using genetic algorithms.Numerical experiment results show the superiority of proposed methods.

  • 【分类号】TP18
  • 【被引频次】11
  • 【下载频次】591
节点文献中: 

本文链接的文献网络图示:

本文的引文网络