节点文献

支持向量机算法的研究及应用

Research and Applications of Support Vector Machines

【作者】 王芳

【导师】 杨慧中;

【作者基本信息】 江南大学 , 检测技术与自动化装置, 2008, 硕士

【摘要】 基于统计学习理论的支持向量机是一种新型的学习方法,它采用结构风险最小化原则,是一个凸二次优化问题,能够保证找到的极值解就是全局最优解,从而在统计样本量较少的情况下获得良好的统计规律和更好的泛化能力,为解决小样本、非线性、高维数等学习问题提供了一个框架,帮助解决了许多其他学习方法难以解决的问题。本文针对支持向量机的理论和应用做了如下研究:在详细分析SVM算法及其属性的基础上,利用SVM的解具有稀疏性的特点,提出了一种基于模糊核聚类的数据约简型支持向量机算法。该算法利用非线性映射和核技巧,通过模糊核聚类方法将数据映射到高维特征空间后聚类,以此来寻找靠近最优分类面的数据,从而进行数据的约简,在保证推广能力不受太大影响的前提下,缩小SVM的求解规模,从而提高其学习速度。实验的结果证实了该数据约简算法的可行性和有效性。为了进一步提高SVM的推广性能,本文提出了一种基于改进Adaboost的ε不敏感支持向量回归集成算法。该算法使用多个支持向量机,按照某种学习规则协调各支持向量机的输出,从而提高其泛化性能。将该方法应用于双酚A生产过程中质量指标的软测量建模,仿真结果表明了该集成算法的可行性和有效性。参数选择是支持向量机研究领域的重要问题之一。针对SVR参数对模型的推广能力影响较大,但目前又无完善的理论指导参数选取这一问题,本文提出了一种基于二分法的核参数解路径算法。在该算法中,随着参数的更新,在已有参数解的基础上进行推导计算以求得当前参数的最优解,而其目标函数的极值所对应的参数值即为最优参数解。数值函数和实际应用例子表明该方法可以快速地求得推广能力最佳的模型所对应的参数。

【Abstract】 Support Vector Machine (SVM) is a kind of novel machine learning methods, theoretically based on statistic learning theory. It employs the criteria of structural risk minimization. And it’s a quadratic programming problem which can make sure that the extreme solution found is the optimal one. So it can use limited information to obtain statistic principles and high generalization, and can also provide a framework for the small samples, nonlinearity and high dimension problems which most traditional learning methods can’t solve. In this paper, a series of work on the theory and application of support vector machine was discussed.After analyzing the SVM theory in detail and using the characteristic of solution sparseness, a data reducing algorithm of support vector machine based on fuzzy kernel clustering was proposed. Through nonlinear mapping and kernel trick, the data which were mapped into a high dimensional feature space from the original space can cluster in the feature space by using fuzzy kernel clustering algorithm. So the data which were most likely to be support vectors, can be found from the sub-clusters that were located near the optimal classification hyperplane. And the size of sample-data for SVM training turns to be small. Meanwhile the training time was reduced greatly without compromising the generalization capability. The simulations show that this new method was effective.In order to further improve the generalization of SVM, an improved support vector regression ensemble algorithm was proposed. Learning by a series of support vector regressions and combining all the results in accordance with some rule, the algorithm improves its regression performance greatly. Moreover, the proposed algorithm was used in a soft–sensor model for the Bisphenol-A productive process. Simulations using artificial and real data also demonstrated that the algorithm was effective.Parameter selection was one of the most important issues in the research of support vector machines. The previous researches show that the SVM’s generalization capacity was greatly affected by its parameters. But there have been few theoretical methods to choose the SVR’s parameters so far. A solution path algorithm with respect to kernel parameter based on the bisection method was proposed. With the update of the parameter, the current solution can be computed based on an already obtained one, and the value of the parameter which is correlated with the extreme value of the target function is the optimal one. Simulations using artificial and real data show that this algorithm can quickly get the model which has better generalization.

  • 【网络出版投稿人】 江南大学
  • 【网络出版年期】2009年 03期
节点文献中: