节点文献

结合粗糙集的支持向量机研究及应用

【作者】 叶蔓

【导师】 赵志刚;

【作者基本信息】 青岛大学 , 计算机应用技术, 2009, 硕士

【摘要】 SVM是统计学习的一种,是在统计学习理论基础上发展起来的一种新型的学习机器。目前,SVM被看作是解决分类问题和回归问题的强有力的工具,并已经是机器学习领域继神经网络后新的研究热点。它以结构风险最小化原则以及VC维理论为理论基础,根据有限的样本信息在模型的复杂性和学习能力之间寻求最佳折衷,以期获得最好的推广能力。支持向量机被看作是对传统分类器的一个好的发展,在解决小样本、非线性和高维的机器学习问题中表现出了许多特有的优势。众所周知,利用支持向量来进行线性或非线性规划具有全局收敛优势,但是支持向量机在解决多类问题时转化过程较为繁杂,且计算量较大,需要占用大量的训练时间。为此,提出了基于邻域的支持向量机训练算法,即通过邻域的计算来减少训练样本的数目以节约训练时间并降低计算量。为了在降低冗余的同时确保分类的准确率,在训练过程中也引入了粗糙集的原理,利用粗糙集理论对数据进行属性约简,从而进一步减少支持向量机求解计算量。实际结果证明了该方法的有效性。本论文解决的主要问题:(1)针对二类分类问题提出的支持向量机在解决多类分类问题时需要进行一定的转化,本文采用将一个多类问题统一为一个两类问题的转化方法,并在空间映射方面做出改进,使得新类的类内距离更小,类间距离更大,从而提高样本的可分性,最后通过类内散度和类间散度的计算在UCI数据集上加以验证。(2)结合粗糙集与支持向量机的理论,利用粗糙集理论对数据的属性进行约简,在保持知识库分类能力不变的条件下,根据其等价关系删除其中不相关或不重要的属性,从而简化决策表,在某种程度上减少支持向量机求解计算量及处理时间。最后将属性约简结合邻域概念以及支持向量回归机算法应用到电力系统负荷预测当中,并与传统算法进行对比分析来证明改进算法的优越性。

【Abstract】 A new learning machine---support vector machine(SVM) is one of the statistical learning theories,which also based on the statistical learning theory.Now,SVM is viewed as the most convincing tool in solving the classification and regression problems and the research focus in the filed of machine learning after neural network.On the basis of structural risk minimization and VC dimension,it seeking the optimum tradeoff between model complexities and learning abilities under the limited samples,to achieve the best generalization.SVM is regard as a good development to traditional classifier,showed many unique advantages in solving small samples,nonlinear,high dimension and other machine learning problems.It is well known,SVM has the advantage of global convergence,but the process become more complexity when it treat multi-class problems,also the computation cost and training time increased.Therefore,an improved algorithm based on neighborhood theory is proposed to handle the problems above.The rough set theory is also introduced to reduce data’s features,so the time complexity is decreased.Actual results proved our method’s validity.In this paper our main work is as follows:First,SVM is a two-class classifier previously,it need transformation when we meet a multi-class problem.The traditional method of transforming a multi-class problem is reconstructing the classes and the outputs of SVM classifier,and then the recognition rate is enhanced.Based on this method,we proposed an improved algorithm which changes the distribution of new samples by using mapping,after that,the same samples are more compact and the different samples are looser,which will contribute to classification.At last,with-class scatter and between-class scatter are calculated for validation in UCI database.Second,combining RS with SVM,use rough set theory to reduce data’s features, delete the irrelevant features according to the equivalence relation,then,the time complexity is decreased.At last,the attribute reduction and neighborhood theory are combined with support vector regression;this improved algorithm is applied to electric power system load forecast and compared with traditional method to prove its advantages.

  • 【网络出版投稿人】 青岛大学
  • 【网络出版年期】2009年 11期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络