节点文献

基于1-SVM的多球体分类器理论及其应用研究

Research on 1-SVM Based Multi-sphere Classifier and Its Application

【作者】 徐磊

【导师】 赵光宙;

【作者基本信息】 浙江大学 , 控制理论与控制工程, 2008, 博士

【摘要】 作为支持向量机的重要分支算法,一类支持向量机及多球体聚类算法在异常检测、聚类学习等无指导领域得到了良好的应用。本文在深入研究一类支持向量机理论的基础上,给出一类支持向量机的作用集训练算法,从改进核聚类算法出发提出多球体理论框架,并进一步与有指导分类器融合以增强其训练速度和分类正确率,使之能适应现今大样本训练和实时决策的学习趋势。本文的主要工作包括:(1)提出一类支持向量机的作用集训练算法,并引入递推方法求解无约束优化的线性方程。该算法寻找支持向量在最优状态下的分布,无需逼近目标函数,避免使用KKT容忍值,并可获得解析的最优解,从而提高了一类支持向量机的训练效率。(2)为解决一类支持向量机核聚类的距离参数问题,提出模糊核聚类算法,定义具有支持向量特性的模糊隶属度函数替代距离参数,通过惩罚边缘样本的权重抑制聚类中心的偏移,不失鲁棒性地避免了参数搜索过程。同时,在核聚类算法基础上提出多球体理论框架。(3)扩展多球体理论框架至有指导学习并构造多球体分类器,引进精简一对多分类器以分离多球体内的混叠样本。构造组合分类器将上述两个性能互补的分类器加权组合,并给出基于交叉验证的权重估计和参数搜索策略。相对于传统的一对多算法,组合分类器显著减少了训练时间和决策时间,提高了分类正确率。(4)针对一对一算法的成对耦合决策规则的实时性问题,使用多球体分类器获得样本与类别的模糊隶属度,引进预分类算法挑选部分隶属度较高的类别参与决策,显著降低决策计算量。给出固定候选集容量和K均值两种预分类算法。前者固定参与决策的类别数,通过调节容忍参数权衡决策时间和分类正确率,以牺牲一定正确率为代价获得较快的决策速度;后者采用K均值聚类得到隶属度较高的类别,考虑了不同样本的模糊隶属度特性,因而其分类正确率无明显下降。(5)针对车牌识别项目,采用图像线性变换完善车牌字符样本库,并将本文提出的基于多球体理论框架的分类器应用于车牌字符识别模块,通过比较实验,最终在项目中选取基于K均值预分类的一对一成对耦合算法。

【Abstract】 As important algorithms inspired by support vector machine (SVM), one-class SVM (1-SVM) and multi-sphere are well applied to novelty detection, clustering and so on. Based on the research on 1-SVM theory, this paper proposes active set method for 1-SVM training, and introduces multi-sphere with fuzzy factors to solve supervised classification problems.This research mainly consists of the following parts:(1) A recursive training method based on active set is proposed to 1-SVM training. It focuses on the optimal distribution of the support vectors rather than the convergence of the objective function, hence the absolutely analytical solution is obtained without the sensitivity of the KKT tolerance.(2) To solve the problem of distance parameters in 1-SVM based clustering, an improved clustering algorithm based on fuzzy 1-SVM (1-FSVM) is proposed. The distance parameters are replaced with SVM featured fuzzy membership functions and the clustering center is prevented from being affected by the abnormal data, hence the robustness of the algorithm against the irregularly distributed data is improved without extra searching of distance parameters. Meanwhile, the framework of multi-sphere is proposed from this kernel based clustering algorithm.(3) The multi-sphere classifier (MSC) is proposed by expending the multi-sphere framework to supervised learning, and the compacted one-vs-rest (1VR) classifier is introduced to separate the mixed samples in the spheres. These two complementary classifiers can be combined into a novel weighted classifier. Cross validation is used here to evaluate the weights and search the optimal training parameters. When compared with the traditional 1VR classifier, this novel classifier gets higher accuracy with less training time and decision time.(4) To decrease the decision time in one-vs-one (1V1) classifier with pairwise coupling (PWC), MSC is used to obtain the fuzzy memberships between the samples and the object classes. Two pre-classification methods are introduced to pick out a part of classes with higher fuzzy memberships for the further PWC decision. The first method fixes the number of the classes which will be involved in the final decision. It makes a trade-off between the decision time and the accuracy by regulating a tolerance parameter. It can save much decision computational costs with a little payment of accuracy. The last method uses K-means to get a clustering with higher fuzzy memberships. Since it considers the difference of the samples, the accuracy remains almost the same.(5) In the license plate recognition project, linear transformations for image are used to get virtual license plate character data. Multi-class classifiers based on multi-sphere proposed in this paper are applied to the on-line character recognition module of the project. By the comparison, the 1V1 classifier with PWC and K-means based pre-classification is selected for the project.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2009年 07期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络