节点文献

核机器学习方法研究

Kernel Based Learning Machines

【作者】 周伟达

【导师】 焦李成;

【作者基本信息】 西安电子科技大学 , 电路与系统, 2003, 博士

【摘要】 从上世纪60年代始,人们开始研究基于数据的机器学习问题理论,直至上世纪九十年代,在Vapnik等人的努力下,基于数据的机器学习理论得到了长足的发展,形成了一门比较完善的统计学习理论,并在此基础上创建了一类全新的通用的有效的机器学习算法:支撑矢量机。统计学习理论的精髓在于引入了假设函数集容量控制的概念,学习机为了获得好的推广能力,需在假设函数集容量控制和最小化经验风险之间作一个好的折衷。在统计学习理论出现和完善之前,在机器学习中引入核函数,更广义地说就是引入非线性映射和非线性函数技术早已有之。但核函数真正在机器学习中获得成功应用始于支撑矢量机。其原因就是由于引入了非线性函数,使得学习机假设函数集太大,容易导致学习机的过拟合而降低推广能力。正是统计学习理论和核技术的结合,才触发了从上世纪九十年代中期开始的核机器的出现和快速成功的发展。目前主要的核机器技术包括支撑矢量机、核Fisher分类器和核主分量分析等。本论文的所有工作正是在上述结合点上展开,主要包括两大部份的内容:支撑矢量机算法分析和改进方面以及基于统计学习理论的新核机器算法方面。在支撑矢量机算法分析和改进方面,本论文主要作了以下四方面的工作:第一、分析了支撑矢量机的基本几何性质。我们针对模式识别和回归估计两类支撑矢量机,分别分析和证明了它们的一些基本几何性质,基于这些性质讨论了支撑矢量机对新增样本的推广能力,得到了一些非常有价值的结论。从这些结论可以看出支撑矢量机对新增样本具有良好的推广能力,并且支撑矢量机是一种可积累的学习模型。第二、提出了线性规划支撑矢量机。我们通过对统计学习理论中一些重要结论,特别是线性假设函数集VC维数的分析,得到了一类线性规划支撑矢量机。在线性规划支撑矢量机中,以对VC维数界作适当放宽为代价,从而降低支撑矢量机的求解复杂度。在该章最后对人工和实际样本的实验结果说明了线性规划支撑矢量机采用放宽VC界对学习机推广能力的影响是可以接受的,而在计算复杂度上明显优于原支撑矢量机。第三、提出了无约束规划回归估计支撑矢量机。当采用高斯损失函数时,我们提出了一种无约束支撑矢量机回归估计算法,并证明了该算法具有严格的凸性,不存在局部极小解。该算法较标准支撑矢量机而言,由于不存在线性约束,可以雷达信号处理重点实验室<WP=6>II核机器学习方法研究采用快速的多维搜索数值方法,如最陡下降法、Newton法和共轭梯度法等具有较快的优化速度,而且能够直接推广到复数域中。第四、提出了自适应支撑矢量机算法。通常无线通信信道具有时变性,要求多用户检测算法具有自适应性。我们提出了一种自适应支撑矢量机方法,并把它用于信道时变情况下的多用户检测。一方面由于支撑矢量机引入结构风险,使得支撑矢量机多用户检测的推广能力较好且对训练要求的样本数也大大下降;另一方面由于支撑矢量机的非线性特性可以比线性检测器更好地逼近最佳检测器。在新的基于统计学习理论的核机器方面,本论文主要作了以下四方面的工作。第一、提出了一种新的支撑矢量机模型选择准则。支撑矢量机模型选择由于其高度的非线性一直是一个非常困难的公开问题。我们通过对支撑矢量机推广能力的分析,提出了一种构造性的与样本分布有关的推广能力衡量准则。该准则与统计学习理论中的推广能力准则具有几何上的一致性,由样本的二阶统计量构成,比已有的完全不依赖于样本分布的推广能力上界更能反映学习过程的收敛性和收敛速率。较为重要的一点是该准则在学习过程之前是可处理的,所以它可以用作所有分类器中数据预处理的准则,同时也可以为支撑矢量机模型的选择提供依据。第二、提出了复值支撑矢量机算法。支撑矢量机由于采用了Vapnike-不敏感损失函数和数值优化算法,不能简单地推广到复数域。为了使支撑矢量机适用于复值样本的处理,我们发展了模式识别复值支撑矢量机和回归估计复值支撑矢量机。首先我们受到数字通信中相位调制方法的启示,定义了复平面上的N进制复值符号函数。然后基于所定义的复值符号函数提出并推导了复数域的二分和四分模式识别支撑矢量机。对于复数域的二分模式识别问题,我们证明了二分模式识别复值支撑矢量机与采用增维方法的实值支撑矢量机等效,因而它仅具数学意义。对于复数域的四分模式识别问题,四分模式识别复值支撑矢量机与数字通信中4-QAM解调决策完全一致,因此将具有良好的实用价值。我们进一步在模式识别复值支撑矢量机中通过引入复核函数及其对应的核函数组得到了非线性模式识别复值支撑矢量机,并讨论了几种典型的复核函数和核函数组。另一方面,严格地说复值样本的回归估计并不能简单地分解为分别对实部和虚部的回归估计。我们针对复值样本的回归估计提出了线性回归估计复值支撑矢量机,并类似模式识别复值支撑矢量机进一步通过引入复核函数以及对应的核函数组得到了非线性回归估计复值支撑?

【Abstract】 In the early of 1960s the theory of machine learning based on the databegan to be studied. It had not been well developed until Vanpik et al completed Statistical Learning Theory (SLT) and proposed a new general and efficient machine learning algorithm.Support Vector Machines (SVMs) in the 1990s. The concept of capacity control of a learning machine plays an important role in the SLT. It is necessary for a learning machine to achieve a good balance between theempirical risk and capacity control of it to obtain a good generalization performance. Before SLT was presented and accomplished, kernel functions had been introduced into machines learning. In other words, the techniques of nonlinear mapping and nonlinear function had been used in machine learning. However the first example of thesuccessful applications of kernel functions in machine learning is the SVMs. That is because the introduction of nonlinear functions into machines leaning makes the set ofhypothesis function without capacity control become very wide, thus the overfitting problem and the decrease the generalization occursineluctably. It is the combination of SLT and kernel method that spring the appearance of kernel machine and its rapid development. At present, the main kernel machine includes SVM, kernel Fisher classifier, kernel principal component analysis (PCA), etc. From the viewpoint of the combination of SLT and kernel method, this dissertation studied the analysis and the improvement of SVMs and new kernel machines. In the aspect of the analysis and the improvement of SVMs, four parts work are studiedas follows.1.Some geometry ofSVMs for classification and regression is described and proven. And then the generalization performance of SVMs on new-added samples is discussed. Through the analysis of the property of new-added samples and the effect of them on support vectors and non-support vectors, some valuable results are presented. These enable us to conclude that SVM has a good compatibility, adaptability and generalization performance for new-added samples and is a hereditable learning model.2.Based on the analysis of the conclusions in the statistical learning theory, especially the VC dimension of linear functions, linear programming SVMs are presented. In linear programming SVMs, the bound of the VC dimension is loosened 西安电子科技大学博士学位论文<WP=9>目录Vproperly. Simulation resultsfor both artificial and real data show the generalization performance of our method is a good approximation of SVMs and the computation complex is largely reduced by our method.3.An unconstrained convex quadratic programming for support vector regression (SVR) is proposed, in which Gaussian loss function is adopted.Due to no linear constraint some classical multi-dimensions optimization algorithms such as steepest descend method, Newton method, conjugate gradation method and so on can be used to solve the convex quadratic programming for SVR.Compared with standard SVR, this method has a fast training speed and can be generalized into the complex-valued field directly. Experimental results confirm the feasibility and the validity of our method.4.Generally the wireless channel varies with time. Therefore it is necessary for a multi-user detection algorithm to have acapacity ofadaptation. Adaptive SVMs are presented for multi-user detection.Structural Risk introduced in SVMs leads to the more generalization and less training samples to be required than the other learning models and the nonlinear SVMs can approximate optimum multi-user detector when the adaptive SVMs is used for multi-user detection. In the other aspect of new kernel machines based on SLT, Four algorithms are described. 1.A new constructiveprinciple, which depends on the distribution of examples, for measuring the generalization performance is proposed based on the analysis of the generalization performance of SVMs.Our principle is consistencyin geometry with that in statistical learning theory, composed of two-order statistic of samples and shows the convergence

  • 【分类号】TP181
  • 【被引频次】40
  • 【下载频次】1846
节点文献中: 

本文链接的文献网络图示:

本文的引文网络