节点文献

基于张量数据的机器学习方法研究与应用

Research and Application of Machine Learning Algorithm Based Tensor Representation

【作者】 杨兵

【导师】 经玲;

【作者基本信息】 中国农业大学 , 运筹与管理, 2014, 博士

【摘要】 在传统的机器学习领域,大多数经典的机器学习算法都是基于向量空间的数据进行设计的。然而在实际问题中,许多实际数据需要通过张量形式才能进行更好的表示。若只是直接的将张量数据其转化为向量数据进行操作,这样因为大量的结构信息的丢失使得学习结果不甚理想。因此近些年来基于张量数据的机器学习方法得到了众多研究者的极大关注。使用张量类型的数据,不但保留了其独特的空间结构信息,同时张量学习方法也可以有效的控制优化问题中变量的个数,从而克服了在向量学习过程中经常出现的过度拟合现象。目前,基于张量数据的机器学习新方法被广泛研究及应用,己成为当今数据挖掘领域的一个新的研究热点。本文将从最优化方法的角度研究张量数据的学习问题,特别是基于张量数据学习问题的新模型的建立及其相应最优化方法,并将其最终应用于实际问题中。支持向量机方法是基于最优化方法解决向量型数据挖掘的有效方法,本文将以其为基础,针对张量数据建立各类数据挖掘问题的支持张量机新模型及其求解算法。本文所涉及的研究成果主要包括以下几个方面:1.建立了全新的张量学习框架---低秩支持张量机模型本文以统计学习理论为基础,讨论了经典支持张量机模型与支持向量机模型中存在的局限性。考虑打破经典支持张量机中张量权重参数的秩一限制,讨论了一种新的低秩映射方法,从而建立了全新的张量学习框架---低秩支持张量机模型。2.设计了求解低秩支持张量机模型的相关优化算法:张量梯度下降算法与张量二步法针对低秩支持张量机模型的求解,本文着重讨论了两个基于不同思想的张量优化算法:张量梯度下降算法和张量二步法。张量梯度下降算法通过对优化变量整体梯度的计算,避免了传统张量迭代算法中的大量的交替迭代过程。从而使得新方法在求解速度上得到了大大提升。张量二步法则基于找到一个较优的近似解的思路,通过顺序求解两个目标函数及可行域都更为简单的子优化问题,得到了低秩支持张量机原始模型的一个近似解。3.不平衡数据分类低秩支持张量机的建立基于本文所提出的低秩张量学习思想,通过对经典的双子支持向量机模型的推广,本文建立了用于解决张量数据的不平衡学习的新模型LS-TNPPC模型。新模型的提出不仅丰富了处理不平衡数据分类问题的数据挖掘方法,同时也说明了使用低秩张量模型的思想对传统的向量方法进行张量上的推广是行之有效的。4.核方法张量学习与多标签核支持张量机本文详细讨论了张量数据应用核方法时应遵循的原则。并根据张量数据的特点,给出了一种可应用于张量数据的核构造方法。通过这种核方法,本文建立了一个用以解决图像场景分类中多标签分类问题的优化模型,在实际问题中也取得了一定的成功。

【Abstract】 In the traditional research for machine learning, most of the classical learning algorithms are based on the vector space model. But many objects are naturally represented by tensors in computer vision research. In prevenient research, the tensor was always scanned into vector, thus leading to the data structure destroyed. It discarded a great deal of useful structural information, such as spatial information and temporal information. Recently, the advantages of tensorial algorithms have attracted significant interest from the research community. Compared with vector representation, tensor representation is helpful to overcome the over fitting problem in vector-based learning and the tensor learning algorithms specially suited for small-sample-size problems. Therefore tensor representation and tensor learning have become a new research hotspot at present.In this paper, the reaserch of tensor learning method is based on optimization method, especially focus on the establishment of the new model, new algorithm and its applications. Support Vector Machine (SVM) is a powerful tool of data mining and pattern recognition. In this paper, SVM algorithms have been extended to deal with tensors. The new tensor models and algrithms are presented. Specifically, the main achievement of this paper is as follows:1. The new tensor learning framework, low rank Support Tensor Machine, is presented:Based on statistical learning theory, the paper discussed the limitation of classical Support Vector Machine (SVM) and Support Tensor Machine (STM). The Rank-One limitation of the formulation of weight parameters tensor is broken, a novel low rank tensor projection has been discussed. At last, the new tensor learning framework, low rank Support Tensor Machine (LR-STM), has been presented.2. Two novel tensorial algorithms has been designed to sovle the proposed LR-STM:Tensor gradient descent algorithm calculated the descent direction for LR-STM by some smoothing operations. It avoids the alternating process which existed in traditional tensor algorithm and gets the optimal solution directly and fast.Tensor two-step algorithm divided the primal problem of LR-STM into two sub-problems. By skillful combining the two solutions from the sub-problems, Tensor two-step algorithm can find an approximate solution for the LR-STM model.3. LS-TNPPC algorithm is presented to deal with imbalance tensor data classification problem:Based on the idea of low rank projection, the classical Twin-SVM algorithm has been extended to solve imbalance tensor data classification problem. In this paper, a novel LS-TNPPC algorithm has been presented. The new method can get better prediction accuracy in standard test data. It proved the idea of low rank projection can help the extending of traditional tensor algorithm to handle tensor data. 4. Tensor kernel method and multi-label kernel support tensor machine:Based the theory of kernel method, this paper discussed the application of kernel learning in tensor learning. In this paper, a tensor kernel method has been presented. By the new tensor kernel method, a novel multi-label kernel support tensor machine is presented. Experiments on some real applications suggest the efficiency and the effectiveness of this method.

节点文献中: