节点文献

流形学习中的增量谱嵌入方法

Incremental Spectral Embedding Methods in Manifold Learning

【作者】 李厚森

【导师】 成礼智;

【作者基本信息】 国防科学技术大学 , 数学, 2010, 硕士

【摘要】 数据的维数约简是数学与计算机科学的新兴交叉研究方向.它所关注的问题是如何将高维数据表示在低维空间中,并由此发现其内在结构.特别地,流形学习方法在理解多维模式结构中取得了巨大成功.然而,大多数的流形学习算法以批处理的方式执行,进而无法高效地实现高维数据流的约简表示.为此,研究人员将目光投向了增量流形学习.本文在增量学习的背景下,讨论了流形学习中谱嵌入算法的处理新样本数据的拓展.具体而言,本文的主要贡献可归纳为以下两个方面.一方面,针对一类重要的流形学习方法——谱嵌入算法,本文给出了一个增量实现的一般框架.该框架不仅囊括了若干现有的增量算法,而且克服了现有算法每次只能处理一个样本的限制,提供了可以一次处理多个样本的新拓展.与此同时,我们给出了增量谱嵌入算法的收敛条件,分析了算法的收敛性.另一方面,利用增量谱嵌入算法的一般框架,我们给出了两种新的增量算法——增量HLLE算法和增量LSE算法,同时分析了这两个增量算法的计算复杂性.通过与原算法计算复杂度的对比,本文从理论上证实了增量算法的高效性.另外,在合成数据集和图像数据集上的数值实验验证了提出的增量算法的准确度和时效性.这些数值实验包括两个方面,一是维数约简,二是在维数约简基础上进行的模式分类.最后,我们对全文的工作进行了总结,并对将来的研究工作做了几点展望.

【Abstract】 Dimensionality reduction is a novel research direction, which integrates the knowl-edge from mathematics and computer science. It mainly focuses on the problem of repre-senting the original high dimensional data in a low dimensional space and discovering theintrinsic structure. Especially, manifold learning methods make a great progress in un-derstanding the structure of multidimensional patterns. Most of these methods, however,operate in a batch mode and cannot be effectively applied when data samples are collectedsequentially. Therefore,incrementalmanifoldlearninghasappearedinresearchers’sights.Under the background of incremental learning, this thesis has exclusively studiedout-of-sample extensions of spectral embedding algorithms in manifold learning. Moreconcretely, the main contributions of this thesis can be summarized into two aspects.On the one hand, a general incremental framework is proposed for spectral embed-ding algorithms, a significant category of manifold learning methods. Not only does thisframework unify several contributions in incremental learning, but also it extends the ex-isting methods to be able to deal with the cases when two or more observations comesimultaneously, conquering the limitation of tackling only one sample per time. Mean-while,wepresenttheconvergentconditionsofincrementalspectralembeddingalgorithmsand analyze their convergence performance.On the other hand, two novel incremental algorithms, consisting of the incrementalHLLE algorithm and the incremental LSE algorithm, are proposed by utilizing the generalframework as a tool. The complexity of computations of these two algorithms is theoret-ically analyzed. Compared with the batch methods, the two incremental methods show adesirable gift in terms of computational costs. Moreover, the efficiency and the accuracyof the proposed algorithms are evaluated on both synthetic and image dataset. All thenumerical simulations are designed from two views: one is dimensionality reduction andthe other is pattern classification based on reduced representation of original data.Finally, we sum up the research results in this thesis and describe several directionsof further research in the fields.

节点文献中: