节点文献

非监督的结构学习及其应用

Unsupervised Structural Learning and Its Applications

【作者】 陈远浩

【导师】 张宏江;

【作者基本信息】 中国科学技术大学 , 模式识别与智能系统, 2008, 博士

【摘要】 在机器学习领域中,数据的表示方式是其中的核心问题。传统的方法经常通过特征向量的方式将数据表示为高维空间中的点。特征向量的表示方式由于简单直观的特性得到广泛的研究。但是,近年来的一些研究表明,单一的特征向量表示很难描述数据的某些特性。因此,基于结构的数据表示方式已经成为研究人员关注的重点。本文的研究重点是通过非监督的结构方式学习数据的结构。由于数据结构空间的搜索是组合问题,会出现组合爆炸现象,因此如何通过近似途径快速地搜索数据的结构空间是非监督学习的重点。根据不同任务的特性,我们提出了不同非监督学习算法。在文本聚类任务中,我们提出了层次谱聚类算法来进行文本的层次聚类和语义树的生成。在图像的物体识别任务中,我们提出了结构推导算法和知识传播策略学习物体的图模型的结构和参数。论文主要研究内容与创新成果如下:1.我们提出基于概率文法和马尔可夫场的物体模型(Probabilistic Grammar-MarkovModel,PGMM)。PGMM模型的学习过程只需要极少量的监督信息,即PGMM模型的结构和参数都可以通过非监督的方式进行学习。关键点三元组被提出作为PGMM模型的基本组成单位。结构推导算法通过对关键点三元组的组合来生成复杂的模型结构。由于PGMM模型的结构采用了联合树的形式,允许动态规划算法的使用,因此PGMM模型可以快速推理和参数学习。实验结果证明,PGMM模型能处理在未知背景中的物体识别和定位。在学习和推理过程中,PGMM模型允许物体在2D范围内的任意变化(位置、旋转和尺寸)。由于概率文法模型的帮助,PGMM模型不但能够处理物体具有不同的形态,还能够处理由不同的物体类别构成的混合类数据。2.我们提出一种学习概率物体模型(Probabilistic Object Model,POM)的新方法。POM模型综合各种视觉特征,能够同时执行图像分类、图像分割和图像物体识别等多个视觉任务的能力。我们通过组合使用互补图像特征的基本的概率物体模型的方式来学习POM模型的结构。在模型的学习过程中,我们提出了知识传播策略。该策略允许一个基本概率物体模型为其它基本概率物体模型提供信息,并且指导它们的学习过程。知识传播策略显著地降低了训练过程对数据的要求,也提高了推理过程的速度。PGMM模型是POM模型中的一个组成部分。相对于PGMM模型,POM模型不仅保留了PGMM模型的所有优点,而且能够执行更多的视觉任务。同时,在图像分类任务中,POM模型也具有更高的性能。3.我们提出一种新颖的层次聚类算法,谱层次聚类算法(Spectral Hierarchi-calClustering,SHC)。SHC算法是基于谱图理论的层次聚类算法。它采用AMG(Algebraic Multi-Grid)数值计算方法,通过迭代地权重融合方式,自底向上地分层合并节点进行聚类。AMG数值计算方法的应用保证了算法能够得到近似全局最优解。实验证明了SHC算法在文本聚类算法中的性能。SHC算法最终得到的自然并且不规则的聚类结构也是其一大特性。基于博客标签的语义树生成实验证明了SHC算法的聚类结构的合理性。它使得用户浏览语义树更为方便自然。综上所述,本文提出新颖的非监督学习模型结构的算法,将它们应用于物体识别和文本聚类任务中,并通过实验证明它们的合理性和有效性。

【Abstract】 The representation of data is one of the key in machine learning.Traditional methods usually represent the data as points in high dimensional space via feature vector. Due to its simplicity,great progress and developments have been made in methods based on feature vector.However,recent research showed that feature vector can not describe some properties of data.Therefore,the representation based on structure attracts more and more attention of researchers.This dissertation focuses on the unsupervised structure learning.Because of the huge space of data structure,how to rapidly search the space by approximate method is the key problem of unsupervised structure learning.According to properties of the tasks,different algorithms are proposed. For document clustering,spectral hierarchical clustering algorithm is proposed to do hierarchical document clustering and blog tag taxonomy construction.For object recognition,structure induction and knowledge propagation are used to learn the structure and parameters of the graphical model of object.The main content and innovations in this dissertation are as follows:1.We introduce a Probabilistic Grammar-Markov Model(PGMM) which couples probabilistic context free grammars and Markov Random Fields.PGMM is a generative model defined over attributed features and is used to detect and classify objects in natural images.PGMM is designed so that it can perform rapid inference,parameter learning,and the more difficult task of structure induction. PGMM can deal with unknown 2D pose(position,orientation and scale) in both inference and learning different aspects of the model.The PGMM can be learnt in an unsupervised manner where the image can contain one objects of different object categories or even be pure background.We first study the weakly supervised case,where each image contains an example of the object category,and then generalize to less supervised cases.The experiments on a subset of the Caltech dataset show that our results are comparable with the current state of the art and our inference is performed in less than five seconds. 2.We present a method to learn probabilistic object models(POMs) with minimal supervision which can exploit different visual cues and perform tasks such as classification,segmentation,and recognition.We formulate this as a structure induction and learning task and our strategy is to learn and combine basic POMs that make use of complementary image cues.We describe a novel structure induction procedure which uses knowledge propagation to enable one POM to provide information to other POM and "teach them"(which greatly reduces the amount of supervision required for training and speeds up the inference). We give detailed experimental analysis on large datasets which show that the POMs is invariant to scale and rotation of the object(for learning and inference) and performs inference rapidly.In addition,the experimental results show that POMs can be applied to learn hybrid objects classes(i.e.when there are several objects and the identity of the object in each image is unknown).We emphasize that these models can match between different objects from the same category and hence enable object recognition.3.We present spectral hierarchical clustering(SHC),a novel hierarchical clustering algorithm.Spectral analysis on SHC is provided by spectral graph theory which is commonly used in flat clustering but novel for hierarchical clustering.SHC uses the numeric techniques of Algebraic Multi-Grid method to perform fine-tocoarse weighted aggregation recursively.We evaluate the proposed algorithm on a number of different benchmark datasets.The comparison results show that our algorithm performs much better than the state-of-the-art hierarchical clustering algorithms.SHC is applied to the application of blog tag taxonomy construction. The results demonstrate that SHC performs more consistently with human judgments than other methods.Moreover,the resulting natural irregular tag hierarchy obtained by SHC is easier for users to browse the structure and locate the tags of interest.

节点文献中: