节点文献
基于图的降维技术研究及应用
Research on Graph-based Dimensionality Reduction and Its Applications
【作者】 乔立山;
【导师】 陈松灿;
【作者基本信息】 南京航空航天大学 , 计算机应用技术, 2009, 博士
【摘要】 高维数据的涌现是模式识别面临的极大挑战,降维技术已成为处理高维数据,克服“维数灾难”的重要途径。研究表明多数降维方法可归结于图的构造及其嵌入方式。然而,现有许多典型的降维算法均依赖于人工预定义的近邻图,如局部保持投影(LPP)及其变体。虽然这类算法在很多实际问题中取得了良好性能,但存在诸如近邻参数选择、噪声敏感、判别力不足、无法自然地合并领域先验等一系列问题。本文围绕图的构建与优化对降维方法进行了研究,主要贡献有:(1)对全局保持和局部性保持(降维)策略的重新认识与评价。以几种典型的局部保持降维算法为例,通过与全局降维方法的对比,获得了一系列新的洞察(特别是,局部保持策略的不足)。进而,从图的构造角度分析了其深层原因,并给出了具体的改进策略和建议。这一方面,澄清了最近某些方法对局部和全局保持策略的误解,为模型选择提供了依据;另一方面,说明现有局部保持降维方法存在很大提升空间,成为本文研究工作的重要动机之一。(2)首次将稀疏表示引入图的构造,设计了稀疏保持投影(SPP)算法。由于采用全局策略构图,SPP在一定程度上克服了局部保持降维方法中近邻参数选择的困难;而SPP隐含的“近邻”通过l 1优化问题自动获取,很好地弥补了局部构图方法无视数据分布,所有样本使用同一近邻数的缺陷。另外,受益于稀疏表示自然的判别能力,SPP在人脸识别等问题上获得了较LPP等局部保持降维方法更优的性能。(3)提出了稀疏保持判别分析(SPDA)算法,并将其应用于单标号图像人脸识别问题。SPDA不仅是SPP的半监督推广,而且进一步将稀疏表示建图思想统一于贝叶斯学习的框架之下,使得先验知识能够自然地引入图的构造过程。此外,通过集成策略加速稀疏建图,设计了集成稀疏保持判别分析(enSPDA)算法。实验表明所提算法不仅较传统的半监督判别分析方法(如SDA)有效,并且需要更少的无标号样本。(4)提出了软局部保持投影(SLPP)方法。传统的局部保持降维技术中,近邻图起着至关重要的作用,但其构造依赖于人为定义,并独立于后续的降维过程。鉴于此,在LPP的基础上,提出了SLPP算法,将图的构造与投影学习整合于单个目标函数,通过交替优化,不仅使图学习过程简洁、高效、易于处理,并且获得了解析的,具有原则性指导意义的图更新公式。标准数据集上的实验表明了SLPP的有效性。(5)搭建了同时降维与图学习的统一框架。受SLPP的启发,提出了一个降维与图更新的同时学习框架,其思想可以应用于几乎所有基于图的降维技术。进一步,为验证此框架的可行性,基于此扩展了经典的LPP,提出了自助式局部保持投影(SdLPP)算法,并在数据可视化、聚类和分类等问题上验证了其有效性。
【Abstract】 The high dimensionality of data is one of the main challenges faced by the current pattern recognition techniques. Dimensionality reduction (DR) has become an important tool to handle high-dimensional data and overcome the“curse of dimensionality”. Recent researches showed that most of the DR algorithms can generally reduce to graph construction and its embedding manners. However, many existing graph-based DR methods rely on artificially pre-defined neighborhood graph, e.g., locality preserving projections algorithm and its variants. Despits their success in many practical applications, those algorithms usually suffer from some limitations such as neighborhood parameter selection, sensitivity to noices, insufficient discriminative power and the difficulty of incorporating prior knowledge. This paper is a research on graph construction and optimation, especially for dimensionality reduction task. The main contributions include:(1) The revisit about globality and locality preserving strategies. In particular, we compare several popular locality-oriented DR algorithms with classical globality-oriented DR ones, and obtain a series of new insights (especially some undesirable characteristics involved in locality preserving strategy). Furthermore, we analyze the reasons for the empirical results from the viewpoint of graph construction, and provide specific tricks and suggestions to address such issues. On one hand, these studies clarify the current misunderstanding involved in locality-oriented methods; on the other hand, these show that there exists large space to improve the current DR methods, and become the important motivations of our following study works.(2) The first attempt to construct graph by sparse representation and the design of Sparsity Preserving Projections (SPP) algorithm. By virtue of global strategy for graph construction, SPP alleviates to a certain extent the difficulty of neighborhood parameter involved in locality preserving methods; SPP captures the“neighbors”by l 1-minimization problem potentially and automatically, instead of artificially predefinition which assigns the same neighborhood size for each sample. In addition, SPP benefits from the natural discriminative power of sparse respresentation, and thus achieves better performance than some popular locality preserving DR algorithms for face recognition application.(3) Proposing Sparsity Preserving Discriminant Analysis (SPDA) method and applying it to single labeled training image face recognition problem. Not only does SPDA extend SPP to semi-supervised version, but also unifies sparse graph construction under Bayesian learning framework which facilitates the incorporation of prior knowledge. In addition, we attempt to speed up SPDA by ensemble strategy and design ensemble SPDA algorithm. The experiments on publicly available data sets show that the proposed algorithms achieve better performance than their competitors such as SDA, and tend to work well just resoring to very few extra unlabeled samples.(4) Presenting Soft Locality Preserving Projections (SLPP) method. It is well known that graph plays an important role in typical locality-based DR techniques. However, the graph construction involved in these methods relies on artificial predefinition and is independent of subsequent DR step. To address this issue, we design SLPP algorithm based on LPP, which integrates graph construction with projection learning into one single objective function. With alternating iterative optimization, we obtain a principle way of graph construction. The feasibility and effectiveness of SLPP are verified on several standard data sets with promising results.(5) A unified framework for simultaneous dimesionality reduction and graph learning. Motivated by SLPP, we establish a simultaneous dimensionality reduction and graph updating framework, which is very general and applicable to most of the current graph-based DR techniques. To verify the feasibility and effectiveness of such framework, we further extend the typical LPP and develop Self-dependent LPP (SdLPP) algorithm. The effectiveness of the proposed algorithm is validated by the experiments including data visulization, clustering and face recognition on publicly avaible data sets.