节点文献

增量学习及其在图像识别中的应用

Incremental Learning and Its Applications to Image Recognition

【作者】 李敬

【导师】 吕宝粮;

【作者基本信息】 上海交通大学 , 计算机软件与理论, 2008, 博士

【摘要】 我们生活在一个信息爆炸的时代,增量学习成为处理这些每天都在增加的信息的唯一手段。同时,随着并行计算技术的发展,基于模块化的、并行的增量学习算法成为一个新的研究方向。高斯零交叉函数最小最大模块化网络(M~3-GZC)正是一种具有模块化结构、并行计算能力和增量学习能力的分类器。然而,M~3-GZC网络的模块数量是样本数量的平方级别,导致了平方级别的时间复杂度和空间复杂度,限制了M~3-GZC网络在大规模问题中的应用。另一方面,M~3-GZC网络的增量学习能力是建立在完全实例空间的,这导致了过高的存储空间需求,影响了分类精度。本文通过对M~3-GZC网络的深入分析,提出了去除冗余模块算法和新的增量学习算法,并将它们成功应用到工业图像故障检测、性别识别、手写体输入识别等图像识别问题中去。本文的主要贡献在以下几个方面。1)本文分析了M~3-GZC网络的特点:高度的模块化结构、一定的增量学习能力、学习的收敛性以及具有给出“不知道”输出的能力。为了更好地理解M~3-GZC网络,本文讨论了它与两种常用模型——最近邻算法以及径向基函数网络的关系。2)本文通过对M~3-GZC网络接收域特点的分析,提出了一种M~3-GZC网络的结构修剪算法,去除冗余模块,从而减少存储空间需求,加快响应速度。本文在一些公共数据集上验证了该修剪算法的有效性,并将它成功地应用到一个工业图像故障检测项目。3)为了将M~3-GZC网络的增量学习能力从基于完全的实例空间转变为基于部分的实例空间,本文提出了高门限增量检测算法和有监督的增量聚类算法。前者可以有选择地学习新样本中的代表性样本,并能对已经训练好的M~3-GZC网络进一步去除冗余样本。后者则在学习过程中将训练样本进行聚类。这两种算法使得M~3-GZC网络具有了真正意义上的增量学习能力。4)为了进一步提高M~3-GZC网络的增量学习能力,本文从概念空间学习的角度,提出了层式支持向量机。它的主要思想是利用先验知识,将大规模复杂问题分解为若干子问题,然后分别求解这些子问题。测试时,首先判断测试样本属于哪个子问题,然后由该子问题对应的分类器决定输出。在层式支持向量机的基础之上,通过合并实例空间与概念空间,本文提出了基于M~3-GZC网络和支持向量机的增量学习算法。它根据每个训练数据子集建立对应的支持向量机,同时不断更新M~3-GZC网络。测试时,首先由M~3-GZC网络判断测试样本属于哪些训练子问题,然后通过概率合并对应支持向量机的输出,给出最终结果。本文成功地将上述两种算法应用到多角度性别识别、手写体输入识别等领域。5)为了更好地将增量学习算法应用到图像识别领域,本文提出了多尺度边缘增强的图像预处理方法和自适应图像欧氏距离的图像相似性测量方法。由于在图像中,边缘通常反应图像的形状和结构,而非边缘部分通常受照明的影响产生灰度的变化。因此,本文提出一种多尺度边缘增强算法,它可以在图像的预处理阶段强化边缘信息,去除噪声、光照的影响。另一方面,在图像的相似性测量中,常用的欧氏距离忽略了图像结构,并不能正确表达图像间的距离。本文提出的自适应图像欧氏距离则考虑了像素之间的几何相关性和灰度相关性,并且可以很容易地嵌入到各种现有的模式识别算法中去。

【Abstract】 In the era of information explosion,incremental learning becomes the only way of processing the information accumulated every day.Moreover,as the development of parallel computing,incremental learning based on modularized structure and parallelization becomes a new research area.The Min-Max Modular Network with Gaussian-Zero-Crossing Functions(M~3-GZC) is a modular classifier which is capable for parallel computing and incremental learning.However,the number of modules in a M~3-GZC network is quadratic complexity with the number of training instances,which results in quadratic time and space complexity and limits the application of M~3-GZC network in large-scale problems.On the other hand,the incremental learning ability of M~3-GZC network is based on full instance memory,which leads to the high requirement in space and limits the classification accuracy. In this dissertation,we analyze the characteristics of M~3-GZC network thoroughly and propose a redundant module removing algorithm and some new incremental learning algorithms. We also apply these algorithms to some image recognition fields,such as industry image fault detection,gender classification and handwritten digital recognition.The main contributions of this dissertation can be described as follows.1) We reveal that M~3-GZC network has the following attractive features:the highly modular structure,the ability of incremental learning in a certain extent,the guarantee of learning convergence,and the ability of saying ’unknown’ to unfamiliar inputs.We also discuss the relationship between M~3-GZC network and two traditional models,the nearest neighbor algorithm and the radius-basis function network for better understanding of M~3-GZC network.2) We propose a structure pruning algorithm to remove redundant modules based on the analysis of the receptive fields in M~3-GZC network.We validate the algorithm on several benchmark data sets,and apply it successfully to an industry image fault detection project.3) To change the incremental learning abilities of M~3-GZC network from full instance memory based to partial instance memory based,we propose an enhanced threshold incremental check algorithm and a supervised clustering algorithm.The former can select representative samples from a new training set and prune redundant modules in an already trained M~3-GZC network.While the latter can cluster the training data during learning.The proposed algorithms endow the M~3-GZC network with the truly incremental learning ability.4) To improve the incremental learning of M~3-GZC network further,we propose a layered support vector machine based on the learning in concept memories at first.The fundamental idea of it is dividing a complicated and large-scale problem into several easy subproblems according to prior knowledge,and then solving these subproblems in parallel.During the test process,it decides which subproblem the test sample belongs to at first,and then gives the final output according to the corresponding support vector machine.Based on the layered support vector machine,we combine the instance memory as well as concept memory, and propose an incremental learning algorithm based on M~3-GZC network and support vector machines.It trains a support vector machine according to each training data subset, and updates a M~3-GZC network at the same time.During the test,the M~3-GZC network decides which subproblem the test sample belongs to,and combines the output of the corresponding support vector machines.We apply the two algorithms successfully to some image recognition fields,such as multi-view gender classification and handwritten digit recognition.5) We propose a multi-scale edge enhancement algorithm and an adaptive image Euclidean distance for better image recognition.In images,edges are often locate at the boundaries of important image structures and reflect shapes,while non-edge areas are often changed in gray level under the influence of illumination.Therefor,we propose a multiscale edge enhancement algorithm which can intensify the edge information and remove the effects of illumination and noises.While in the image similarity measures,the most commonly used Euclidean distance discards the image structures and is not reasonable for image distance.On the contrary,our proposed adaptive image Euclidean distance considers both the spatial relationship and the gray level relationship between pixels,and can be easily embedded in many existing pattern recognition techniques.

  • 【分类号】TP181;TP391.41
  • 【被引频次】8
  • 【下载频次】999
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络