节点文献

视觉注意机制建模及其应用研究

Research on Visual Attention Mechanism Modeling and Its Applications

【作者】 田明辉

【导师】 岳丽华;

【作者基本信息】 中国科学技术大学 , 计算机应用技术, 2010, 博士

【摘要】 当前,随着信息技术的发展,图像数据的规模变得越来越大,面对如此庞大的图像数据,如何能够快速而准确地完成各种图像分析任务已经成为人们研究的热点。传统的图像分析方法将图像中所有区域都被赋予相同的优先级,然而很多图像分析任务(诸如图像检索、图像语义标注、场景分析与理解、目标识别等)所关心的内容通常仅占图像中较小一部分,因此,这种全面加工不但增加了分析过程的复杂性,而且带来了许多不必要的计算浪费。近年来,许多研究学者发现人类视觉系统(human visual system,简称HVS)在面对一个复杂场景时,人类的注意力会迅速被少数几个显著的视觉对象所吸引,并对这些对象进行优先处理,而该过程则被称为视觉注意。显然,将这种机制引入到图像分析领域是非常必要且有意义的,它可以提供容易引起观察者注意的图像区域信息,帮助制定合理的计算资源分配方案,从而极大地提高现有图像分析系统的工作效率。然而,将人类视觉注意机制这种快速筛选能力引入到计算机的计算中,构建视觉注意模型以使计算机也具有类似人类的注意智能绝非易事。一方面,在脑神经科学和感知科学领域中,人类的这种注意机制的工作机理尚不明确,还有很多未知的和有待解决的问题,无法给出明确的原型过程。另一方面,由于图像处理技术自身在语义表达上仍不能符合人类的语义,在许多概念上仍不能给出较为准确的数学定义。本论文首先阐述了视觉注意机制建模的研究意义,分析了该领域内的国内外的研究现状,介绍了视觉注意机制的特点、经典理论以及计算过程。接下来,对视觉注意建模领域的已有研究成果进行了总结,分析了视觉注意建模的关键问题,并在显著度计算、特征融合策略、视点转移等方面进行了研究,提出了一种适用于自然场景分析的自底向上视觉注意模型。论文还关注了视觉注意建模和显著度计算在图像分析领域中的一些应用,提出一种基于显著度计算的对象视频检索方法,给出了一种时序图像中显著对象的检测方法,又将显著度计算引入到遥感图像处理的应用中,主要针对复杂海面背景下的海上舰船检测和噪声背景下的变化检测。论文的主要贡献包括以下几个方面:(1)提出一种适用于自然场景分析的视觉注意模型,包括针对不同特征的全局显著度计算方法、动态多特征评估与融合算法以及基于心理学因素的视点转移过程模拟的计算方法。相比已有的建模工作,在显著对象的轮廓及语义完整性上有较为明显的提高,更接近于真实的人眼视觉注意过程。(2)提出一种基于视觉显著度计算的对象视频检索方法。该方法通过将视频中的显著对象抽取出来,针对这些关键对象构建特征向量的相似性计算,并以此来作为整体视频的相似性。这样屏蔽了视频中大量背景因素的影响,更好地反映了对象视频中的主题内容。相比基于关键帧的特征相似度的计算,检索效率有一定的提高;通过引入运动特征分析扩展了静态图像的视觉显著度计算模型,给出一种适用于时序图像分析的显著对象检测方法。该方法有效地将时序图像中的显著对象检测出来并进行分类,对于斑点噪声、亮度和对比度具有较好的鲁棒性。(3)将视觉显著度计算应用到遥感图像处理中,主要针对复杂海面背景下的海上舰船检测和噪声背景下的变化检测。前者利用舰船相对于海面呈现视觉上较为显著的特性,引入显著度计算以克服传统阈值分割方法在复杂海面背景下较难将目标与背景分离的问题。后者利用对象的变化分布在不同特征中,且不同特征对于变化的贡献不同,通过计算不同特征通道的差异显著度,动态地融合成一幅综合差异显著度图,并在差异显著度图中寻找对象变化区域。相比于基于统计特征直方图匹配后的阈值分割方法,该方法具有更好的检测效果及对噪声的鲁棒性。

【Abstract】 Technological progress is always accompanied by a rapid proliferation of image data. And size of image data is also becoming larger and larger. Facing such vast amounts of image data, how to complete different kinds of image analysis tasks effectively and fast is a hotspot which people concern about. However, in many image analysis tasks, such as image retrieval, scene analysis, surveillance systems, objects detection, and tracking, people pay more attentions to some special regions or objects which interest them. These objects or regions of interest (ROI) usually are only a small part of an image. In traditional image analysis methods, each position of one image is usually treated with the same priority and all regions are processed orderly and sequentially. Such all-sided processes increase the complexity of analysis tasks and become time consuming, computation consuming. So there is a need to find effective and robust methods that are adaptive to different image analysis tasks to locate ROI rapidly. In recent years, many researchers have found that the Human Vision System (HVS) can focus on several salient objects or regions fast in a complex scene, and HVS prefers to process them first. This is called "Visual Attention Mechanism", and those salient objects or regions are called "Focus of Attention" (FOA). Obviously, it is necessary to introduce this mechanism to the filed of image analysis. It can provide the ability of selection for image analysis process, help to make a suitable plan to allot our limited resource of computation, and also improve the efficiency of our existing image analysis systems.In psychology, researchers think that objects which make more visual stimuli or novel stimuli and some objects which people expect can attract more human’s attention. These are called "stimulus-driven capture" and "motive selection". Accordingly in computer vision there are bottom-up attention and top-down attention. Bottom-up attention models are data-driven and independent of image analysis tasks, while top-down attention models are intention-driven and dependent on image analysis tasks. In this paper, we focus on bottom-up attention models. So our following research works mainly include two parts:visual saliency modelling and salient object detection. First, we introduce the features of visual attention mechanism and the fundamental theories, also analysis the previous methods and classic models. Then we construct a bottom-up visual attention model for natural scenes. Based on our visual attention model and saliency computation, several different approaches are proposed for salien object detection in some image analysis applications.The contributions of the dissertation are listed as follows:(1) A visual attention model is proposed for natural scenes, including different global visual saliency measurements for different features, a strategy for dynamic feature evaluation and combination and the simulation for location shift of FOA. Comparing with previous models, this model is better at keeping object-integrality in semantic and object’s contours, and more similar to real human visual attention process.(2) An approach for object-video retrieval based on visual saliency is proposed. It constructs feature vectors based on salient objects in videos, not on frames to neglect the effect of background. In addition, we extend our visual attention model by adding mostion analysis for time sequence images and an approach for salient object detection in time sequenced images is proposed. Experimental results indicate our approach is effective and robust.(3) The visual saliency computation is applied onto remote sensing image processing tasks in this paper. Based on saliency computation, a ship detection approach with complex sea surface background and a change detection method with noisy background are proposed. Comparing with the traditional segmentation methods based on statistics, our approaches are more effective and more robust to noise, brightness and contrast.

  • 【分类号】TP391.41
  • 【被引频次】33
  • 【下载频次】2148
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络