节点文献
视觉注意机制的若干关键技术及应用研究
【作者】 单列;
【导师】 刘政凯;
【作者基本信息】 中国科学技术大学 , 信号与信息处理, 2008, 博士
【摘要】 在人类所感知的外界信息中,大约80%的部分都来自于视觉,视觉信息的重要地位决定了对视觉信息的研究必定是当前科学研究领域的热门课题。从计算机诞生之日起,人类就希望有朝一日计算机能够像人类那样通过视觉观察去理解世界,具备自动适应环境的能力。但是目前的计算机视觉与人类视觉在能力上存在着巨大的差距。为了缩小这个差距,科学家们长久以来不断研究人类的视觉机制,并提出许多新的符合人类视觉处理特征的计算机信息处理方法来提高计算机的视觉处理能力。计算机视觉注意机制正是在此背景之下于近年新兴起的一种图像信息处理方法。本论文尝试针对以下问题进行探索:如何更准确地模拟人类视觉信息处理过程,搭建完整的视觉注意机制模型对图像数据进行处理;如何将视觉注意机制与具体的图像分析与理解任务相结合,根据视觉任务优化模型,使注意机制在多个实用领域发挥巨大的作用。本文的主要工作和创新之处归纳为以下几点:1.深入分析了视觉注意机制的工作原理,为将注意机制引入图像信息处理过程奠定了扎实的理论基础。2.研究并实现了当前较为成熟的视觉注意模型,引申并优化了模型的关键算法,并在此研究基础之上提出了新的感兴趣区域提取模型和多显著目标检测模型,对比表明新的模型对显著区域的提取更加准确和有效,符合人的视觉习惯。3.将视觉注意机制应用于复杂背景下的目标搜索。通过实例对比分析注意模型的抗干扰性和噪声鲁棒性。提出了注意机制和偏振信息检测机制相融合的目标检测框架,在人造目标检测上获得了很好的实验结果。4.在尺度注意模型的研究基础之上,将注意机制引入图像检索技术,提出了一种新的基于尺度注意机制和EMD判决距离的建筑物检索方法,并在此研究基础之上提出了一个完整的建筑物检索框架。5.研究视觉注意机制在MPEG-21中的应用。在深入研究MPEG-21框架中数字项适配技术的基础之上,研究注意模型在数字项自适应适配机制中的应用,并给出了实验结果。综上所述,本论文深入研究了视觉注意机制的关键技术,并系统地设计和试验了视觉注意机制在图像信息处理领域的多项应用,将本文中的各种算法和模型应用于多种类型的真实自然图像,都取得了较好的实验结果。视觉注意机制的研究是一个很有潜力的领域,随着人类视觉研究的不断进步,关于视觉注意机制的更多设想和技术会不断更新,对视觉注意机制应用的研究也会更加的丰富,希望本文有限的工作能够为推动该技术的发展略尽绵薄之力。
【Abstract】 As we all know,almost 80%of the information we captured is from vision when we observe the outside world.The research on vision information has become excitingly attractive because of its important role in information area.From the birth of computer,people are expecting computer could understand the outside world by sight like human beings and have the ability to adapt surroundings automatically.To reduce the difference between the visual ability of computer and human beings, scientists have been researching on the visual mechanism of human and brought forward many computer information processing methods to improve the computer visual processing ability.Visual attention mechanism is an innovative image processing method arising recently on this background.In this dissertation,we discuss on these questions below:how to simulate the human visual processing procedure and create an integrated visual attention model to handle image data;How to apply the visual attention mechanism on practical tasks in image analysis and understanding;How to optimize visual attention model according to visual tasks and make it play an important role in applications.The main contribution of this dissertation can be summarized as follows:1.Analyze the working principles of visual attention mechanism in detail and build a profound theory foundation to apply the attention mechanism on image information processing procedure.2.Research on the classic visual attention models and optimize the key techniques of them.Based on the research we put forward an innovative model for salient region extraction and multi-salient objects detection.The comparison shows that our extraction results are relatively more accurate and effective,and more appropriate for human visual habits.3.Apply visual attention model on object searching in complex background. Analyze the noise robustness of attention model with examples.Put forward an integrated object detection scheme combined attention mechanism and polarization information detection,which acquired satisfied result on man-made object searching.4.Based on the research of scale saliency model,we applied attention model on image retrieval area.We implemented a building retrieval scheme based on scale saliency and EMD distance measure.5.We applied visual attention model on MPEG-21.Based on the digital item adaptation mechanism,the application of attention model in DIA is researched and experiment result is provided.In conclusion,we studied the key technology of visual attention mechanism, designed and tested its various applications in image processing area,Lots of good experiment results are achieved with our method and model.The research on visual attention mechanism is a quite promising area.With the advancement of the research on human vision,the technology of visual attention mechanism can be incessantly updated with more innovative ideas,and the application of such technology could also be enriched.This dissertation is expected to impel the advancement of such technologies.
【Key words】 Visual attention; visual feature extraction; data-drive attention mechanism; task-drive attention mechanism; object detection; image retrieval; MPEG-21;