节点文献

图像挖掘在图像检索中的应用

Image Mining in Image Retrieval

【作者】 段曼妮

【导师】 吴秀清; 徐守时;

【作者基本信息】 中国科学技术大学 , 信号与信息处理, 2009, 博士

【摘要】 近年来,随着多媒体技术和计算机网络的飞速发展,全世界的数字图像的容量正以惊人的速度增长。无论是军用还是民用设备,每天都会产生容量相当于数千兆字节的图像。这些图像中包含了现实世界的各种实体,实体所组成的图像集合则包含着这些实体的变化、相互关系以及隐藏在其中的各种模式、演化规律等信息。但是,对于人来说,处理包含数以万计的图像数据集,并从中发现知识,是非常困难,甚至是不可能的。数据挖掘、信息检索和多媒体数据库及其相关领域的发展使得对图像的管理和分析以及从中发现对人们有用的信息成为可能,同时也促生了图像挖掘这一研究领域。图像挖掘指的是在图像数据库中发现隐含的、未知而且潜在有用的知识、图像数据关系的过程。由于这些隐含的信息和知识是人们直观所不能得到的,它有望使很多相关领域发展到一个新的阶段。本文的主要研究目的是探索图像挖掘技术在图像检索领域的应用方法。图像检索技术大体上可以分为基于样例的图像检索和基于文字的图像检索两类。基于样例的图象检索可分为图像类别检索和近重复图像检索两类;而基于文字的图象检索则可分为基于伴生文字和基于图像自动标注两类。本文对图像挖掘技术进行了深入研究,主要工作和创新之处归纳为以下几点:1.本文的第二章首先分析了基于Bag-of-Words模型的近重复检索中存在的主要问题,即视觉多义词和同义词现象,提出用关联规则挖掘算法挖掘视觉词组以消除视觉多义词和同义词现象的方法。本章使用以特征为中心的事务库构造方法,并用传统的Apriori算法在事务库挖掘关联视觉词汇,构造视觉词组。本章同时比较了视觉词组的多种使用方法,讨论了视觉词组在近重复图像检索中的意义。在标准数据库上的实验证明了该方法的有效性。2.本文的第三章研究了图像挖掘技术在基于伴生文字的图象检索中的应用。在基于文字的图像检索中,利用伴生文字进行图像检索是当今商业搜索引擎最常用的做法。但是由于伴生信息往往含有较多的噪声图像检索结果并不能很好地满足用户的需求。本文的第三章设计一个收集用户知识的网络游戏,并通过位置关联规则挖掘算法分析用户的游戏日志,获取的知识用以改善LiveSearch上的地理相关图像检索结果。大规模的用户调查证实了该方法可以较好地改善基于伴生文字的图像检索。3.本文的第四章提出一种用概念挖掘实现风格化的图象自动标注的方法。自动图像标注是指通过机器学习的手段,系统自动为图像生成与内容相关的标注词的过程。该领域中的关键问题是底层特征与高层语义之间的语义鸿沟问题。本章提出,利用概念挖掘,发现图像中包含的概念,以及不同用户的图像风格,从而通过个性化的图像标注方法改善图像标注的结果。本章提出用PLSA模型对图像的内容以及用户的兴趣进行概念挖掘,不同用户的不同风格则被表现为给定概念上的图像特征和标注特征的不同分布。在来自商用网站的数据集上的实验证明,风格化的图像标注可以大幅度提高自动图像标注的精度,从而为个性化的图像检索提供可能的解决方案。

【Abstract】 Advances in image acquisition and storage technology have led to tremendous growth in very large and detailed image databases. These images, if analyzed, can reveal useful information to the human users. Image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the images.Image mining is rapidly gaining attention among researchers in the field of data mining, information retrieval, and multimedia databases because of its potential in dis-covering useful image patterns that may push the various research fields to new fron-tiers. The main purpose of this paper is to explore the usage of image mining technique in the field of image retrieval.Image retrieval in general can be divided into example-based image retrieval and text-based image retrieval.Among example-based image retrieval, image near-duplicate(IND) retrieval has a vast scene of application. We first analysis the major problem in IND retrieval based on Bag-of-Words model as "visual polysemy and synonymy phenomenon". To eliminate this phenomenon, we propose using associate rule mining to find "visual pattern". We propose and compare different usages of visual pattern. Experiments on benchmark dataset prove our proposed method is superior to classic Bag-of-Words model.Among the text-based image retrieval, using surrounding text as image’s keywords and building index is the most commonly used method in commercial image search engine. However, because surrounding text is often associated with a lot of noise, image retrieval results can not meet the needs of users very well. To solve a location image retrieval task, we first define a measurement for image, namely, geographical relevance and then use it to rank the returning images. To obtain images’geographical relevance, we designed a online game to gather user’s knowledge about image and location. We then use co-location mining algorithm to find the similar location, image geographical relevance and image region’s geographical relevance. The comparison with a commercial search engine (Live search) confirmed our proposed algorithm is useful in improving location image retrieval’s performance.Another direction in text-based image retrieval is automatic image annotation. Au-tomatic image annotation (also known as automatic image tagging) is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. The key problem in the field is the semantic gap be-tween feature and semantic concept. Modeling user’s attention is one feasible solution to eliminate the semantic gap. We use concept mining technical in a image annota-tion task. We assume the images in one group share one "style". Mining and using this "style" could improve the annotation precision and consequently improve image retrieval based on auto-annotation. Experiments on the real word datasets prove our assumption.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络