节点文献

图像检索中若干问题的研究

Study on Several Issues of Content-based Image Retrieval

【作者】 刘伟

【导师】 童勤业;

【作者基本信息】 浙江大学 , 生物医学工程, 2007, 博士

【摘要】 图像含有比文本更为丰富的信息,在人们日常生活中发挥着重要作用。近年来由于因特网技术的发展及各种消费型电子产品的普及,每天都有巨量的数字图像产生和发布。在多媒体数据库中快速、有效地寻找所需要的图像是一个非常有意义的课题。目前工业界的许多图像搜索引擎(如GoogleTM和百度TM)在搜索图像时并没有按照图像内容本身来搜索,而是根据与图像相关联的文字信息来完成搜索任务。导致搜索结果不尽如人意。基于内容的图像检索是有望解决这一问题的关键技术。本文对这一技术中的几个问题进行了研究,取得了如下结果:纹理特征是图像检索中广泛使用的重要底层视觉特征。本文将图像纹理视为非线性动力系统产生的信号,使用2种非线性信号分析方法-复杂性方法和希尔伯特-黄变换(HHT)方法来提取图像的纹理特征并将之用于纹理图像检索。得出的结果有:(1)将时间序列复杂性方法用于图像纹理分析与检索。所做的工作和得到的结论是:比较了8种时间序列复杂性方法用于图像检索时的性能,发现基于符号动力学和基于熵的方法不适于图像检索;基于频谱分析的CO复杂性特征适于图像检索,该特征的检索性能与二维图像一维化的扫描方法有关;实验表明采用Hilbert扫描方式的CO复杂性特征在Brodatz纹理库上取得了和Gabor特征极为接近的检索结果,计算特征所需要的时间比Gabor特征少了一个数量级;由图像阈值化算法得到启发,提出了一个新的一维时间序列粗粒化框架;提出了多种基于二维CO复杂性测度的纹理特征:复杂度直方图和多尺度复杂度直方图、复杂度共生矩阵、复杂度纹理谱和多尺度复杂性特征;实验表明基于金字塔分解的多尺度复杂性特征在不同的实验图像库上检索性能稳定,是一种较好的纹理特征;(2)将希尔伯特-黄变换方法用于图像纹理分析与检索,所做的工作和得到的结论是:提出了一种新的基于聚类的边界处理算法以改善经验模式分解(EMD)方法所产生的边界效应问题;使用二维Hilbert变换计算了内禀模态函数(IMF)的幅值作为检索用的图像纹理特征。实验表明,提出的HHT特征可以取得和Gabor特征较为接近的图像检索结果。图像的显著性区域是表达图像语义的主要部分。本文尝试使用一个基于视觉生理和心理物理实验基础的选择性视觉注意计算模型用于自然图像检索的研究。所做的工作和得到的结果是:(1)使用视觉注意计算模型计算了图像中的兴趣点并提取兴趣点周围的局部特征用于图像检索。提出的检索特征有图像的显著性直方图特征、图像的显著性标签和注意焦点(FOA)空间关系直方图特征。实验结果表明显著性标签和FOA空间关系混合编码的直方图特征可以取得比全局直方图特征更好的检索结果;在采用视觉注意计算模型计算得到的图像显著性区域上提取的一些区域特征可以取得比全局特征更好的检索结果;(2)提出了将潜在语义标引方法和视觉注意计算模型结合起来用于自然图像检索的方法;(3)提出了在多示例学习框架下基于视觉注意计算模型和JSEG图像分割算法的包生成器方法,并将其用于自然图像检索。图像检索实验表明基于JSEG分割算法的包生成器取得了比一些文献中提出的包生成器更好的实验结果。本文提出了“图像语义阈值”的新概念及其度量方法。通过计算机实验和心理物理学实验初步得到如下结论:在自然图像认知或理解时存在一个语义阈值;可以通过图像的图像熵和图像分维数及类似Weber律的方法来度量该阈值;差别阈限图像及其原始图像的度量值的比值与图像语义内容无关,而和色彩模式(彩色或灰度)及图像的变换方法相关。本文作者还设计与开发成功了一个图像检索实验平台。使用该平台方便了研究者进行图像检索实验研究,提高了工作效率,便于他们之间进行学术交流。这项工作具有一定的应用价值。

【Abstract】 A picture is worth a thousand words. Images play a very important role in human daily activities. Recently with the development of the Internet technology and popularization of the consumer electronic devices(such as mobile phone, digital camera, etc.), enormous digital images are created, distributed and shared everyday. It is a challengable task to search images rapidly and efficiently in all kinds of multimedia image databases. However, most of current image search engines(such as GoogleTM and BaiduTM) search images not by images’ contents, but by texts related to them(for example, texts related to images in a web page), which leads to the poor retrieval accuracy. Content-based image retdeval(CBIR) is the key technique to solve the problem that is how to retrieve useful information within enormous amount of digital images. In this paper, several issues were studied.Image texture feature is a widely used low-level visual feature in CBIR society. In this paper, image texture was regarded as the signal generated by non-linear dynamic systems. Two non-linear signal analysis approaches, whose names are time series complexity approach and the Hilbert-Huang Transform(HHT), were used to extract the texture feature from images. The extracted texture features were used for image retrieval experiments.The time series complexity approach was applied to image retrieval. Following conclusions were drawed: Image retrieval results were compared based on eight different time series complexity algorithms, which leaded to a conclusion that symbol dynamic-based and entropy-based complexity algorithms are not suitable for image retrieval. A complexity approach named CO complexity, which is based on Fourier spectrum analysis, is suitable for image retrieval. The CO complexity-based retrieval results are relevant to the scanning methods which convert image from two-dimensional structure to one-dimensional time seres form. Experimental results showed that the retrieval accuracy based on CO complexity algorithm with Hilbert scanning method is comparable to that based on Gabor feature, which is an excellent texture feature in image retrieval. Moreover, consuming time to extract the CO complexity-based texture feature is far shorter than that of the Gabor-based texture feature.Motivated by the image binarized algorithms, a novel time series coarse graining framework was proposed.Several texture features based on 2D-CO complexity measurement were proposed, whose names were complexity histogram and multi-scale complexity histogram, complexity co-occurrence matrix, complexity texture spectrum and multi-scale complexity feature. Experimental results showed that multi-scale complexity feature based on pyramid decomposition is a favourable texture feature.The approach of texture image decomposition and texture feature extraction based on HHT, which can decompose the image into a set of functions denoted Intrinsic Mode Functions(IMF) and a residue, was presented. The extracted features were used for texture image retrieval. The Bidimensional Empirical Mode Decomposition(BEMD) method was used to decompose the texture image, the features extracted were the mean and standard deviation of the amplitude of the IMFs and their Hilbert transformations. Furthermore, according to the spatial relationship between local extrema points, a novel boundary processing approach based on clustering algorithm was proposed. Preliminary comparision experimental results showed that the texture image retrieval results based on HHT were encouraged.Salient region of the natural image is the main part to describe the image semantic, which is called ROI(region of interesting). In this paper, a saliency-based bottom-up visual attention computational model which was motivated by visual physiological and psychophysical experimental results was used for natural image retrieval.Interesting points in natural images were selected by using the visual attention computational model. Furthermore, local features around the interesting points were computed for natural image retrieval. The proposed local salient features were called image salient histogram, image salient signature and focus of attention(FOA) spatial relationship histogram. Experimental results showed that salient local feature which combine salient signature and FOA spatial relationship histogram can achieve better retrieval accuracy than global feature.In this paper, natural image retrieval approach based on visual attention computational model and latent semantic indexing was also proposed.Multi-instance learning(MIL) is a new machine learning framework which has the ability of "learning from ambiguity". MIL may be a hopeful approach to solve the difficult "semantic gap" problem in image retrieval. Designing good bag generator is an important problem in MIL. Two novel bag generators based on visual attention computational model and an effficient image segmentation algorithm whose name is JSEG were proposed. Natural image retrieval experiments were carded based on MIL and these two bag generators. Experimental results showed that bag generator based on JSEG algorithm can achieve better retrieval results than other bag generators introduced in some literatures.The image perception threshold and its metric approach was proposed. Problem of the image perception threshold was studied on a natural image database based on image color/gray mode and different image transformation approaches by using computer and psychophysical experiments. The image transformation approaches used were scale transformation, Gaussian noise transformation and motion blurred transformation. According to the preliminary experimental results, following conclusions were drawed: There exists a perception threshold when human perceive the natural images. The image perception threshold can be measured by image entropy and image fractal dimension number. There exists a law similar with the Weber’s law when human perceive the natural images. The ratio of metric of the difference image to that of the original image is independent of the image content and is relevant to both image color/gray mode and image transformation approaches.A general-purpose image retrieval experimental platform was also developed. Reseachers can use this platform not only to carry out all kinds of image retrieval experiments without any coding, but also do academic intercommunion conveniently. This platform is very useful for image retrieval study.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2008年 02期
节点文献中: