节点文献

图像检索中自动标注技术的研究

Research on Automatic Annotation in Image Retrieval

【作者】 赵玉凤

【导师】 赵耀;

【作者基本信息】 北京交通大学 , 信号与信息处理, 2009, 博士

【摘要】 随着多媒体技术和计算机网络技术的发展,人们接触到的图像数据迅速增长。面对海量图像资源,基于内容图像检索(Content Based Image Retrieval, CBIR)技术能够有效地分析、组织和管理图像数据,因此成为多媒体领域的研究热点。然而由于受到“语义鸿沟”瓶颈的制约,也就是低层视觉特征(如颜色、纹理、形状等)不能完全反映和匹配用户的查询意图,导致CBIR技术遇到了前所未有的巨大挑战。近几年发展起来的自动图像标注技术就着手于建立起高层语义与低层特征之间的桥接,是解决“语义鸿沟”问题的有效途径之一。针对当前自动图像标注技术中存在的问题和不足,本文尝试和探索从不同的角度挖掘图像内容的语义概念,即半监督模式、小样本学习、伪相关反馈机制与多视角的语义关联性分析,以此强化对图像内容的语义理解,改善自动图像标注的性能。主要成果和创新之处包括以下几个方面:(1)半监督模式下的自动图像标注本文首先探讨了自动图像标注问题本身的特点,即由于一幅图像被标注多个关键词,同时一幅图像又包含多个区域,因此其属于一个多类多示例学习问题,据此提出了在半监督模式下完成自动图像标注任务。通过在多示例学习框架下对语义关键词进行独立分析,将多类分类问题转化为半监督模式下的二类分类问题,实现语义粒度的层次化描述,以期有效挖掘图像的内在语义概念。实验结果验证了该图像标注框架的有效性。(2)自动图像标注中小样本学习问题虽然图像标注工作已经取得了很大的进展,但是由于关键词语义类别的多样性,用于图像标注任务的训练图像数量相对不足,即小样本学习问题,导致了图像标注的效果不甚理想。为了解决自动图像标注中小样本学习问题,本文着重研究了在最小参考集(Minimum Reference Set, MRS)框架下的多示例学习策略。通过采用具有最小MRS的代表示例集合表征关键词的语义信息,提高了多示例学习的鲁棒性,从而使得在训练样本不足时自动图像标注的性能得到显著改善。(3)伪相关反馈框架下的自动图像标注从数据挖掘的角度分析可知,图像检索与图像标注两种技术在某种程度上具有一致性及互补性。针对现有基于Search的图像标注中存在的不足,如相关图像集合的精度低、用户负担重等,本文尝试通过有效融合伪相关反馈机制,建立伪相关条件概率标注模型。在避免人工干预的同时实现自动迭代搜索,以期获得更为可靠的相关图像集合;而且利用基于文本分析技术获取关键词之间的语义关联,从而更好地服务于图像标注任务。(4)多视角的语义关联性分析如何挖掘基于语义的多视角相关模型是当前自动图像标注技术中一项重要而迫切的研究课题。本文从概率关联模型角度,分析了隐马尔科夫模型解决自动图像标注任务的可行性。在直推式支持向量机的框架下,有效地建立图像-关键词之间的对应关系;而且通过融合关键词的共生关系与语义词典,高效地获取关键词-关键词之间的语义关联,建立了图像-关键词与关键词-关键词的多视角相关模型,有助于解决自动图像标注任务。

【Abstract】 With the development of multimedia technology and computer network, content-based image retrieval (CBIR) becomes more and more important to organize, index and retrieve the massive image information in many application scenarios. Thus, CBIR has emerged as a hot topic in recent years. However, the improvement of CBIR is hindered by the well-known semantic gap between low-level visual features, e.g. color, texture, shape, and high-level semantic concepts. Automatic image annotation (AIA) is a feasible way to narrow down the semantic gap since it attempts to establish the bridge between low-level visual features and high-level semantic concepts.Aiming at the problems and the difficulties in the field of AIA, the semantic concepts of images are mined from different views, i.e. the manner of semi-supervised learning, the learning of small samples, the scheme of pseudo relevance feedback and semantic relationship based on multiple views. Since the semantic understanding of image content is addressed based on the four views, the performance of AIA can also be largely improved. The main contributions of the dissertation are as follows:(1) Automatic image annotation in a manner of semi-supervised learningThe discussion and analysis of AIA is given in this dissertation, i.e. one image is annotated by several keywords and is segmented into many regions. Therefore, the task of AIA attributes to both the problem of multiple-classification learning and multiple-instance learning (MIL). For this, the dissertation proposes that AIA is resolved in a manner of semi-supervised learning. By independently analyzing the keywords under the framework of MIL, the multiple-classification is able to be transformed into binary-classification so that the hierarchical description of semantic granularity is implemented and the intrinsic semantic concept is effectively mined. The experimental results verify the effectiveness of the proposed framework.(2) Small sample learning in automatic image annotationAlthough many improvements are made in recent researches, the problem of small samples is more and more salient in the domain of AIA, which degrades greatly the performance of image annotation. In order to focus on the problem of small samples, the MIL strategy based on minimum reference set (MRS) is investigated in this dissertation. Then, the salient instance set with the smallest size of MRS can be accurately exploited to characterize the semantic content of keywords. Since the robustness of MIL is promoted, the quality of AIA can also be increased greatly.(3) Pseudo relevance feedback oriented automatic image annotationAnalyzed from the view of data mining, the image annotation technology possesses the consistency and the complementarities with the image search technology. To overcome the difficulties in search based image annotation, e.g. lower accuracy of relevant images, more burdens on human, the dissertation attempts to integrate the scheme of pseudo relevance feedback into the task of AIA and create the pseudo relevance probability model of automatic image annotation. Hence, more reliable relevant images are explored without human’s interruption and the semantic correlations among keywords are mined by the technology of textual analysis, which leads to better annotation performance.(4) Semantic relationship analysis from multiple viewsA popular technology is focused on how to build the semantic relation of relevance model based on multiple views recently. From the view of probability relevance model, it is feasible for Hidden Markov Model (HMM) to deal with the task of AIA. Under the framework of transductive support vector machine, the correspondence of image-keyword is able to be constructed effectively. Moreover, the semantic relation of keyword-keyword is correctly mined by combining the co-occurrence and the tool of WordNet. Then, the multiple-views based relevance model, i.e. image-keyword and keyword-keyword, can be built to promote the quality of AIA.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络