节点文献

基于内容的图像搜索重排序研究

Research on Content-based Image Search Reranking

【作者】 田新梅

【导师】 吴秀清;

【作者基本信息】 中国科学技术大学 , 信号与信息处理, 2010, 博士

【摘要】 随着互联网技术和网络共享服务的发展,网络上的视频/图像数据呈几何级数增长。为了满足大量用户的搜索需求,建立快速有效的视频/图像搜索系统成为迫切需要解决的问题。为了借鉴文本搜索中的成熟技术并满足搜索对高效性的要求,目前大多数的商业搜索引擎(Bing, Google, Yahoo, Baidu等)对视频和图像的搜索主要是通过索引其相关的文本信息。由于这些文本信息不足以全面充分地描述视频/图像丰富的视觉内容,基于文本的视频/图像搜索结果不尽如人意。重排序被提出在基于文本的搜索结果基础上,通过加入视频/图像视觉信息、用户反馈等知识改进搜索结果。目前的重排序方法已经取得了一定进展,然而由于低层特征和高层语义概念之间的语义鸿沟的存在,视频/图像搜索重排序中还有很多问题需要研究。本论文首先提出了无监督的贝叶斯重排序算法,接着分析了将重排序应用到实际系统中的几个关键问题,最后提出了有用户反馈情况下的半监督主动重排序方法和基于结构学习的有监督主题多样化重排序方法。本文对基于内容的重排序方法进行了深入研究,主要工作和创新之处归纳为以下几点:1.本文在分析视觉信息和文本信息在重排序中的本质作用基础上,从贝叶斯角度将这两种信息分别看作是先验和似然提出了贝叶斯重排序。贝叶斯重排序是一个通用的重排序框架,很多现有重排序算法都可以统一到该框架下。针对现有算法对视觉信息和文本信息的描述中存在的问题,分别提出了局部学习正则化模型和基于点对的偏好强度重排序距离。在标准数据集上的大量实验验证了本文提出的方法的有效性。2.重排序研究的最终日的是成功地将其应用到实际的搜索系统中有效提高基于文本的视频/图像搜索结果。本文从多方面探讨将重排序应用到实际的图像搜索系统中的关键问题,对这些问题的讨论不仅对于将来重排序的实际应用有重要意义,对我们进一步的研究工作也有指导意义。本文从算法、特征表达、计算复杂度等方面提炼了六个关键问题,并从三个常用的商业搜索引擎中搜集了一个网络图像数据集,在该数据集上进行了大量的实验,通过对这些实验结果的分析和总结,给出了这六个问题的答案。3.无监督的重排序为所有的用户返回一样的查询结果,因此不能满足不同用户的不同搜索需求,尤其是在用户的查询词指代不明的情况下。研究表明相关反馈是解决这一问题的有效途径,但是现有的基于用户交互的重排序方法不能很好地从用户反馈中准确学习用户的搜索意图。为了解决这一问题,本文提出了半监督的主动重排序方法,该方法首先通过人机交互获得用户的标注信息,在此基础上利用子空间学习算法区分与用户查询相关和不相关的图像。在学习用户的真正搜索意图过程中,为了减少用户的标注量提出了一种基于结构信息的样本主动选择方案:为了学习反映与用户查询相关的图像子空间提出了一种局部-整体区分式子空间学习算法。在人工数据集和网络图像搜索数据集上的实验表明本文提出的主动重排序方法可以有效学习用户的搜索意图,返回满足用户需求的结果。4.在图像搜索中,用户希望返回的结果同时具有高相关性和高主题覆盖性。主题多样化重排序受到越来越多的重视,但是现有的多样化重排序方法受到两方面的限制。首先,这些方法对相关性和多样性的优化是分两步进行的,因此得不到联合最优的结果;另一方面这些方法普遍使用视觉多样性来近似主题多样性,由于语义鸿沟的存在,这一方法很难得到好的结果。针对这些问题,本文提出了联合优化相关性和主题多样性的主题多样化重排序。该方法在结构学习框架下设计了一组特征来描述排序结果的相关性和多样性,然后利用用户标注信息,从一组训练数据中学习得到主题多样化重排序模型。利用该模型,可以对未标注的查询进行预测得到高相关性和高主题多样性的重排序结果。在网络图像搜索数据集上的实验表明本文提出的方法可以同时提高相关性和主题多样性。

【Abstract】 With the rapid development of recording and storage devices, as well as the significant improvement of transmission and compression techniques, the amount of multimedia data (e.g., image, video and audio) on Internet increases explosively and the video/image-sharing websites become more and more popular. Efficient and effective multimedia search tools are essential for Web surfing. Due to the requirement of high efficiency and the leverage of successful techniques already de-veloped in text search, most of our frequently-employed image search engines, e.g. Bing, Google, Yahoo and Baidu, are implemented by indexing and searching the images’associated textual information, e.g., image file names, URLs, surrounding texts and so on. However, this text-based image search result is not satisfactory because that the textual information is not the essential description of image’s rich content. Reranking is then proposed to refine this text-based search result by incorporating images’ visual information, user feedback and other information.Although a lot of works have been done on image search reranking, there are still many problems need to be solved, due to the semantic gap between low-level visual features and high-level semantic concepts. In this thesis, we first propose an unsupervised Bayesian reranking method, and then distill six most important problems which should be carefully considered in a practical reranking system, finally proposed semi-supervised active reranking with user feedback and structural learning based supervised topic-aware reranking method. This thesis conducts a deep research on reranking and obtains the following achievements1. By analyzing the intrinsic roles of the textual and visual information in reranking, we propose Bayesian reranking in which the two cues are mod-eled as as prior and likelihood respectively from probabilistic perspective. Bayesian reranking is an general framework and can unify several existing reranking methods. To well model the textual and visual information in Bayesian reranking framework, we also propose to use a local learning regu-larizer to model visual consistency and a pair-wise preference strength rank-ing distance respectively. The experiments conducted on benchmark datasets have demonstrated the effectiveness of the proposed Bayesian reranking method.2. To incorporate reranking technique into practical image search system, there are several issues which will greatly influence the reranking performance, besides the reranking algorithm design. This thesis distills six most impor-tance problems which should be carefully considered in a practical reranking system. the six aspects include algorithm selection, effective visual feature representation, efficient feature extraction, computational cost, the charac-teristics of the text-based reranking, and the utilization of the text-based search results. Their effects to the resulting reranking performance are ana-lyzed based on comprehensive experiments on a dataset collected from three most frequently-used commercial image search engines. We believe that these analysis and insightful findings will provide useful guidelines for the practical application and further research on Web image search reranking.3. unsupervised reranking methods fail to capture the user’s search intentions when the query term is ambiguous. Relevance feedback has been proven to be an effective way to solve this problem. However, current work on rerank-ing with user interaction cannot learn the user’s intention precisely. This thesis proposes semi-supervised active reranking methods to learn use’s in-tention more extensively and completely. This method first obtain the user’s labeling information by interacting with users, and then learn the user’s in-tention by distinguishing relevant images from irrelevant ones via subspace learning. Furthermore, this thesis proposes a structural information based sample selection strategy to reduce the labeling efforts and a novel local-global discriminative dimension reduction algorithm to localize the user’s intention in the visual feature space. Experiments conducted on both syn-thetic datasets and Web image search dataset demonstrate the effectiveness of the proposed active reranking method.4. In image search, the desired result should satisfy both high relevance and high topic diversity. Topic diverse reranking has drawn increasing attentions. However, existing diversified reranking methods suffer from two problems. First, the maximization of diversity and relevance is performed in two-step, which typically will not achieve the joint optimum. Second, visual diversifica-tion, which is used in diversified reranking, usually cannot well approximate the topic diversity due to the semantic gap. In this paper, we propose topic-aware reranking which jointly maximizes the relevance and topic diversity. Through a structured learning framework, the relevance and diversity are modeled by a set of carefully designed features, and then learned from hu-man labeled training samples. The experiments conducted on a web image search dataset demonstrate that the proposed method not only improves the topic coverage compared with existing diversified reranking methods but also improves the relevance compared with relevance-based reranking methods.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络