节点文献

视频搜索结果的重排序研究

Research on Video Search Reranking

【作者】 刘媛

【导师】 吴秀清;

【作者基本信息】 中国科学技术大学 , 信号与信息处理, 2009, 博士

【摘要】 互联网中视频数据在近几年呈爆炸式增长并且广泛分布,使得视频搜索成为当前视频研究的重点和热点。由于文本搜索的成功应用,现今通用的大型视频搜索引擎,如Google、Yahoo!、Live、百度等主要还是利用视频数据周围的文本信息,采用基于文本搜索的方法实现视频搜索和排序。然而,视频内容及其所包含的复杂意义通常是语言工具难以完整描述与表达的。为了解决这种原始文本搜索的缺陷,视频搜索结果的重排序逐渐引起了众多研究者的关注。重排序,是指基于原始搜索排序的基础上,通过挖掘数据内在关联、或者借鉴外部知识和人工干预,对原始搜索结果进行重新排序的过程,目的是提高搜索质量和提升用户搜索体验。本论文首先提出一种新颖的基于查询独立的学习框架,接着从三个阶段研究了视频搜索结果的重排序中的关键问题,即自重排序(仅从自身挖掘相关知识)、样例重排序(利用用户提供的查询样例)和群重排序(利用从外部搜索引擎的结果中挖掘的知识)。显然这三个阶段涵盖了现今的大部分视觉信息重排序的框架和方法。本文对视频重排序方法进行了深入的研究,主要工作和创新之处归纳为以下几点:(1)对于查询独立的学习框架,本文提出了在“查询-镜头”对中学习相关性关系。与传统的查询依赖的学习框架不同,该种方法的训练模型和任何查询没有直接关系,故训练样本在所有的查询中能达到共享,更适用于实际的应用。在这种查询独立的学习框架下,各种机器学习的方法都可以扩张并应用,从而进一步提出了一种基于SVM模型的全监督查询独立的学习方法和一种基于多图模型的半监督的查询独立学习方法。经过大量实验证实,查询独立的学习方法明显优于传统的查询依赖的学习方法,从算法的运算量角度看,查询独立的学习方法也更具有实用性。(2)对于自重排序,本文提出一种基于典型性的视频结果的重排序方法。传统的基于学习的重排序方法往往只关心训练样本的相关性或多样性,却忽略了样本的典型性。本文提出在考虑相关性和多样性的同时应兼顾样本的典型性。首先根据样本的概率分布定义视频/图像的典型性,并将样本选择看成一个既考虑样本典型性又兼顾原始搜索结果的优化问题,最后基于选择的高典型性样本并利用SVM构建重排序模型,实验表明该模型具有较好的泛化能力和较强的鲁棒性。(3)对于样例重排序,本文提出一种基于查询样例的全监督视频重排序方法。传统的全监督的视频重排序方法常根据经验将重排序问题转化为二类的分类问题,样本完全根据分类的置信度进行排序。文中提出了重排序实际上应是一个优化问题,即一个序列中的任意两个样本都能正确排序即可达到全局最优,而不是简单地区分每一个样本是否相关。在这样的框架下,进一步提出两种重排序算法,即直接重排序和插入重排序。实验证实,新的重排序方法可以较大程度地改进原始的搜索结果,与其他一些经典的重排序方法相比,也具有较大的优势。(4)对于群重排序,是本文中提出的重排序问题的新的发展阶段,旨在从互联网中挖掘相关的视觉原型并利用到重排序中。据大量资料的调研,群重排序是首次将互联网中的群包数据应用到搜索结果的重排序当中,与传统的自重排序和样例重排序有显著的不同。首先利用多个搜索引擎返回的结果图像构建一组视觉单词;接着在此视觉单词中挖掘两种视觉原型(显著度和共存性);最终基于该视觉原型将重排序问题转化为一个优化问题,并给出封闭解。实验表明,群重排序对原始搜索结果的提高是较稳定的,与传统的重排序方法相比有较明显的提升。

【Abstract】 The explosive growth and widespread accessibility of community-contributed multimedia contents on the Internet have led to surge of research activity in video search.Due to the great success of text search,most popular video search engines, such as Google,Yahoo!,Live and Baidu,build upon text search techniques by using the text information associated with video data.This kind of video search approach has proven unsatisfying as it often entirely ignores the visual contents and human perception on the search results.To address this issue,video search reranking has received increasing attention in recent years.It is defined as reordering video shots based on multimodal cues to improve search precision.In this thesis,we first propose a novel query-independent learning based video search framework;then we investigate the key problems of video search reranking in three paradigms:self-reranking,which only uses initial search results; query-example based reranking,which leverages user provided query examples; CrowdReranking,which aims to mine relevant visual patterns from the search results of external search engines.Obviously,such three paradigms cover most of existing reranking framework or approaches.Accordingly,this thesis conducts a deep research on video search reranking,and obtains the following achievements:(1) We firstly propose a novel query-independent learning(QIL) framework for video search by investigating relevance from query-shot pairs.Unlike conventional query-dependent learning framework,it is more general and suitable for real-world search applications.Under this framework,we can use various machine learning technologies.Therefore,we further propose a SVM-based(Support Vector Machine) supervised query-independent learning and a multi-graph-based semi-supervised query-independent learning approach.(2) For self-reranking,we propose a typicality-based video search reranking. Conventional learning-based approaches to video search reranking only care the relevance or diversity of the selected examples for building the reranking model,while video typicality is usually neglected.In this thesis,we propose to select the most typical samples to build reranking model,considering that typicality indicates the representativeness of each sample,so that more robust ??reranking model could be learned.We first define the typicality score of image/video based on sample distribution,and then formulate the example selection as an optimization scheme that takes into account both the image typicality and the initial ranking order in the initial search results.Based on the selected examples we build the reranking model by using SVM.(3) For query-example-based reranking,we present a novel supervised approach to video search reranking with several query examples.Conventional supervised reranking approaches empirically convert the reranking as a classification problem in which each document is determined relevant or not,followed by reordering the documents according to the confidence scores of classification. We argue that reranking is essentially an optimization problem in which the ranked list is globally optimal if any two arbitrary documents from the list are correctly ranked in terms of relevance,rather than simply classifying a document into relevant or not.Under the framework,we further propose two effective algorithm,called straight reranking and insertion reranking,to solve the problem more practically.(4) For CrowdReranking,we have proposed a new paradigm for visual search reranking called CrowdReranking,which is characterized by mining relevant visual patterns from image search results of multiple search engines available on.the Internet.To the best of our knowledge,the proposed CrowdReranking represents the first attempt towards leveraging crowdsourcing knowledge for visual reranking.This is a great difference from existing self-reranking and query-example-based reranking.We first construct a set of visual words based on the local image patches collected from multiple image search engines.We then explicitly detect two kinds of visual patterns,i.e.,salient and concurrent patterns,among the visual words.Finally,we formalize the reranking as an optimization problem on the basis of the mined visual patterns and propose a close-form solution.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络