节点文献

辅助动画视频分析的相似视频片段匹配技术研究

Research on Near Duplicate Video Clip Detection Technology Supporting Cartoon Video Analysis

【作者】 邓莉琼

【导师】 吴玲达;

【作者基本信息】 国防科学技术大学 , 控制科学与工程, 2012, 博士

【摘要】 伴随着视频技术和互联网的快速发展,动画视频作为视频数据的一个重要分支,正日益受到人们的关注,其产业规模也在不断扩大。然而,大量的动画视频资源也使得用户难以在浩如烟海的视频数据中快速找到所需的视频信息。作为动画视频数据中重要的组成部分,动画视频片段由于承载了大量可重复使用的内容,重复利用率高,因而成为视频研究领域中的主要研究对象之一。近年来,随着动画视频片段的来源日益广泛,数量日益庞大,相似动画视频片段匹配技术已成为多媒体检索领域中的热门课题。如何找到一种有效的动画视频片段匹配方法,使之可以辅助用户对动画视频进行自动化分析,已经成为当前视频分析领域迫切需要解决的问题之一。相似视频片段匹配技术在实际中的应用主要受两个因素影响:匹配准确度和匹配速度。由于视频片段匹配技术涉及多个结构层次和多个语义层次的大量视频信息,计算复杂度大,因此,如何实现准确度与速度之间的优化平衡一直是相似视频片段匹配研究中的难点问题。为此,本文对动画视频片段匹配技术进行了深入研究,提出了一系列相关技术和方法,其目的是为动画片段的辅助分析工作提供技术支持,从而帮助用户获取所需的动画视频信息。为实现上述目标,本文从系统的角度对以下几个关键技术进行了研究,分别是:动画图像的特征提取与匹配技术、相似动画视频片段的特征提取与匹配技术、基于内容的动画视频片段实时探测技术、动画视频片段标注技术以及相似动画视频片段关联技术等。这些关键技术从不同方面对动画视频辅助分析提供了支持,其研究循序渐进,方法相辅相成,因而形成了一套比较完整的理论和方法体系。具体来说,本文的主要贡献体现在以下几个方面:首先,提出了一种结合颜色信息的动画图像特征的提取与匹配方法。针对动画图像区别于自然图像的若干特点,对传统的全局特征提取与匹配方法进行了改进,将动画图像颜色组成分布之间的关联特征融合进全局颜色特征的描述子中;此外,针对局部特征提取方法中常常丢失颜色信息的问题,将颜色不变量作为图像局部特征的提取对象,用于描述动画图像的细节组成信息;最后,研究了动画图像全局特征与局部特征的加权融合方法,实现了两类特征的优势互补。其次,提出了动画视频片段在不同结构层次和不同语义层次上的多种相似匹配方法。结合相似视频片段定义的多语义特点,分别从底层特征层和中层逻辑层对视频片段相似度的衡量方法进行了研究。其中,针对视频片段的底层语义特征,分别对基于关键帧的词袋方法和基于关键帧的编辑距离方法进行了改进,上述方法分别利用语言模型和扩展后的编辑距离对视频关键帧序列的视觉特征和时序特征之间的相似度进行了描述;其次,针对视频片段的中层语义特征,分别提出了基于关键帧的时序网络方法和基于视频单元的视频距离轨迹方法,前者通过时序网络对视频片段的时序特征与视觉特征进行融合,从而有效解决了视频片段的部分对齐问题,后者在对视频单元的特征描述参数进行定义的基础上,结合图论中的最优匹配技术实现了相似视频片段的最佳对齐;最后,研究了不同语义层次上的相似动画视频片段匹配方法之间的相似度融合方法,通过调整权重系数,可以使其满足不同的应用场合和任务需求。然后,提出了一种基于相似视频片段匹配的动画视频辅助分析方法。其中,为了实现基于内容的相似动画视频片段实时探测,首先,在相似视频片段匹配技术的基础上,利用改进后的局部敏感哈希函数建立了视频片段的索引结构;此外,为提高相似视频探测结果的排序准确度,采用了关联图对相似动画视频片段进行聚类,实现了对探测结果的重排序;其次,针对传统视频片段自动标注方法中错误信息过多的问题,利用随机漫步的方法对标注信息进行了重排序,在完善了视频片段语义信息的同时,实现了基于检索的动画视频片段自动标注;最后,利用可视化技术从三个方面对相似动画视频片段之间的关联关系进行了挖掘和表现,分别是:聚类关联、特征关联以及演化历程。最后,设计和实现了一个动画视频片段辅助分析系统NCLIPs。详细介绍了NCLIPs系统的设计思路、结构层次和模块功能,给出了系统原型的具体实现界面。NCLIPs系统以本文研究的各种技术和方法作为支撑,实现了对动画视频片段基于内容的分析、检索以及个性化的组织表现。综上所述,本文通过研究相似动画视频片段匹配方法实现了对动画视频片段的辅助分析。从实际效果看,本文提出的相似动画视频片段匹配方法具有较高的匹配准确度和匹配速度,将其用于动画视频片段的辅助分析,可以为视频片段的分析、检索以及个性化的组织与表现提供技术支持。从研究意义上看,本文的研究为获取动画视频片段的相关信息提供了一条有效途径,其研究成果无论是在理论还是在实践上都具有十分重要的意义。

【Abstract】 With the fast development of video technology and the rapid spread of internetmedia,the cartoon video which is an important embranchment of video data is nowattracting more and more attention while the cartoon industry is broadening continually.Howerver, the abundant cartoon video resource makes users difficult to find theirneeded information in video dataset. As an important composing of cartoon videomaterial, cartoon video clips often bear tremendous repeated information and have ahigh reuse rate, which make it becomes the main research object in video domain.Recently, as the amount of cartoon video clip is increasing exponentially, the nearduplicate cartoon video clip matching technology has become one of the hotspots inmultimedia retrieval domian. How to obtain an effective matching technology of nearduplicate cartoon video clips, and use it to realize the automaticlly analyse of cartoonvideo has become one of problems which are needed to solve urgently in video domain.The near duplicate video clip matching technology is affected by two factors:veracity and velocity. Because of the large video information among different structurelevels and semantic levels of video clip, the computation complicacy is huge too. Thus,how to implement the tradeoff between veracity and velocity has always been thediffculty in the near duplicate video clip matching research. This thesis explores thetopic on near cartoon video clip matching technology, a num of correlative technologiesand methods are proposed. The target is to support the video clips analyse and helpusers to obtain needed cartoon video information.To realize those targets, the following key technologies are researched fromsystem’s point: cartoon image feature extraction and matching, near duplicate cartoonvideo clip matching, caontent-based online cartoon video clip detection, the annoatationof cartoon video clip and relevancy among near duplicate cartoon video clips. In detail,the main contributions of this thesis can be concluded as follows:Firstly, a color combined visual feature extraction and matching method of cartoonimage is proposed. Aimed at the cartoon image’s features which are distinc from naturalimage, through the embeding of the relevancy feature of image’s color distribution intothe global color descriptor, the matching accuracy is improved. Meanwhile, Aimed atthe problem of missing color information in local feature extraction method, the colorinvarance is used as the input of local feature extraction, so the compoment details ofcartoon image is well described. Finally, the weighted fusing methods of the cartoonimage’s global and local feature are researched in order to realize the advantages’complementation between the two features.Secondly, the matching methods of near duplicate cartoon video clip on differentstructure levels and semantic levels are proposed. The research is done from the bottom feature level and middle logical level. Firstly, aimed at the bottom level, thekeyframe-based bag of word method and edit distance method are improved by usinglanguage model and extended edit distance to describe the visual feature and sequencefeature of keyframe respectively. Secondly, aimed at the middle level, the keyframebased sequence net method and unit based video track method are proposed. Theanterior one fuses the visual feature and sequence feature by building the net whichsolve the problem of partially alignment. Based on the disciption parameter of videounit feature, the latter one realizes the best alignment by employing the optimizationmatching technology of graph theory in order to achieve the traderoff between veracityand velocity. Finally, the similarity fusing approach among different levels of cartoonvideo clip matching is researched, during which the diverse application situation andtask demands can be satisfied by adjusting the weighted coefficient.Thirdly, methods of cartoon video clip supporting analyse technology are proposed.Firstly, a content-based online near duplicate video clip detection method is realized,which employs an improved index structure to speed up the detection rate and proposesa relevancy graph based resorting method in order to resort the detection result;Secondly, an automatical annotation method of cartoon video clip which empoly therandom walk approach is proposed to solve the problem of overfull wrong labels andconsummate the clip’s semantic information; Lastly, kinds of visualization structuresare proposed to present the relevancy among near duplicate cartoon clips, which are:class relevancy, feature relevancy and evolution process, in order to provide a novelthought for mining the deep information of cartoon video clip.Lastly, a system for supporting analyse cartoon video clip is designed andimplemented. The design idea and each functional module of NCLIPs system aredescribed in detail, and the implementation of prototype system is also presented. TheNCLIPs system use the technologies of this thesis as the support to implement thecontent-based analyse, retrieval and bardian organazation of cartoon video clips.As a general, the supporting analyse of cartoon video clip is realized based on nearduplicate video clip matching. From the point of pratical effect, the proposed nearduplicate cartoon video clip matching method has a good veracity and velocity. Thistechnology can be use as a technical support for videos analyse, retrieval andorganazation. From the point of research meaning, achievements of this thesis providean effective method to obtain the cartoon video information. The research productionhas an important meaning in theory and pratical use.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络