节点文献

图像和视频的便捷抠图技术研究

【作者】 管宇

【导师】 彭群生;

【作者基本信息】 浙江大学 , 应用数学, 2008, 博士

【摘要】 抠图(matting)是图像和视频处理的一项重要的技术,在医疗诊断、电影特效和家庭娱乐中,获得了广泛的应用。传统的数字抠图方法要求用户在原始图像上交互地指出确定的前景区域、确定的背景区域和不确定区域,作为初始的输入信息。这种初始信息输入图称为三分图(trimap),抠图算法在三分图的基础上进行优化。因此,三分图制作的质量将直接影响到最终的抠图结果。然而,手工制作一张合适的三分图需要大量的交互。对于一张复杂的图像,例如蜘蛛网等,制作一张最优的三分图则更加困难。如果对于一段视频序列,也逐帧地手工制作三分图,工作量更是难以想像。本文面向电影特效和家庭娱乐等应用领域,对图像和视频的便捷抠图技术展开研究,既要减少用户的交互量,又要保证高质量的抠图结果。为此,本文在如下一些方面进行了研究和探索。一、研究便捷的交互方式,使得用户从繁琐的三分图的制作过程中解脱出来:二、探索便捷的局部修改技术,使得用户可以方便地对抠图结果进行局部校正;三、研究便捷的视频抠图技术,从大尺度的视频数据中方便快速地提取出移动的前景α图(matte)和前景目标。更为重要的是,保持视频抠图的时空一致性。基于上述目标,本文的主要内容如下:第一章介绍图像和视频抠图的意义及抠图技术的演化和发展。对抠图的相关工作进行了阐述,并对其不足之处进行讨论。随后,揭示了图像和视频抠图的难点,引出本文的研究目标和文章组织。第二章提出了基于线条的便捷图像抠图系统。采用基于线条的便捷交互方式和迭代的能量最小化系统框架,提取出了高质量的前景α图和前景目标。并进一步提出局部便捷抠图技术,对全局便捷抠图结果进行局部精细。更为重要的是,由于存在狄立克莱边界条件,局部修改结果可以无缝地嵌入全局抠图中,不会产生视觉跳跃。第三章将便捷图像抠取算法扩展到视频抠图,提出了基于马尔可夫链的视频抠图算法。将视频序列分割成具有相互关系的视频帧对,构建三维的能量函数对帧对进行优化。用户只需在关键帧上指定少量的前景和背景线条,系统即可自动快速地提取出整段视频的前景α图。并保持了视频抠图的局部时空连贯性。第四章将视频体的时空编辑界面和基于线条的交互模式相结合,利用线条的体扩散算法以及自动的背景重建技术,提出了一种新的时空一致的视频抠图算法。三维能量最优化系统框架将抠图方程的零阶连续性和一阶连续性作为能量方程的先验知识,得到了全局最优解,重建出了时空连贯的前景α和前景颜色。最后,第五章对全文进行总结并展望将来的研究方向。

【Abstract】 Matting is an important operation in image and video processing. With the development of digital technology, matting is widely applied to medical diagnosis, special visual effects and home entertainment. To build priors, traditional digital matting approaches require the user to supply a hint image that partitions the input image into three regions: "foreground", "background", and "unknown" with the background and foreground regions having been delineated conservatively. The hint image is called as trimap. To generate good mattes, all these approaches require the user to "carefully" specify the trimap. However, it requires a considerable degree of user interaction to construct a "good" trimap for an experienced user, and it is almost impossible to manually create an optimal trimap. When images contain large portions of semi-transparent foregrounds or partial pixel coverage, such as the spider web image, manually creating a trimap is a very tedious process. Moreover, it is unimaginable for a video sequences to manually construct trimaps on a per-frame basis.In this dissertation, we focus on the convenient and fast image and video matting techniques and orient them to the domain of special visual effects and home entertainment, etc.. The techniques not only can simplify the user interaction, but also can extract high-qualified matte and foreground. Therefore, we have explored the following problems. First, we research how to create a convenient interactive mode to reduce user’s efforts from the tedious process of constructing trimap. Secondly, we explore a local matting technique to allow the user to further improve results locally. Finally, we seek a video matting technique to quickly extract moving mattes and foreground from a great deal of video data. More importantly, the technique explores how to preserve spatio-temporal coherence.Based on the above objectives, the main contents of this dissertation as follows:Chapter 1 introduces the significance of image and video matting, and de- scribes the evolution and development of matting techniques. Subsequently, we reveal the difficulties of image and video matting, elicit the research objectives and the organizations of this dissertation.Chapter 2 presents a stroke-based Easy matting system. We propose an iterative energy minimization framework for interactive image matting and extract high-qualified matte and foreground object. The energy optimization can be further performed in selected local regions for refined results. Due to the existing Dirichlet boundary condition, the modified local regions can be seamlessly integrated into the final results.Chapter 3 extents the Easy matting to video domain, proposes a Markov Chain based approach for video matting. We partition the video sequences into a series of frame-pair containing inter-frame correlation, and construct 3D energy function to optimize each frame-pair. Only few strokes are required to be assigned in few key frames, our system can automatically extract video mattes. And the final results preserve local temporal coherence.Chapter 4 presents an interactive video matting approach that combines the stroke-based interactive mode with video-cubic editing interface; proposes a new volume expansion scheme and a novel automatic background estimation algorithm. The 3D energy optimization framework regards the zero-order continuity and the first-order continuity of matte as a priori expectations, obtains globally optimal solutions. Most importantly, we reconstruct global spatio-temporal mattes and foreground.Finally, Chapter 5 concludes this dissertation by summarizing our contributions and suggesting future research directions.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2009年 07期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络