节点文献

面向三维可视通讯的立体匹配方法

Stereo Matching for Three Dimensional Visual Communication

【作者】 柴登峰

【导师】 彭群生;

【作者基本信息】 浙江大学 , 应用数学, 2006, 博士

【摘要】 立体视觉是计算机视觉的核心研究领域。经过几十年努力,视图几何关系的研究取得突破,理论逐步完善,方法逐渐成熟,立体匹配的研究也取得很大进展,人们将视差场描述为马尔科夫随机场,将立体匹配表述为像素标号问题,采用图割算法和置信传播算法估计视差场,取得很好的实验结果。近年来,出现了三维可视通讯、基于图像的绘制等新兴应用领域,这些领域对立体匹配提出新的要求。本文针对这些新兴应用领域,围绕质量和效率两个要素,以马尔科夫随机场为描述工具,以图割算法为求解工具,对立体匹配问题开展研究。主要贡献包括:1.提出像素标号的二分法。首先将整个标号集赋给每个像素,然后将标号集一分为二成为两个子标号集并舍弃其中一个子集,如此循环直至标号集仅含一个标号为止。通过这种方式将多值标号问题转换为一系列二值标号问题,从而提供了NP难问题的一种近似解法。进一步解释上述标号过程,并据此构造优化目标函数,证明所构造目标函数可以利用图割算法进行优化。在此基础上,设计像素标号的置位算法,算法复杂度为log2n(n是标号数目),而目前同类算法中效率最高的扩张算法(α-expansion algorithm)复杂度为n*k(k>1)。应用置位算法求解立体匹配问题并与扩张算法进行比较,结果表明:在匹配质量相当的基础上,二分法具有很强的效率优势。像素标号的二分法对立体图像没有特殊要求,方法具有很强通用性,而且还可以应用于图像恢复、运动估计等领域。2.提出双层立体匹配方法。回顾和分析现有分层立体匹配方法,针对前景和背景彼此分离并各自连续的场景,提出首先确定前景层视差场和背景层视差场,然后组合成为整体视差场的匹配方法,从而将整个匹配分解为一系列二值标号问题,避免模型拟合与迭代改善。在此框架下,进一步给出融合颜色、对照度和形状等信息划分前景区域和背景区域的目标函数。实验结果表明:双层立体匹配方法大大改善了匹配质量。与分层动态规划方法比较的结果表明:双层立体匹配方法在质量和效率方面都具有一定优势。3.基于上述两个方法,给出三维可视通讯系统中凝视校正和前景背景分离两个关键技术问题的解决方案和实现技术。特别地,提出基于双层表达的视图合成算法,提出基于像素标号二分法的前景背景分离算法。进一步给出实验结果,表明方法的有效性。

【Abstract】 Stereo vision is a fundamental topic in computer vision. There have received a breakthrough on multiple view geometry in the past decades. At the same time, people described disparity field as Markov Random Field, formulated stereo matching as pixel labelling problem, applied graph cut algorithms or belief propagation algorithm to estimate the disparity field, and got very good experimental results. In recent years, three dimensional visual communication and image based rendering etc. are becoming new applications of stereo vision and require both high quality and high efficiency of the matching. In this thesis, we develop some novel approaches for stereo matching to meet these requirements. The main contributions consist of:1. We propose bisection approach for pixel labelling. It assigns the whole label set to each pixel at first, splits the label set into two subsets and discards the one with higher cost of assigning it to the pixel iteratively, until each subset contains only one label. We present a probabilistic interpretation of the process, construct an energy function to optimize it, and prove that the constructed energy can be mini-mized via graph cut exactly. Based on bisection approach, we propose bit setting algorithm, it sets one bit of each pixel’s label at each step. Bit setting algorithm has complexity of (log2n), is most efficient among state of the art techniques. We apply bit setting algorithm to solve stereo correspondence problem. Exper-imental results demonstrate that both good performance and high efficiency are achieved.2. We propose bilayer stereo matching for scenes consist of foreground and back-ground. It first determines disparity fields for foreground layer and background layer independently, then combines them together to get the final disparity field. Unlike previous layered approach for stereo matching, it does not need model fitting and iterative adjustment. We also make use of color information and con-trast information in one image to determine a better segmentation of foreground and background. Experimental results demonstrate that bilayer stereo matching improves precision greatly, has advantages on both quality and efficiency over Layered Dynamic Programming. 3. Based on above approaches, we present some solutions for gaze correction and foreground/background segmentation, which is necessary for three dimensional visual communication. We proposed a view synthesis algorithm based on bilay-ered description of scene, propose a foreground/background segmentation algo-rithm based on bisection approach for pixel labelling. More experimental results demonstrate that the techniques proposed in this thesis are effective.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2008年 01期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络