节点文献

三维视频编码技术研究

Study on Three-Dimensional Video Coding

【作者】 杨海涛

【导师】 常义林;

【作者基本信息】 西安电子科技大学 , 通信与信息系统, 2009, 博士

【摘要】 三维视频使用户能够自由选择观看的视点与视角,并体验三维视觉感知,可广泛应用于三维电视、娱乐、视频通话、视频监控、艺术展览、教育、医疗和军事等各个领域。典型的三维视频数据包括多视点视频与相应的深度图像序列。三维视频信息量巨大,是制约其应用的瓶颈,因此三维视频压缩技术成为近几年的研究热点,尤其是基于H.264/AVC标准的三维视频编码标准化工作成为运动图像专家组(Moving Picture Experts Group,MPEG)近年来的主要活动内容之一。本论文深入研究了基于H.264/AVC的三维视频压缩编码方法及相关技术,主要研究内容与成果如下:1.提出一种基于深度特征的多视点视频图像区域分割算法,并可同时估计得到每一个图像区域的视差。已有基于深度特征的区域分割算法的一个共同特点是需要先估计得到基于像素或图像块的视差场,再分割得到不同深度层区域。提出的算法能够避免计算和分割图像视差场,直接提取图像中各对象的深度特征计算得到区域视差,并基于这些区域视差进行图像分割得到不同深度层次区域。2.总结与分析了已有的普通视频与可伸缩视频的运动信息预测编码方法,提出一种多视点视频视点间运动预测编码方法——基于精细粒度运动匹配的视点间运动跳过模式。运动跳过模式是一种已有的视点间预测编码技术,它能够节省编码宏块运动信息所需的比特开销,提高多视点视频编码总体效率。提出的精细粒度运动匹配方法在邻近视点图像中搜索得到当前编码宏块的最优运动信息,再将该运动信息用于视点间运动跳过模式,从而显著改进已有运动跳过模式的编码效率。该项技术已被联合视频小组(Joint Video Team,JVT)纳入多视点视频编码参考软件。3.视频图像与对应深度图像间具有极强的相关性,表现为对象边界的相似性和对象运动的相似性。因此本论文提出一种视频-深度联合预测编码方法,包括视频-深度运动信息复制与视频-深度运动信息预测两种机制,可在编码深度图像过程中重用视频图像编码产生的运动信息,从而提高深度图像压缩效率。此外,对多视点视频-深度联合预测编码结构进行了初步研究,设计出一种预测结构能够将已有各种预测编码工具纳入其中,灵活使用这些工具可以有效去除各种冗余信息。4.视频编码预处理能够消除或降低视频图像采集过程中引入的各种噪声和畸变失真,改善视频图像质量,并能提高后续的视频压缩编码效率。本论文对其中的自动曝光功能进行了深入研究,提出一种基于图像亮度直方图的自动曝光控制方法。算法从亮度直方图分布中推导得到不感兴趣区域,为这些不感兴趣区域分配相对较小的权值来降低它在计算加权均值时所占的比重,从而将曝光重点放在用户感兴趣区域达到优化图像亮度效果的目的。

【Abstract】 Three-dimensional video enables viewers to freely choose an arbitrary view-point and viewing direction, and provides three-dimensional visual perception to viewers. It can find wide applications in three-dimensional television, entertainments, video phone, video surveillance, exhibition, education, medical care and military field. Typical three-dimensional video data is comprised of multi-view video and corresponding depth image sequences. The huge amount of information in three-dimensional video is one of the key enabling factors for its wide applications. Therefore, kinds of three-dimensional video compression techniques have been intensively studied in recent years. Especially, the standardization of H.264/AVC based three-dimensional video coding scheme has recently become one of the main activities of moving picture experts group (MPEG).This dissertation investigates H.264/AVC based three-dimensional video compression algorithms and related techniques. Major contributions of this dissertation are summarized as follows:1. A depth based image region partitioning method is proposed for multi-view video, with which the disparity of each image region can be estimated simultaneously. Existing depth based region partitioning algorithms share one characteristic: pixel-wise or block-wise depth disparity field needs to be estimated firstly, and then region partitioning is performed by classifying these pixels or blocks into different groups. Distinguished from these algorithms, the proposed algorithm can directly get an estimation of the disparity for each of the regions with different depth characteristics. Then region partitioning is performed by specifying an optimal disparity from the estimated regional disparities for each block in the image.2. Existing predictive coding methods for motion information in ordinary two dimensional video coding and scalable video coding schemes are summarized and analyzed firstly. Then an inter-view motion predictive coding method, i.e., fine-granular motion matching based motion skipped coding mode is proposed for multi-view video coding. Motion skip mode is an existing inter-view motion predictive coding method, with which the bits for coding motion information of a macroblock can be saved, hence the compression efficiency of multi-view video coding can be improved. The proposed fine-granular motion matching algorithm searches the encoded neighboring views for the motion that matches the motion of the coding macroblock best, and then uses the best matching motion information in the existing motion skip mode. Therefore, the coding efficiency of the existing motion skip mode can be significantly improved. The proposed technique had been adopted into the reference software of multi-view video coding by joint video team (JVT).3. There are strong similarities between video pictures and corresponding depth images in the aspects of contour and motion of video objects. To exploit this kind of redundancy, a joint video-depth coding scheme is proposed to reuse the motion information of encoded video pictures in the coding of corresponding depth images by two motion reusing mechanisms, i.e., motion information copy and motion information prediction. In addition, we also made a preliminary investigation on the prediction structure of joint multi-view video-depth coding, and proposed a prediction structure that can incorporate various existing coding tools that can be used to remove all kinds of redundancies in multi-view video and depth data.4. Video pre-processing prior to video coding can be used to remove or reduce various noises and distortions introduced in the video capturing process, and can enhance the efficiency of subsequent video coding. Automatic exposure control (AEC), one of the most important video pre-processing techniques, is studied in the dissertation, and a luminance histogram based AEC scheme is proposed. The proposed algorithm finds out regions-of-no-interests (RONI) in a captured video picture based on the luminance histogram distribution, and puts the emphasis of exposure on regions-of-interests (ROI) by assigning a relatively small weighting factor for ROI when calculating luminance average. Therefore, the exposure of captured video pictures is optimized.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络