

A Research on Video Coding Technologies for Communication

【摘要】 摘 要视频压缩编码技术是多媒体通信中的关键技术之一。传统的视频压缩技术主要解决的问题是如何提高压缩编码效率,面向通信的视频编码除了对编码效率提出了更高的要求外,还要求压缩后的视频流能够适应网络带宽的变化和容忍传输错误(如互联网的丢包和无线网络的突发及随机错误等)。在视频通信应用中,视频编码的目标从面向存储转到了面向传输,编码的目的从产生适合存储的固定尺寸的码流发展到产生适合一定的传输码率的可伸缩性码流。因此面向通信的视频压缩编码技术成为视频通信应用中急需解决的关键问题,对其进行的研究必将具有重要的理论与现实意义。本文对于面向通信的视频编码技术进行了深入系统的研究,具体内容主要包括:(1) 对 MPEG-4 的精细可伸缩视频编码方案编码效率低的问题,归纳总结了基于 FGS 的可伸缩视频编码方案的一般性原理框架,并在这个一般性的原理框架指导之下,根据漏预测编码技术,提出了一种基于宏块的漏预测精细可伸缩视频编码方案,称作 MB-based FGS-LP。在该方案中,首先提出了用于增强层宏块编码的漏预测帧间编码模式,该模式以牺牲少量编码效率为代价,来提高编码方案的鲁棒性。同时给出了一个简单的算法用于最优地确定每个增强层宏块的编码模式及漏预测因子的数值。实验结果显示,本文所提出的方案不仅在编码效率上比 MB-basedPFGS 视频编码方案有了进一步提高,而且保留了 MB-based PFGS 的误差恢复能力; (2) 为了提高 FGS 编码方案的增强层编码效率,本文把用于静止图像的子带编码方法引入到 FGS 的增强层编码之中,提出了基于 DCT的嵌入块编码(EBC_DCT)方法。在 EBC_DCT中,首先根据 DCT系数的特性定义上下文,然后对每一个子块进行基于位平面的上下文算术编码。EBC_DCT 编码方法不但保留了原方法的可伸缩性,而且有效地提高了增强层的编码效率; (3) 针对 FGS 的码流特点,本文对基于 FGS 的抗误码方法进行了系统的研究,提出了分层前向纠错信道编码方法(LFEC)和先进的 FGS 增强层错误隐藏方法(AEC),实验结果表明,在高误码环境下的视频通信中,采用以上的抗误码方法,有效地提高了码流的抗误码能力; (4) 三维小波视频编码方案是面向通信的视频编码应用中另一种有效的解决方案,本文针对现有三维小波视频编码的不足,提出了一种基于提升的<WP=4>面向通信的视频编码技术研究 运动补偿三维小波编码方案,称作 MCL-3DWT,在该方案中,利用运 动补偿和小波提升技术实现运动补偿时域滤波,编码效率因为多参考帧 和子像素(subpixel)运动补偿的使用而大大提高。实验结果证明本文 的视频编码方案不仅保留了信噪比、时域和空域可伸缩的特性,而且编 码性能好于 MC-EZBC 等经典的运动补偿三维小波编码方案; (5) 小波提升方案是继多分辨分析之后,另一种非常有效的构造小波滤波器 的方法。本文从一般的 9-7 小波提升方案出发,给出了一种新的更适于 硬件实现的提升 9-7 小波滤波器(称为 D97 滤波器),该滤波器的图像 压缩性能与 CDF97 相当,但其计算复杂度由于引人了大量的移位运算 而大大降低,非常适于 ASIC 实现。并把 D97 滤波器应用于本文提出的 MCL-3DWT编码方案之中,证明了其在压缩性能上的有效性; (6) 运动估计是视频编码方案中的关键技术之一,在本文的两种可伸缩视频 编码方案中,运动估计是其中的核心算法之一,决定了整个视频编码方 案的编码效率和编码复杂度。在运动估计算法中,MVFAST算法是最为 有效的块匹配运动估计算法,该算法已被 MEPG-4 标准接受为运动估计 的核心算法,本文针对 MVFAST 算法的不足,提出了一种新的块匹配 运动估计算法,称为预测线性菱形搜索算法(PLDS)。在 PLDS 算法中, 第一次引入了运动估计的线性搜索策略,算法首先根据运动的时间相关 性判断块是否为静止块,然后通过对块的分类来决定搜索起始点并根据 搜索起点的信息来进行线性搜索,最后通过线性菱形搜索策略确定最终 的运动矢量。实验结果证明,与 MVFAST相比,PLDS 算法不仅减小了 运动估计的计算复杂度,而且提高了运动估计的精度。

【Abstract】 Video Coding is one of the key technologies of multimedia communication. Themajor task of the video coding technologies for storage-based applications is how toimprove the coding efficiency. However, for video communication applications,besides the requirements on higher coding efficiency, the generated bitstreams haveto adapt to network bandwidth variations and tolerate transmitted errors (such aspacket losses in the Internet network and random or burst errors in wireless network).The goals of the video coding were changed from the storage to the transmission, andfrom the fixed bitstreams to the scalable bitstreams. The video coding technologiesfor communication applications are the key question for video communicationapplications. It is important to research these technologies for the video codingtheories and the engineering applications. The thesis systematically researches thevideo coding technologies for communications. The main work is as follows:(1) In order to solve the low coding efficiency problem of the fine granularity scalable (FGS) coding in MPEG-4 standard, a universal scalable coding framework based on FGS is induced, and on the basis of this framework, a macroblcok-based fine granularity scalable video coding with leaky prediction (MB-based FGS-LP) is presented. A leaky prediction INTER mode is first proposed for the enhancement-layer macroblock coding. Then, a decision-making mechanism is developed to choose the coding mode and the optimal leaky factor for each enhancement layer macroblock. Our simulation results demonstrate the proposed scheme can provide further coding efficiency improvement over the MB-based PFGS scheme, yet still keeps the property of the error recovery.(2) In this paper, a new method based on the JPEG2000 bit-plane coding idea is proposed for the enhancement layer coding of FGS, referred to as Embedded Block Coding Based on DCT (EBC_DCT). In EBC_DCT, the data redundancies between the residual DCT coefficients are eliminated by context-based arithmetic coding Experimental results show the proposed method is higher coding efficiency than that of the original FGS enhancement layer in MPEG-4.(3) Based on the properties of the FGS bitstreams, a layered FEC framework is presented. To reduce the use of the bandwidth, the FEC data is separated<WP=6>面向通信的视频编码技术研究 two or more layer. The sender decides how many layers are sent according to the feedback statistics after a period of time. Moreover, an adaptive error concealment method is applied to the enhancement layer of the FGS in the decoder. Our experimental results show that the error resilience ability of the FGS bit-streams is obviously improved in the bit error environment.(4) Three-dimensional (3-D) wavelet coding using a motion compensated temporal filter (MCTF) is emerging as a very effective structure for highly scalable video coding. This paper describes a video coding system based on motion compensated 3-D wavelet coding, referred as MCL-3DWT. In the new system, the motion compensated temporal-filtering process is achieved in terms of the concept of lifting filters. The coding efficiency is improved due to the use of multiple-reference and subpixel accurate motion compensation. The proposed video system is applied to several test video clips. Its performance exceeds that of MC-EZBC while maintaining SNR, temporal, and spatial scalability.(5) Lifting Scheme is an efficient method for construction of wavelet after multiresolution. This paper constructs a kind of lifting 9-7-tap wavelet filter for hardware implementation, called D97, which not only have excellent compression performances but also are very suitable for the implementation of ASIC. The computational complexity of the new lifting wavelet filter is enormously reduced because of the introduction of the s
