节点文献

视频压缩的运动估计与小波方法研究

The Research on Motion Estimation and Wavelet Methods in Video Compression

【作者】 王镇道

【导师】 章兢;

【作者基本信息】 湖南大学 , 控制理论与控制工程, 2008, 博士

【摘要】 图像是人类获取信息的主要途径,而图像压缩在数字图像的处理、存储和传输中起着十分重要的作用。运动估计和运动补偿是消除视频信号时间冗余的主要方法,是视频压缩编码的关键技术。本文在空间域运动估计方法、小波域运动估计方法,以及小波变换在图像压缩中的应用等方面进行了研究和探讨。主要工作与贡献如下:(1)提出了一种空间域物体运动特征自适应的运动估计算法,主要包括搜索起点预测模型和自适应搜索方法两部分。基于物体的整体性和运动的连续性,搜索起点预测模型随相邻块运动相关性的变化调整模型参数,预测结果更加接近最佳运动矢量。根据绝对误差和的梯度和物体的运动特征,自适应搜索调整模板的大小和形状,从而加快搜索速度。实验结果表明,该算法在峰值信噪比(PSNR)和搜索速度以及重建图像的主观质量方面优于其它快速运动估计对比算法。(2)提出了一种小波域的初始运动矢量预测方法和交叉搜索运动估计方法。初始运动矢量预测方法利用小波变换的多分辨率特性,以及块的时间和空间关联性,并结合低频子带全搜索方法,在运算复杂度增加极小的情况下,得到更准确的初始运动矢量。根据小波子带系数的特点,在低频子带平移后,交叉搜索沿水平和垂直方向进行,通过调整交叉搜索中心,逐步逼近最优的运动矢量。运动估计按小波变换级数进行,在每个分辨率层次,搜索范围局限于参考帧对应变换级数下的4个平移子带,得到的是当前分辨率下的最优运动矢量,保证了解码图像是已接收数据下的最佳图像。仿真结果验证了初始矢量预测和交叉搜索方法的有效性。(3)通过分析运动补偿时间滤波(MCTF)更新算法对重构图像PSNR的影响,研究了一种内容自适应的MCTF反向运动补偿算法。根据简单量化独立信源编码的平均绝对误差和计算方法,分析了更新步骤缺省时编码端高、低频子带的能量增益系数的变化,并得出更新步骤是否缺省时,因能量归一化所导致的解码端奇、偶数帧之间的PSNR变化。基于预测和更新步骤的运动矢量互为反向的原则,提出了一种反向运动补偿矢量的获取方法。仿真实验表明该方法可以改善重构图像的PSNR。根据小波变换的子带系数特性,建立了图像平坦区域的估算模型;根据运动估计后的高频子带能量,建立了运动估计准确程度的估算模型,从而实现了具有内容自适应特性的MCTF反向运动补偿算法,减小了ghosting伪影,克服了由于运动估计的不准确以及更新步骤可能导致的低频子带反向补偿误差。仿真实验证明了该方法的有效性。(4)设计了一种高效低功耗的二维小波变换器VLSI结构,提出了采用该小波变换器的视频压缩系统方案和系统优化方法。二维小波变换器采用基于提升算法的可分离二维变换结构,对行、列小波处理器的数据调度方法进行了优化。采用Z形扫描方法,用少量暂存器缓冲列滤波结果,实现了行、列处理器并行工作,提高了数据处理速度和硬件利用率。通过优化列处理器的数据调度,处理每个点只需读取或存储一次数据,将存储器访问带宽降低了50%,从而用单口RAM替代了双口RAM,大大减少了存储器所占芯片面积和功耗。整个设计进行了FPGA验证,并采用HJTC 0.18μm工艺库完成了综合与版图设计,设计的芯片通过了流片验证。(5)针对氧化铝熟料烧结回转窑生产过程,提出了一个视频压缩系统设计方案。该系统包括多媒体专用DSP、二维离散小波变换器、摄像头以及网络传输等部分。通过回转窑火焰图像的压缩进行了仿真实验,实现了黑把子图像在不同空间分辨率、时间分辨率和不同PSNR条件下的压缩与重构,完成了在极低码率下黑把子的准确识别,说明了该系统可有效应用于回转窑火焰图像的压缩。

【Abstract】 Image is the main approach to obtain information, and image compression is crucial to process、store and transmit digital image. Motion estimation and compensation play a virtual role in video compression coding to reduce temporal redundancies. This dissertation focuses on the motion estimation algorithm in spatial and wavelet domain,also investigated are the applications of wavelet transform in image compression. The main research of this dissertation is listed as follows:Based upon the motion characteristics of the subjects in spatial domain, an adaptive motion estimation (MV) algorithm, imposing of an alterable search pattern and an adaptive model for starting point prediction is proposed. According to the integrality and the motion continuity of the object, the parameters of the prediction model can be modified corresponding to the variation of the correlation amongest the adjacent blocks to minish the difference between the prediction result and the best MV. The alterable search pattern can improve searching efficiency by adjusting its size and shape according to the grads of the sum of absolute differences (SAD), which reflects the motion characteristics of the adjacent blocks. Experiment results demonstrate that the proposed algorithm outperforms the other fast motion estimation algorithms presented in this dissertation in terms of PSNR、search speed and subjective quality of the reconstruct image.An initial MV prediction approach together with a cross search motion estimation algorithm is presented. Combined with full search in low subband, a more accurate initial MV can be obtained from the prediction approach with little computational cost overhead, by utilizing multiresolution characteristic of wavelet transform and temporal/spatial correlation between the blocks. Due to the subbands coefficients characteristic, the cross search is performed only along the horizontal or vertical direction after low-band-shift. Search results can approximate the best MV by adjusting cross searching center continually. Motion estimation is performed by the level of the wavelet transform, and the search range is confined to the 4 subbands of the reference frames at the same transform level. AS a result, the MV searched is the best one at the current resolution, which can insure the optimality of reconstruct image quality under the current data received. The simulation results validate the prediction approach and the search algorithm. On basis of the impact on reconstruct image PSNR, brought by update step in motion compensation temporal filtering (MCTF), a content adaptive invert motion compensation (IMC) algorithm is proposed. When the update step is skipped, the energy gain factors will change correspondingly in both of the high and low subbands at encoder side. By useing the SAD calculation method of a simple scalar quantizer with independent coding of the source samples, the PSNR fluctuation can be figured out between even and odd frames whether the update step is skipped. Additionally, the ways to select MV for IMC is studied on principle of MV invertible between prediction and update step. Simulations illustrate the improvement in PSNR of reconstruct image by the proposed way. Furthermore, a content adaptive IMC algorithm in MCTF, composed of two estimation models, is put forth to reduce the ghosting artifacts. One model is based on characteristic of subbands coefficients for low activity region estimation, and the other is to estimate the MV accuracy in high subbands from the energy after ME. Artifacts, caused by inaccuracy ME or update step, can be reduced effectively by the estimation models. Experiment results verify the efficiency of the adaptive IMC algorithm.A high efficiency and low power dissipation VLSI architecture is designed for two-dimensional discrete wavelet transform (2D-DWT). In the separable lifting-scheme based 2D-DWT architecture, data dispatch is optimized in both row and column processor. Scanning in zigzag order, the processors can work in parallel via few temporal buffers to store the data filtered by column processor, leading to improvement of data transform speed and hardware utilization factor. By the optimization of data dispatch in column processor, only one read or one write operation are necessary per clock for temporal buffer, so the bandwidth of memory access can be reduced to 50 percents in contrast to one read and one write in usual case. As a result, the line buffer can be single-port RAM instead of two-port RAM, and the chip area and power dissipation can be decreased sufficiently. After demonstrated in FPGA, the implementation is synthesized with HJTC 0.18μm cell library, and layout has been designed for tape out. The CMOS chip demonstrates the VLSI architecture and its implementation.Finally, an image compression scheme,including DSP for multimedia process、2D-DWT transformer、video camera and network transmission, is designed for the rotary kiln alumina production. Compression experiments are carried out to compress and reconstruct the flame image in rotary keln under different PSNR condition at various spatial and temporal resolutions. The coal particles region can be identified accurately at very low bitrate. Experiment results show the validity of the scheme proposed for flame image compression in rotary keln.

  • 【网络出版投稿人】 湖南大学
  • 【网络出版年期】2009年 08期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络