节点文献

音频感知编码模型及关键技术的研究

Research on Audio Perceptual Coding Model and Related Key Technologies

【作者】 李琳

【导师】 郭立;

【作者基本信息】 中国科学技术大学 , 电路与系统, 2008, 博士

【摘要】 关于音频感知编码模型的研究主要集中在两个方面:一是各种音频压缩编码算法的研究;二是音频编解码器实现技术的研究。当前,随着移动通讯网络的普及,音频产品的传播变得更加频繁和方便,但移动设备终端的计算能力和存储容量都是有限的,因此,低复杂度高质量的音频编码算法研究和系统实现成为数字音频处理领域的研究热点之一。为实现一个低复杂度高质量的音频编解码器,本论文的工作主要围绕下述两个方面展开:在算法级上,选择音频感知编码模型中具有突出优点的AAC编码系统为研究对象,分别在频域变换、心理声学分析和量化编码这三大关键模块中进行算法优化,在保证编码质量的前提下,降低运算复杂度,减小编码耗时;在实现技术方面,采用SOPC设计策略,使用“微处理器软核+专用IP核”的模式进行软硬件协同工作,在FPGA开发平台上实现一个低复杂度高音质的AAC编解码系统。本文的主要工作和创新如下:(1)滤波器组是音频感知编码模型中的计算密集型模块,占用较大的运算量。本文针对滤波器组的快速实现算法进行了研究,分别提出了两种改进方案——基于递归结构和基于N/8点FFT核的MDCT/IMDCT快速实现方案,适于IP核设计并可以实现MDCT/IMDCT电路共用。第一种方案具有电路规整、占用硬件资源少、运算速度快和吞吐能力强等优点,与现有递归算法相比,只需要N2/16个周期就可以完成N点MDCT/IMDCT变换。第二种方案,相对于目前流行的基于N/4点FFT核的实现方法,增加了一些加法器,但降低了对乘法器数目的需求,减小了计算误差,同时将运算速度提升了近一倍。(2)为消除预回声的影响,音频感知编码模型在心理声学模块中通过暂态分析,判断信号的瞬变性,以指导变换编码中自适应长短块的切换。本文结合入耳听觉特性和音频编码特点,草拟了一种听觉感知阈值的拟合模型框架,并且,分析了基于感知熵的块类型选择算法存在的缺点,提出了一种简单的暂态分析方法——时域峰值检测法,能在时域上快速判断出音频信号的瞬变性,从而,对平稳信号和瞬变信号使用不同的变换窗长度,以获取较好的时域分辨率和频域分辨率。在对音质影响不大的前提下,提高了心理声学模型的计算速度。(3)音频感知编码模型中使用Brandenburg的双循环量化处理结构,可以获取较好的编码质量,但存在收敛速度慢、迭代次数多的缺点,不具备实时处理能力。本文在原量化模块设计思想的指导下,提出了基于噪声预测的量化-编码结构。通过确定公共缩放因子和尺度因子的制约关系,缩小量化阶的迭代范围,加快了收敛速度,简化了量化模块的运算复杂度。与原有双循环迭代结构相比,在对音质影响不大的前提下,运算速度提高了一倍。相应地,在反量化模块中,提出了一种改进型的查表方法,与现有算法相比,减少了50%的存储空间,并将计算误差控制在10-6级别内。(4)依据嵌入式系统实时操作和可编程化的要求,本文提出了一种基于SOPC架构的数字音频编解码系统的可编程实验模型。选择MPEG AAC为实验对象,通过对编解码系统中关键模块的算法改进和部分电路的硬件优化,软硬件协同设计,降低编解码的运算复杂度。在保证编码质量的前提下,系统的编码速度提高了一倍,并且实现了实时解码。经过主/客观评测系统评估,取到了较好的编码质量评测分数。

【Abstract】 On the audio perceptual coding technology, research mainly concentrates in two aspects: first, optimization of audio compression algorithms; second, hardware design and implementation of the algorithms. At present, with the popularity of mobile network, the spread of audio products gets more frequent and convenient. Due to the limitation of computing capability and storage capacity which comes along with the mobile terminal, realization of an audio coding system with the performance of low complexity and high quality has become one of the most popular researches in digital audio processing.To achieve a high-quality audio codec with low complexity, this paper focused on two improvements: first, the key technologies of AAC, such as frequency transform, psychoacoustic analysis and quantization, on the algorithm level were optimized, in order to reduce computational complexity; second, based on the SOPC design strategy, a real-time MPEG AAC codec system was implemented using the combination of soft-core microprocessor and IP cores.The main work and innovation are as follows:(1) The filterbank module is a computation-intense part of audio perceptual coding model, occupying large amount of computing. In this paper, two methods accelerating the computation speed of the filterbank are proposed. One method was based on a recursive structure and the other was with the N/8-point FFT kernel, which were suitable for IP core design of both MDCT and IMDCT. Compared with the other recursive algorithms, the first approach reduced its computation cycles to N~2/16 and provided a superior performance in terms of computation speed, data throughput and hardware utilization. Although the existing algorithms based on N/4-point FFT kernel cost fewer adders, the second method not only cut down the requirement of multipliers, but also doubled the computation rate.(2) To eliminate the impact of pre-echo, the psychoacoustic module in audio perceptual coding model adopts transient analysis method to switch adaptively the transformation length. Based on the characteristics of human auditory and audio compression technique, a hypothesis of perceptual threshold model is presented. Besides, a block switching method in time domain is exploited instead of PE-based algorithm, which could quickly determine the transient signal. As a result, it raised the computing speed of the psychoacoustic model with little effect on the audio quality.(3) The quantization module of audio coding system employs Brandenburg architecture to obtain good quality, but it results in great complexity, which is not suitable for real-time applications. A simplification of the dual-loop structure is proposed on the basis of the noise approximate formula. According to the relation between the common scalefactor and the scalefactors in each scalefactor bands, the iterating scope of quantizing step got narrower to expedite its convergence. Results of experiments showed that the quality of reconstructed sound with the proposed approach was almost the same as the one reconstructed by original quantization module. In the decoding system, a modified version of a look-up table method is exhibited to perform the inverse non uniform quantization. In comparison with the existing ones, it reduced 50% storage and decreased the calculation errors.(4) A programmable model of digital audio codec is developed with the concept of SOPC architecture. Taking MPEG AAC as an example, the software/hardware co-operation was processed to reduce the computational complexity of the codec system. The reports of FPGA implementation showed that this audio codec system achieved higher coding rate and realized real-time decoding procedure. The results of both objective and subjective evaluation tests indicated this codec got good audio quality.

  • 【分类号】TN912.3
  • 【被引频次】6
  • 【下载频次】651
节点文献中: 

本文链接的文献网络图示:

本文的引文网络