节点文献

基于DSP的混合激励线性预测语音编码算法及其实现

The Mixed Excitation Linear Prediction Speech Coding Algorithm and Its Implementation Based on DSP

【作者】 王军

【导师】 赵继印;

【作者基本信息】 吉林大学 , 通信与信息系统, 2004, 硕士

【摘要】 引言在移动通信、卫星通信、军用通信系统中,语音编码技术在压缩语音信号的传输带宽、降低信道传输码率,进而提高信道利用率发挥着重要作用。近年来,语音编码技术取得了突飞猛进的发展,研究的焦点也随着信号处理和通信技术的发展集中在低码率和甚低码率编码算法的研究与实现上。传统的LPC声码器采用简单的二元激励模型,不能更好地模拟实际语音的特征,致使合成语音的质量以及鲁棒性较差;码激励线性预测(CELP)低速语音编码算法根据感知加权误差最小准则,从自适应码本以及固定码本中搜索最佳码矢量作为激励。它能在8~16kbps的速率上合成出质量较高的语音。当编码速率进一步降低时,由于没有足够的比特数来表示激励矢量,致使合成语音质量下降很快。近年来,国内外在开展4kb/s及其以下速率的语音编码研究方面,主要代表算法有AMBE、MELP、WI、STC等。这些算法都大大降低了传输码率而节省带宽。在目前的低码率语音编码研究中,混合激励线性预测编码(MELP)是一种比较好的方法,2.4kbps的MELP编码方法已经被确定为美国新的联邦语音编码标准。该算法结合了LPC、MBE算法的优点,能在较低的码率下得到好的再生语音。本文在对FTR 1024A 2.4Kbps MELP算法分析的基础上,对其核心算法进行了细致的研究和大量的实验,对基音周期检测、LSF系数的传递、矢量量化、语音合成等环节加以改进,提出了一种码率为1.8kbps左右的改进MELP低速语音编码算法。 一、改进的MELP低速语音编码算法1.MELP模型的建立标准的MELP算法是基于传统LPC声码器的基础上,附加了五个特征参数,即:⑴混合激励,⑵非周期脉冲,⑶自适应谱增强,⑷脉冲散布,⑸付氏幅度模型。这些附加特征的引入很大程度上改善了原有LPC参数模型的激励源构造,也消除了LPC合成语音中有时出现的机械的或蜂鸣的音调噪<WP=69>声,允许MELP编码算法能够模拟自然语音的更多特征,从而使得MELP声码器在低比特率上能够产生高质量的语音,成为目前低速率语音编码中最有潜力的方法之一。与LPC10的简单清/浊音判决不同,MELP采用混合激励源:通过一组带通滤波器将语音信号分成五个子频带,对每个频带进行清浊音判别,在合成端将这五个子带信号相加得到混合激励,其主要功能是减少LPC声码器的蜂鸣声。当输入信号是浊音时,MELP编码器能用周期或者非周期脉冲来合成语音。非周期脉冲大多用在清/浊或浊/清转换的语音段中。其结果能够使解码端重生不定期的声门脉冲而不引入其它声调。自适应谱增强滤波器是一个零/极点滤波器,目的是为了使合成语音与自然语音在共振区有更好的波形匹配。脉冲散布利用一个固定的脉冲整形滤波器对合成语音进行后处理。它能让激励信号的能量散布于整个基音周期之内。这使合成语音在非共振区与原始语音有更好的波形匹配,有助于消除合成语音中的一些刺耳噪声。在编码部分,我们对LPC逆滤波得到的残差信号进行傅立叶变换,取其前10次谐波值,量化后传到解码端,用以合成周期脉冲,这样有助于提高合成语音的自然度,尤其在有男声和背景噪声时。2.语音分析输入的语音信号首先经过预处理,通过截止频率为60Hz的高通滤波器,目的是为了抑制50H电源干扰。然后利用本文提出的归一化基音检测算法提取基音周期。该算法用到了前一帧和后一帧的信号,以及长时平均基音周期,保证了相邻帧基音周期的连续性。采用线性内插进行分数基音的搜索,提高了基音周期估计精度。经典算法有时检测到的是实际基音周期的倍数,该算法采用倍数检测消除了估计的误差。大量的实验结果表明,该算法不仅具有基音平滑算法的准确性、可靠性,而且能在当前帧内实时地提取基音周期估计值。MELP编码是一种基于LPC的参数编码方法,与所有传统的基于LPC<WP=70>合成-分析方法相同,其参数是逐帧分析和传送的,这种做法的不足之处是考虑语音的形成过程中,声道响应特征变化较缓慢的特点,即相邻帧之间的相似性,本文归一化自相关函数来表示相邻帧LPC系数的相似性,当相似度大于某个阈值时,就可以不传送当前帧的LPC系数,而以前面帧的LPC系数来代替。实验表明,采用该方法,约有50%左右的语音帧的LPC系数可以采用替代的办法,从而可以大大减小编码的码率,而且不会对再生语音的质量带来多少影响。接下来分析确定子带清/浊音强度及非周期脉冲标志,用德宾算法推出LPC系数,计算残差信号的峰值更新子带清/浊音强度,接着计算增益并更新平均基音周期。将输入信号通过量化后的预测系数构成的线性预测滤波器,求得残差信号,求出残差信号的前十个基音周期谐波处的付氏幅度值。3.参数编解码 经过语音分析,得到本算法的语音参数。在编码方案中的比特分配如表4-1。基音周期取对数后,用99阶的均匀量化器进行量化,这些数据采用查表的方法映射到7比特的码字上。用8比特对增益量化编码,其中采用5位的均匀量化器进行量化,然后,用3比特对进行量化编码。用4比特对子带清/浊音强度(Bpvc)量化编码。标准的MELP算法采用四级矢量量化,搜索路径为8,考虑到标准的MELP算法中,采用的码本容量太大,同时量化的码本矢量的第四级的补偿还比较大,本?

【Abstract】 IntroductionIn mobile, satellite and military communications systems, the technology of speech coding plays an important role in increasing the availability of the channel by compressing transmission bandwidth and reducing transmission bit-rate of the speech signal. In recent years, the technology of speech coding advances rapidly. With the development of the signal processing and communication technology, the focus of speech coding research is centralized on the study and realization of low and very low bite rate speech coding algorithms.The traditional LPC vocoder is too simple for the speech signal model which partitions unvoice and voice in whole spectrum of the speech, so that the synthetical speech lacks the naturalness and robustness. The code-excited LPC algorithm(CELP)constructs an LPC excitation signal by the least rule of perceptual weighted error choose vectors from two codebooks: an “adaptive” codebook and a “stochastic” codebook, the algorithm can get highly synthesized speech quality, but the coding bit-rate continue to drop will result in the fast descend of the synthesized speech quality. In recent years, the representative speech coding algorithms have AMBE, MELP, WI, STC and so on in the study of equal to or less than 4kbps speech coding. These algorithms not only largely reduce the coding rate, but also economize bandwidth. The MELP is an good algorithm in current low bit rate speech coding. MELP coder has been adopted as the new US Federal Standard at 2.4kbps. the algorithm combine the merit of LPC and MBE algorithm. Several careful research and many experiments on the aspects of speech analysis, parameter code/encode and speech synthesis have been carried out. some new methods and ameliorations are employed in pitch detection, vector quantization and transmittion of LPC parameters. An improved 1.8 kbps MELP coding algorithm is proposed . 一、An Improved MELP Low Bit Rate Speech Coding Algorithm1. MELP Model <WP=73>The MELP coder is based on the LPC model with additional features including mixed excitation, aperiodic pulses, adaptive spectral enhancement, pulse dispersion filtering, and Fourier magnitude modeling. These additional parameters largely amend the excitation structure of the LPC model, at the same time eliminate mechanical tone noise that come forth in LPC speech synthesize. these allow the MELP vocoder to simulate accurately natural speech.At this way MELP vocoder can synthesize the high quality speech. It has become one of the best potential low bit rate speech coding.Differing from LPC10 simple unvoice/voice distinguish, MELP vocoder adopt mixed excitation. Each frame is divided into five bands and U/V determination is made in every band. The five subband signals of the speech were summed up yielding the mixed excitation, it reduce the humming of the LPC vocoder. When the input signal is voiced, MELP encoder can synthesize the speech by the cycle or the aperiodic pulse. The aperiodic pulse is mostly used in U/V conversion speech. It can form the irregular glottis pulse without introducing other tones.The adaptive spectral enhancement filter is a zero/pole filter, it make the synthetical speech and the natural speech match on better wave forms in the resonance district.pulse dispersion filtering deal with the synthetical speech by a regular pulse shaping filter. It can make the excited signal energy scatter on the whole pitch. It make the synthetical speech and the natural speech match on better wave forms in the unresonance district, contribute to dispelling some ear-piercing noise.In code part, fourier transform is used in the residual signal through LPC inversely filtering, and adopt the first 10 harmonic factor, and passed to the decode after the quantization, used to synthesize the cycle pulse, contribute to improving the naturalness of the synthetical speech, especially in male voice and backgroud noise. 2. Speech AnalysisThe input signal passes the pretreatment at first, through the high-pass filter of 60Hz, with the purpose of suppressing 50Hz p

  • 【网络出版投稿人】 吉林大学
  • 【网络出版年期】2004年 04期
  • 【分类号】TN912.3
  • 【被引频次】1
  • 【下载频次】446
节点文献中: