节点文献

基于原型波形内插算法的语音问题的研究

Study of Speech Based on Prototype Waveform Interpolaton

【作者】 史学晶

【导师】 赵淑清;

【作者基本信息】 北京化工大学 , 检测技术与自动化装置, 2004, 硕士

【摘要】 本论文主要完成基于原型波形内插(PWI-Prototype Waveform Interpolation)算法的语音编码和基于这一算法在汉语语音合成中声调调整方面的研究。原型波形内插(PWI)算法是美国AT&T贝尔实验室的W.B.Kleijn博士首先提出来的,这种算法利用了浊音语音的周期性,将浊音语音看作是慢变化的基音周期波形的连接,每隔20~30ms提取一单个的基音周期波形,然后在更新点处进行内插重建语音信号。本文系统介绍了原型波形内插(PWI)的基本原理及其实现方法,然后在对规则脉冲激励—长时预测(RPE-LTP)语音编码方案(13kb/s)研究的基础上,利用原型波形内插方法,提出了浊音语音4.8kb/s的编码方案,使编码速率大大降低。计算机模拟实验表明,这种编码语音质量与GSM编码方案质量相当。此外,本论文还研究了PWI算法在语音合成上,尤其在声调调整上的应用。传统的基音同步叠加算法(PSOLA)虽然具有良好的韵律调整能力,但是也有不足之处,当基音频率修改过大时有可能出现严重的谱包络失真,即共振峰特性产生不可接受的变异。本论文将PWI算法与PSOLA算法结合,对这一缺陷进行了改进。

【Abstract】 This paper introduces a speech coding scheme and a tone modification method for Chinese speech synthesis based on prototype waveform interpolation (PWI) algorithm. PWI algorithm is proposed by Dr. W.B.Kleijn when he worked in AT&T Bell Laboratories. Based on the periodicity, voiced speech is interpreted as a concatenation of slowly evolving pitch-cycle waveforms. The waveform of a single pitch cycle, which will be referred to as the prototype waveform, is transmitted at regular intervals (of 20-30 ms) and then interpolated between theses update points.The principle and implementation method of the prototype waveform interpolation are analyzed and introduced in detail. On the base of the study of Regular Pulse Excitation-Long Term Prediction speech coding scheme (13kb/s) a voiced speech coding scheme used PWI method at 4.8kb/s is described, which greatly reduces the coding rate. The computer simulation results show that the synthesized speech quality of PWI scheme is close to that of the original one.In addition, this paper introduces a tone modification method for <WP=5>Chinese speech synthesis used PWI. Traditional algorithm, time-domain pitch–synchronous overlap-add (PSOLA) is capable of transforming prosodic features of the Chinese speech. But PSOLA algorithm shows some of shortcomings. When the pitch frequency is modified greatly, spectral envelope will distort, which means formant features produce inacceptable variation. This paper combines PWI and PSOLA algorithm, and improves the result of tone modification.

  • 【分类号】TN912.3
  • 【被引频次】1
  • 【下载频次】107
节点文献中: