节点文献

语音合成自然度的研究

The Research of Speech Synthesis Naturalness Based on Computer

【作者】 吕鹏

【导师】 刘齐跃;

【作者基本信息】 河北科技大学 , 通信与信息系统, 2010, 硕士

【摘要】 随着社会的不断进步,人们在关于语音处理方面的研究已经取得了很多研究成果,尤其是语音合成的可懂度已经达到了相当高的要求,但是在语音自然度方面仍然与人们的预期要求有一定的差距,这将严重影响语音合成技术的进一步发展。本文主要研究在语音合成的基础上,针对现在的语音合成自然度不高的问题提出的改进方法,主要过程为以自我录制的语音库的语音合成为例,利用波形拼接的方法对语音自然度进行改进,并通过主客观评测方式验证语音自然度的改进效果。主要内容如下:1)从语音学的基本要素出发,分析语音合成的基本要素,研究一些影响语音合成自然度的相关问题,并从中分析出语音合成与语音识别等的关系。2)以音节为单位制作语音库,并通过对语音的无声段处理,消除掉影响语音信号连接的停顿较长的问题,并分析出合成语音时不必要的部分,运用波形拼接算法中的TD-PSOLA和FD-PSOLA方法分别对语音的时长和频率进行调整,使其在韵律控制上更加贴近自然发音,同时利用语音韵律参数声音及图像的对比来看出语音合成前后及与自然音之间的差距,进而分析出语音自然度的改进程度。3)最后本文对语音合成自然度的系统进行了仿真实验,经系统仿真后在语音的自然度上有了一定的提高,并利用主客观的方法对合成结果进行了评测,效果非常理想。本文的研究为语音合成自然度的进一步研究提供了很好的基础和方案。

【Abstract】 Along with society’s unceasing progress, the people in have already obtained the very many research results about the pronunciation processing aspect research, the speech synthesis understandability already has met the quite high requirements in particular, but still had certain disparity in the pronunciation nature aspect with people’s anticipated request, this will be serious affects the speech synthesis technology the further development.The this article main research in the speech synthesis foundation, the improvement method which proposed in view of the present speech synthesis nature not high question, this article take the self-transcribing pronunciation storehouse speech synthesis as the example, and carries on the subjective and objective using the profile splicing method to the pronunciation nature enhancement the contrast improvement. The primary coverage is as follows:1)Embarks from the phonetics basic essential factor, the analysis speech synthesis basic essential factor, studies some influence speech synthesis nature related question, and analyzes the speech synthesis and the speech recognition and so on the relations.2)for the unit manufacture speech corpora syllables, and through the silent period of speech, eliminate the influence of speech signal connection problem, and a long pause out synthesized speech unnecessary parts, when using the stitching algorithm waveform and FD-PSOLA, TD-PSOLA method for voice - the duration and frequency adjustment, which is more close to the rhythm control in pronunciation, and use natural voice and image sound prosodic parameter comparison to see speech synthesis and the gap between the sound and the nature, and then analyzes the voice of degree of improvement of natural.3)Finally this article has carried on the simulation experiment to the speech synthesis nature system, had certain enhancement after the system simulation in the pronunciation nature, and carried on the evaluation using the subjective and objective method to the synthesis result, the effect has been extremely ideal. This article research has provided the very good foundation and the plan for the speech synthesis nature further research.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络