

Polyphone Research in the System of Text-to-speech

【作者】 李清

【导师】 张莉;

【作者基本信息】 河北大学 , 汉语言文字学, 2010, 硕士

【摘要】 计算机文-语转换技术(Text-to-Speech,简称TTS)指利用计算机程序将既定的可视文本信息转换成语音的过程。这一系统广泛地涉及了语言学、语音学、计算机编程、数字信号处理等领域,是一门综合了多学科多领域的技术项目。我们接触的拼音输入法、图书拼音排序检索、各种音序排序、汉语的语音教学软件的运用、各类电子产品的语音朗读功能的改进、自动介绍或者答复系统以及盲人用品和儿童玩具的开发、甚至包括机器人制造以及未来语音操控系统在各领域的实现,都离不开这一技术手段。作为一项高度要求理论性与实用性相结合的技术,TTS在产生之初就备受各学科学者的高度重视。如何提高TTS系统中语音合成的流畅度、自然度与准确率成为这一技术必须关注的焦点。其中,汉语多音字读音自动标注的准确率更成为文-语转换技术的难点之一。本文的研究对象在于,确定《现代汉语词典》(第5版)(以下简称《现汉》)中921个多音字及其音项在CCL现代汉语语料库中的语用频率,以字频为基础,进而从语言学理论的角度出发,为TTS处理中多音字问题的解决提出一种新思路。文章的主要内容包括三个部分,第一部分,对《现汉》中多音字的数目进行统计,确定以921个多音字为研究对象,并对每个多音字的词性及进行统计。第二部分,在CCL现代汉语语料库中对这921个多音字的字频进行语用频率统计。根据统计结果和累计频率的计算,最终将这些多音字分出高、中、低三个频级。对各频级的多音字每个音项的使用频率进行统计,分出高频音、低频音两个音级,对在语料库中只占1%的低频多音字采用常读音默认的方法进行处理。第三部分,对中、高频多音字进行分类,综合运用多音节词排除法、词性确定法和附带常用多音字词库法等方法进行处理。对那些各音项语用频率相当、词性区别不明显的可独立成词的多音字,则逐条梳理其所有语料,总结其出现的语境,为多音字构建规则。

【Abstract】 Computer Text-to-Speech Technology (Text-to-Speech, referred to as TTS) refers to the use of computer programs to established visual text into speech process. Because of the system include linguistics, phonetics, computer programming, digital signal processing and other fields, it is a comprehensive multi-disciplinary technical projects in many fields.We touch input method, Library, various scheduling problems, the use of Chinese language teaching software, all kinds of electronic products to improve voice reading function to automatically introduce or reply system and the blind development of supplies and children’s toys, and even robots manufacturing and future voice-control system to achieve in all areas,which are inseparable from the technical means. As a highly theoretical and practical combination of technology,TTS received the great attention from various disciplines and scholars at the beginning of language translation technology in the production.How to improve TTS’s speech synthesis fluency, naturalness and accuracy of the technology have become the focus of attention. Among them, the Chinese pronunciation of polyphonic accuracy of automatic tagging has become one of the difficulties language translation technology in the system of TTS.Object of this paper is to determine the "Modern Chinese Dictionary" (5th Edition) (hereinafter referred to as "Modern Chinese ") in 921 polyphones and pronunciation items in the CCL of modern Chinese corpus, the Pragmatic frequency to frequency-based word, and then from the perspective of linguistic theory, a new idea comes up for the TTS system of polyphone solution to the problem.Article mainly includes three parts, in the first part,according to the "Modern Chinese" in the polyphonic character of the number of statistical, I got 921 polyphones as the object of study, each polyphones’s part of speech and the number of polyphonic words were Statisted. The second part, in CCL modern Chinese corpus on these 921 polyphones’s frequency and frequency statistics were pragmatic. According to statistics, the cumulative frequency of the calculation results and the final separation of these words pronunciation, high frequency and low frequency levels. On the frequency of each word-class polyphones frequency of use of the statistical items, separate the regular pronunciation, second pronunciation, the pronunciation of three rare audio level.In the corpus only 1% of the low-frequency polyphones pronunciation using the default method of constant handling.The third part, the high-frequency words were classified more sound, and the integrated use of multi-syllable words of elimination, parts of speech determine the law and with common polyphones thesaurus other methods,which need processing. The tone of those items very pragmatic frequency, part of speech can not distinctive of a separate polyphones, then using statistical methods build rules for polyphones, according to different types separately.

【关键词】 多音字文-语转换语料库字频音项
【Key words】 PolyphoneText-to-SpeechCorpusWord- frequencyPronunciation item
  • 【网络出版投稿人】 河北大学
  • 【网络出版年期】2010年 12期
  • 【分类号】H08
  • 【被引频次】3
  • 【下载频次】257