节点文献

数据挖掘及其在汉语文语转换中应用的研究

【作者】 朱廷劭

【导师】 高文; Charles Ling;

【作者基本信息】 中国科学院研究生院(计算技术研究所) , 计算机应用, 1999, 博士

【摘要】 为了满足人们实际工作中的需要,数据库中的知识发现近年来逐渐发展起来。本文对数据库中的知识发现处理过程模型进行了研究,并将数据挖掘应用于普通话韵律规则发现,取得了良好的效果。 目前进行的数据挖掘的研究目前所进行的关于KDD的研究,大多只着眼于对学习算法的研究,而忽视了整个处理过程的研究。基于上述因素,本文提出了支持多数据集多学习目标的KDD处理过程模型,以使得KDD更适合实际工作的需要并使得最终用户和数据挖掘人员的之间的影响尽量小,提高学习效率。 目前的合成语音的自然度和连续度不够高,影响合成语音质量不高的一个很重要因素在于目前所使用的韵律控制规则不够完善。本文提出在基音同步叠加基础上利用数据挖掘进行汉语韵律规则的学习,并在汉语两字词和句子的韵律规则学习中得到应用,收到了良好的效果,目前国内外有关语音合成的文献中未见同类成果的报道。 本文将两字词中的音节韵律规则看做是对孤立音节和词中发音的一种映射关系的描述,通过训练神经网络获取基频和时长的映射关系,利用训练后得到的网络直接计算出所需的基频以用于合成,通过实验,得到较好的学习效果并且利用神经网络生成的基频变化结果完全符合公认的声调变化规律。 为了学习句子中音节的韵律变化规律,本文首先通过聚类分析得出典型的句中音节基频模式,这些基频模式完全可以对应于目前通用的声调曲线。在基频模式基础上,本文将训练数据中的基频变换到高层次描述并综合采用多种数据挖掘方法进行韵律规则的学习,取得了较好的实验结果。通过学习所获取的变化规则完全包容了变调规则,而且产生的新规则对声调变化的研究也将起到一定的启发作用。 在上述工作的基础上,本文开发出基于数据挖掘的普通话文语转换的研究原型系统DMTalker,该系统利用数据挖掘进行韵律规则学习,并将学习得到的韵律规则用于文语转换中。

【Abstract】 To meet the requirement, Knowledge Discovery in Database(KDD) comes into being these years. This thesis have done research on the KDD process model, and finding prosodic rules for Mandarin Speech synthesis by data mining. The results are encouraging.KDD has been caught more and more attention recently, but most of the current research on KDD pay much attention to data mining, which is one stage of KDD, and little to the KDD process model. But actually data mining has been done little amount of mining work of the whole discovery task. Reasonable KDD process model can organize the whole discovery stages into an solid unit, and thus makes it easy for end users to use KDD.We propose a KDD process model which is based on the analysis of the practice, and it supports dataset and multi-thread training. The proposed model is more suitable for the KDD application, and it makes the influence between data mining expert and end user as little as possible, so it can make knowledge discovery more efficient.The current synthesized speech has low quality, and one of the reasons is that the prosodic rules which are now being used are unsatisfied. We propose to learn prosodic models by data mining from actual speech database, and it was implemented in the learning from phrases and sentences.To extract pitch variation patterns from two-word phrases, a data mining system called SpeechDM has been implemented. In Chinese, the pitches extracted from an isolate syllable differ from those extracted from the same syllable in phrases, and SpeechDM extracts the patterns from the mapping between them. Since the pitch variation patterns have been learned from actual speech, it is possible to improve the naturalness of synthesized speech.The pitch models which are now being used in Mandarin Text-To-Speech are extracted by linguistics experts, and they are described qualitatively and with low precise. To acquire more accurate prosodic rules, data mining is

节点文献中: 

本文链接的文献网络图示:

本文的引文网络