

Research and Implement on Isolated Mandarin Speech Recognition

【作者】 李建宁

【导师】 冯宏伟;

【作者基本信息】 西北大学 , 计算机软件与理论, 2007, 硕士

【摘要】 孤立词语音识别实现简单、技术成熟,有着广泛的应用领域,是深入进行语音识别研究的基础。隐马尔可夫模型(HMM)是目前最流行的语音识别技术,许多成功的语音识别系统都是基于该技术实现的。本文通过一个在Windows平台上用VC++实现的基于连续隐马尔可夫模型(CDHMM)的汉语小词汇量、非特定人、孤立词语音识别系统,对孤立词语音识别进行了研究。论文首先研究了语音识别的基本知识,主要包括语音识别的原理;语音信号处理的基本知识;各种语音识别和训练的方法。然后研究了隐马尔可夫模型的原理及其在语音识别中的应用。在此基础上论文主要工作有以下:1)完成了一个使用连续隐马尔可夫模型的汉语小词汇量、非特定人、孤立词语音识别系统的设计和实现,并进行了实验。由于使用VC++实现系统,对信号处理较为复杂。因此在实现时没有选用美尔频率特征系数(MFCC),而是选用了近似于MFCC但计算相对简单的LPC美尔倒谱系数(LPCMCC)作为特征参数。2)实验时发现系统中的双门限端点检测方法对噪声较敏感,当语音信号中混入噪声时,检测结果就会变得不准确;针对这一问题,对端点检测做了研究,提出了一种变帧长自适应门限的端点检测方法;3)分析了特征参数各维系数在语音识别中的贡献,给出了提高特征参数抗噪声性能的方法;4)最后本文针对Baum-Welch算法进行HMM参数估计速度慢、效率低的问题,给出了改进的方法。在使用Baum-Welch算法训练HMM模型时,语音识别系统的速度和效率比较低,因此优化训练方法尤为重要。

【Abstract】 Isolated speech recognition is easy to implement and has been a mature state of technique. It can be applied broadly in many fields and is the base of deeply researching on speech recognition. Currently Hidden Markov Model is the trend of speech recognition, and most of successful speech recognition systems are based on this technique. This paper researches on isolated speech recognition by implement of a basic Mandarin speech recognition system of small scale vocabulary, isolated words and speaker independence using VC++ on Windows platform.First, the paper focus on fundamentals of speech recognition, including: principle of speech recognition, basic knowledge of speech signal processing, and all kinds of methods of speech training and recognition. Then study theories of Hidden Markov Model and it’s applications on speech recognition.Based on the basic theories, the paper has most works as follow:1) Accomplishes design and implement of a basic Mandarin speech recognition system of small scale vocabulary, isolated words and speaker independence using Continuous Density Hidden Markov Model, and makes an experiment on this system. Because it is difficult to process speech signal by VC++ developing the system, the paper doesn’t select Mel Frequency Cepstrum Coefficient (MFCC) as Feature Parameters. It chooses LPC Mel Cepstrum Coefficient (LPCMCC) as Feature Parameters that is almost equal to MFCC and easier to compute.2) In the experiment, it finds that the end-point detection method of two thresholds is sensitive to noisy. It can’t get exact results of the end-point detection when wave data contain some noisy. In order to solve this problem, the paper researches on the end-point detection of speech signal, and present an endpoint detection method based on dynamic frame and self-adaptive threshold.3) Analyzes the contribution of each dimension of MFCC and gives methods of resisting noisy for feature coefficient.4) Finally, the paper indicates the methods to improve speed and efficiency of the Baum-Welch algorithms to re-estimate parameters of HMM. When Using the Baum-Welch algorithms to train the HMM, the speech recognition system is slow and poor efficient. So, it is necessary to give optimistic methods.

  • 【网络出版投稿人】 西北大学
  • 【网络出版年期】2007年 04期
  • 【分类号】TN912.34
  • 【被引频次】14
  • 【下载频次】464