节点文献

语音识别特征提取算法的研究及实现

The Research of Feature Extraction Algorithm for Speech Recognition and the Realization

【作者】 惠博

【导师】 冯宏伟;

【作者基本信息】 西北大学 , 计算机软件与理论, 2008, 硕士

【摘要】 语音信号具有很强的时变特性,在较短的时间间隔中语音信号的特征可看作基本保持不变,这是语音信号处理的一个重要出发点。语音识别率的高低,也都取决于语音信号特征提取的准确性和鲁棒性。因此,语音信号特征提取在语音信号处理应用中具有举足轻重的地位。论文首先研究了语音识别的基本知识,主要包括语音识别的原理;语音信号处理的基本知识;各种语音识别和训练的方法。在此基础上本文完成的工作有:1、着重研究了目前使用广泛的美尔频率倒谱系数(MFCC)参数,以24维MFCC参数为例,采用增减分量的方法分析了高阶参数缺失对识别率的影响,找出了对噪音不敏感的高阶MFCC参数,在识别率变化不大的情况下对24维MFCC参数进行了优化组合。2、使用VC++根据动态时间规整(DTW)模型实现了一个连接数字串语音识别系统,并进行了实验分析。系统的组成模块和语音识别系统的基本构成模型一致。在实现时选用了美尔频率系数(MFCC)。3、实验过程中发现了汉语数码易于混淆的问题,在模板训练方法和参考模板两方面做了改进,提出了使用多对特征矢量序列进行鲁棒性训练和进行声韵母分割来构造参考模板的方法。4、最后本文研究了汉语连续语音识别中的声学建模方法,给出了识别汉语易混淆词的方法。本文通过对实际语音识别系统各个部分的实验和研究,为进一步开发实用性语音识别系统的工作做了基础性的工作。

【Abstract】 Since the speech signals have strong time variance, it is an important springboard of speech signal processing that the voicing features can only be considered invariable in little time interval. The rate of speech recognition depends on the accuracy and robustness of voice feature extraction. So, extract the voicing features of speech signal play an important role in speech signal processing.First, the paper focus on fundamentals of speech recognition, including: principle of speech recognition, basic knowledge of speech signal processing, and all kinds of methods of speech training and recognition. Based on the basic theories, the paper has most works as follow:1、The paper focus on MFCC which widely used, as 24-dimensional MFCC terms example, analysis the impact of lacking of high MFCC terms on speech-recognition rates by changing the number of the terms, find out the high terms which not sensitive to noises are given, and optimize the 24-dimensional MFCC terms under recognition rates change is not big situation.2、Use Visual C++ 6.0 to implement a figure string speech-recognition system which based on DTW model, and makes an experiment on this system. The system is consistent with the model of the speech-recognition system. The paper select Mel Frequency Cepstrum Coefficient (MFCC) as feature terms.3、In experiment, it finds that the Chinese digital easy to confuse, in two aspects, training and reference template, we have made improved, and present a way of use more vector sequences to robust train, and a method by dividing the initial and final into two segments, and construct a reference template.4、Finally, the paper researches acoustics modeling method of Chinese continuous speech-recognition, and indicates the method to recognize the word which easily confused in Chinese words.Through the experiment and research of the actual speech-recognition system, it carries out the fundamental and exploring research for the further application of speech-recognition system.

  • 【网络出版投稿人】 西北大学
  • 【网络出版年期】2008年 08期
  • 【分类号】TN912.34
  • 【被引频次】26
  • 【下载频次】2071
节点文献中: 

本文链接的文献网络图示:

本文的引文网络