节点文献

语音识别中个人特征参数提取研究

Research of the Characteristics Parameters Extraction in the Personal of Speech Recognition

【作者】 张志霞

【导师】 韩慧莲;

【作者基本信息】 中北大学 , 信号与信息处理, 2009, 硕士

【摘要】 随着计算机的不断发展,语音识别拥有可观的应用背景,不仅是指机器通过学习实现从语音信号到文字符号的理解过程,同时作为一门交叉学科也具有深远的理论研究价值。语音识别实质上就是语音训练与模式识别的过程,但是要保证识别效果的相对完好,与语音信号特征参数的有效提取是分不开的。特征参数的提取主要是为了提取语音信号中能代表语音特征的信息,减少语音识别时所要处理的数据量,尽量能够完全、准确地表达语音信号。本文以语音识别整体框架结构、语音识别技术为导向,对语音信号特征参数提取算法进行研究,对语音识别具有重要的理论与实际意义。首先,介绍了语音识别的基础知识,研究了语音信号的预处理、个人特征参数提取算法、语音识别模型匹配和训练技术——动态时间规整算法原理和隐马尔科夫模型,重点分析了本文用到的动态时间规整算法,给出语音信号特征参数提取的整体方案。其次,在办公室环境下对语音信号进行采集,直接剔除那些明显被偶然因素干扰和因说话人本身造成的不规则样本,并且显示所采集的语音信号。然后,对所采集的语音信号进行预处理,包括语音信号预加重、分帧和加窗,端点检测等。在此基础上,对语音信号进行特征参数提取,着重实现线性预测倒谱系数和美尔频标倒谱系数的提取,并分析其在办公室环境下提取的特征参数对个别个体语音识别的影响。最后,针对美尔频标倒谱系数,利用动态时间规整算法对所经过预处理之后的个别个体特定声音进行识别并实验仿真,然后分析实验结果。对动态时间规整算法的不足之处,提出改进方案。

【Abstract】 With the development of the technology of computer increasingly, speech recognition is very promising in application. As an interdisciplinary field, it is also theoretically very valued.In fact, speech recognition is the process of pattern recognition. However, to ensure relative intact of speech recognition, it has close contact with the effective extraction of the voice signal characteristic parameters. Extraction of the characteristics parameters is mainly to attain the information that are able to represent voice characteristics, and reduce the amount of data to deal with during the speech recognition, so as to express the voice signal as possible as accurately. This paper analyzes the overall structure and technology of speech recognition system, researches speech signal feature extraction. It is important theoretical and practical significance for speech recognitionFirst, introduce the basic knowledge of and speech recognition. Study the preprocessing of the voice signal, feature parameter extraction algorithms, speech recognition technology and training model matching, including Dynamic Time Warping and Hidden Markov Models. Focus on the analysis of the Dynamic Time Warping algorithm used in this article. Give the overall scheme of speech signal feature parameters to extract.Secondly, gather the voice signal in the office environment, excluding directly those obvious interference was accidental and caused by its own speak of irregular samples. And then display collected voice information.Furthermore, pre-processing of speech signals. On this basis, carry out voice signal feature parameter extraction, focusing on implementing, linear prediction cepstrum coefficient and Mel frequency cepstrum coefficient. Eventually, analyze its effects to individual speech recognition in the office environment. Finally, on the basis of Mel frequency cepstrum coefficient, realizes the individual speech recognition using dynamic time warping algorithm. And then analysis the results of experimental, put forward improved algorithm of dynamic time warping algorithm.

  • 【网络出版投稿人】 中北大学
  • 【网络出版年期】2009年 11期
节点文献中: