节点文献

数学形态学在语音识别中的应用研究

Study on the Application of Mathematical Morphology in Speech Recognition

【作者】 王霞

【导师】 赵晓群;

【作者基本信息】 河北工业大学 , 微电子学与固体电子学, 2008, 博士

【摘要】 由于现实环境中存在各种噪声,严重影响了语音的识别率,因此带噪语音识别的研究显得尤为重要。本文从语音信号的非线性理论出发,探讨数学形态学在提高语音识别抗噪性能中的应用。对带噪语音识别中的语音增强、特征参数提取及识别方法等关键问题进行了研究。主要研究内容如下:1.对基于形态滤波的语音增强方法进行了研究。采用不同的形态滤波器和结构元素对带噪语音进行增强,得到不同情况下的输出信噪比,分析了结构元素形状及长度对增强效果的影响。2.将形态滤波和小波变换相结合,形成形态-小波滤波器,对带有不同噪声的语音信号进行滤波。实验结果表明,这种滤波器较好地保持了语音信号形状并使信号得到增强,效果优于形态滤波器。3.基于形态滤波器的幂等性,采用形态预失真方法提取纯净语音的美尔倒谱等参数。对纯净、带噪、去噪及预失真语音特征参数间的距离进行了分析比较,得出了预失真方法的可行性。4.在形态滤波的基础上,对基音周期检测方法进行了研究。根据短时平均幅度差函数(AMDF)与修正自相关函数(MACF)的特点,设计了滤波加权修正自相关函数的基音周期检测方法。该方法利用归一化平均幅度差函数的指数形式对修正自相关函数进行加权,实现了带噪语音的基音周期检测。5.采用预失真特征参数作为训练数据用于隐马尔可夫模型(HMM)识别方法,提高了训练和识别的匹配性,使语音识别率较使用传统方法的识别率有较大提高。6.设计了基于预失真参数的改进径向基函数(RBF)神经网络语音识别方法。对隐层中心的选择、权值的计算及网络结构优化方法进行了研究,分析了不同准则对结构优化的影响,确定了改进方案。通过实验分析比较了RBF神经网络与采用预失真参数的改进方法对带噪语音的识别率。

【Abstract】 There are kinds of noise in real circumstance, speech recognition rate is influenced seriously, so it seems very important to study noisy speech recognition. Form nonlinear theory of speech signal, this paper discusses the application of mathematical morphology for improving robustness of recognition. Speech enhancement, feature extraction and recognition method in noisy speech enhancement are studied. The main research work is as follows:1. Speech enhancement method based on morphological filter is studied. Noisy speech signals are enhanced using different morphological filters and structuring elements, output SNRs in different circumstances are acquired, and the influences of the shape and length of structure elements are analyzed.2. Morphological filter and wavelet transform are combined to form morphology-wavelet filter, speech signals with different noises are filtered. Experiments show that this filter can maintain signal shape and enhance signal, its effect is better than morphology filter.3. Based on idempotency of morphological filter,clean speech feature coefficients are extracted using morphology predistortion method. Feature distances of clean, noisy, denoisy and predistortion speech are analyzed and compared,and feasibility of predistortion method is achieved.4. On the basis of morphological filter, pitch detection methods are researched. According to the characters of short time average magnitude difference function (AMDF) and modified short time autocorrelation function(MACF), filtering weighted modified autocorrelation pitch detection method is designed. This method uses exponent of normalized AMDF to weight MACF, and realizes pitch detection of noisy speech.5. Predistortion feature coefficients are used in Hidden Markov Model(HMM) recognition method as training data in order to increase matching of training and recognizing process, and the speech recognition rates of this method are better than that of traditional method. 6. Speech recognition method of RBF neural networks based on predistortion coefficients is designed. The following research work is concerned with hidden centers choosing, weights computing and network structure optimizing. Influences of different criterions are analyzed, and an improving scheme is decided. Recognition rates of noisy speech using RBF neural networks and modified method based on predistortion coefficients are tested.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络