节点文献

基于分形维数的语音端点检测算法研究

The Endpoint Detection Algorithm of Speech Based on Fractal Dimension

【作者】 张振红

【导师】 张雪英;

【作者基本信息】 太原理工大学 , 通信与信息系统, 2008, 硕士

【摘要】 语音信号的端点检测技术就是从包含语音的一段信号中准确地确定语音的起始点和终止点,区分语音和非语音信号。有效的端点检测技术不仅能在语音识别系统中减少数据的采集量,节约处理时间,还能排除无声段或噪声段的干扰,提高语音识别系统的性能,而且在语音编码中还能降低噪声段和静音段的比特率,提高编码效率。因此,端点检测是语音处理技术中的一个重要方面。在低信噪比的环境中进行精确的端点检测比较困难,尤其是在无声段或者发音前后。本文首先总结了现有典型的语音端点检测算法,包括:基于短时能量及过零率的语音端点检测算法、基于LPC倒谱特征的语音端点检测算法、基于熵函数的语音端点检测算法、基于隐马尔可夫模型(HMM)的语音端点检测算法和基于子带平均能量方差的语音端点检测算法。分析了各种端点检测算法所选用的特征,并给出了部分算法的仿真结果。这些方法在静音环境下或当噪声较小时可以取得较好的检测结果,但在语音环境较恶劣、信噪比较低时,检测的结果下降较快,难以让人满意。随后在前人工作的基础上提出了噪声环境下三种语音端点检测新算法。算法一:提出了基于分形维数的语音端点检测方法。该方法利用了分形维数在噪声情况下作为语音端点检测参数的优越性,克服了在噪声情况下判决门限难以估计的问题。算法二:提出了基于分形维数和模糊RBF神经网络的语音端点检测方法。该方法结合了分形维数在噪声情况下作为语音端点检测参数的优越性,以及基于信息熵和神经网络的语音端点检测方法避免设置阈值的优点。仿真结果表明该方法对低信噪比信号,端点检测的准确率有一定的提高。算法三:提出了基于1/f分形信号小波模型和模糊RBF神经网络的语音端点检测方法。仿真结果表明该方法在常见的噪声环境下效果较好,算法实现简单,环境适应性较强。

【Abstract】 The endpoint detection technology of speech signal is to accurately determine starting point and ending point from a section of speech signal. Thus it can distinguish speech and non-speech signal. Effective endpoint detection can not only reduce the amount of data collection and save the processing time, but also can eliminate interference from the silent and the noise. It can improve property of speech recognition system. Besides it can reduce bit rate of the noise and the silent in speech coding so improve the coding efficiency. Therefore endpoint detection is very important in speech processing.It is a bit difficulty to detect endpoint accurately in low SNR, especially in silent segment and pre-and post pronunciation. This paper summarized the typical endpoint detection algorithm, including the algorithm based on short-time energy and zero-crossing rate, the algorithm based on LPC cepstrum, the algorithm based on entropy function, the algorithm based on HMM and the algorithm based on sub-band average energy variance. The paper analyzed the different feature and presented the part of the simulation results. Those algorithms can have a good performance when it is quiet or has a small noise. But the result has a rapid decline when the environment is bad and SNR is low. The paper proposed three methods of endpoint detecting in noise environment. The first is the endpoint detection based on fractal dimension. It utilizes fractal dimension superiority and overcomes the difficulty of decision threshold in noise environment. The second is the endpoint detection based on fractal dimension and fuzzy RBF neural network. This method combines the advantages of both fractal dimension and information entropy and neural network which avoid threshold setting. The simulation result shows that this method is better in accuracy of endpoint detection in low SNR. The third one is endpoint detection based on 1/f fractal signal wavelet model and fuzzy RBF neural network. The experiment shows that it has a better effect in normal noise environment. The algorithm is easy and adaptable to environment.

  • 【分类号】TN912.3
  • 【被引频次】4
  • 【下载频次】332
节点文献中: