节点文献

电话信道自然语音关键词检测

【作者】 刘鑫

【导师】 王炳锡;

【作者基本信息】 中国人民解放军信息工程大学 , 信号与信息处理, 2002, 硕士

【摘要】 关键词检测是一种特殊的语音识别技术,旨在从连续话音中检测出由具体应用决定的特定词,它在许多领域内有着良好的应用前景。本文简要介绍了关键词检测技术的发展史和国内外发展动态,并分别就特征提取与选择、模式划分方法和时间对准三个关键词检测的基本问题做了详细的介绍。目前比较流行的模式划分方法是隐马尔可夫模型。本文重点介绍了隐马尔可夫模型的基本原理,并给出简单、可行的训练方法和识别策略,建立了基于识别——确认两级结构的识别系统,实现了无语法限制的关键词检测。此外,本文在提高系统鲁棒性和识别速度方面做了新的尝试:应用FastICA算法对特征变换和降维;实现了说话人分类和说话人自适应基本算法,说话人分类由混合高斯模型实现,可以扩大应用人群,提高识别率,说话人自适应由最大似然线性回归算法实现;在提高系统识别速度方面采用高斯选择法。在发音确认阶段,本文还提出新的基于识别结果本身信息的置信度,可以有效减少系统虚警率。文章最后还给出了在关键词检测方面进一步的研究方向。

【Abstract】 As one special field in speech recognition research, keyword spotting is to determine occurrences of one or more keywords embedded in unconstrained extraneous speech and/or noise. It has bright future in many application areas. In this paper, we give a brief history of keyword spotting research and provide a discussion of its fundamental principle in which three most important problems in this field are pointed out, that is, how to extract and choose feature and how to characterize keywords and garbage; and how to detect keywords from continuous speech. This paper describes the basic theory of HMM and presents simple and practical methods for building HMM and time aligning patterns with models are provided, that is, Segmental k-means training algorithm and Frame-synchronous Viterbi algorithm. And we build a recognition-verification system which can detect keywords from continuous speech without grammar restriction. Furthermore, we do some works in feature transformation, speaker clustering, speaker adaption and Gaussian selection for improving system robustness and efficiency. Feature transformation is achieved by FastICA algorithm. Speaker clustering is implemented by Gaussian Mixture Model which can make system applied to a wider group of people and speaker adaption is achieved by Maximum Likelihood Linear Regression algorithm. And we use Gaussian selection method to reduce calculation. In utterance verification phase, some new confidence measures based on recognition results’ information are used to reduce the false alarm rate. Finally, the paper shows the further research direction in this field.

  • 【分类号】TN912.3
  • 【被引频次】4
  • 【下载频次】170
节点文献中: 

本文链接的文献网络图示:

本文的引文网络