节点文献

基于广义音素的文本无关说话人认证的研究

Investigation on Broad Phone Based Text Independent Speaker Verification

【作者】 杨浩

【导师】 董远;

【作者基本信息】 北京邮电大学 , 信号与信息处理, 2008, 硕士

【摘要】 从二十世纪80年代开始,随着科技的发展,文本无关说话人认证作为模式识别领域类一个的分支,越来越受到研究人员的青睐。目前,最热门的文本无关说话人认证系统均是基于高斯混合模型并结合背景模型的,这类系统忽略说话人说话的内容、语言等,因而其工程应用价值大打折扣。为了弥补当前技术的不足,近两年,基于广义音素的说话人认证系统引起了学术界的关注。采用广义音素的说话人认证不仅可以结合语音识别技术、文本无关说话人认证技术,还可以引入商业应用中比较成功的文本相关说话人认证中的技术:另外,广义音素的说话人认证可以很好的解决由于说话人语言多样性而带来的问题。在课题中,作者从广义音素的定义开始,对基于广义音素的说话人认证系统作了深入研究。文中,作者提出了一套完善的广义音素定义及模型训练方法并设计了基于广义音素的说话人认证系统的整体框架,使系统的性能和流行的基于高斯混合模型并结合背景模型的系统性能相当;同时,为了提高音素识别前端处理以及说话人自适应的效率,作者分别提出了快速声道长度归一化算法和说话人自适应鲁棒性算法;除了对基于应马尔可夫模型的广义音素说话人认证作了大量的研究,作者还提出了以本征音说话人自适应训练因子来张成说话人空间并使用支撑向量机在该空间来做说话人认证判决的系统,该系统能对传统的系统判决起到很好的补充作用。

【Abstract】 As an important branch in the area of pattern recognition, text-independent speaker verification has attracted attention from more and more scientists since the last twenty years of the last century. Currently, Gaussian Mixture Model-Universal Background Model based speaker verification, dominates the field of text-independent speaker verification. Unfortunately, due to the regardless of content and language information, this kind of system has its limitation when applied to commercial tasks. To compensate the drawback of the Gaussian Mixture Model-Universal Background Model based speaker verification system, researchers have proposed speaker verification using broad phones, which could not only take use of techniques in the sphere of speech recognition, text-independent and text-dependent speaker verification, but also address problems introduced by the diversity of languages spoken by the speakers enrolling in the system.In this research work, the author proposed the definition of the broad phones, training method of the broad phonetic Hidden Markov Modes, and the framework of the broad phonetic Hidden Markov Modes based speaker verification system that has equivalent performance compared to Gaussian Mixture Model-Universal Background Model based speaker verification system. To boost the computational efficiency of the front-end processing in the phone recognizer, the author proposed a novel rapid Vocal Tract Length Normalization algorithm. Besides, the author proposed a algorithm to enhance the efficiency in speaker adaptation phase. In addition, the author successfully introduced the weights of Eigen Voice Speaker Adaptation into the Support Vector Machine, and constructed a new kind of speaker verification system which could provide complementary information to conventional Gaussian Mixture Model-Universal Background Model based speaker verification system.

  • 【分类号】TN912.34
  • 【下载频次】41
节点文献中: 

本文链接的文献网络图示:

本文的引文网络