

Research and Implementation on Speaker Recognition System Based on GMM

【作者】 陈强

【导师】 阙大顺;

【作者基本信息】 武汉理工大学 , 信号与信息处理, 2010, 硕士

【摘要】 说话人识别也称声纹识别,其目的是根据说话人的声音特征来完成说话人的辨认或确认。随着网络信息化技术的迅猛发展,身份验证的数字化、隐性化、便捷化显得越来越重要,说话人识别作为一种生物认证技术,在视觉监控、身份验证、司法刑侦及金融安全等领域有着广泛应用前景,成为当前语音信号处理领域的研究热点。说话人识别技术研究的关键是语音信号的特征提取和模式匹配等问题。本文在研究当前说话人识别主要算法的基础上,通过研究基于声学特性的倒谱特征提取方法和基于模板匹配及概率统计的模式匹配方法,研究实现了基于矢量量化VQ的说话人识别系统,重点研究设计了与文本无关的基于混合高斯模型GMM的说话人识别系统。论文主要研究内容如下:(1)总结归纳说话人识别技术的发展、研究热点和难点,分析讨论了现有说话人识别主要算法。(2)分析研究了说话人识别语音预处理,重点对减谱法语音增强算法进行了改进,通过实验分析了语音增强效果,提高了噪声环境下的说话人识别系统的鲁棒性;研究了说话人识别的特征提取原理和方法,仿真实现了说话人基音特征、LPCC和MFCC参数及差分倒谱参数等的提取。(3)在分析VQ基本原理、LBG算法和VQ码本初始化的基础上,设计实现了基于VQ的说话人识别系统,完成了模型参数训练和匹配识别过程,实验分析了不同模型参数及不同语音样本时长下的系统识别性能。(4)为了提高系统识别率和稳定性,在研究GMM模型参数估计期望最大化(EM)算法、模型参数初始化、训练和识别过程的基础上,研究设计了基于GMM的说话人识别系统,并完成了系统仿真实验,分析了不同模型参数、不同特征提取方法、不同语音样本时长和不同信噪比噪声环境下的说话人识别性能。(5)分析了开集说话人识别方法、说话人确认阈值选取方法,研究了一种先辨认后确认的开集说话人识别方法,分析了针对集外冒充说话人的“拒识问题”,并完成了基于VQ和GMM两种模型的开集说话人识别系统性能分析比较。

【Abstract】 Speaker Recognition is also known as Voiceprint Identification, of which the purpose is to indentify or verify the speaker based on the voice.With the rapid development of network information technology, the digitalization, recessivation and facilitation of identity authentication has become more and more important. As a biological authentication technology, Speaker Recognition has wide application prospects in many fields such as surveillance, authentication, investigation and finance security and become a hot spot in the research on speech signal processing. The key technologies of Speaker Recognition are feature extraction and pattern matching currently. On the condition that research the key algorithm of the current speaker recognition, this paper study the method of feature extraction based on acoustic performance, the method of pattern matching base on template matching and probability-statistics.Analyze and verify Speaker Recognition System base on VQ. Thoroughly, study and design of Text-independent Speaker Recognition System based on GMM.The concrete content is as follows:(1) Summarize status of development, the study hotspot and difficulty in speaker recognition technology. Analyze and discuss the existing main algorithm in speaker recognition.(2) Study voice signal processing and spectral subtraction method of speech enhancement algorithms in speaker recognition system of front end process, improves spectral subtraction method. The experiment shows that the robustness of speaker recognition system is improved in noisy environment. Research the fundamental principle of feature extraction of speaker recognition. Realize parameter extraction process of pitch, LPCC, MFCC and its difference by simulation.(3) On the basis of analyzing the fundamental principle of VQ, the algorithm of LBG and mode initialization in VQ, Design and Implementation of speaker recognition system based on VQ. Establish of training model parameters and the process of recognizing parameters matching. Analyze the performance of speaker recognition system in different model parameters and duration of speech samples by experiments.(4) To improve the recognition rate and the stability of the system, research the algorithm of expectation maximization (EM) for parameter estimation, model parameter initialization, the process of training parameters and recognizing parameters in GMM, and complete simulation and experiment. Analyze the performance of system in different model parameter, methods of feature extraction, duration of speech samples, various SNR.(5)Analyze the open-set speaker recognition, the rule and method of getting threshold value in speaker verification. A method of speaker identification followed speaker verification in open-set speaker recognition is presented. Solve "rejection problems" for pretenders.Finally, analyses and compares the performance of open-set speaker recognition based on VQ and GMM.
