节点文献

基于神经网络与HMM的说话人识别研究

【作者】 梁慧

【导师】 曾水平;

【作者基本信息】 北方工业大学 , 检测技术与自动化装置, 2012, 硕士

【摘要】 说话人识别的目的是为了识别不同人的身份,识别过程是先选取一定的声音特征,然后运用一定的模型算法对每个说话人建立独有的模板库后进行逐-模板匹配,最终得到最佳匹配结果。在说话人识别领域,广泛采用的各种特征参数各有优缺点,其识别效果并不十分理想,长期以来,一直没有找到能够完全表征说话人个性差异的特征参数。本文讨论了几种常用特征参数和模型算法,并引入一种新的小波特征参数以及神经网络的改进算法组成一个说话人识别系统。本文介绍了特征参数Me[倒谱系数(MFCC),这个特征参数是基于倒谱域的参数,然而在描述说话人个性特征方面,参数的区分识别能力有些欠缺,故本文利用倒谱原理以及小波变换提取出了一种小波MFCC特征;另外,在模型算法方面,论文介绍了隐马尔科夫模型的初值算法,对当前普遍利用K阶均值聚类算法设定初值的方法进行了分析,同时引入自组织神经网络的聚类算法,使其与K阶均值聚类算法在训练过程的收敛速度方面进行分析和比较。本文的设计实验表明,采用小波MFCC特征大大减少了计算个数,其得到的系统识别率达到了94.4%,比采用MFCC特征得到的87.5%的识别率提高了7%左右;同时,在利用自组织特征映射神经网络与自组织竞争型神经网络对K阶均值聚类算法的改进实验中,把在实验中记录的不同说话人的不同特征参数以及经过不同的模型算法得到的训练迭代次数以及识别率作为分析的依据,得到了不同算法的优缺点以及存在的问题。

【Abstract】 Speaker recognition aims to identify different identity, the recognition process is to select a certain sound features firstly, and then use some of the model algorithm for each speaker to establish unique template library for each template matching, and get the best matching results at last. In the field of speaker recognition, various characteristic parameters which are widely used have advantages and disadvantages, and the recognition results are not very satisfactory, since a long time, characteristic parameters which can be able to characterize the speakers individuality completely has not been found. This article discusses several common feature parameters and the algorithm of the model, and introducing a new wavelet feature parameters and the improved algorithm of neural network to make up a speaker recognition system.This paper introduces a kind of common characteristic parameter firstly, Mel cepstrum coefficient (MFCC), the parameter is based on cepstrum parameters, however, in describing the speaker personality characteristics, distinguishing ability of the parameter has some lack, therefore this article extracts a wavelet MFCC features using cepstrum principle and wavelet transform; moreover, in the algorithm of the model, the paper analyses initial importance of the hidden Markov model and describes the initialization method of the K order mean clustering algorithm which is used generally, while introducing self organizing neural network clustering algorithm, and make the comparison with the K order mean clustering algorithm in the process of training convergence aspects.Experimental results show that, using the wavelet MFCC features can greatiy reduce the number of its calculation, and the system recognition rate reached94.4%, compared with87.5%when used MFCC features, the recognition rate is improved by about7%; at the same time, in the experiment using self-organizing feature map neural network and self-organizing competitive neural network to improve K order mean clustering algorithm, based on the training iterations and the recognition rate obtained by different characteristic parameters of different speakers through different algorithms,we can analyse the advantages and disadvantages of different algorithms and their existing problems.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络