节点文献
基于文本无关的说话人识别
Text-independent Speaker Recognition
【作者】 刘雪燕;
【导师】 李明;
【作者基本信息】 兰州理工大学 , 通信与信息系统, 2007, 硕士
【摘要】 说话人识别技术因其独特的方便性、经济性和准确性,在生物特征识别领域中具有广阔的应用前景。现有的说话人识别技术在理想条件下效果很好,但在实际环境中却由于各种因素的影响,不能得到普遍的应用,其中最重要的一个原因是大训练量和实时性不够。因此如何在不影响识别率的情况下,提高系统的训练时间和识别时间成为本领域的研究热点。SVM是一种基于结构风险最小化原则的模式分类方法,在处理样本中非线性、高维数问题时有很大的优势,应用于基于语音样本的说话人识别上有良好的效果。本文深入研究了SVM在说话识别中的大样本训练,及识别时需要匹配所有的参考模型等问题,并提出自己的解决方案。具体做了如下几方面的工作:1、针对标准SVM在说话人识别中的大样本训练问题,提出一个基于多约简支持向量机(MRSVM)的说话人辨识方法,既采用PCA变换和模糊核聚类分别减少训练样本的维数和个数,在不影响识别率的情况下,减少了标准SVM的训练量和系统存储量。2、提出一个基于PCA和MRSVM的多级说话人辨识方法,提高系统的辨识速度。利用PCA分类器具有无需训练、实现简单、快捷的优点。识别时用PCA对注册说话人进行快速预判决。利用SVM具有很强分类能力的优点,根据预判决的结果只判决一部分MRSVM的个数,从而减少了系统的辨识时间。相对于传统的识别方法,实验结果表明本文方法具有很大的时间优势,且整个系统具有很好的可扩性。
【Abstract】 Due to its special merits of flexibility, economy and accuracy, speaker recognition technology has a broad application future in biometrics security field. However, speaker recognition techniques have performed well under ideal conditions. There are still many problems when we want to apply speaker recognition to real applications, One most cause is the long computational time of training a speaker model or test an utterance, and in the recognition stage, the test utterance must match the every speaker mode. This makes real-time implementation very hard and expensive. Thus the problem of improving train time and recognition time has turned into the most active research filed without deteriorating recognition performance.Support vector machine technology is one of statistics learning theories. It has a very great advantage while dealing the samples with nonlinear and multidimensional problems on the basis of the mode categorized method of the structure risk with minimizing principle. So there are good results on speaker recognition based on speech signal samples. However, training a speaker SVM model consumes large memory and long computing time with all the speech parameters, and in the recognition stage, the test utterance must match the every speaker mode. This thesis has systematically investigated existing works from other colleagues, and proposed some novel approaches:1、In the speaker recognition, there are some major difficulties that confront large extractive feature data, which will consumes large memory and long computing time to training SVM with all speech parameters. This paper proposes a speaker identification method based on multi-reduced support vector machine (MRSVM) to reduce training time and the memory size for SVM. Viz. PCA and kernel-based fuzzy clustering are used to reduce the dimensions and amounts of training data respectively, the experiment results show that the training data, time and storage can be reduced remarkably by using our method without deteriorating recognition performance, and the system has better robustness.2、To save the recognition time of speaker identification, this paper proposes a novel hierarchical speaker identification(HSI) system based on MRSVM and PCA classifier. PCA classifier come true easy and fast because it needn’t to train, so that the PCA classifier is used to get a coarse judge by a fast scan all registered speakers. And the selected MRSVM models are used to get a final decision by the result of the first judge. Experiments show that HSI have the similar identification performance compared with traditional method, but the identification velocity is improved greatly. And the system is easy to add and delete a new speaker
【Key words】 Speaker recognition; Speaker identification; PCA transform; Support vector machine (SVM); Kernel-based fuzzy clustering;
- 【网络出版投稿人】 兰州理工大学 【网络出版年期】2008年 10期
- 【分类号】TP391.42
- 【被引频次】1
- 【下载频次】117