节点文献

说话人识别中信息融合算法的研究

A Research on Information Fusion Algorithms in Voice Biometrics

【作者】 刘镝

【导师】 裘正定; 孙冬梅;

【作者基本信息】 北京交通大学 , 信息安全, 2011, 博士

【摘要】 摘要:本文通过对说话人认证系统中的特征级融合、匹配分数级融合、决策级融合与多层级融合各种信息融合算法的研究,进一步提高了系统的认证精度,以便更好地解决关系国计民生的公共信息安全问题。针对各个融合层级发现的问题,本文分别从建立特征级融合理论、提出匹配分数级融合特征选择算法、提出多层级融合理论三个角度进行了研究。本文主要贡献有:1.本文按照信息融合生物识别系统中存在的特征级、匹配分数级与决策级三种融合层级,对现有信息融合说话人识别算法进行归纳,并对匹配分数融合说话人识别算法进行子类划分。然后,针对各融合层级遇到的问题与不足,具体工作有:2.针对特征层级存在的问题,本文提出了一种基于关系度量融合框架的特征级融合算法。以此关系度量框架为理论依据建立了一种基于特征级融合的说话人认证算法,通过引入最大Kullback-Leibler距离计算特征级融合的有效信息量,首次从信息论角度阐述了特征级融合优于说话人识别中常用的匹配分数融合的原因。实验结果显示特征级融合算法较传统匹配分数算法可以获取更多的有效信息量,得到了比匹配分数融合和单模态算法更优的性能。最佳情况下,特征级融合算法的等错误率比传统匹配分数融合降低了3.88%,比单模态算法降低了7.3%。3.针对匹配分数层级存在的问题,本文提出了一种基于Spearman相关系数的匹配分数融合特征选择算法。在匹配分数融合过程中,如何选择两种相关性较小的匹配分数是提高融合后系统性能的关键。目前业内缺乏衡量这种相关性的度量。本文首次引入Spearman相关系数来衡量匹配分数之间相关性,并且利用多项式曲线拟合Spearman系数分别与等错误率、MinDCF之间的关系,验证了该系数的有效性。进一步引入Kullback-Leibler距离分析了与Spearman系数之间的关系,再次验证了Spearman系数的有效性。通过Spearman系数对6种话语特征共15种两特征融合方案的匹配分数相关性的实验评估,进行了最优融合特征的选择,选出了MFCC与residual phase的最佳融合方案,并将Spearman系数与其它典型相关性度量进行时效性比较,验证其时效性最优,适合大量话语特征的快速选择。4.针对决策级融合存在的问题,本文首次提出了一种多层级融合说话人识别框架理论,在框架中分别定义了一种强多层级融合、三种弱多层级融合的四种多层级融合概念。针对两特征融合实例,分别讨论了以上四种多层级融合情况,提出了一种两特征的匹配分数、决策级融合多层级融合算法,验证了多层级融合理论的可行性。实验结果显示该算法性能均优于传统匹配分数算法、单模态算法。最优情况下,比传统单模系统等错误率降低了18.63%。

【Abstract】 ABSTRACT:This thesis aims to improve performance of voice biometrcis system through different investigation of feature level fusion, matching-score level fusion, decision-making level fusion and multiple level fusion algorithms, in order to solve the problems related to public security furtherly. By the discussions of three different level fusion frameworks, the thesis strengthens the accuaccay of the system by aspects of the establishment of feature level fusion, feature selection for matching-score level fusion, and multiple level fusion. The main contributions are shown as follows:1. According to three fusion levels, firstly it summrises current information fusion algorithms on speaker recognition, then makes subcatrgories for the matching-score level fusion. By investigating problems encounted in each fusion level, the following contributions have been proposed:2. For the feature level, a Relation Measurement Fusion framework-based feature level fusion algorithm on speaker verification has been proposed which superior to the existing fusion methods. According to the robustness and availability of the Relation Measurement Fusion framework, the feature level fusion on speaker verification is established. In order to show advantage of feature level fusion, the Maximum Kullback-Leibler distance is firstly introduced to measure information content for feature level and matching-score level fusions. The exprimental results indicate the feature level fusion can hold more discriminative information amount to obtain lower EER and MinDCF than the existing matching-score level fusion and unimodal algorithms. In the best case, compared to the matching score level fusion and unimodal algorithm, EER of the proposed algorithm improves 3.88% and 7.3%.3. For the matching score level, a Spearman rank correlation coefficient-based feature selection algorithm for the matching-score level fusion has been proposed. Fusion techniques by using different features have been employed, but no metric is used to measure correlation for combined features on the matching-score level fusion so far. So an attempt by making use of the Spearman rank correlation coefficient is described as a metric to measure correlation for the matching-score level fusion of speaker recognition. In this context, this metric is able to find out an optimized selection the combination of MFCC and residual phase to achieve good performance. Then, polynomial curve fitting is employed to describe the relationships between the Spearman coefficient and EER or MinDCF, tesifying the availability of the Spearman coefficient. After that, Kullback-Leibler distance is used to verifie that the availability of Spearman coefficient again. Finally, compared with other correlation metrics, the time cost of the Spearman coefficients outperforms others.4. For decision-making level, a multiple level fusion framework has been proposed. Based on this framework, both a strong multiple level fusion and three weak multiple level fusion have been defined. By discussing these four multiple level fusion cases, finally a two-feature muitiple level fusion algorithm which combines matching-score level fusion and decision-making level fusion has been proposed. From the experimental results, this algorithm has shown the theory of the multiple level fusion has the avaibility, and is superior to the current maching-score level fusion and unimodal algorithm, reducing 18.63% of EER compared with unimodal algorithm in the best case.

节点文献中: