节点文献

多路并行实时说话人识别算法研究与实现

Research and Implementation on Real-time Speaker Recognition Algorithmin in Multiplexing Parallel Model

【作者】 罗晓亭

【导师】 吉立新;

【作者基本信息】 解放军信息工程大学 , 军事通信学, 2010, 硕士

【摘要】 随着通信技术的飞速发展,电话通信日益成为人们联系和信息交流的平台,基于电话语音的多路并行说话人识别也成为人们广泛研究的课题。当前许多的语音识别系统是基于计算机软件或者基于DSP平台,系统实现灵活,但相对于电话语音的多路并行应用来说实时性较差。FPGA(Field-Programmable Gate Array,现场可编程门阵列)芯片具有时钟频率高,内部时延小,全部控制逻辑由硬件完成的优点,其速度快、效率高,适于大数据量的高速传输控制。采用DSP+FPGA的说话人识别系统,可以充分利用两种芯片的各自特点,预处理和特征参数算法处理的数据量大,对处理速度要求高,但运算结构相对比较简单,适合于用FPGA进行硬件实现,模板匹配算法的特点是处理的数据量相对较少,但算法的结构复杂,适于用运算速度高、寻址方式灵活、通信机制强的DSP芯片来实现。DSP+FPGA结构可以兼顾速度及灵活性。本文的主要工作包括:(1)针对多路并行实时说话人识别系统对数据吞吐量、计算速度和资源占用有较高的要求,提出了基于FPGA+DSP平台的系统实现方案;(2)研究了当前说话人识别中常用的特征参数和识别方法,并根据并行说话人识别系统的特点,在占用资源和计算复杂度上做权衡,设计了基于MFCC+VQ的多路并行说话人识别系统;(3)研究说话人识别模板匹配经典VQ算法,针对多路并行识别系统在识别精度和处理速度的要求,对VQ算法做两方面的改进,在识别精度上提出了码字可分性加权VQ算法,在模板匹配速度上提出了码字可分性加权VQ算法的均值不等式快速搜索算法(ENNS),经仿真测试,识别精度和模板匹配速度性能均得到了一定的提高;(4)完成DSP平台的相关设计,包括数据接口的设计与测试(DSP与主机通信HPI接口,DSP与FPGA通信EMIFA接口)、DSP系统寄存器配置;设计多路并行事实说话人识别的流程,基于TI DSP6455平台设计并实现了改进VQ算法,并进行了优化,实验测试基于平台的定点结果和VC浮点结果之间相对误差,并对模板匹配时间做测试;(5)对说话人识别系统进行联调及性能测试。测试结果表明,系统能够实时处理32路并行电话语音并且识别精度比较高,达到了设计的要求。

【Abstract】 As communication technology highly evolving, telephone communication becomes the main platform of association and information exchange between people, Multiplexing parallel Speaker Recognition which is based on telephone communication turn into extensive research assignment. At present many Speaker Recognition Systems is based on the computer software or DSP chip, which has flexible system implementation, but with regards to telephone communication it is worse in real time.The FPGA(Field-Programmable Gate Array) chip has advantage of high clock frequency, small inner part postpone,all control logic completed by hardware,it is quicky speed,high efficiency,and suitable for the large data stream of highly transmission control. There is two characteristics of DSP+FPGA structure,first structure flexible, strong general use,and the suitable for modularization design,thus it can raise calculate efficiency and be applicable to actually processing system;secondly, it has short development period and the system is easy to maintain and upgrade.This paper’s main work includes:(1) Aiming at characteristic of Multiplexing parallel Speaker Recognition,data throughput,resource requirement and calculate speed,this paper puts forward the system based on DSP and FPGA.(2) According to the characteristics of Multiplexing parallel Speaker Recognition,researching on the characteristic parameter and matching model,this paper designs recognition algorithm based on Mel Frequency Cepstrum Coefficient(MFCC) and Vector Quantization (VQ).(3) According to speaker mode matching method,aiming at the requirement identification rate and processing speed in system,this paper introduces code vector separability improved VQ algorithm,which advances the recognition performance;and in order to improving matching speed,add equal-average nearnest neighbor search(ENNS) to improved VQ algorithm. Simulation shows that: recognition performance and matching speed get major increase.(4) Then, this paper completes the related design of DSP , include designing data interface,HPI(DSP and host communication),EMIF(DSP and FPGA communication),and DSP related register set;Design and optimizing improved VQ algorithm based on TI 6455;make the experiment about relative error the DSP fixed-point and VC float-point result,and matching time.(5) The experiment shows that: the system can process multiplexing parallel telephone speech,and recognition performance is well.

【关键词】 说话人识别DSPFPGA矢量量化美尔频标倒谱系数
【Key words】 Speaker RecognitionDSPFPGAVQMFCC
节点文献中: 

本文链接的文献网络图示:

本文的引文网络