节点文献

基于听觉仿生的目标声音识别系统研究

Study on Target Sound Recognition System Based on Auditory Bionics

【作者】 张文娟

【导师】 戴明;

【作者基本信息】 中国科学院研究生院(长春光学精密机械与物理研究所) , 光学工程, 2012, 博士

【摘要】 目标声音识别技术是声音识别的一个重要分支,它的发展极大地提高了人的工作效率、生活品质和服务质量。但是由于声音变化范围较大,声音识别系统很难进行精确匹配;而且声音容易受音量、音质、速度和背景噪声的影响而降低识别效果。因此,研究并设计具有高识别率和高鲁棒性的目标声音识别系统是十分必要的。随着声音信号处理技术的深入研究,结果发现人的听觉系统在听音辨物方面具有独特的优越性,它能够准确地提取目标声音特征并精确地识别声音的方向、类别和内容,基于人耳听觉仿生的目标声音识别技术日益受到重视。因此,本文针对基于听觉仿生的目标声音识别技术展开系统研究,积极探索先进的人耳仿生理论、特征提取技术、目标声音分类技术和基于FPGA的识别系统硬件实现方法,全文主要研究内容及成果如下:1.通过分析人耳听觉系统的生理结构及其对声音的感知过程,研究并建立了一个较为完整的听觉系统数学模型,实现对人耳声音处理过程的模拟。通过仿真实验表明,该数学模型可以较好地模拟耳蜗基底膜的分频滤波功能和内毛细胞的能量转换过程。2.通过分析比较几种常用的声音特征提取方法,针对其普遍存在的鲁棒性差等问题,提出一种基于听觉谱的声音特征提取方法。该方法采用听觉系统的数学模型对声音进行信号处理,其原理符合人耳对声音的处理过程,能够很好地提取声音的特征量,避免关键信息的丢失,提高系统的抗噪声性能和识别率。3.通过对常用几种模式识别方法的对比研究,综合考虑声音具有非线性的特点,本文选择具有自适应能力强的BP神经网络对目标声音信号进行识别及分类处理,该方法思想直观,数学意义明确。通过仿真实验表明:采用BP神经网络设计的分类器对所有测试样本的平均识别率达到93.14%,这说明此方法对目标声音特征进行分类识别是行之有效的。4.在听觉系统数学模型、听觉谱特征提取方法和BP神经网络识别算法已有研究的基础上,综合考虑算法的复杂程度、所需的硬件资源和对外接口等问题,本文提出采用FPGA嵌入式开发平台完成目标声音识别系统的硬件设计。该硬件系统采用VHDL硬件描述语言来模拟耳蜗基底膜的分频功能并设计了基底膜滤波器,采用NOIS II软核技术实现内毛细胞数学模型、耳蜗核数学模型、基于听觉谱的特征提取算法和基于BP神经网络的分类器。最后,针对大炮、救护车、轮船、火车和飞机滑行这5种不同目标声音,在基于FPGA的目标声音识别系统上进行了多次识别实验。测试结果表明,5类目标声音测试集中对救护车的测试样本识别率最高,达到了97.14%,而对大炮的测试样本识别率最低,达到85.71%,所有测试样本的平均识别率达到91.43%。实验结果证明,利用FPGA硬件实现的听觉仿生系统具有良好的识别效果,整个方案是可行且有效的。本文将听觉仿生技术和FPGA硬件技术成功地应用在目标声音识别系统中,为相关技术的研究和工程实践提供了理论支持和技术参考。

【Abstract】 The recognition technology of target sound is one of the important branches ofacoustics recognition, whose development improves peoples’ working efficiency,quality of living and service quality greatly. But because of wider range of sound, it isvery difficult to do accurate matching for sound recognition system, and recognitioneffect is also easy to be reduced by the influence of volume, acoustics, velocity andbackground noise. Therefore, it is essential to study and design the recognition systemof target sound with high recognition rate and high robustness.With further study of audio signal processing, it is found that the human’sauditory system has an unique superiority in listening and distinguishing, which canextract feature of target sound accurately and recognize the direction, category andcontent precisely. And the target sound recognition based on the ear bionic isincreasingly concerned. Therefore, the target sound recognition based on the audiobionic is studied systematically in this paper, and the ear bionic theory, featureextraction technologies, target sound classification technology and hardwareimplementation of recognition system based on FPGA are explored actively. Allresearch works in the paper are outlined as following:1. Through analyzing the physiological structure of auditory system andperception process, a comparatively complete mathematical model of auditory systemis studied and established, which simulates the sound treatment processing. Thesimulation experiment shows that the mathematical model can commendably simulatethe frequency division and filtering of the basilar membrane of cochlea and theprocess of energy transition of inner hair cells. 2. Through analyzing and comparing with usual methods of audio featureextraction, in the light of the commonly existing problems of poor robustness, anaudio feature extraction based on auditory spectrum is proposed. This methodprocesses signal by using the mathematical model of auditory system, which canaccord with the treatment process of ear, extract audio feature well, avoid losing thekey information and improve the anti-noise performance and the recognition rate ofthe system.3. Through studying the comparison with usual methods of pattern recognition,considering nonlinear algorithm of sound, BP Neural Network which has higheradaptability, direct form and clear mathematic significance is chosen in this paper torecognize and classify the target sound. The simulation experiment shows that theaverage recognition rate of all test samples by using BP Neural Network reaches93.14%, which suggests that it is an effectual method to classify and recognize thefeatures of target sound.4. Based on mathematical model of auditory system, auditory spectrum methodand BP Neural Network, considering the complexity of algorithm, hardware resourcesand interface etc, the hardware design of target sound reorganization by using FPGAembedded development platform is proposed in this paper. The hardware systemsimulates the frequency division of the basilar membrane of cochlea and designs thefilter of basilar membrane by using VHDL (hardware description language), andrealizes the mathematical model of inner hair cells and cochlear nucleus, as well asfeature extraction method based on auditory spectrum and FPGA hardwareimplementation of BP Neural Network. Finally, the five chosen different sounds,cannon, ambulance, steamship, train and aircraft taxiing as examples, are recognizedrepeatedly in target sound system based on FPGA. The test results show that therecognition rate for the sound samples of ambulance is maximum,97.14%, however,the recognition rate of cannon in testing sample is the minimum,85.71%. The averagerecognition rate of all testing sample reaches91.43%. The research results show thatthe auditory bionic system based on FPGA hardware implementation is effective insound recognition and the scheme is feasible.Auditory bionics technology and FPGA hardware technology are applied totarget sound recognition system successfully in this paper, which provide theoreticalsupport and technical reference for the study of relevant technology and engineeringpractice.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络