节点文献

基于麦克风阵列的近场和远场混合声源定位

Mixed Near-Filed and Far-Filed Speech Source Localization Based on Microphone Arrays

【作者】 姜锦云

【导师】 王建英;

【作者基本信息】 西南交通大学 , 信号与信息处理, 2013, 硕士

【摘要】 声源的定位是实现语音识别和语音增强的前提和基础,它具有广阔的应用前景。随着数字信号处理与阵列信号处理技术的发展和进步,麦克风阵列已广泛应用于声源定位中,但当前大多数基于麦克风阵列的声源定位技术中,要么信源完全处于近场源,要么信源完全处于远场源,另多数假设信源为窄带信号,而实际生活中语音信号为宽带信号。针对这些问题,本文深入研究了在混合近场和远场的情况下基于麦克风阵列的声源定位技术。主要内容如下:第一、分析了语音信号的特性,介绍了传统的窄带信号处理模型和宽带信号处理模型,研究了麦克风阵列均匀线阵在远场和近场的两种模型。第二、由于麦克风阵列不仅接收有用语音信号,还有其他各种各样的噪声,因此需要对得到的数据进行预处理,包括预滤波、预加重,归一化,加窗分帧,短时能量检测,和语音降噪等,本文对语音活动检测进行了研究,为了得到时域上的对数能量和频域上的子带谱熵这两种方法各自的优点,文中采用了一种新的对数能量子带谱熵法。第三、研究了近场MUSIC算法,分析了假若信号源既有处在远场又有处在近场时的信号模型,给出了用MUSIC算法在混合场中对语音宽带信号进行定位的算法,该算法首先将信号源的到达角和距离进行分离,推算出一个只含有到达角信息的新的方向矩阵,然后运用MUSIC算法得到所有信号源的到达角,最后基于已得到的到达角信息和远场距离特性,再次通过MUSIC算法获得对远场与近场声源的定位。第四、研究了在近场和混合场两种不同情况下基于稀疏分解的声源定位算法,当信源处在混合场时,本文根据混合场的信号模型,给出了构造适合麦克风阵列混合场的原子库的方法,然后使用匹配追踪算法完成在混合场的声源方位估计。通过实验仿真可知该算法在低信噪比情况下有较好的鲁棒性。

【Abstract】 The localization of sound source is the basis and prerequisite for realization of speech recognition and speech enhancement. It has broad application prospects. As the development of the technology of the digital and array signal’s processing, the microphone array has been widely used in the localization of sound source, but the most of sound source localization based on microphone array technology, either completely in near field source, or in the far field source. The most assuming source is narrow-band signal, but voice signal in real life is broadband signal.To solve these problems, this paper deeply studied in the case of mixed near field and far field of sound source localization based on microphone array technology. The main contents can be stated as follows:First, it analyses the characteristics of the speech signal, introduces the traditional narrow band signal processing model and broadband signal processing model, and studies the model of the uniform linear array microphone array in the far field and near field.Second, it is necessary to preprocess the data because the microphone array receives both the useful speech signal and all other kinds of noise. The preprocessing includes pre-filtering, pre-emphasis and normalization, and window frame, short-time energy detection, and voice noise reduction, etc. In this paper, the voice activity detection is studied. In order to take advantages of these two kinds of methods:logarithmic energy on the time domain and band-partitioning spectral entropy on the frequency domain, this paper gives a new logarithmic band-partitioning energy spectral entropy.Third, it studies the near-field MUSIC algorithm, and the signal model with the situation that the signal is in both the far field and the near field, and gives the algorithm of using MUSIC algorithm to locate the wideband speech signal in the hybrid field. The algorithm first separates the arrival direction and distance of the signal source, then gets a direction matrix which only contains the information of arrival direction. Then we can get the arrival angle of all sources with the apply of the MUSIC algorithm. At the end, we can obtain the far field and near field sound source localization through the MUSIC algorithm, based on the characteristics of the information of arrival direction and the far field.Last, it studies the sound source localization algorithm based on sparse decomposition in the near field and the mixed field. When the source is mixed in field, this paper, based on the signal model of hybrid field, puts forward the mixed structure suitable for microphone array field method of atomic library, and then succeeds in estimating the direction in mixed field of the sound, using the matching pursuit. Simulation shows that the algorithm has good robustness through the experiment under the condition of low signal-to-noise ratio.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络