èŠ‚ç‚¹æ–‡çŒ®

éžçº¿æ€§æ»¤æ³¢åŠå…¶åœ¨è¯´è¯äººè·Ÿè¸ªä¸çš„åº”ç”¨ç ”ç©¶

Research on Nonlinear Filtering with Application to Speaker Tracking

åˆ†é¡µä¸‹è½½
åˆ†ç« ä¸‹è½½
æ•´æœ¬ä¸‹è½½
åœ¨çº¿é˜…è¯»
ä¸æ”¯æŒè¿…é›·ç‰ä¸‹è½½å·¥å…·ï¼Œè¯·å–æ¶ˆåŠ é€Ÿå·¥å…·åŽä¸‹è½½ã€‚

ã€ä½œè€…ã€‘ ä¾¯ä»£æ–‡ï¼›

ã€ä½œè€…åŸºæœ¬ä¿¡æ¯ã€‘ å¤§è¿žç†å·¥å¤§å¦ ï¼Œ ä¿¡å·ä¸Žä¿¡æ¯å¤„ç†ï¼Œ 2008ï¼Œ åšå£«

ã€æ‘˜è¦ã€‘ è¯´è¯äººå®šä½æ˜¯è¯éŸ³ä¿¡å·å¤„ç†çš„é‡è¦å†…å®¹ä¹‹ä¸€,åœ¨è¯éŸ³å¢žå¼ºã€è§†é¢‘ä¼šè®®ç³»ç»Ÿã€äººæœºäº¤äº’ã€æœºå™¨äººç‰é¢†åŸŸæœ‰å¹¿é˜”çš„åº”ç”¨å‰æ™¯ã€‚ä¼ ç»Ÿçš„è¯´è¯äººå®šä½æ–¹æ³•åˆ©ç”¨éº¦å…‹é£Žé˜µåˆ—åœ¨å½“å‰æ—¶åˆ»æŽ¥æ”¶åˆ°çš„è¯éŸ³ä¿¡æ¯è¿›è¡Œå®šä½,åœ¨è‡ªç”±å£°åœºæ¡ä»¶ä¸‹,èƒ½ç»™å‡ºè‰¯å¥½çš„å®šä½æ•ˆæžœã€‚ä½†æ˜¯,åœ¨çŽ¯å¢ƒå™ªå£°ä¸Žæˆ¿é—´æ··å“å‡å˜åœ¨çš„å¤æ‚å£°åœºæ¡ä»¶ä¸‹,è¯¥å®šä½æ–¹æ³•ä¼šç”±äºŽè™šå£°æºçš„å‡ºçŽ°è€Œé”™è¯¯åœ°ä¼°è®¡è¯´è¯äººä½ç½®ã€‚å› æ¤,éœ€è¦é‡‡ç”¨å£°æºè·Ÿè¸ªçš„æ–¹æ³•ç¡®å®šè¯´è¯äººä½ç½®,ä»¥æé«˜è¯´è¯äººä½ç½®çš„ä¼°è®¡ç²¾åº¦ã€‚è¯´è¯äººè·Ÿè¸ªæ˜¯ä¸€ç§å…¸åž‹çš„éžçº¿æ€§æ»¤æ³¢é—®é¢˜ã€‚æœ¬æ–‡åœ¨è´å¶æ–¯ä¼°è®¡æ¡†æž¶ä¸‹,ä»¥ç³»ç»ŸçŠ¶æ€çš„åŽéªŒæ¦‚çŽ‡å¯†åº¦å‡½æ•°ä¸ºçº¿ç´¢,å¯¹é«˜æ–¯å’Œéžé«˜æ–¯ä¸¤ç±»ä¸åŒçš„éžçº¿æ€§æ»¤æ³¢æ–¹æ³•,åœ¨æ»¤æ³¢ç²¾åº¦ã€é²æ£’æ€§å’Œè®¡ç®—é‡ç‰æ–¹é¢è¿›è¡Œäº†æ”¹è¿›ã€‚åŒæ—¶,å°†éžçº¿æ€§æ»¤æ³¢æ–¹æ³•åº”ç”¨äºŽè¯´è¯äººè·Ÿè¸ªé—®é¢˜,æå‡ºäº†ä¸€äº›å…·æœ‰é’ˆå¯¹æ€§çš„æ”¹è¿›æŽªæ–½ã€‚æœ¬è®ºæ–‡å–å¾—çš„ä¸»è¦åˆ›æ–°æˆæžœå¦‚ä¸‹:ï¼ˆ1ï¼‰åœ¨é«˜æ–¯åˆ†å¸ƒæ¡ä»¶ä¸‹,æå‡ºäº†è¿ä»£çš„sigmaç‚¹å¡å°”æ›¼æ»¤æ³¢ï¼ˆISPKFï¼‰æ–¹æ³•,è¯¥æ–¹æ³•é€šè¿‡é‡å¤åˆ©ç”¨è§‚æµ‹ä¿¡æ¯,æé«˜äº†SPKFæ–¹æ³•çš„ä¼°è®¡ç²¾åº¦ã€‚é’ˆå¯¹ä¼ ç»Ÿçš„è¿ä»£æ–¹æ³•ç¨³å®šæ€§è¾ƒå·®çš„é—®é¢˜,åœ¨éžçº¿æ€§ä¼˜åŒ–ç†è®ºåŸºç¡€ä¸Š,åˆ©ç”¨Levenberg-Marquardtæ–¹æ³•è°ƒæ•´é¢„æµ‹åæ–¹å·®é˜µ,ä¿è¯äº†è¿ä»£æ»¤æ³¢æ–¹æ³•çš„å…¨å±€æ”¶æ•›æ€§ã€‚ï¼ˆ2ï¼‰ä¼ ç»Ÿçš„è´å¶æ–¯ä¼°è®¡æ–¹æ³•å»ºç«‹åœ¨H₂å‡†åˆ™åŸºç¡€ä¸Š,ä»¥å‡æ–¹è¯¯å·®ä¸ºä»£ä»·å‡½æ•°,è¦æ±‚ç³»ç»Ÿæ¨¡åž‹è¾ƒä¸ºå‡†ç¡®å¹¶ä¸”å¤–éƒ¨å¹²æ‰°ä¿¡å·çš„ç»Ÿè®¡ç‰¹æ€§ç¡®åˆ‡å·²çŸ¥ã€‚ä½†åœ¨å®žé™…åº”ç”¨ä¸,ä¸ä»…å¤–éƒ¨å¹²æ‰°ä¿¡å·çš„ç»Ÿè®¡ç‰¹æ€§éš¾ä»¥å‡†ç¡®äº†è§£,è€Œä¸”ç³»ç»Ÿæ¨¡åž‹æœ¬èº«ä¹Ÿå˜åœ¨ä¸€å®šç¨‹åº¦çš„ä¸ç¡®å®šæ€§ã€‚æœ¬æ–‡åœ¨H_âˆžèŒƒæ•°æ„ä¹‰ä¸‹,å°†ç»Ÿè®¡çº¿æ€§åŒ–æŠ€æœ¯åº”ç”¨åˆ°é²æ£’æ»¤æ³¢ç³»ç»Ÿ,æå‡ºäº†H_âˆžsigmaç‚¹å¡å°”æ›¼æ»¤æ³¢æ–¹æ³•ï¼ˆHSPKFï¼‰ã€‚è¯¥æ–¹æ³•ç”¨sigmaç‚¹è½¬æ¢æŠ€æœ¯å‡å°äº†çº¿æ€§åŒ–è¯¯å·®,ç”¨H_âˆžæ»¤æ³¢æ–¹æ³•æé«˜äº†æ»¤æ³¢ç³»ç»Ÿå¯¹ä¸ç¡®å®šæ€§å™ªå£°çš„é€‚åº”èƒ½åŠ›,ä»Žè€Œå¢žå¼ºäº†ç³»ç»Ÿçš„é²æ£’æ€§ã€‚ï¼ˆ3ï¼‰åœ¨ç²’åæ»¤æ³¢æ¡†æž¶ä¸‹,æå‡ºäº†åŸºäºŽå‡å€¼æ¼‚ç§»çš„æ‹Ÿè’™ç‰¹å¡æ´›æ»¤æ³¢æ–¹æ³•,è¯¥æ–¹æ³•ä»¥ç¡®å®šæ€§é‡‡æ ·ä»£æ›¿éšæœºé‡‡æ ·,åˆ©ç”¨æ‹Ÿè’™ç‰¹å¡æ´›ç§¯åˆ†ä¸çš„ä½Žåå·®åºåˆ—ä»£æ›¿éšæœºé‡‡æ ·ç‚¹é›†åˆ,ä½¿é‡‡æ ·ç²’ååœ¨çŠ¶æ€ç©ºé—´ä¸Šå‡åŒ€åˆ†å¸ƒ,æœ€å¤§ç¨‹åº¦åœ°äº’ç›¸è¿œç¦»,ä»Žè€Œé™ä½Žäº†æ»¤æ³¢è¿‡ç¨‹ä¸çš„ç§¯åˆ†è¯¯å·®,æé«˜äº†çŠ¶æ€ä¼°è®¡ç²¾åº¦;åŒæ—¶,ç”¨å‡å€¼æ¼‚ç§»æŠ€æœ¯è°ƒæ•´é‡‡æ ·ç²’åçš„ç©ºé—´ä½ç½®,ä½¿é‡‡æ ·ç²’åæ²¿æ¢¯åº¦æ–¹å‘å‘é«˜ä¼¼ç„¶åŒºåŸŸç§»åŠ¨,ä»Žè€Œå¢žåŠ äº†æ»¤æ³¢è¿‡ç¨‹ä¸æœ‰æ•ˆé‡‡æ ·ç²’åçš„ä¸ªæ•°,å‡å°‘äº†æ‰€éœ€é‡‡æ ·ç²’åçš„æ•°ç›®,é™ä½Žäº†è®¡ç®—éœ€æ±‚ã€‚ï¼ˆ4ï¼‰é’ˆå¯¹é‡é‡‡æ ·è¿‡ç¨‹å¯¼è‡´é‡‡æ ·ç²’åå¤šæ ·æ€§ä¸§å¤±ã€è®¡ç®—é‡å¢žå¤§çš„é—®é¢˜,æœ¬æ–‡æå‡ºäº†åŸºäºŽå……åˆ†ç»Ÿè®¡é‡çš„ç²’åæ»¤æ³¢æ–¹æ³•ã€‚å¯¹åŽéªŒæ¦‚çŽ‡å¯†åº¦å‡½æ•°å¯ä»¥ç”¨å……åˆ†ç»Ÿè®¡é‡æè¿°ä¸”å……åˆ†ç»Ÿè®¡é‡æ˜“äºŽæ›´æ–°çš„æƒ…å†µ,è¯¥æ–¹æ³•é€šè¿‡å……åˆ†ç»Ÿè®¡é‡çš„ä¼ é€’ä»£æ›¿åŽéªŒæ¦‚çŽ‡å¯†åº¦å‡½æ•°çš„æ›´æ–°,è¿™æ ·,ç”±äºŽæ–°çš„é‡‡æ ·ç²’åä»Žè¿žç»çš„è€Œä¸æ˜¯ç¦»æ•£çš„åˆ†å¸ƒå‡½æ•°ä¸æŠ½æ ·èŽ·å¾—,å› è€Œä¸ä¼šå‘ç”Ÿç²’åé€€åŒ–çŽ°è±¡,ä¹Ÿä¸éœ€è¦å†è¿›è¡Œé‡é‡‡æ ·è¿‡ç¨‹,ä»Žè€Œé™ä½Žäº†è®¡ç®—é‡ã€‚ï¼ˆ5ï¼‰æ ¹æ®è¯´è¯äººè¿åŠ¨çš„ç‰¹ç‚¹,æœ¬æ–‡ç”¨å¤šç§æ¨¡åž‹æè¿°è¯´è¯äººçš„è¿åŠ¨çŠ¶æ€,æå‡ºäº†åŸºäºŽé‡‡æ ·äº¤äº’çš„å¤šæ¨¡åž‹ç²’åæ»¤æ³¢æ–¹æ³•ã€‚è¯¥æ–¹æ³•åœ¨è¯´è¯äººè·Ÿè¸ªè¿‡ç¨‹ä¸,é€šè¿‡è°ƒæ•´ç²’åçš„é‡‡æ ·åŒºåŸŸæ¥å®Œæˆå¤šæ¨¡åž‹æ–¹æ³•ä¸æ»¤æ³¢å™¨è¾“å…¥çš„äº¤äº’è¿‡ç¨‹,è¿™ä¸ä»…å®žçŽ°äº†å¯¹å„æ»¤æ³¢å™¨ä¸é‡‡æ ·ç²’åæ•°ç›®çš„ç›´æŽ¥æŽ§åˆ¶,é¿å…äº†æ¨¡åž‹è½¬æ¢è¿‡ç¨‹ä¸çš„æ€§èƒ½é€€åŒ–çŽ°è±¡,è€Œä¸”æ‘’å¼ƒäº†å¯¹å„æ¨¡åž‹åŽéªŒæ¦‚çŽ‡å¯†åº¦å‡½æ•°çš„é«˜æ–¯å‡å®š,ä½¿ç®—æ³•èƒ½é€‚åº”ä»»æ„çš„æ¦‚çŽ‡åˆ†å¸ƒå½¢å¼,å¢žå¼ºäº†è¯´è¯äººè·Ÿè¸ªç³»ç»Ÿçš„é²æ£’æ€§ã€‚ï¼ˆ6ï¼‰åˆ©ç”¨ä¿¡æ¯èžåˆæŠ€æœ¯,æå‡ºäº†ä¸€ç§è”åˆæ³¢è¾¾æ–¹å‘å’Œæ—¶é—´å»¶è¿Ÿä¿¡æ¯çš„è¯´è¯äººè·Ÿè¸ªæ–¹æ³•ã€‚è€ƒè™‘åˆ°æ³¢è¾¾æ–¹å‘å’Œæ—¶é—´å»¶è¿Ÿä¸¤ç§è§‚æµ‹ä¿¡æ¯å¯¹è¯´è¯äººä½ç½®ä¼°è®¡ç²¾åº¦çš„å·®å¼‚,è¯¥æ–¹æ³•åˆ©ç”¨åˆ†å±‚é‡‡æ ·æŠ€æœ¯,å°†æ³¢è¾¾æ–¹å‘æ»¤æ³¢å™¨çš„çŠ¶æ€ä¼°è®¡ç»“æžœ,ä½œä¸ºæ—¶é—´å»¶è¿Ÿè·Ÿè¸ªæ–¹æ³•çš„å»ºè®®åˆ†å¸ƒå‡½æ•°,è¿™æ ·å°±é€šè¿‡æ”¹å–„å»ºè®®åˆ†å¸ƒå‡½æ•°çš„è´¨é‡,æé«˜äº†ç²’åæ»¤æ³¢å™¨çš„é‡‡æ ·æ•ˆçŽ‡,é™ä½Žäº†è¯´è¯äººçš„è·Ÿè¸ªè¯¯å·®ã€‚æ›´å¤š è¿˜åŽŸ

ã€Abstractã€‘ Speaker localization is one of the important techniques in acoustic signal processing, which have found many applications in fields such as speech enhancement, video-conferencing,human-computer interface,robot navigation,et al.Traditional approaches to speaker localization,which collect data from several microphones and exploit a frame of data merely at the current time to estimate the current source location,may causes wrong results since noise and reverberation cause spurious peaks to occur in the localization function.Therefore,it is necessary to track the location utilizing state space approach to improve localization accuracy.Speaker tracking is a typical nonlinear filtering problem.Taking the posterior probability density function as a clue,the nonlinear filtering methods,including Gaussian filters and non-Gaussian filters,are studied and improved in accuracy,robustness and computational complexity in this dissertation.Moreover,some improving measures are proposed purposely when applying the nonlinear filter to speaker tracking problem.The main contributions of this dissertation are as follows:ï¼ˆ1ï¼‰ Under Gaussian assumption,an iterated sigma point Kalman filterï¼ˆISPKFï¼‰ is proposed to improve estimation accuracy by using the new measurement iteratively.In the new method,the iteration is formulated as a nonlinear optimization process and a new update process is proposed based on the Levenberg-Marquardt algorithm,which insures convergence of the estimation and improves mean square estimation error.ï¼ˆ2ï¼‰ Traditionally,Bayesian estimator has been based on the minimization of the H₂ -norm of the corresponding estimation error.This type of estimator assumes the message model and the noise descriptions have known statistical properties.Unfortunately,accurate system models and statistical nature of the noise processes are not readily available.Here a H_âˆžsigma point Kalman filter is presented under H_âˆžperformance criterion.Since sigma point transformation technique is used instead of Taylor series expansion,linearization error of the nonlinear system is decreased.Moreover,the noise uncertainty problem is solved utilizing H_âˆžfiltering method,which enhance robustness of the system.ï¼ˆ3ï¼‰ In the framework of particle filtering,a mean shift quasi-Monte Carloï¼ˆMS-QMCï¼‰ method is proposed in which low-discrepancy sequences are exploited instead of random draws according to a quasi-Monte Carlo integration rule.The idea of using deterministic points suggests that we can choose the points that provide the best-possible spread in the sample space and attain low integration error.Additionally,the mean shift technique is used to estimates the gradient of the approximated density and moves particles toward the modes of the posterior,leading to a more effective allocation of particles thereupon fewer particles are needed and the computational demand is reduced.ï¼ˆ4ï¼‰ A sufficient statistics based particle filter is put forward to deal with sample impoverishment and large computational complexity problems in particle filtering.If the posterior density function depends on the observed data only through a set of sufficient statistics,which is straightforward to update,both of the problems mentioned above can be mitigated utilizing the proposed method by propagating the sufficient statistics instead of the posterior density function.As new samples are drawn from continuous density rather than discrete one,resampling is not required in the new filter,which results in a reduced complexity compared with the sequential importance sampling filter with resampling procedure.ï¼ˆ5ï¼‰ A new interacting multiple modelï¼ˆIMMï¼‰ particle filtering algorithm based on sampling interaction is developed to track a randomly moving speaker according to dynamic characteristics of the speaker.The interacting process in the new algorithm is accomplished by properly selecting the sampling region.Thus,not only the number of particles in each mode can be controlled so that the degeneracy problem around mode transition is avoided,but also the Gaussian assumption of posterior density function of the state is abandoned so that the filter can adapt to all distribution and the robustness of the speaker tracking system is enhanced.ï¼ˆ6ï¼‰ Applying information fusion techniques to speaker tracking,both direction of arrival ï¼ˆDOAï¼‰ and time difference of arrivalï¼ˆTDOAï¼‰ of speech source are used to localize the speaker.Since the measurement modalities differ in the level of information they provide about the state,layered sampling method is employed in the proposed filter.Posterior density function of the DOA based filter is used as proposal distribution of the TDOA based filter,so that the new filter exploits the information in the most recent observation and guides the search in the state space effectively.Thus,the speaker localization error is decreased.æ›´å¤š è¿˜åŽŸ

ã€å…³é”®è¯ã€‘ éžçº¿æ€§æ»¤æ³¢ï¼› è¯´è¯äººè·Ÿè¸ªï¼› éº¦å…‹é£Žé˜µåˆ—ï¼› è´å¶æ–¯ä¼°è®¡ï¼› å¡å°”æ›¼æ»¤æ³¢ï¼› ç²’åæ»¤æ³¢ï¼› è’™ç‰¹å¡æ´›ï¼› æ— è½¨è¿¹è½¬æ¢ï¼›
ã€Key wordsã€‘ Nonlinear Filteringï¼› Speaker Trackingï¼› Microphone Arrayï¼› Bayesian Estimationï¼› Kalman Filterï¼› Particle filterï¼› Monte Carloï¼› Unscented Transformationï¼›

ã€ç½‘ç»œå‡ºç‰ˆæŠ•ç¨¿äººã€‘ å¤§è¿žç†å·¥å¤§å¦

ã€åˆ†ç±»å·ã€‘TN912.34
ã€è¢«å¼•é¢‘æ¬¡ã€‘8
ã€ä¸‹è½½é¢‘æ¬¡ã€‘587
æ”»è¯»æœŸæˆæžœ

æ‰“å°æœ¬é¡µ

èŠ‚ç‚¹æ–‡çŒ®ä¸ï¼š

èŠ‚ç‚¹æ–‡çŒ®

éžçº¿æ€§æ»¤æ³¢åŠå…¶åœ¨è¯´è¯äººè·Ÿè¸ªä¸­çš„åº”ç”¨ç ”ç©¶

Research on Nonlinear Filtering with Application to Speaker Tracking

éžçº¿æ€§æ»¤æ³¢åŠå…¶åœ¨è¯´è¯äººè·Ÿè¸ªä¸çš„åº”ç”¨ç ”ç©¶