节点文献

连续数字语音识别系统的研究与实现

Research and Implementation of Connected Digit Speech Recognition System

【作者】 章学勇

【导师】 何丕廉;

【作者基本信息】 天津大学 , 计算机软件与理论, 2006, 硕士

【摘要】 随着计算机和信息技术的发展,语音交互成为了人机交互的必要手段。语音识别技术是计算机技术的重要发展方向,语音识别已经形成了一定规模的理论系统,基于PC平台的识别系统的研究也在技术上取得了一些成果。虽然现在的语音识别研究基础性理论已经相当完善,并且已经进入了商业应用阶段,但由于语音本身多样性的特点,使得没有一个通用的平台可以适应所有的应用,对于一个领域往往需要进行专门的研发,以适应实际需要。本文首先介绍了语音识别技术的国内外发展状况,分析了汉语连续数字语音识别中面临的困难,在此基础上阐明本课题的研究背景和意义。对语音识别过程中的语音数字模型、语音的端点检测和语音特征提取等过程进行介绍,并确立本系统中所采用的算法和模型。本文中的语音识别采用隐马尔可夫模型(Hidden Markov Model, HMM),在HTK(Hidden Markov Model ToolKit)的基础上,结合远方播报语音信号的特点进行设计和实现。文中对语音采集、语音识别和自动标绘三个阶段的技术难点及解决方案进行详细的介绍。系统采用语音自动重叠技术以减少语音分割中产生的误差,提高识别准确率;并对语音信号的数字和电码两种播报方式分别建模和识别;在航迹标绘过程中,详细讨论了对于识别数字串的分割和航迹点数据的存储方式及标绘过程中对航线的三次样条拟合。最后对语音识别及航路模拟系统的总结及今后工作的展望。

【Abstract】 With the development of the computer and information technology, the speech interaction is an essential human-computer interaction means mutually. The speech recognition technology is one of the most important directions of computer technology. The speech recognition has been developed as an integrated theory, on the other hand the speech recogniton systems run on the PC have been developed so well and have gotten some success. Although the basic theory of speech recognition is quite perfect and lots of commercial applications are successful, there is not any universal system can adapt to all applications because of the variaty of the speech. So, we ofen have to develop the system specially for an application in the field.Firstly, this paper introduces the development of speech recognition and the difficulties we faced in speech recogintion of chinese connected digits, elucidates the background and significance of the research. It describes the digital models of speech, endpoint detection, feature extraction in the process of speech recognition, and chooses the arithmatics and models of the application system.In this paper, we choose HMM (Hidden Markov Model), design and implement the system on the basis of HTK (Hidden Markov Model ToolKit) according to the characters of the speech of remote broadcasted. This paper describes the techique difficulties and solutions in three phases: speech gathering, speech recognition and auto plotting. First, adopting the speech auto overlapping technique to decrease wrong separating rate and increase recognition accuracy will be presented. Second, modeling and recognizing the digital signal and code seperately. Third, discussing the partition of the digital cluster, the storage of the data point of fligt path, and the skyway simulation by cubic spline interpolating.Finally, this article concludes the system and prospects of the future works of Speech Recognition and Skyway Simulation System.

  • 【网络出版投稿人】 天津大学
  • 【网络出版年期】2007年 01期
  • 【分类号】TN912.34
  • 【被引频次】10
  • 【下载频次】445
节点文献中: 

本文链接的文献网络图示:

本文的引文网络