节点文献

模式识别技术及其在文字识别领域的应用研究

【作者】 汪芳

【导师】 康慕宁;

【作者基本信息】 西北工业大学 , 计算机软件与理论, 2004, 硕士

【摘要】 模式识别技术的研究目的是根据人的大脑识别的机理,通过计算机模拟,构造出能代替人完成分类和辨识的任务进行自动信息处理的机器系统。模式识别技术在社会生活和科学研究的许多方面有着巨大的现实意义,已经在许多领域得到了广泛应用。随着计算机技术和人工智能、思维科学研究的迅速发展,模式识别技术正在向更高、更深层次发展。人们己开始研究如何用计算机系统解释图像,实现类似人类视觉系统理解外部世界,这就是所谓的图像理解或计算机视觉,并且取得了不少重要的研究成果。这其中就包括文字识别技术。文字识别是一个典型的模式识别问题,也是模式识别中一个非常重要的应用领域。文字识别作为一种信息处理的手段,具有广阔的应用背景,巨大的市场需求是文字识别得以飞速发展的根本动力。因此,对文字识别的研究具有理论和应用的双重意义。 本文全面阐明了文字识别中的特征提取和分类方法,对集成与分类之间的关系进行了深入的分析,然后根据综合集成法的基本思想,针对典型的汉字字符集的特点,提出了相应的识别和集成方法。在此基础上,建立了一个印刷体汉字识别系统。 汉字字符集所具有的字量大、结构复杂和相似字多的特点,字量大导致了直接采用网络进行分类和集成的困难;而结构复杂和相似字多又使得传统的结构分析方法和统计识别方法难以取得满意的效果。针对这些问题,本文对所提出的网络集成方法进行了改进,给出了三个提取不同局部特征的最小距离分类器,并采用上述方法构成了集成型识别系统。测试结果表明,集成后的识别率比原来最好的单分类器高,充分说明了上述方法的有效性。

【Abstract】 The object of pattern recognition technology is constructing a system to automatically classify, recognize and process information through computer simulation according to the mechanism of human’s thinking. Pattern recognition technology is of great significance in living and science investigation, and has already been used in many fields. Character recognition is a very important and active research area in pattern recognition. Theoretically, it is not an isolated technique. It involves the problems that all the other areas of pattern recognition must face. Practically, as a kind of information processing technology, character recognition has a very broad application background. The need of market is the basic motive force of the rapid development of character recognition. Thus, it is of both theoretical and practical significance.In this thesis, the methods of feature extraction and classification frequently used in character recognition are demonstrated and the relationship between classification and integration is thoroughly analyzed. According to the basic idea of comprehensive integration, classification and integration methods are developed and a recognition system is established for typical character set.Chinese has the feature of a large vocabulary, complex structures and lots of similar characters. Large vocabulary brings about the difficulties in directly using neural network to classify and integrate. Complex structures and many similar characters make it very hard to use traditional structure analysis and statistical methods to get satisfying classification results. Aiming at these problems, the proposed network integration method is improved. Three minimum distance classifiers, which extract different local features, are proposed and they are combined to form an integration system by making use of the above methods. The measurement results show that the recognition rate of the integration system is higher than that of the best single classifier.

  • 【分类号】TP391.4
  • 【被引频次】21
  • 【下载频次】1860
节点文献中: 

本文链接的文献网络图示:

本文的引文网络