

Research of Chinese Character Image Recognition Technology for the Multi-media Learning

【作者】 梁俊娟

【导师】 谈国新;

【作者基本信息】 华中师范大学 , 教育技术学, 2009, 硕士

【摘要】 文字图像识别技术是一种运用计算机实现自动辨别图像文字的实用技术。它广泛应用于不同领域。文字图像识别技术用于幼儿教育,能够为幼儿提供一种新的人机交互形式。另外,多媒体技术可以为幼儿的汉字学习提供一个图、文、声并茂的学习环境,但是缺乏处理真实世界信息的能力。因此,文字图像识别技术和多媒体技术结合可以形成一种面向多媒体学习的汉字图像识别技术。它不仅可以实现对汉字图像的自动识别,而且可以把汉字图像以多媒体信息如图片、声音、视频、音频、动画等形式表现出来,弥补汉字在传统教育过程中死记硬背的缺点,降低汉字学习的难度,提高学生的动手能力,促使幼儿更加积极、主动、认真、专注的学习,为幼儿的汉字教育提供一条更加有效的途径。本文在对汉字图像识别技术研究的基础上,结合多媒体信息,实现了对汉字图像的识别和多媒体信息的表示。然后运用这些技术开发一个汉字学习的原型系统。系统经过测试,证明了本文算法不仅能够有效的完成汉字图像识别,并且能以多媒体信息的形式辅助学习。论文的主要工作如下:(1)特征选取以及分类器的设计。根据汉字特点,提取笔画密度和灰度投影作为汉字特征。根据不同的笔画范围采用动态的特征提取法。然后根据提取特征范围选择不同的分类器进行汉字识别。针对灰度投影特征,本文采用灰度投影模板匹配的方法实现对汉字图像的识别。(2)冲突处理机制。针对汉字图像识别过程中出现的汉字冲突问题,运用冲突处理机制来处理。冲突主要分两种情况:笔画密度冲突和灰度投影冲突。笔画密度冲突主要分两种:一字多码和一码多字。根据出现的情况不同,采取不同的冲突处理机制。(3)学习机制。针对汉字识别过程中运用冲突处理机制仍然不能实现汉字识别的情况,采取学习机制对其进行学习,完成对汉字特征的提取以及多媒体信息的输入。根据学习过程中特征提取的不同,学习机制可以分为两类:笔画密度学习机制和灰度投影学习机制。并且针对在模板匹配过程中出现的阈值选择问题,采用学习机制实现对阈值的自动确定。

【Abstract】 The technology to recognize Chinese character in image is a practical technology that can automatically recognize the Chinese character by using computers. It is widely used in different areas. The character recognition technology can also be applied to childhood education, which provides the young children a new human-computer interaction form .In additional, the multimedia technology can provide the children a good Chinese character learning environment full of images, text and sound. Thus the character recognition technology and multimedia technology can combine to form the Chinese characters recognition technology particularly for multi-media learning. It can not only identify the character image automatically, but also it can present the character images in various forms of multimedia information such as pictures, sounds, video, audio, animation and so on, which can provide children a good learning environment full of images, text and sound. It makes up the shortcomings like the emptiness and uncertainty of the shape and meaning of Chinese characters in traditional education, and also the shortcomings of rote learning. In addition, it can reduce the difficulty of Chinese characters learning, enhance the practical ability of students, and promote children to learn more actively, seriously and attentively, which in all provide a more effective way for children to learn Chinese characters.Based on the research of the Chinese character image recognition technology, and combined with multimedia information, this paper realizes the representation of the character image recognition and multimedia information. Then, these technologies are used to develop a system prototype for Chinese character learning. And the algorithm in this paper proves to be effective after the system test.The main contents are as follows:(1) Feature selection and the design of classifier. Based on the features of Chinese characters, the density of stroke and the gray-scale projection are extracted to be the features of Chinese characters. And the method of feature extraction by dynamic classification is applied according to the different stroke scopes. Then, different classifiers are selectively designed according to the different scopes of feature extraction. Based on the feature of the gray-scale projection, this paper introduces the method of template matching of gray-scale projection to realize the Chinese character image recognition.(2) The conflict management mechanism. The conflict management mechanism is used to solve the problem of Chinese characters conflicts that appear in the process of Chinese characters image recognition. The conflicts mainly can be divided into two cases: multi-codes for sole-character and multi-characters for sole-code and. According to different scenarios, this paper will adopt different conflict management mechanism.(3)The learning mechanism. To solve the problem that some Chinese characters can’t be identified in the process of Chinese character recognition, this paper uses the learning mechanism to extract the characteristics of Chinese characters and input the multimedia information. According to the difference of the extracted feature during the learning process, the learning mechanism can be divided into kinds that are the stroke-density learning mechanism and the gray-scale projection learning mechanism. Meanwhile, As to the problem of threshold choice in the process of template matching, this paper chooses the corresponding learning mechanism to determine the threshold automatically.At last this paper develops the system prototype based on the above algorithm and the results prove that the system can effectively identify the image of Chinese characters, and can aid children’s learning by multimedia information.


