节点文献

基于训练机制的联机维吾尔手写字母识别技术研究

Research on the Technology of Trainable Online Uyghur Handritten Character Recogniton

【作者】 木塔力甫·沙塔尔

【导师】 李春庚;

【作者基本信息】 大连海事大学 , 计算机科学与技术, 2010, 硕士

【摘要】 由于磁性笔简洁、输入舒适,在各种便携式移动计算设备的普及中得到广泛应用,因此联机手写识别技术也成为模式识别领域中一个“热”点研究分支。联机手写识别技术能给用户提供自然、方便的人机交互方法。联机手写识别中通过手写板等轨迹捕获设备,获得手写者的书写信息,并对它进行实时地识别操作。手写者也能够很容易地发现和纠正识别错的字符。相对于脱机识别而言,联机识别的优势是在笔尖运动过程中可获取动态信息。在市场上已经有很多种中文和英文的联机手写识别产品问世,但联机维吾尔文手写识别技术还处在初步研究状态。本文对联机维吾尔手写字母识别技术做了理论和实验研究,包括维吾尔文字母轨迹数据采集、预处理、特征提取和分类器的设计等。本文在数据采集阶段中,采用自定义的数据结构和相应的文件格式来保存手写样本数据;预处理阶段中,首先对原始数据进行平滑滤波,然后为了保留维吾尔字母的结构信息,根据字母的书写特点,进行线性归一化,最后通过重采样方法压缩信息量,这样可以提高下一步的计算速度;特征提取中,结合了结构特征和统计特征的梯度方向,使特征提取算法对字符的扭曲、变形具有较好稳定性的;分类过程采用支持向量机进行分类。测试表明,随着样本数量的增加,识别率分别达到90.62%、92.86%、94.53%、96.09%。实验结果表明采用梯度方向特征提取方法能够获得较理想的结果,最高分类精度达到96.09%,最差不低于90%。这些研究对于新疆维吾尔自治区的哈萨克文、柯尔克孜文等相似的文字研究也有一定的参考价值。

【Abstract】 With the common use of various portable devices attached with magnetic pen, which can deliver more compact and comfortable input methods, online handwritten recognition technology is becoming a hot research topic in pattern recognition field. Online handwritten recognition technology can afford natural, easy human-computer interaction method for user. In the online handwritten recognition, track information is captured and machine recognizes instantaneously while the user writes using some special writing device such as magnetic pen on some writing tablet. The user can easily detect and correct misrecognized character. The advantage of online recognition is that the dynamic information of the pen movement can be captured in contrast to offline recognition.Now there are many products of online handwriting recognition for Chinese characters and Latin characters. However, the handwriting Uyghur character recognition is still in preliminary research stage. This paper carried out theoretical and experimental researches on the online handwritten character recognition, such as sampling of Uyghur characters, preprocessing, feature extraction and classifier design. In the sampling, the customized file format is designed to save data sample. In the pre-processing, to keep the structure information use smoothing and linear normalization, then resampling to improve calculation speed in next step. Use the gradient directional feature method for feature extraction, which combined with the structural features and statistical features. Classifier use support vector machine. Tests show, with the increasing of training data, the recognition rate reaches 90.62%,92.86%,94.53%,96.09%, respectively. Experimental results show, through gradient direction feature can get better results, up to 96.09%, the worst also higher than 90%. These achievements are also valuable for other similar characters, which have being applied in Xinjiang Uyghur Autonomous Region, such as Kazakh and Kyrgyz.

节点文献中: