节点文献

基于二叉树多层分类SVM的手写体汉字识别方法研究

Research on Method of Off-line Handwritten Chinese Characters Recognizing Based on Binary Tree SVM

【作者】 张丽萍

【导师】 王建平;

【作者基本信息】 合肥工业大学 , 检测技术与自动化装置, 2008, 硕士

【摘要】 汉字识别的研究工作一直被认为是具有重要理论意义和实践价值的模式识别问题,并被视为字符识别研究的最终目的,脱机手写体汉字识别是当前模式识别领域的一个研究热点。支持向量机是一种专门研究有限样本预测的学习方法,SVM算法是建立在结构风险最小化原理基础之上发展成的一种新型结构化学习方法,能很好的解决有限数量样本的高维模型的构造问题。因此,将SVM理论运用于脱机手写体汉字的识别有较大的理论意义和实用价值。论文的主要工作如下:1)汉字繁杂度和结构度的划分。采用基于像素点密度法将汉字分为简单字和复杂字;采用基于水平和垂直投影直方图与连通域相结合的方法将汉字分为独体字和非独体字。2)二叉树支持向量机构造。针对脱机手写体汉字识别中复杂模式多分类问题,在应用二叉树和SVM理论的基础上,构造了手写体汉字分类的二叉树结构支持向量机模型,进行粗分类,以支持向量机工具箱为实现手段,成功实现了对多种类型(简单、复杂、独体字、非独体字等)的分类。3)手写体汉字识别算法。通过多种特征提取方法的组合提取手写体汉字图像特征的方法,根据每类字的不同特点,采用不同的特征提取方法进行特征提取,利用SVM“一对多”的方法对每个类细分类识别。实验结果表明,本文采用二叉树SVM粗分类与“一对多”SVM细分类结合的分类识别方法,可以充分发挥SVM在二类分类问题方面相对于单一SVM方法的优势,在解决脱机手写体汉字复杂多分类识别问题上,能有效的提高分类精度和速度。

【Abstract】 The study of Chinese character recognition is regarded as not only a important theory meaning and practice value direction in pattern recognition field, but a final goal to the research of character recognition. Chinese Characters recognition is one aspect of pattern recognition field. Support Vector Machine (SVM) is a leaning method for especially studying small-sample prediction, which is based on Statistical Learning Theory. It can well solve the construction issue of a high dimensional model of small-sample set. It can get a biggish theory meaning and practice value that the SVM theory is used for the off-line Handwritten Chinese Characters Recognizing.The primary contents of this thesis are:1) Chinese characters are composed of complication and structure. A method based on the pixels density is adopted, Chinese characters is divided into simple and complexity Chinese by this method. A method based on the combination of horizontal and vertical projection with connected component is adopted, the Chinese characters is divided into impartibility Chinese and separable Chinese.2) binary tree SVM. the problems associated with complex pattern and multi-classification in off-line written Chinese characters recognition are addressed and a method of classification recognition combined with binary tree SVM(support vector machine) and "one against rest" SVM are presented. A binary tree SVM multi-classification is presented. It can make coarse classification. SVM toolbox is used as the realization methods in this thesis. The classification of various style script Chinese character images depending on the above Chinese character image classification machine structures are accomplished successfully.3) written Chinese characters recognition machine. The feature extraction method based on six method combined is proposed. because the Chinese character have different character. So the different feature extraction method is adopted. The classification based on "one against rest" SVM is adopted to recognition.The experimental results indicate that the method of classification recognition combined with binary tree SVM (support vector machine) and "one against rest" SVM can exerted the superiority for 2-class classification of SVM over simple SVM algorithms completely. The generalization ability has improved greatly. The new method yields higher precision and speeds up support vector machine multi-class classification.

  • 【分类号】TP391.41
  • 【被引频次】1
  • 【下载频次】249
节点文献中: 

本文链接的文献网络图示:

本文的引文网络