

Hierachical Dynamic Chinese Character Database Based on Global Affine Transformation

【作者】 俎小娜

【导师】 金连文;

【作者基本信息】 华南理工大学 , 通信与信息系统, 2008, 硕士

【摘要】 目前的静态汉字字库,经历了点阵字库、矢量字库到曲线字库的发展,在字库存储量问题上已经有了长足的进步。微软的TrueType字库和Adobe的PostScript系列字库利用曲线轮廓技术,在字形美观效果方面也取得了很好的效果。但是由于汉字数量庞大,这些字库在汉字的信息处理应用中都还具有一定的局限性。目前成熟的汉字字库,都属于静态字库,缺乏笔顺信息,无法模拟汉字书写过程,再加上汉字数量繁多,字形多变从而限制了动态汉字字库技术的发展。在这样的背景下,本文设计并实现了一种全新的分级动态汉字字库,主要工作包括:(1)构建了两种基本组件库:笔画库和部件库。其中部件库是在文献[7]的旧部件库的基础上重新构建的。(2)设计了一种当部件库改变时,汉字的半自动拆分方法。(3)将全局仿射变换用于分级汉字字库的构建,实现了字库构建的自动化。推导了仿射变换参数的具体计算式,通过实验说明了各种预处理(骨架、轮廓、特征点、重心)对全局仿射变换的影响。(4)用结构相似度评判模拟效果。结构相似度能够更加客观地有效地评判模拟汉字的效果。推导了计算二值图象的结构相似度的方法。(5)将分级汉字字库技术应用到动态汉字字库中,构建并实现了分级动态汉字字库,从而使动态汉字字库在嵌入式设备上的实现成为可能。(6)在Borland C++Builder 6平台上构建并实现了基于笔画库和部件库的包括“楷体_GB2312”和“仿宋_GB2312”两种字体的分级动态汉字字库。分级汉字字库通过组件的重复使用大大减少了汉字图象字库的存储量。本文通过全局仿射变换方法的应用使得分级汉字字库的构建完全实现了自动化。分级字库技术在动态字库中的使用大幅度地减少了动态字库的存储量。本文的工作使分级动态汉字字库向实用化迈出了重要一步。

【Abstract】 The technology of static computer Chinese fonts has developed from Bitmap type, Vetor type to Curve Contour type fonts. It has great improvement on the storage of Chinese fonts. The Curve Contour fonts, represented by the TTF of Microsoft and PostScript of Adobe, have excellent displaying effect. But the huge number of Chinese character limits the use of Chinese fonts.At present, the well-knowed Chinese fonts are static ones. They have the same shortage: they don’t contain the written temporal information of Chinese characters and they can’t display how to write Chinese characters correctly. The development of Dynamic Chinese Character Database (DCCD) is limited because of the huge number and variable fonts of Chinese characters.From this background, a new Hierarchical Dynamic Chinese Character Database (HDCCD) is constructed and implemented in this paper. The main works include:(1) Two kinds of component database, stroke database and radical database, is constructed. The radical database is rebuided based on the old radical database proposed in reference [7].(2) The method of splitting Chinese characters semiautomatically is proposed, when the radical database is changed.(3) Global Affine Transformation (GAT) is used to construct Hierarchical Chinese Character Database (HCCD). The application of GAT makes the construction of HCCD automatic. The specific expressions of affine transformation parameters are solved. The effect of different preprocessings to GAT, including skelecton, contour, feature points and barycenter, is explained by experiments.(4) Structural Similarity (SSIM) is applied to judge the effect of simulated Chinese characters. The judgement of SSIM is more objective and more efficient. The method for computing SSIM of binary images is proposed.(5) HDCCD is implemented by the application of HCCD technology on the DCCD. HDCCD makes the application of DCCD on the embeded system possible. (6) HDCCD including“KaiStyle_GB2312”and“FangSongStyle_GB2312”fonts, based on stroke and radical database, is implemented on Borland C++ Builder 6.HCCD can reduce the storage of Chinese graph database greatly by reusing components. GAT applied in this paper can construct HCCD automatically. The storage of DCCD is reduced largely by the application of HCCD technology. The work of this paper enhances the practicability of HDCCD.

