节点文献

汉字键盘输入和非键盘输入若干问题研究

Research on Several Problems of Chinese Character Keyboard Input and Non-keyboard Input

【作者】 张建勋

【导师】 吴建国;

【作者基本信息】 安徽大学 , 计算机应用技术, 2003, 硕士

【摘要】 本篇论文主要研究了自然输入汉字方法(包括键盘输入和联机手写汉字输入)的实现,致力于解决实现汉字自然输入过程中出现的若干问题。这里所谓的自然输入汉字的方法,是指无需经过太多的学习和训练便能掌握的方法。本文从汉字结构出发,将汉字笔划分类,并将国标二级字库中的汉字用笔划进行编码,制定了笔划编码字典,统计了笔划信息的各种数据。根据笔划编码字典和笔划统计信息,设计了笔划编码汉字输入的方法和实现该方法的键盘。 由于汉字的平均笔划数过多,在用笔划编码方法输入汉字时,如果完整的输入汉字笔划就会使得码长过长。为了实现汉字输入码的不完整输入,解决带有模糊输入符的字符串模式与一个字符串集合之间的匹配问题,论文在第三章提出一种海量字符串集合的模式匹配算法,给出了算法的具体实现和复杂度分析,并且提出一种优化的检索树结构来存储字符串集合以节省内存空间。为了提高算法的运行速度,算法还引入了KMP模式匹配和有限自动机匹配的思想。 为了在键盘上实现汉字的自然输入,论文提出一种“模拟笔划”的汉字输入新方法,这种方法特别适用在目前信息产品上广泛使用的数字小键盘上,它不直接在键盘上输入汉字笔划,而是根据笔划的形状特征和运笔方向输入汉字笔划的起点、折点和落点等笔划特征点。这种方法可以连续在键盘上输入汉字笔划,中间不需分割键,并且可以在输入错误时向前删除笔划,它可以看作是键盘输入向联机手写汉字输入的过渡方法。 本文在上述工作的基础上最后给出联机手写汉字输入方法的初步实现,其方法是在笔划编码字典的基础上,根据“模拟笔划”的输入汉字的思想先识别汉字笔划、再识别汉字。笔划的识别思想是通过笔迹上的坐标点抽取笔划的特征点,由特征点形成笔段,由笔段组成笔划,最后由笔划序列来识别汉字。

【Abstract】 The thesis has maily been research in the implementaion of Chinese character input methods which can be mastered without a specialized leaming,including keyboard input and OLCCRAfter studying the structure of Chinese character,we have classified Chinese character strokes and coded Chinese character with stroke strings. Our basic work has also included compiling a stroke coding dictionary for Chinese National Standard Code For Information Interchange (GB23 12-80) and making statistics on stroke imformation.On the basis of the coding dictionary and statistical data,we have devised a keyboard input method and designed a kind of key arrangement for the method.To avoid inputting the stroke strings completely while inputting a Chinese character, we have a requirement of missing some elements in Chinese character input codes. So a fast pattern matching algorithm on mass string assemble has been proposed to solve the problem of fuzzy matching between a string pattern and a string assemble.To make the algorithm cost-effective in space and time,we have developed an optimized trie-tree structure to store the string assemble and introduced the Knuth-Morris-Pratt(KMP) and Finite-Automata(FA) string matching thought to our algorithm.The algorithm has been describled in details and the cost of space and run time has been analized in the thesis.In order to input Chinese character naturally from keyboard,our next step is presenting a new input method named "stroke simulation".The main idea of the method is inputting the feature points of a stroke,such as up-point, twist-point and down-point, instead of inputting a stroke directly from keyboard. The trail of the feature points we extracted from a stroke shoud be able to shape the stroke. The most attractive point of this method is that it allows users input strokes continually without an extra key to separate two continuous strokes. In addition,the method supports deleting strokes backwards to make users be able to modify the error input. Except for PCs,this new method is fit for applicating on some portable device with number keyboard,such as mobile phone,electronical dictionary,etc.It can be viewed as an transitional step towards OLCCR.The final work we have done is a preliminary implementation of OLCCR based on the stroke coding dictionary which had been presented before.Our method is to extract the feature points from the trails of the strokes.The feature points formed the stroke segments,the stroke segments formed the strokes. We recognize hand-written Chinese character from the stroke imformation we have obtained from the original inputting points.

  • 【网络出版投稿人】 安徽大学
  • 【网络出版年期】2004年 01期
  • 【分类号】TP391.43
  • 【被引频次】3
  • 【下载频次】214
节点文献中: 

本文链接的文献网络图示:

本文的引文网络