节点文献

手写虚拟汉字识别研究及其在多通道短信交互系统中的应用

Study on Finger-written Virtual Chinese Characters Recognition

【作者】 杨端端

【导师】 尹俊勋;

【作者基本信息】 华南理工大学 , 电路与系统, 2007, 博士

【摘要】 基于视频的人机交互接口能够增加计算机的感知能力,因此成为新型人机交互接口中一个重要的研究方向。如果机器通过视频捕捉信息后,能够理解人类习惯的交互方式,如手势,肢体语言,文字,草图等,将能提供自然、友好、有效的人机交互接口。另一方面,手写识别模块已经被广泛地置入各种机器中以方便人机交互。手写信息的捕捉一般需要使用触摸屏幕(板)和手写笔,但对于便携式的机器而言,屏幕尺寸一般会受到限制,因此手写功能只能在较小的屏幕上使用。通过视频的方式捕捉手写信息可以摆脱触摸屏幕(板)的限制,使得用户能靠传统的纸笔与机器进行交互,具有方便、自然的优点。传统的纸笔对于人类非常熟悉,基于视频用纸笔方式与机器交流是一个不错的选择,但由于该方法需要大量的纸张,以及要求用户携带书写所用之笔,并不足够环保和方便。如能以用户的手指代替笔,以普通的平面代替纸,同时辅之以计算机视觉技术,使用户能以手指书写的方式达到笔式交互之目的,将会提供新颖的人机交互接口。该人机交互接口由于交互时无需纸张,构造时无需触摸屏,使用时无需携带笔,因此具有环保、方便、交互设备简约等优点。以前虽有研究者提出过一些以手指作为人机交互媒介的应用,如手势识别、手指鼠标等,但以手指向计算机输入字符的人机交互接口还未见有相关文献提及。由于手指在普通平面上书写汉字后不会留下笔迹的信息,此类汉字似乎虚拟不可见,因此,本文称之为手写虚拟汉字。此文研究了手写虚拟汉字的恢复,重构,编码,识别,并提出了一系列的方法。1.由于手写虚拟汉字的书写过程是一个手势表达的过程,因此本文首先研究前景分割方法,目的是把手势的主体从视频图象中分割出来。考虑到整个手写虚拟汉字识别系统的实时性,本文提出了一个简单但却有效的改进单高斯背景模型,并据此分割手势的主体。实验表明,使用该背景模型后,对用户的手有较好的分割效果。2.为了保证实时地重构手写虚拟汉字,提出了一种新型的指尖预测器:改进的单层函数连接神经网络(Flat Functional-link Neural Network,FFNN)。并用它与Kalman滤波器分别来对手写虚拟汉字书写轨迹进行实时预测。实验结果表明改进的FFNN在手写虚拟汉字轨迹预测上有较好的表现,性能优于Kalman滤波器。3.为了获取重构手写虚拟汉字所用的指尖轨迹,提出了基于轮廓分析和环形特征匹配的指尖定位算法。该方法首先获得手的轮廓,然后采用网格抽样方法定位指尖的粗略位置,再通过环形特征匹配方法确定指尖的准确位置。实验结果表明,该方法定位指尖的准确率可以达到98.5%。最后将该指尖定位方法运用到手写虚拟汉字识别系统中,汉字的识别率可达93%左右。4.提出了基于Bezier曲线重构手写虚拟汉字的方法。以直线连接每一帧的指尖位置的方式重构手写虚拟汉字,相当于将手写轨迹分为很多线段,表示出来的汉字有折线感,不光滑。采用曲线轮廓技术(Bezier曲线)重构手写虚拟汉字不仅可以使得笔画较为光滑,重构出高质量的手写虚拟汉字;而且能大大降低手写虚拟汉字的存储空间,方便我们建立庞大的数据库。5.研究了手写虚拟汉字的识别。包括了手写虚拟汉字的预处理、特征提取和分类器设计。提出了3种新型的统计分类器,分别是基于相似类别集合的层级LDA分类器,基于核模式识别方法和修正二次判决函数(MQDF)的核修正二次判决函数(KMQDF)分类器以及改进线性判决函数(MLDA)+LDA的二级分类器。手写虚拟汉字最优的识别率可以达到93%。总之,手写虚拟汉字识别的研究是一个涉及多个领域(手势识别、图象处理、手写体汉字识别等)的综合研究项目,该课题的研究,将可以实现新颖、自然、友好的人机交互接口,具有重要的理论意义和现实意义。

【Abstract】 As vision-based HCIs can develop‘sense’for computers, they become a popular research field. If a computer with a camera can read our favorite means of communication, such as gesture, body language, handwriting, sketch, this would provide a natural, friendly and very effective vision-vased HCI.On the other hand, handwriting recognition modality has been widly intergrated in machines for the convieninece of human-machine interection. Handwriting usually is captured using a touch screen and a special pen. It is not convenient for mobile application when the size of screen is limited. Using camera to capture handwiring can break away from the limit of touch screen. Users can adopt the favorite means based a pen and paper when they communicate with a computer.The communication means based on a pen and paper is familiar for human being, but it is not convinient and environmental enough because it requires users bring a pen and needs lots of paper. If people can write characters virtually by just using the movement of his finger-tip on a common plane, and computer can recognized those characters, this will provide an interesting wireless character inputting modality for HCI application. As those finger-written characters can’t be seen without ink information, we call them finger-written vitual Chinese characters (FWVCCs). This dissertation researches the reconstruction, coding and recognition of FWVCCs and presents a series methods.1.Background modeling is one important computer vision problem. To get the information of finger moving, we need to segement the‘hand’from the video images.Considering the FWVCCs recognition system is a real-time system, we propose a simple but effective single-Gaussian background model to segement the‘hand’. Experiments show the segmentation results based on this model are acceoptable.2.To reconstruct FWVCC real-time,this dissertation studies the application of the Flat Functional-link Neural Network (FFNN) to predict FWVCC moving trajectories. To solve the prediction problem of a non-stationary time series, conventional neural networks need a lot of time and samples to train, where FFNN can solve this problem very well. Considering the structure of Chinese characters, the dissertation makes some improvements for FFNN and chooses the appropriated train samples, and promising experimental results have been obtained. Furthermore a comparison is performed between the predictions of the Flat NN and a Kalman filter. Experiments suggest that the improved FFNN predictor works better for the prediction of trajectories of handwritten Chinese characters. 3.To get the trajectories of finger moving, this dissertation presents a fast and robust fingertip detection method. First, based on the analysis of the samples of hand contour, a candidate region for fingertip localization was selected. Then, the location of the fingertip was located based on circle feature matching and the candidate region. To demonstrate the strength of the method, the method was run on several sequences with varying light condition, different degrees of clutter background and different speeds of finger movement, experiment shows that the correct rate can reach 98.5%.4.We propose a FWVCCs reconstruction method based on Bezier curve coding . By this method, the strokes of FWVCCs appear smoother than connecting the trajectories of finger moving by lines. This method can also benefit to construct a small storage database but it can contain a lot of characters.5.At last, we also research the FWVCCs recognition, which includes preprocessing, feature extraction, classifiers design. Three new statistical classifiers are proposed, they are similar-Chinese-categories-based heirachy LDA classfier, Kernel Modified Quadratic Discriminant Function and MLDA+LDA classifier (modifier linear discriminate alalysis+LDA). The highest recognition rate of FWVCCs can reach 93%.In conlusion,the research of FWVCCs recognition is a multi-discipline,comprehensive research item, which can realize a new kind of more natural video based approach for inputting the handwritten Chinese characters, and is far-reaching significance in theory and application.

  • 【分类号】TP391.41;TN929.5
  • 【被引频次】10
  • 【下载频次】538
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络