节点文献

多区域图像的分割和倾斜检测方法研究

Research on Segmentation and Skew Detection of Multi-region Document Images

【作者】 岳宁

【导师】 段会川;

【作者基本信息】 山东师范大学 , 计算机软件与理论, 2008, 硕士

【摘要】 在现代信息社会里,计算机已经进入了社会的各个领域,互联网也日益普及,人们越来越多地依赖计算机获得各种信息,大量的处理工作也都转移到计算机上进行。研究如何将传统的纸张文本转换成电子文本就成为了人们关注的课题。在日常生活和工作中,存在着大量的文件资料的处理问题,这些文件不仅包括那些只有文字信息的文件还包括那些图文混排的文件和图像文件,因而如何将文件快速准确的输入计算机的要求变得非常迫切。本文主要研究的是多区域图像的分割和倾斜检测方法。针对常用的文本图像分割算法进行了综述,并对各个分割算法的优缺点进行了介绍。一般的文本图像的处理算法大体可以分成两类:几何分析法和纹理分析法。其中几何分析法又可以分为自顶向下、自底向上、混合法。本文详细介绍了两种自顶向下的分割算法,分别是游程平滑算法和投影轮廓算法,以及两种自底向上的处理方法:近邻线密度法和连通分量分析法。除此之外,还列举了几种常见的图像分割算法。本文总结以上的基本分割方法,针对多区域图像提出了改进的投影轮廓算法。该算法解决了使用一般的投影轮廓算法不能适用于复杂的具有倾斜角度的多区域图像的分割,本文首先对图像进行二值化,使用数学形态学的腐蚀—膨胀操作降低图像上的噪声。对于得到的图像使用改进的投影轮廓算法,该算法即使在X、Y轴方向上没有谷点,也可以根据图像像素的分布状况找出切分点,将图像切成小块,再对小块图像进行投影分析,循环此过程,直到将图像的各个区域分割出来为止。文档倾斜角的检测大体可以归为五大类,基于Hough变换的方法、基于交叉相关性的方法、基于投影的方法、基于Fourier变换的方法和K-最近邻簇法,其中基于Fourier变换的方法计算量非常大,故而很少使用。通常文档图像在扫描入计算机时难免会有损失,文档图像的边缘也很不规则。如果用普通的边缘提取方法寻找图像轮廓,不仅增加了计算量而且增加了许多不必要的计算。本文针对一般倾斜检测算法计算量大的问题,提出了一种简单的寻找边缘的方法,这里并不需要精确地找出文档图像的边缘轮廓,只是找出含有图像的区域就可以了,这个区域就是外接矩形,即bounding box。本文引入GA方法检测图像的倾斜角,该方法使用bounding box的面积作为适应度函数值,只需要找出图像的上下左右四个坐标值便可以了,这样大大减少了计算量。实验结果表明该算法对倾斜角的检测具有较高的精确度。

【Abstract】 In the modern information society, computer technology has been involved in various fields of our lives. The Internet has also become popular increasingly, and we depend on computers to get information more than ever before, a lot of work is shifted on to computer. Studying how to covert the traditional paper into electronic text has become a topic of concern. In daily life, there are a large number of documents to be handled. All of these documents include not only text files but also images and mixed files, so how to put them into computer efficiently and accurately has become urgent requirements.The main purpose of this thesis is to study algorithms for page segmentation and skew detection of multi-region document images. The thesis summarizes the common algorithms of page segmentation, and gives their advantages and disadvantages of each algorithm. Generally, methods of page segmentation can be classified into two types, one is structural analysis, and the other is texture analysis. The structural analysis includes top-down, bottom-up and a mixing of the two. The thesis presents two top-down methods, run-length smoothing and projection profile cut, and two bottom-up methods, neighborhood line density and connected component analysis. In addition, it gives several algorithms which usually be used in image segmentation.According to these algorithms, this paper presents an improved method of the projection profile cut algorithm. This algorithm solves the problem that the projection profile cut algorithm couldn’t deal with complicated documents containing skewed multi-regions. First, the image is binarized, then denoised by erosion and dilation operation of mathematical morphology. Applying the improved projection profile cut algorithm to document images, we can find the cut-off points of the image which don’t have any peak-valley point on the X-axis and Y-axis. With these cut-off points we could cut the image into small pieces, and then we conduct the same operation until multi-regions are separated.Skew estimating methods can be classified into five general categories: Hough transform, cross-correlation, projection profile, Fourier transform and nearest-neighbor, of which Fourier transform is rarely used because of its high complexity.During document scanning, the image may lose something inevitably, and the edges are not smoothing. If we use the normal image edge detection to find the profile, it increases not only the amount of computation but also many unnecessary calculations. The thesis proposed a brief method to find the profile of the image, for which there is no need to find the edges accurately, just to find the area which contains the image. The area being found is called bounding box. The thesis used GA algorithm to detect skew angles of the images. This method uses the area of the bounding box as its fitness function, in which only the coordinate values of the 4 corners need to be found. This can reduce tremendous computing complexity. Experimental results show that the proposed algorithm can certainly guarantee the accuracy for document image deskewing.

  • 【分类号】TP391.41
  • 【被引频次】3
  • 【下载频次】184
节点文献中: 

本文链接的文献网络图示:

本文的引文网络