节点文献
基于图段的彩色地图线要素智能识别
Intelligent Recognition of Line in Scanned Color Maps Based on Segment
【作者】 李华蓉;
【导师】 杜清运;
【作者基本信息】 武汉大学 , 地图学与地理信息系统, 2010, 博士
【摘要】 建立地理信息系统(GIS)的一个重要环节是地理数据的获取,工作量占整个系统开发的80%以上,也是影响GIS发展的瓶颈。地图要素智能识别是地理数据获取的核心问题,是图像处理、模式识别和人工智能等多种学科的综合应用,同时也是计算机视觉领域的重要课题。其直接面向企业需求,具有很高的理论意义和应用价值,多年来的理论研究及实践为此课题的深入研究奠定了良好的基础,但也存在着许多亟待解决的问题,识别理论和方法有待突破。地形图是点、线、面符号的有机集合,是矢量线划图;而扫描地形图图像是点阵像素的自然集合。为了将点阵像素聚合为地图要素符号,本文从知识出发,进行分层处理和自组织的推理,力求加强表达单元整体性、提高表达的层次性、关注各种关联信息以及根据启发信息选择和组织识别数据。本文扩充了图段的约束条件,使其满足颜色、宽度和拓扑一致性要求,使图段成为具有单意义的图像表达单元。根据象素之间的关联,先对点阵图像进行游程编码,构建带有属性信息的扫描串,然后基于扫描串宽度、颜色和拓扑一致性形成图段,得到图像的图段表达,同时提取图段之间连接关系,构建图段连通体。根据图段连通体的属性提取点、线、面连通体,并对线进行矢量化,得到线符号及其之间的连接关系,构建线符号之间的邻接关系。根据线符号之间的关联性,搜索属于同一地图要素的所有线符号,提取之间的关系,重建地图要素。根据人脑识别地形图时对目标信息的组织方式,本文提出了地形图目标的三级模型——地图要素模型、符号模型和影像模型,符号模型将现实模型和影像模型联系起来,该模型基于矢量栅格一体化数据表达形式,是面向地物的,具有矢量数据的特点;同时,通过地物的标识号可以找到该地物途径的所有像元,从而又具有栅格的全部特性。基于符号模型进行地物的识别和提取,可以充分利用地物所有像元的局部特征,同时又可利用地物的拓扑关系等整体特征。识别过程中提取的信息分成图段、符号和地图要素三个层次,同一层次数据之间相互关联,不同层次数据之间也相互关联,这样既注重横向的关联又重视纵向的关联。在识别中,先获得像素局部特征再生成图段连通体的整体结构,然后又用整体结构去指导和修正局部特征;高层信息从低层数据获得,反过来又去指导低层数据,采用自下而上与自上而下推理相结合的方式完成整个识别。为了充分利用地形图的颜色信息,在分析前人分色处理方法的基础上,面向像元层次提出了增加颜色分类数的思想,从而有效地解决了过渡色分色模糊的难题;面向扫描串层次将颜色组合情况归纳为16种,巧妙地用各颜色代码之和来表示,从而使得扫描串带有了颜色信息,为图段颜色信息的提取提供了基础。本文采用游程编码技术实现了像元矩阵→扫描串→图段的转换。图段是满足颜色、宽度和拓扑一致性的邻接扫描串,其能够直接表达线段和交点。利用图段的颜色信息和邻接关系构建单版图段连通体,从而实现了地形图图像的自动分色,由于同时考虑了颜色属性与空间关系,其在处理效率和抑制噪声等方面更为优越。在单版连通体图基础上,对图段连通体进行分析。从地图符号的形状、尺寸和拓扑关系入手,归纳总结出特征尺寸、高宽比、黑白比、节点密度等适于表达的特征,并基于特征的区别有效地将线符号从点、面符号中分离出来。对提取得到的线符号,依据节点图段采用分段矢量化的处理方法,然后根据节点图段的邻接关系进行同源直线检测,从而得到属于同一地图要素的所有矢量线段,形成完整的矢量信息。同时针对不同的地图要素自适应地调整检测条件,更好的提取虚线型道路、河流和等高线的矢量数据。基于上述识别方法和算法,设计开发了一个地形图智能识别原型系统。软件以VC++为平台,采用面向对象的设计方法,并利用通用数据库管理数据。
【Abstract】 Geographic Information Acquisition (GIA) which accounts for 80% of the whole work is important for Geographical Information System (GIS) development, and still a bottleneck of the development for the GIS. Automatic interpretation and acquisition of topographic map is the core of GIA because it involves many subjects such as image processing, pattern recognition and artificial intelligence, etc. At the same time, it is one of key issues in the field of computer vision.The research on recognition is important in both theory and practical application. Previous theories and applications have provided a solid foundation for further research, but there are still some urgent and thorny problems. New approaches of recognition have to be proposed.The topographic map, which can be regarded as vector line graph, is a set of point, line and surface symbols, while the scanned map is composed of pixels. To extract symbols from pixels, recognition process should be self-organizationally implemented by different levels. The approach developed by this dissertation aims to capture more global features. The constraints of the segment are extended to meet the color, width, and topological consistency, so that the segment is the expression unit of a single meaning for image. Firstly, according to the relationship between pixels, images are run-length encoded and the scan strings with the attribute are built. Secondly, based on the color, width and topological consistency of scan strings, the segments expressing image are obtained. At the same time, connecting relationships among the segments are extracted for building the segment connected components. Thirdly, according to the properties of the components, the line symbols are extracted from point and surface symbols. The adjacency relationships between the vector line symbols are constructed to search for all line symbols belonging to the same map element, and thus we can rebuild the map elements.According to the organization of topographic map information in the human brain, this dissertation proposes a three-tier model of topographic map target:the map element model, symbolic model and image model. The symbolic model relates the realistic model and the image model. At the same time, it also has the raster data characteristic therefore we might find ways to all pixels of surface features by its identification number. Recognizing and extracting surface features through the symbolic model, we can use not only the partial characteristic of all pixels, but also the overall characteristics such as topological relationship. Data in the same level and in the different levels are associated with each other, so the associations are paid great attention horizontally and vertically. High-level information originates from low-level data and in turn instructs low-level data, therefore, the recognition is completed through bottom-up and top-down reasoning combination.To take full advantage of the color information of topographic map, the dissertation puts forward the idea to increase the number of color categories in pixel level, which effectively solved the difficult problem of fuzzy transition-color separation. In scan string level color combinations are classified into 16 sorts and are skillfully indicated with the sum of the various color code, which enables scan string to have the color information and provides the foundation for obtaining segments’color. In this paper, run-length coding technology is used to realize the conversion from pixel matrix to scan string and then to segment which makes up of the adjacent scan strings meeting the color, width, and topological consistency. The segment can directly express line and intersection. Single version of a connected graph is built using the color information and the adjacent relationship of segment, thus achieving automatic color separation of topographic map image. The method has a high processing efficiency and reduces much noise since taking into the color attribute and spatial relationship.By analyzing the shape, size and topology of the connected component, the dissertation has summarized characters for expression, such as the size, aspect ratio, black and white ratio and node density, and has effectively separated line symbol from point symbol and surface symbol based on these characteristic. The sub-vector method is adopted to deal with line symbols, at the same time, homologous lines are detected according to the adjacency relationship of node segments. Through the process, all vector lines to the same map element are obtained and the complete vector information is established. In order to better extract the vector data of dashed roads, rivers and contour lines, test conditions can be adaptively adjusted according to different map elements.The above mentioned recognition methods have been realized in the already-developed prototype system of recognizing the topographic map. In which, the software shall be designed according to the objects and data be managed by common database.
【Key words】 map recognition; scan string; segment connected graph; line symbol; Contour; Road; Water;