

The Study of Information Extraction Technology for Remote Sensing Based on Feature Knowledge

【作者】 高伟

【导师】 吴信才;

【作者基本信息】 中国地质大学 , 地图制图学与地理信息工程, 2010, 博士

【摘要】 遥感数据的空间、光谱、时间分辨率不断提高,为开展各类遥感应用提供了大量的数据资料。但在遥感数据获取能力增强的同时,遥感数据中隐藏的丰富信息并没有得到充分的发掘和利用,遥感应用中形成了“数据丰富、信息贫乏”的现象。提高遥感专题信息提取的自动化程度,快速准确地将遥感数据转化为行业应用需要的专题信息,已成为当前遥感应用的迫切要求。遥感技术在不接触物体的情况下,通过传感器获取研究目标的相关特征并从中提取出有效信息(例如人工建筑、土地利用类型、植被、温度,以及其它兴趣目标)的过程。遥感信息提取就是从影像数据上获取满足行业应用的专题信息,其研究对象是存在于地表空间的地理实体及相关现象。由于地表空间的信息是多维的、无限的,而通过遥感采集数据只能记录为多谱段的数组形式,地表信息和遥感数据之间的信息不对称使得遥感信息的提取(即地学空间分析和过程反演)具有模糊和多解的特点,需要通过加入完整的特征描述及领域知识来辅助完成准确提取。在遥感信息提取处于目视判读提取阶段时,解译专家通过综合利用地物的色调、形状、大小、阴影、纹理、图案、位置等直接判读标志加以和布局、空间拓扑关系等间接解译标志,并结合其它非遥感数据资源进行综合分析和逻辑推理,从而达到较高的信息提取精度,但该类方法主要依赖解译专家的知识,难以得到大规模的复用,且信息提取时间也较长。而目前自动提取方法多是采用统计模型,对于地学知识的应用主要通过在数理统计模型上增加特征知识维度,或是采用神经网络、决策树等方法将地学规则和知识引入信息提取过程中,这虽然取得了一定的效果,但并不能真正满足应用的需求,主要原因是:1)特征表达多以像元为对象,对于形状、语义等特征难以完整表达,用于识别地物的特征不完整,导致地物间的可分性不够;2)特征知识规则多是采用先验知识,识别过程中采用的隐型表达,不符合人类解译的习惯,对于传感器、成像条件变化的情况难以做到自适应调节。Goodchild提出了关于地理表达的一般化本体概念,即所有的地理信息均可以表示为非常基本的一种形式-地理原子,即形如(x,Z,Z(x))的联合体,其中x表示空间坐标,Z表示其属性,Z(x)表示与属性相关的规则,而地理对象可以通过地理原子的集合进行表示。影像的每一个像元可以看作是一个地理原子,则具有一定相似度特征的像元组成影像对象,可视为是地理实体在影像空间的表现。基于该思想,本文提出可以将地理实体对象表示为由影像空间中由特征和规则组成的像元集合体-语义对象,并依据语义对象来构建知识库(即特征知识库)来辅助遥感信息提取。本文针对遥感信息智能化提取的应用需求,研究适合遥感信息提取中的特征知识库模型及管理方案,结合面向对象的遥感分析方法,建立基于特征知识库的遥感信息提取框架;在此基础上通过研究影像对象构建中分割尺度等参数的优化方法,解决基于对象的多特征组合优化及目标识别等问题,形成特征知识“应用-评价-反馈更新”的动态机制,从而为遥感影像信息的高效智能提取与应用提供解决方案。论文集中从以下四个方面开展研究:1)基于特征知识库的遥感信息提取框架:分析地理对象和影像对象之间的对应关系,研究遥感信息提取的方法及关键问题,着重分析地学特征知识在遥感信息提取中的作用和应用模式,在此基础上寻求“特征+规则”建立语义对象以及基于语义对象记录相关地学知识的特征知识库的方法,提出以特征知识库为核心的遥感信息提取框架;2)地理实体特征知识的管理与更新机制研究:针对遥感信息提取中的各类知识,分析其特点,设计面向遥感信息提取应用的特征知识库;研究特征和规则的量化表达机制和存储方法,并结合特征知识库的应用设计分级索引策略,以实现对特征知识的高效管理;3)特征知识约束下的对象构建技术研究:分析现有多尺度分割构建影像对象的方法,研究尺度、形状因子等分割参数对最终影像对象结果的影响,总结影像对象评价标准,研究依据评价标准的对象构建参数的优化方法,实现自动化程度更高的影像对象构建方法;4)地物目标识别特征优选方法研究:针对“影像对象-地理目标”转化中的模糊推理方法,重点研究模糊推理中的多特征组合方法;针对特征知识库中特征规则更新的应用需求,研究基于影像对象的特征规则获取方法以及特征组合优选策略,以实现对特征知识库的反馈更新。论文相关章节安排及主要内容如下:第一章为绪论,阐明遥感信息智能化提取技术研究的必要性,分析遥感信息提取方法、面向对象影像分析技术的研究现状、发展趋势以及存在问题进行了分析,给出了本文的研究内容及结构安排。第二章提出了基于特征库的遥感信息提取技术框架,首先对遥感信息提取的基本问题进行分析,介绍了遥感信息提取的系统方法;然后从遥感信息提取的流程、影像信息表达模型以及地学知识在遥感信息提取中的应用模式三个方面进一步分析遥感信息提取中的关键问题;最后提出基于特征知识库的遥感信息提取框架。论文的后续四章将围绕该框架中的特征知识库管理、对象构建、目标识别的多特征组合等关键问题进行展开。第三章介绍了影像特征知识库的构建机制,首先分析了面向遥感信息提取的特征知识库的组成、关键问题和知识表达模型;然后分别从影像特征的表达与存储、规则的表达与存储两方面进行了详细介绍,在此基础上给出了适合遥感信息提取使用的知识组织模型和动态索引机制。第四章针对特征表达基本模型-影像对象的构建问题进行分析,针对影像对象构建的多尺度分割方法,给出了影像分割对象评价标准,并基于该标准对尺度、形状因子、波段权重等影响对象构建结果的参数因子进行分析;通过引入遗传算法,解决对象构建中尺度参数、分割参数的优化问题;第五章针对目标地物识别中的特征组合问题进行研究,首先结合模糊逻辑推理机制介绍了基于知识的模糊分类方法知识,然后针对模糊分类中的多特征组合问题进行研究,介绍了针对高维特征的最近邻和SVM分类器,在此基础上提出了基于层次分类思想的多分类器并给出其算法流程;第六章针对特征知识库中的特征简化与更新,分析了基于对象的特征获取算法以及特征组合优化方法,并简要介绍了关联规则挖掘算法方法;第七章为结论和展望,在总结了论文完成研究内容的基础上,指出论文存在的问题以及存在的不足之处,指明今后进一步研究的重点和方向。论文围绕基于特征知识库的遥感信息提取框架及关键技术问题开展研究,完成的相关工作如下:1)对遥感数据获取的过程进行了研究和分析,指出地表空间信息具有多维连续性,而遥感数据受获取条件限制表现为二维离散数据,这样在遥感数据获取过程中信息被简化而造成损失。遥感信息提取的目标是从遥感数据中获取领域应用相关的专题信息,从信息传递的角度可看作是遥感数据获取的逆过程,成像过程中的信息损失给遥感信息提取过程造成了困难。引入补充地学知识到遥感信息提取过程中是提高其准确性和自动化程度的关键;2)针对遥感信息提取中知识应用的相关问题进行了总结,基于遥感信息提取的一般流程,重点对遥感信息提取的系统方法、影像信息表达模型以及地学知识在信息提取中的应用模式进行了分析,并指出单纯应用产生式规则构建知识库不适用于遥感信息提取过程中的知识应用及反馈更新,基于像元的表达模型也限制了地理实体特征在影像空间的完整表达;3)提出了基于特征知识库的遥感信息提取框架。针对如何建立基于知识的智能化信息提取系统的问题,提出以特征知识库为核心的遥感信息提取框架,在该技术框架中按照面向对象的方式来组织特征知识库,在影像空间中以影像对象作为识别单元,将像元级分析提升到可表达多特征的对象级,通过构建多特征组合分类器,不仅可以在较高层次上完成遥感信息的自动提取,而且可以通过样本学习实现特征知识的获取与更新;4)建立了面向遥感信息提取应用的特征知识库的构建模型,分别对影像特征的表达与存储、规则的表达与存储两方面问题进行了详细分析并提出解决方案,确定了特征知识的表达模型,并根据信息提取的应用需求,提出以层次索引为主可以进行动态分组的特征知识索引策略;5)对构建多尺度影像对象的分割方法进行了研究,结合实验分析指出影像对象均存在最适合表达其特征的最优尺度,尺度参数、颜色因子、形状因子等参数都会影响影像对象的生成结果。目前应用中对于这些参数都是通过人工试验选取的,针对该问题,提出基于遗传算法的分割参数优化算法并基于实现对该算法进行了分析,结果表明该方法可以有效地获取最优分割参数;6)针对“影像对象-地理空间目标”转化方法,研究基于知识的模糊逻辑推理过程,重点考虑高维特征空间内特征组合方法,分析了目前解决该问题的两种常用方法:最近邻法和支撑向量机法。基于层次分类思想,提出将聚类方法和基于模糊逻辑的分类器进行组合,按照层次不同采用不同分类法以形成多分类器,实验证明该方法可有效减少单一分类器所需特征数目,分类精度和可靠性也有所提高;7)针对特征知识更新中的特征获取和规则获取方法进行了简要分析,包括:基于影像对象的特征获取方法,由于从影像对象获取信息最关键的问题是确定对象边界,文中介绍了基于矢量和基于栅格扫描两种方法;对于特征优化组合问题,在对特征选取问题和现有方法进行分析的基础上提出了基于遗传算法优化特征组合的方法;对于关联规则的获取问题,简要分析了基于Apriori的规则提取算法。论文针对遥感信息提取中的应用和需求,在遥感信息智能化提取方面开展研究,通过建立更适合影像信息提取的特征知识库,结合面向对象影像分析模型,提出基于特征知识库的遥感信息提取框架;通过对特征知识的管理、影像对象构建的参数优化模型、地物目标识别中的多特征组合策略以及特征知识优化等关键问题的研究,实现遥感信息智能化提取的初步模型。论文的研究成果对于提高遥感信息提取的智能化、自动化,解决遥感数据-信息转化的效率问题有着重要的实用价值;相关成果可用于国土资源调查、基础地理信息更新等领域;相关方法和模型也可应用在基于内容的遥感数据查询等方面,具有较强的应用前景。

【Abstract】 With the continuously improvement of remote sensing data in spatial, spectral and time resolution, a lot of data in various types of remote sensing applications is provided.However, the abundant information obtained in remote sensing data has not been fully explored and used, which leads to a huge waste in remote sensing resource, and hinders its further application as well. Therefore, the development of remote sensing information extraction and recognition has become an urgent requirement.The extraction of remote sensing information is to extract the useful information (such as buildings, land-use types, vegetation,temperature, and other interested target) from remote sensing image data. The object of the study is the geographical entity and the related geo-phenomenon. As the entity object of remote sensing study-the information of surface space is multi-dimensional and infinite, but the remote sensing data obtained through transmission of information is a two-dimensional and simplified information, so the asymmetry between surface information and the remote sensing data makes the remote sensing information extraction(the process of geo-spatial analysis and inversion) fuzzy and get multiple solutions. To achieve the accurate extraction of the automation of remote sensing information extraction, the remote sensing data must be made full use of,meanwhile, flexibly reuse the related feature rule which change with the data source and application condition through the excavating and adding other features to form a complete description.In the stage which remote sensing information extraction can be realized by visual interpretation, the interpretation experts, through the comprehensive utilization of direct interpretation including color, shape, size, shadow, texture, pattern, location, etc, and other indirect interpretation signs including distribution and topological spatial relations, and combined with other comprehensive analysis and logical reasoning of non-remote sensing data resources, so as to achieve a higher precision of information extraction. However, this method is labor intensive and time-consuming, and rely mainly on the knowledge of experts which is difficult to reuse. The existing automatic extraction methods always bring rules and knowledge to the extraction of information by adding features of the knowledge dimension on the basis of mathematical statistics model, or through the use of neural networks, decision tree methods. Although in some applications, such methods achieved good results, it is hard to promote, mainly for two reasons:I)The feature expression often uses the pixel as the object, for the shape and characteristics of semantic integrity is difficult to express; 2) The feature knowledge rules always use a priori knowledge and invisible expression in the recognition process, which is not in conformity with the habit of human interpretation, and hard to reuse. Moreover, it is impossible to achieve self-adaptive adjustment for the sensors and imaging conditions change.Object-oriented image processing method is to analyze objects from the pixel-level to the object level, in order to get the shape, spatial relations and other features outside the spectrum, and through taking advantage of the relationship between objects, it can express a higher level of semantic feature and provide better information vector compared to pixel-level image analysis. This thesis analyzes characteristics of image objects and knowledge of the rules of expression based on object-oriented image analysis model, and establish a reusability feature knowledge base to achieve management and updates of the target feature in geographical entities, and on this basis it analyze the application mode of feature knowledge base in imaging feature building and target recognition of an object, and study key issues such as rules of optimal combination features in target recognition based on samples to achieve the extraction framework of feature-based knowledge of remote sensing information. The main content of this thesis include:(1)The framework of feature-based knowledge for Remote Sensing Information Extraction:It analyzes geological features of knowledge in the role of remote sensing information extraction, provides the framework of feature-based knowledge for Remote Sensing Information Extraction, and studies application model based on the framework for the application of information extraction.(2) The physical characteristics of geographic knowledge management and update mechanism:It analyzes the use of remote sensing information extraction of features, rules and knowledge, and numerical expression, designs the dynamic hierarchical indexing strategy to achieve knowledge management of feature knowledge.(3) The object construction technology research under the control of feature knowledge:It analyzes the effect of parameters to its object in the multi-scale image segmentation method, and builds the evaluation criteria of object establishment based on feature knowledge. According to the optimization of parameters in the process of the realization of that criterion, it finally achieves the automatic construction of the image.(4) The study of preferred method in features recognition of geographical entities:It studies the recognition method for geographical entities based on image object, establishes the way to study geographical entities automatically derived from the training set, focuses on the application requirements for updating the feature rule in feature base, studies on the way and optimized strategy of feature combination based on image acquisition methods, and finally achieves the feedback on the characteristics of knowledge base updates.This paper is divided as follows on the basis of the content of study.1)At the beginning of this chapter, it elaborates on the significance of the study of remote sensing information extraction and indicates the imperative demand of raising the level of remote sensing thematic information extraction and cognition, then point out that the knowledge-based remote sensing information extraction method is a trend that is all but irreversible; and provides fundamental basis for intelligent remote sensing information extraction on the basis of object-oriented methods. Finally, it demonstrates the content of the study and the structural arrangement of this article.2)This paper gives description of the feature-based library of remote sensing information extraction technology framework. In the first place, it introduces the geographic entity characteristics and imaging findings by narrating the remote sensing data acquisition process and analysis the function of characteristic knowledge which is used in information extraction; and then summarize the application model.Base on this, it lays out a frame of feature-based knowledge of remote sensing information extraction and analyzes the key issues.3)Construction system of the knowledge base of image feature. Firstly, it introduces the key factors of knowledge base, characteristic knowledge base and how to realize the characteristic knowledge base, and then analyzes characteristics, rules and knowledge that is used in Remote Sensing Information Extraction. Base on this, the way of storing characteristics and knowledge is given below, and the appropriate index method which fits the information extraction is provided.4)Describe knowledge-constrained approach to build an object. This chapter first introduces the multi-scale segmentation method of video object, and highlights the FENA segmentation method. Then analyze the parameters which affect the construction of objects through experiments on the scale, shape factor, bandwidth, weight, etc. Through the introduction of standards as well as genetic algorithms, it solves the automatic determination method of relevant parameters to build the object, and establish an automatic segmentation process by introducing the sample evaluation criteria. Finally this chapter accesses the new segmentation method using the experimental data.5)Give the optimization mechanism of feature combination and surface features recognition. Object-based geographical entity is recognized based on various types of features through a combination of implementation. This chapter puts forward the method of using statistical classification and fuzzy classification and multiple classifiers to further improve the classification accuracy based on the proposed use of statistical classification and fuzzy classification method 6)The paper provides the of the extraction methods based on samples. This chapter focuses on the application requirements for automatic extraction based on the rules of samples. The spectral image object, shape, semantic and other characteristics, and association rules extraction methods are introduced, and on this basis, the feature combinatorial optimization methods are proposed.7)In summarizing the main contents of the paper, it proposed the problems and inadequacies of the thesis, and focus and direction for further research as well.This thesis brings the feature knowledge to remote sensing information extraction process through the in-depth study of object-oriented model of information extraction. It solves some key issues such as management and updating mechanism of geographical entities, automatic construction of target feature knowledge-driven, recognition of the target feature combination and optimization method. The basis for rapid extraction of intelligent remote sensing is provided, and can be widely used in land resource surveys, land monitoring, basic geographic information updates, and protection of environmental resources and so on. Enhancing the spatial information in related fields can provide more accurate baseline data for analyzing and making decisions, which has nice prospects and a wide range of application fields.
