节点文献

基于Web的空间本体构建方法研究

The Research on Web-based Spatial Ontology Construction Method

【作者】 钟美

【导师】 杜清运;

【作者基本信息】 武汉大学 , 地图制图学与地理信息工程, 2010, 博士

【摘要】 尽管过去的十年已有成熟的本体工程工具,手工获取空间本体依然是一个耗时、昂贵、高度技巧化,并且有时很麻烦的任务,很容易导致地理领域知识获取瓶颈。这些问题类似于过去二十年来知识工程师在知识获取方法学或定义知识库工作平台上处理过的问题。事实证明,数据挖掘和机器学习技术的整合对地理领域知识获取是有益的。如何利用知识获取技术来降低空间本体构建的开销的相关技术称为空间本体学习技术。空间本体学习即从现有的知识源获取地理领域知识、以(半)自动方式构建或更新空间本体。空间本体学习技术可以支持从Web上现有的数据中抽取空间本体。作为互联网上最重要的应用之一,Web(万维网)提供了便捷的文档发布与获取机制,并逐步成为各类信息资源的聚集地。由于文本是Web上最丰富的资源,基于Web的本体学习技术的研究主要侧重于从自由文本中获取本体。自由文本依据一定的造句法表达特殊的语义,使得知识工程师可以基于一些背景知识来理解其中的含义。然而,由于缺乏一定的结构,要使机器能够自动地理解纯文本并从中抽取出所需要的知识,则必须利用自然语言处理(NLP)技术对其预处理,然后利用统计、机器学习等手段从中获取知识。基于Web的本体学习方法通常包括术语抽取,语义解释,和创建领域本体。对于基于Web的空间本体学习也包括这三个方面。过去空间本体的构建都是手工的从无到有,根据各自不同应用需求来建立本体,对本体的概念,关系及公理进行形式化的定义,借助各种的本体构建工具来构建本体并进行推理验证,以及对于本体的应用的研究:例如基于本体的空间搜索引擎。而对于从Web上抽取术语以构建空间本体,在其过程中的空间概念学习的语义歧义消除这一基础步骤的系统研究很少。本文试图从自然语言理解的方向着手,研究空间概念学习的语义解释的相关理论与技术;从方法论的角度实现对空间概念语义的深层理解,研究空间本体的构建。论文的创新点包括以下三个:1.论文对自然语言词汇语义与空间信息词汇语义的联系与区别进行了系统的分析与研究,并对WordNet中的空间本体概念进行了细致的分析。2.论文从语义解释的角度对空间信息词汇语义进行歧义消除,引入基于论旨角色标记体系理论,在此基础上采用选择限制、统计词义消歧等技术解决语义歧义问题。3.为了实现从Web页面中自动抽取空间本体,论文提出了一个空间本体学习模型,该模型主要包括Web文档预处理、空间领域术语抽取、空间概念学习、空间关系学习。

【Abstract】 Although ontology engineering tools have matured over the last decade, manual spatial ontology acquisition remains a time-consuming, expensive, highly skilled, and sometimes cumbersome task that can easily result in geography knowledge acquisition bottleneck. These problems resemble those that knowledge engineers have dealt with over the last two decades as they worked on knowledge acquisition methodologies or workbenches for defining knowledge bases. The integration of knowledge acquisition with machine learning techniques proved beneficial for geography knowledge acquisition. How to use knowledge acquisition techniques to reduce the overhead in spatial ontology construction is called spatial ontology learning techniques. Spatial ontology learning from existing knowledge sources to obtain geography knowledge in order to (semi) automatic construction or renovation of spatial ontology. Extraction of spatial ontology from existing data on the Web can be supported by ontology learning techniques.As the Internet, one of the most important applications, Web provides a convenient mechanism for document publishing and access, and gradually became a gathering place for all kinds of information resources. As the text are the Web’s most abundant resources, Web-based ontology learning technology research focused on the ontology acquisition from free text. Free text based on certain Sentences methods to express special meaning, so the knowledge engineer can understand the meaning based on some background knowledge. However, the lack of a certain structure, for machines that can automatically understand the text and extracted from the knowledge required, you must use natural language process technology to its pretreatment, then, statistics, machine learning and other means can be used to acquire knowledge. Web-based ontology learning methods typically include terminology extraction, semantic interpretation, and create domain ontology, as well as Web-based spatial ontology learning includes those three aspects. In the past spatial ontology construction was from scratch by hand, such as the construction of spatial ontology according to different application needs,the formalized definition for ontology conception, relationship and axiom, the construction of ontology using all kinds of ontology tools and the deduction for those constructions, and the research on ontology application (for example:the ontology-based spatial search engine). Whereas little research has been conducted in constructing spatial ontology from extract term on the web and disambiguating its semantic statement in spatial term learning. This paper emphasis on the theory and technique of semantic interpretation which bases on natural language understanding and do further research on spatial concept and spatial ontology construction. To make a deep understanding about spatial concept, some theories are provided and methodology is discussed. Three innovation points of this paper list here:1. Systematic research and analysis on semantic of vocabulary is provided through comparing natural language and spatial information and a meticulous analysis on spatial concept instance included in WordNet is experimented also.2. Detailed measurements of disambiguation are provided. The base of disambiguation is semantic explanation. Based on the introduction of thematic role system theory, selective constraints and disambiguation of statistic method are used to get the aim.3. A learning model of spatial ontology is provided which consists of web document arrangement, extracting of spatial term, spatial concept learning, spatial relationship learning, to extract spatial ontology from web pages automatically.

  • 【网络出版投稿人】 武汉大学
  • 【网络出版年期】2010年 10期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络