节点文献

基于本体技术的语义检索及其语义相似度研究

Research on Semantic Retrieval & Its Semantic Similarity Based on Ontology Technology

【作者】 邹文科

【导师】 孟祥武;

【作者基本信息】 北京邮电大学 , 计算机应用技术, 2008, 硕士

【摘要】 随着网络技术的发展和Internet上信息量的激增,信息检索系统作为网络信息平台的一个重要组成部分,在用户获取准确的网络信息过程之中发挥着重要的作用。传统的信息检索仅仅是基于语法层面上的简单匹配,缺乏对知识的表示、处理和理解能力,其实质在于信息资源缺少统一的语义描述,用户难以查询到与需求相关的信息,难以实现相关信息的语义融合,问题的关键在于将信息检索从传统的基于语法的简单匹配提升到基于语义知识层面。语义Web(Semantic Web)是WWW的发明人Tim Berners-Lee倡导的下一代万维网,致力于以计算机可处理形式表示信息。语义Web的目的是让计算机能够“理解”Web上的信息,并在“理解”的前提下更好地处理和利用这些信息,为人类提供更好的服务。本体具有良好的概念层次结构和对逻辑推理的支持,能够通过概念之间的关系来表达概念语义的能力,实现语义上的信息表示,可以很好地应用于信息检索。基于本体的信息检索不同于传统的关键词检索,利用本体知识库强化了概念之间的内在联系,通过逻辑推理可以发掘概念之间隐含的和不明确的信息,实现语义智能信息检索。首先对传统信息检索技术进行了分析研究,导致其检索质量低下的根本原因在于传统信息检索采用基于语法的匹配方式,缺乏检索信息的语义理解,探讨了将本体技术应用于信息检索,实现语义智能信息检索。其次分析研究了语义Web和本体技术,包括它的来源定义、框架结构、研究现状和应用等。语义Web是对现有万维网的扩展和进化,基于元数据和本体的语义和知识的表达,提供充分的丰富的语义信息使得机器可以理解,达到机器可以自动处理信息的能力。另外详细分析了本体技术在电信领域的应用情况,包括基于本体的网络系统管理集成信息模型、语义Web技术应用于上下文感知的智能移动Web服务和电信领域本体的构建等。接着重点研究分析了基于本体的语义智能信息检索的关键技术,包括本体技术、智能信息检索方法、领域本体构建和系统流程等。基于对传统信息检索技术的不足和本体技术,设计了基于领域本体的语义智能检索系统。分析了当前互联网上的手机商品在线网站的检索系统,提出了基于本体的语义智能检索系统框架模型,构建了实验系统的手机商品本体,并进行了智能信息检索系统的语义推理分析。在前面技术理论和系统技术设计的基础上,实现了基于本体的手机商品语义检索系统(MPPSRS)。该实验系统以手机商品领域为智能检索对象,通过本体的语义推理处理,可以充分发掘检索信息之间隐含的关联信息,为用户提供了良好的语义检索服务,从而在根本上解决传统信息检索中资源对象语义信息缺乏的问题,更加准确和全面地查询到用户需要的手机商品信息,实现语义智能信息检索。然后分析了当前概念相似度研究现状,结合本体技术,在构建的领域本体的基础上,提出了一种改进的基于领域本体的语义相似度的计算模型,该模型结合基于距离的语义相似度和基于属性的语义相似度,其中基于距离的语义相似度综合考虑并利用了本体类的层次关系中的多种影响因素,如语义重合度、语义层次深度、语义距离、语义密度以及相应的调节因子等,来计算领域本体内部概念之间的语义相似度。最后结合上一章具体探讨的改进的基于领域本体的语义相似度计算模型,设计并实现了基于本体的电子镇流器/荧光灯管产品检索推荐系统(BLPRRS)。分析了某公司的实际需求,基于本体技术,结合该公司产品特点,在抽取公司研发和销售的电子镇流器和荧光灯管产品,构建了电子镇流器和荧光灯管的本体库的基础上,实现了实验系统。通过调整实验系统中相应的各个调节因子,并将实验数据与专家主观判断进行比较,分析并验证了改进的语义相似度计算方法的效果,表明基于本体的语义相似度计算模型可以帮助扩展检索概念,提供有效的产品检索结果。

【Abstract】 With the development of network technology and rapid increasing information on Internet, information retrieval system plays an important role at communication between users and resource on the network. The traditional information retrieval is only based grammar match, which lack of the presentation, handling and understanding of knowledge. The key problem is that information resource is lack of semantic description, so that it is hard for users to retrieve the information which they really want and impossible to associate information resource with semantic feature. The essential solution to this problem lies in the information retrieval from the traditional grammar-based level upgraded to knowledge-based semantic level.Semantic Web is an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation. Ontology has the good hierarchical structure of concepts and the support of logical reasoning, and semantic information can be realized through the semantic relationship of concepts. Ontology technology can be well applied to information retrieval. Ontology-based information retrieval is different from the traditional keyword search. Semantic Intelligent Information Retrieval can be realized because Ontology knowledge base strengthens the intrinsic link of the concepts and the implied and unclear information can be deduced through logical reasoning. This paper analyzed the traditional information retrieval technology and got that the reason of the low quality of its retrieval fundamentally lies in the traditional information retrieval based on the matching syntax and lack of the semantics of information retrieval. And this paper put forwarded the Ontology technologies to be applied to information retrieval. Another way, Ontology technology applied in the field of telecommunications applications was analyzed in detail, including Ontology-based network management system integrated information model, Semantic Web technologies in the context-aware smart mobile Web services and ontology construction in telecommunications field.Then this paper focuses on the analysis of several key technologies of ontology-based semantic intelligent information retrieval, including ontology technology, the method of Semantic Intelligent Information Retrieval, domain ontology building process, and system process. Based on analysis of traditional information retrieval technology and ontology technologies, Ontology-based Semantic Intelligent Retrieval System was designed. After analysis of the current information retrieval system of on-line mobile phone product shop on the Internet website, the semantic intelligent retrieval system framework model based on ontology was proposed. Then mobile phone product ontology was constructed for the experimental system, and the semantic reasoning was analyzed in Semantic Intelligent Information Retrieval.After that, Mobile Phone Product Semantic Retrieval System (MPPSRS) was developed based on the technology theory and system design in previous sections. Mobile phone product was the intelligent retrieval object in this experimental system. Through the semantic reasoning based on ontology, we can fully explore the retrieval of information which users implied. This system offered a good semantic retrieval services which fundamentally solve the shortage of traditional information retrieval in which information resource was lack of semantic information, and this system provided users the more accurate and comprehensive retrieval result as users’ inquiries and achieved Semantic Intelligent Information Retrieval.At last but important two sections in this paper, traditional concept semantic similarity computation models was analyzed, and based on domain ontology, a reformative semantic similarity algorithm was put forwards, which integrated semantic similarity based on distance and semantic similarity based on attribute. For distance-based semantic similarity, several important elements which are implicated in domain ontology were taken into account, such as semantic ancestor, semantic depth, semantic distance, semantic density, related adjustment factors and so on. Then an ontology base from an actual company, Ballasts and Lamps, was developed and a semantic similarity retrieval experimental system, Ballasts & Lamps Product Retrieval Recommendation System (BLPRRS) was developed. And the experimental result demonstrated this semantic similarity computation model could help to extend the query concepts sets and provide an effective product retrieval result.

  • 【分类号】TP391.3
  • 【被引频次】29
  • 【下载频次】2228
节点文献中: