

Research and Design of Searching Engine Based on Medical Domain Ontology

【作者】 吴迪

【导师】 李万龙;

【作者基本信息】 长春工业大学 , 计算机应用技术, 2011, 硕士

【摘要】 近些年来,Internet上的WEB资源日益增长,促使万维网自1989年诞生到现在的二十多年的时间里得到了前所未有的发展,现在已经成为人们日常工作和学习生活中不可或缺的重要工具之一,而搜索引擎作为信息获取的主要技术也随之发展起来。搜索引擎是指根据一定策略、运用特定计算机程序来搜集网络上信息,然后对信息进行组织与处理之后,将处理过的信息返回给用户,它是一种为用户提供查询服务的系统。在WEB资源急速膨胀与人们生活节奏加快的今天,就要求搜索引擎能够快速准确的找到用户需要的信息。但是当前搜索引擎的查询技术主要基于关键字匹配,这在准确性和效率上都无法满足用户要求。而本体技术作为一种知识层次上对事物进行描述的概念模型建模工具具有良好概念层次结构与对逻辑推理较好的支持。因此本体以及本体相关技术在知识检索领域得到了广泛地应用。本课题的主要目的是将本体技术应用到搜索引擎的信息检索模块中,以提高检索的质量,即弥补仅依赖关键字检索时查全率和查准率不高的缺点。在对现有的检索方式分析研究之后,依据本体的特点和定义,提出了一种新的检索模型,并对该模型的语义分析处理、相似度计算、语义扩展等方面进行了分析和研究,主要的研究工作包括:1.利用了本体中的层次结构关系对本体概念内容进行扩展,达到语义扩展的目的,进而实现了信息检索系统的智能推理。2.根据本体的形式化表示,依据本体中各个元素之间的关系,提出一种新的概念语义相似度的计算方法。使得系统能够对概念进行相似度计算,并且可将查询结果按照相似度排序显示。3.实现了基于Jena的本体存储,并使用RDQL语言实现了本体概念的查询,其中包括显式关系查询和隐式关系查询。4.介绍了信息系统检索的检索过程,实现了对查询条件的语义分析和扩展查询,查询结果优化排序等功能。5.设计了基于医疗本体的搜索引擎的原型系统,该系统的概念查询和扩展查询功能提高了检索的查准率和查全率,实现了对领域资源的智能检索。

【Abstract】 Recently as the growing of Internet WEB resources the world wide web experienced unprecedented development during the past twenty years. Now it has become one of the indispensable tools in our daily work and study and the search engine as the main access to information technology be developed as well. Search engine is based on certain strategies using of specific computer programs to collect information on the Internet and then return the information to user after organizations and processing, which provide a kind of consulting services system for users. At present with the rapidly expanding of WEB resources and the people’s life rhythm speeding up, a fast and accurate search engine is required to help user to find the information they want. However, the current search engine query technology is mainly based on keyword matching, which cannot meet user requirements in completeness and accuracy.In fact, the ontology as describe the concept of modeling tools in semantic and knowledge levels, which has a good hierarchy and the ability of logical reasoning. Therefore, ontology technology in information retrieval, especially in the field of knowledge-based semantic retrieval has been applied widely.The main purpose of this issue is to apply ontology technologies into the search engine information retrieval module in order to improve the quality of search engines. This approach has make up the deficient in completeness and accuracy of search result which only rely on keyword search.In the analysis of the current retrieval methods, combined with the characteristics and definitions of ontology, proposed a new retrieval model, and this model’s analysis and research include semantic analysis processing, similarity calculation, semantic extension and other aspects. The main research work as follows:First, using hierarchical structure relation of ontology of ontology to expand its concept content, and realize the intelligent reasoning in information retrieval system.Second, according to the formal representation of ontology and the relationship of various elements, this paper puts forward a new calculation method of concept semantic similarity. Make the system to calculate the similarity of concepts, and sort query results by similarity.Third, realize the ontology storage based on Jena and use the RDQL language to query the concept of ontology, including explicit and implicit relational query.Fourth, introduced the process of information system retrieval, realized the semantic analysis of query condition, expanded query and optimized sorting in inquires the results and other functions. Fifth, designed the Ontology-based Medical Search Engine Prototype System (OCHRIMPSEPS), the concept query and extend query in retrieval system improve the deficient in completeness and accuracy and realized intelligent retrieval of field resources.

【关键词】 本体搜索引擎语义检索Jena
【Key words】 ontologysearch enginesemantic retrievalJena

