

Research and Implementation of Diabetes Health Education System Based on Semantic Analysis

【作者】 樊春雷

【导师】 王学武;

【作者基本信息】 华东理工大学 , 控制科学与工程, 2011, 硕士

【摘要】 语义网是当前万维网的延伸和扩展,它能够让人和计算机协作效果更好。基于语义的糖尿病知识检索分析系统就是采用了语义网技术,实现了糖尿病知识的优化组织和管理,并且为更好的帮助用户查询到理想的糖尿病知识提供了可能。基于语义网的检索机制与传统的基于关键词的检索相比较更加智能,并且能够有效提供资源检索的查准率和查全率。糖尿病知识本体的建立是实现语义检索的关键。目前,糖尿病影响了全球很多人的生命。由于糖尿病病人的生活方式、预防保养和治疗等方面知识的宣传和普及能够极大地改善病人的生活质量。本文在糖尿病本体建立完成的基础上,设计了一个用户友好的糖尿病健康教育系统。对于本体的操作使用Jena来实现,使用Jena来计算本体概念类之间的语义相似度。但是语义相似度由多种因素决定,而且各种因素对语义相似度的影响各不相同。因为语义相似度影响因素的人为定义很大程度上影响到最终的结果,所以本文使用BP神经网络算法来更好地实现相似度算法,实现用户输入关键词在语义方面的扩展。这种方法能够更好地解决关键词查询带来的弊端,及其能够解决语义相似度算法受人为因素的影响,最终能够更好地实现语义相似度算法,从而使得搜索引擎能够达到很好的查全率和查准率。在检索方面使用Lucene来实现全文搜索技术,对糖尿病资源描述库中的内容进行索引和检索。为了避免人工实现语义标注而带来的资源有限的不足,本文考虑在现有的基础上采用Heritrix爬虫来对网页知识进行抓取,从而扩充糖尿病知识库。

【Abstract】 Semantic Web is an expansion of the current World Wide Web, it can make people and computers collaborate better. Diabetes knowledge retrieval system based on semantic analysis can realize optimal organization and management of diabetes knowledge, and make it possible to help people obtain ideal diabetes knowledge after the Semantic Web technology is applied. Semantic Web-based retrieval mechanism is more intelligent than the traditional keyword-based search, and it can improve precision and recall of resource search.The establishment of Diabetes knowledge ontology is the key to realizing the semantic retrieval. Currently, diabetes affects the lives of many people worldwide. The publicity and outreach of knowledge of diabetes patient lifestyle, preventive maintenance and treatment can greatly improve the patient’s quality of life. A user-friendly diabetes health education system is presented based on the establishment of diabetes ontology. Jena is used to implement ontology operation, and it is used to calculate the semantic similarity between the ontology classes. But the semantic similarity is influenced by a variety of factors, and the influence on the semantic similarity from various factors is different. Because the semantic similarity defined by man largely affects the final result, therefore, BP neural network algorithm is used in this article to achieve better similarity algorithm and realize the semantic extensions of keywords that the user inputted. The disadvantage of keyword query and human factors of the semantic similarity algorithm can be improved by this method. It makes the semantic similarity algorithm more efficient, and good search recall and precision can be achieved.Lucene is applied to realize the full-text search technology in the retrieval, index and retrieve the contents of the diabetes resource description library. In order to avoid the limited resources caused by manual annotation, Heritrix crawler is considered here to crawl the knowledge on the basis of the existing web pages, and knowledge of diabetes can be expanded as a result.

【关键词】 语义网糖尿病本体JenaLucene
【Key words】 Semantic WebDiabetesOntologyJenaLucene