节点文献

基于本体知识库的教学资源自动采集技术研究

【作者】 田俊华

【导师】 李艺;

【作者基本信息】 南京师范大学 , 教育技术学, 2011, 博士

【摘要】 Web信息资源已十分丰富,利用技术手段对Web上的教育资源进行自动采集,从而形成各种教学资源库,为教学活动提供信息资源服务,这无疑对促进教育信息化建设起到巨大的促进作用。但面对日益庞大的Web规模和越来越复杂的页面结构,研究如何在有限的网络资源和采集规模下,高效地从Internet采集教学资源,具有重要的学术意义和实践价值。本文对主题信息资源自动采集技术进行了系统研究,讨论了主题爬行技术、文本自动分类技术、文本自动抽取技术、本体及本体知识推理技术等,并深入讨论了这些技术在Web教学资源自动采集中的应用。本文以生态学的视角分析了Web主题资源的分布规律,提出了网络生态链(Network Ecological Chain)理论,并据此设计了网络生态链算法。提出了把网站的主题特性判断与具体的链接目标预测相结合的自上而下的主题信息资源采集方法:即通过网络生态链算法,辅以文本自动分类、文本自动抽取和本体知识推理等技术,首先从Web中发现主题网站群,然后结合网站、页面及链接邻近块文本的主题特性,再用主题爬行算法对具体的链接目标进行选择性采集。这样,可以有效地解决主题爬行中的方向迷失问题,提高主题信息资源采集的收获比(Harvest Rate)。为了提高主题爬行中对链接目标的预测能力,本文重点研究了本体(Ontology)技术及其在Web教学资源自动采集中的应用。讨论了本体语言、本体的构建方法及本体开发技术,尝试性地构建了教育本体知识库,开发了教育本体知识推理引擎,探索了教育本体知识推理引擎的具体应用。由于本体具有开放性和标准化的特点,因而教育本体知识库的构建可以通过共建共享的方式实现知识复用。最后,设计开发了一个Web教学资源自动采集原型系统,并以德育教学资源自动采集为例,验证了各种技术的有效性。本文的主要工作和创新之处主要有:系统研究了主题信息资源自动采集技术;提出了网络生态链理论,设计了网络生态链算法,并通过实验数据验证了其有效性;把本体技术应用于教育知识库的构建中,尝试性地开发了教育本体知识推理引擎,探索了它在Web教学资源自动采集中的应用。本文的研究可以为相关系统的设计开发提供了一定的理论指导和技术支持。

【Abstract】 The Web Information resources are extremely rich. No doubt, it will be a great advancement in education informationization by automatically collecting educational resources on the Web with technological means, in order to construct various teaching resource repositories and provide information resource services. However, in face of the increasing Web scale as well as increasingly complicated page structures, the research on how to efficiently collect educational resources from Internet with limited resource pool and restricted acquisition scale has important academic meaning and practice value.This paper systematically researches into automatic acquisition technology of the topic information resources, discusses the topic crawling technology, automatic text categorization, automatic text extraction, ontology and ontology knowledge inference and so on. Finally, the paper discusses the application of those technologies in automatically acquiring Web teaching resources.This paper analyzes distribution law of Web topic resources from ecological perspective, provides the theory of Network Ecological Chain, designs Network Ecological Chain Algorithm based on this theory. The paper provides a top-down method of acquiring topic information resources, which combines the judgment of site’s topic characteristic and the prediction of specific link target. That’s to say, through Network Ecological Chain Algorithm, assisted by such technologies as Automatic Text Categorization, Automatic Text Extraction and Ontology Knowledge Inference, we firstly find topic-web group from the Web, then based on the topic characteristic of web sites, Pages and text adjacent to links, we selectively collect specific link targets with topic crawling algorithm. In this way, we can solve disoriented problem in topic crawling effectively, and increase the harvest rate in acquiring topic information resources.To promote the predict ability of link targets in topic crawling, this paper mainly study ontology technology and its application in automatically acquiring web teaching resources. It discusses ontology language, construction method of ontology and ontology development. We tentatively build educational ontology knowledge repository, develops an educational ontology knowledge inference engine, and explore the specific application of this engine. Considering the Openness and standardization of ontology, the construction of educational ontology knowledge repository can bring about knowledge reuse through co-construction and sharing. At last, the paper designs and develops an automatic Web teaching resources acquisition prototype system, and verifies the efficiency of all technologies with an example of automatically acquiring moral educational resources.The main work and innovation of this paper are as follows:it systematically studies the automatic acquisition technology of topic information resources, provides the theory of network ecological chain, designs network ecological chain algorithm and verifies its efficiency, applies ontology technology to the construction of educational knowledge repository, tentatively develop an educational ontology knowledge inference engine and explorer its application in automatic Web teaching resources acquisition. The research in this paper can provide sort of theoretical guide and technical support.

  • 【分类号】TP391.1;TP274.2
  • 【被引频次】6
  • 【下载频次】772
  • 攻读期成果
节点文献中: 

本文链接的文献网络图示:

本文的引文网络