

Research on Retrieval of Multimedia Network Teaching Resources in Elementary Education

【作者】 魏春燕

【导师】 孟祥增;

【作者基本信息】 山东师范大学 , 教育技术学, 2008, 硕士

【摘要】 教育信息化建设转变了教育思想和观念,对教师和学生都提出了新的要求。教师要具备利用网络获取教学资源、组织教学的能力,学生要有利用网络进行自我学习能力。因特网蕴含了大量信息资源,但可用的教学资源分布零散且质量良莠不齐。虽然现有的WEB搜索引擎功能日益完善,但多数采用基于关键词的方法,对于教学和学习所需要的多媒体资源的检索无能为力。尤其对于计算机能力不强的中小学教师和学生而言,在多媒体资源的查找方面更需要方便快捷的系统加以辅助。本课题正是基于以上原因,我们以中小学教材为依据,组织基础教育教学预搜索关键词,搜索网络资源,建立了一个以中小学师生为使用对象,面向基础教育的多媒体网络教学资源索引库。并以asp技术为支持,以多媒体资源索引库为基础,建立了一个面向基础教育的多媒体网络教学资源索引库的检索系统。组织基础教育教学预搜索关键词,是为预搜索系统提供搜索指向,是建立面向基础教育的多媒体资源索引库的前期工作。我们以中小学教材为依据,通过人工收集和整理,从学段、学科和类型三个维度建立了基础教育教学主题词库体系。学段分为小学、初中和高中,其中小学的学科有5门,初中的学科有12门,高中的学科有14门,主题词类型分为图像、动画、视频和音频。论文设计并建立了一个以面向基础教育的多媒体资源索引库为基础的检索系统,该检索系统是面向WEB的多媒体资源检索系统,可以根据用户名连接相应的WEB多媒体资源索引库。每个资源库包含了图像、动画、视频、音频四类资源。该系统包括用户登录界面、用户输入界面、检索结果输出界面。检索系统是在分析了资源库中媒体的类型、特征及存储特点的基础上,采用中文自然语言查询的方法,以相似度来衡量查询目标媒体和数据库媒体之间的差距。自然语言是表达思想的有效工具,利用自然语言表达多媒体资源的语义是一种简洁、有效的方法。论文对自然语言分词的一般方法做了介绍,引用已有的分词词典建立了自用的分词函数,对查询文本进行分词和词性标注。从查询文本中去除虚词、设定的缺省词汇,提出名词、动词、形容词、成语等我们需要的主题关键词,即可得到对目标媒体的描述,称为主题内容。计算相似度之前,主题内容要依据同义词词典进行扩展。媒体资源索引库中包含图像、动画、视频、音频四种类型的媒体,论文采用相似度来衡量查询目标媒体和数据库媒体之间的差距。媒体的特征包括文件属性和内容特征,相似度计算主要是针对媒体的内容特征,对于不同的内容特征使用不同的相似度计算方法。通过比较扩展后的主题内容与数据库中内容描述字段相同词的个数来计算主题内容相似度;主色调颜色词转换为HSI模式,与数据库中以数值方式标注的主色调字段进行色调相似度的计算;图像的主体与主体属性针对数据库中的主体字段计算相似度。所有的内容特征按照其所在层次确定重要性后,计算总相似度。将总相似度大于一定阈值的数据库记录按照总相似度由大到小的顺序,作为检索结果反馈给用户。本文在上述工作的基础上,对面向基础教育的多媒体资源索引库的检索系统进行了大量实验,并对实验结束进行了详细的表述。经实验表明,该系统对结构比较简单的、嵌套较少的查询文本能比较准确的进行分词,对数据库中内容特征标注准确、详实的记录,检索结果准确度较高,证明依据内容特征检索的方法是可行的。缺点是随着多媒体资源索引库中记录的增多,当检索条件比较多时,系统运行速度比较慢。论文最后总结了本文的工作,并提出了下一步的研究方向。

【Abstract】 The education informalization has brought transformation of the concept of educational thought, making the new requirements to both of teachers and students. The teachers should possess to utilize the network to obtain teaching resources, and to organize teaching, as well the students should have the ability of taking advantage of the network to carry on the self-learning. Although there are abundant information resources in the Internet, the available teaching resources are scatteredly distributed and the quality is very different. While existing WEB search engine function has improved day by day, but the majority adopts methods based on the keyword, which is powerless in the retrieval of multimedia resources. Therefore, it is necessary to find out a more convenient and efficient system to assist in the multimedia resources retrieval, especially for the teachers and students of primary and secondary schools, who are lack of computer capacity relatively.On the basis of above reasons, we have established a basic education-oriented index database of multimedia resources, by organizing the pre-search keywords of basic education and searching network resources. Regarding the teachers and students of primary and secondary schools as its main users, the resources index database is exactly based on their textbooks. Furthermore, a multimedia resources index database retrieval system which faces the basic education has been built up, with the support of the ASP technology.Organizing the elementary education pre-search keywords, as preparatory work of the establishment of the basic education-oriented multimedia resources index database, is intent to provide searching direction for the pre-search system. Based on the textbooks of primary and secondary schools, through manually collection and sorting, we built up a thematic words system of basic education from three perspectives, i.e., stage, subject and type. From the aspect of stage, we discussed primary school, junior middle school and senior middle school periods, while there are 5, 12, 14 subjects concerned with each stage respectively. Also, the thematic words are divided into types of image, animation, video and audio.In this thesis, we design and establish a retrieval system that based on the basic education-oriented multimedia resources index database. This is a web-oriented retrieval system, which can create a connection to the WEB multimedia resources index database according to each username. Each resource database contains four types of resource: image, animation, video and audio. User log-in interface, user input interface, search result output interface are included in this system. Analyzing the type, characteristic and storage feature of the media, the retrieval system adopts a method of Chinese natural language query, to measure the difference between the object media and media in the database by means of similarity.The natural language is the effective tool to express thoughts, using which to describe the semantics of multimedia resources will be a simple and effective method. This thesis introduces the general ways of word segmentation on the natural language, builds our own word segmentation algorithm from the existent segmentation dictionaries, to divide the query texts and label its parts of speech (POS) tagging. After obtaining the thematic words such as nouns, verbs, adjectives and idioms by omitting the function words and the default words from the query texts, we can get the description of the object media and call them“theme content”. The theme content should be extended according to the synonym dictionary before calculating the similarity.The media resource index database includes medias of four types of image, animation, video and audio, and the thesis adopts the similarity to measure the difference between the object media and media in the database. The media features include text features and content features, between which we mainly refer to the latter while calculating the similarity. As to different content features, different technologies will be used to calculate the similarity. The similarity is obtained by finding the number of same words between the extended theme content and the content description field in the database. The color word of the dominant hue is changed into HSI model, and then the similarity calculation of the tone with the dominant hue field marking in the database by way of numerical value could be carried on. The subject and subject attribute of the picture are calculated similar degree to the subject field in the database. After confirming the importance of the content features according to their level, calculate the total similarity. The records whose total similarity is greater than a certain threshold value will be recorded from great to small order according to total similar degree, and finally feedbacked to users as searching result.Based on the work described above, a large number of experiments to the retrieval system that based on the basic education-oriented multimedia resources index database have been carried on, and detailed statements to the experimental result have been presented. The experiment showed that, with the relatively simple and less nested query texts, the retrieval system is able to provide pretty accurate words segmentation, detailed and accurate record to the content feather in the database, and the search results of higher accuracy, proving that the method to search according to the content characteristic is feasible. The shortcoming is that with the increase of recording in the multimedia resource index database, or when there are relatively more searching conditions, the system operation slows down. Finally, the thesis summarizes the main conclusion and puts forward the next research direction.

  • 【分类号】TP391.3
  • 【被引频次】4
  • 【下载频次】224