节点文献

基于粗集理论的数据约简及其在现代远程教育中应用的研究

Study on Data Reduction Based on Rough Set and Its Application in Modern Remote Education

【作者】 何伟

【导师】 李华;

【作者基本信息】 重庆大学 , 计算机系统结构, 2003, 硕士

【摘要】 随着网络和多媒体技术的飞速发展,现代远程教育作为一个新的教育模式正在对传统的教学模式产生深刻的变革。在这种新的教学模式中,各种各样的评价系统是现代远程教育体系结构中的重要组成部分之一。这些评价系统通过给出相关的评价指标,收集评价数据,经过数据挖掘,获得评价的决策。但是在网络环境中,这些评价系统在数据处理时会面临如下问题:1. 数据量大2. 数据不完备3. 希望得到的知识是数据本身所含知识的真实反映,尽量减少外界的影响。粗集理论由于其自身的特点恰好能够解决这三个问题。 粗集理论是Pawlak教授于1982年提出的用于处理不确定、不完整知识的数学工具,它涉及了数据的表达、学习、归纳的理论方法。和其它数学工具不同,粗集合理论不需要人们的先验知识做指导,也不受外界的影响,而是客观地反映数据集合所包含的信息,因此20年来越来越受到研究人员的普遍重视。粗集理论也逐渐成为KDD的有力的数学工具。本论文将以远程教育应用为研究背景,集中研究不完备信息系统下的数据约简算法。不完备系统由于存在数据的缺失,传统的粗糙集模型就存在着局限性,因此必须对传统模型进行扩展。论文首先对粗糙集的相关理论做了一番介绍,然后针对已提出的容差关系模型的不足,提出一种改进容差关系模型,该模型更加符合客观实际,增加了灵活性。之后以该改进模型为基础,在不进行数据补齐的情况下,运用属性重要性和粗糙熵理论,在传统属性约简算法的基础上提出了能够处理不完备信息系统的属性约简算法,给出了算法的描述并进行了一定的性能分析。经过属性约简后的决策表仍然有冗余信息,因此需要值约简,论文对二进制可分辨矩阵加以改进,提出多值可分辨矩阵,并在多值可分辨矩阵的基础上提出值约简算法,最后获取决策规则。论文最后以教师评价系统为例,将本论文提出的算法应用到该系统中,并和传统容差关系模型做了比较。

【Abstract】 With the developmemt of the Internet and multimedia technology, modern remote education as has been deeply impacting the traditional teaching mode as a newly teaching mode. In this new mode, various evaluating systems is one of the important component in modern remote education architecture. These evaluating systems give inspectively evaluating indexes, collect evaluating data, and then obtain decision rules by data minging. But in network enviroment, there are some problems in data procession. These are:1. Large amount of data2. Incomplete data3. The knowledge that we obtain is the truly reflection of the decision table without influence of ousider and priori knowledge. Rough set theory is the tool to solve those problems.Rough set theory proposed by Pawlak in 1982 is a mathimatic tool for handling uncertain and incomplete knowledge. It involves methods of data expressing、data learning、and data reducing. Rough set is so different from fuzzy set and other mathimatic tools that it does not need prediction of priori kownledge and is not impacted by outsider but reflect information in data objectively. So people take more and more concern on rough set in recent 20 years. Rough set becomes a powerful tool in KDD increasingly.In this paper, remote education is taken as background to research data reduction in incmplete information system. Because of incomplete data in incomplete information system, the traditional rough set models are not suitable for incomplete system. So the traditional model must be extended to satisfy the incomplete system. in this paper, rough set theory is firstly introduced, and then an improved rough set model is proposed according to the shortcoming of extended model which has been proposed.The new model is more fit for reality and has more flexible.Then a new attributes reduction alogrims are proposed based on improved model applying importance of attributes and rough entropy theory.But there is still redundant data in data table after attribute reduction. For each object, not all attribute value are necessary for last decision rule, so the reduction must be done in further step to get rid of redundant information continuously. That is called<WP=6>value reduction. The author improves the binary discernable matrix, come up with multi-value discernable matrix and apply it to give a value reduction alogrim by constructing multi-value discernable matrix for each object to obtain dicision rules. In the last section of this paper, an application-----Teaching Evaluation System is given as application of rough set. All algorism given in this paper are applied in the application and be compared with tolerance relation.

  • 【网络出版投稿人】 重庆大学
  • 【网络出版年期】2004年 01期
  • 【分类号】TP311.13
  • 【被引频次】1
  • 【下载频次】157
节点文献中: