节点文献

粗糙集中基于粒计算的动态知识更新方法研究

Research on Granular Computing-based Approaches for Dynamic Knowledge Updating in Rough Set Theory

【作者】 陈红梅

【导师】 李天瑞;

【作者基本信息】 西南交通大学 , 计算机应用技术, 2013, 博士

【摘要】 随着现代信息技术的高速发展,不同类型的数据急剧地增加,如何有效及时地从数据的海洋中发现知识是一个迫在眉睫的问题.尤其重要的是,各种应用中的数据不断动态地演化,包括新(旧)的数据不断增加(删除),错误的数据或数据值需要从数据库中修正等,如何在数据的动态变化中及时有效地更新知识是一个重要的研究课题.特别是在大数据或交互式实时应用中,高效地知识发现必不可少.动态知识发现的目标是提高知识发现的效率,以满足不同应用的需要.粒计算理论提供了分而治之和多层次的处理框架,是高效处理大数据行之有效的理论之一.本论文基于粗糙集理论,运用粒计算的思想,对动态知识发现中的若干关键问题进行了研究.本论文的主要研究工作和创新概述如下:1.刻画了对象增加(删除)时变精度粗糙集模型中知识粒度和划分的粒度度量的动态变化规律,阐明了属性值域变化和属性值域不变时其知识粒的变化规律,揭示了知识粒度变大(变小)概念粒度不变,知识粒度变大(变小)概念粒度变大等情况下相对错误分类率的变化机理.给出了论域变化时变精度粗糙集模型中近似集动态更新的原理和算法,采用了UCI公用数据集进行评测并验证了算法的有效性.(第3章)2.提出了一个优势特性关系粗糙集模型,可用来同时处理不完备和有序信息.给出了不完备决策信息系统中属性值向上(向下)多层(单层)粗化细化的定义,刻画了在不同层次和方向属性值的粗化与细化条件下优势类和劣势类的动态变化机理.给出了属性值域变化时优势特性关系粗糙集中近似集动态更新原理和算法,实验验证了算法的有效性,并阐明了算法的性能与优势类(劣势类)的粒度相关.(第4章)3.定义了经典粗糙集模型中最小辨识属性集,刻画了在完备决策系统中属性值粗化细化时知识粒度及相关各参数的变化机理,揭示了其规则度量的动态变化规律,阐明了等价类的泛化决策与近似集之间的关系.提出经典粗糙集模型中决策特征矩阵的概念,刻画了在属性值粗化细化过程中等价类、等价类的泛化决策、属性重要度和辨识属性等变化规律,由此给出了通过增量更新决策特征矩阵、分配辨识矩阵和最小辨识属性集从而更新规则的原理和算法,实验验证了算法的有效性.(第5章)4.提出信息系统中等价类特征矩阵和特征值矩阵的概念,阐明了属性增加和对象增加对决策粗糙集模型中粒度的影响,根据两者对粒度影响的不同,将对象和属性同时变化的信息系统划分为三个子空间,分别给出了在不同子空间中,粒之间的相互关系以及对等价类特征矩阵的影响及动态更新原理.设计了属性和对象同时变化时基于粒的动态更新决策粗糙集模型中近似集的算法,实验验证了算法的有效性.(第6章)论文的研究工作拓展了粗糙集理论及应用的研究范畴,提出了动态数据环境中基于粒计算的知识增量更新的原理和算法以提高知识发现的效率,对大数据处理的研究具有一定的理论和实践意义.

【Abstract】 With rapid development of modern information technology, different types of data increase sharply. How to discovery knowledge timely and effectively from the big data is an extremely urgent problem. And the most important thing is the data in the different applications evolve dynamically, i.e., the new (old) data are inserted (deleted) continually, error data or values need to be revised in the database. How to update knowledge in time and effectively while data varies dynamically is an important research topic. It’s particularly essential to the applications of big data or interactive one. Dynamic knowledge discovery aims to improve the efficiency of the knowledge discovery and satisfies the needs in different applications. Granular computing provides a framework of divide and conquer, and multi-level for data processing, which is one of effective theories for processing big data. In this thesis, we focus on study several key problems of the dynamic knowledge discovery based on rough set theory by using the methodology of granular computing. The main research works and innovations are as follows:1. The dynamic properties of the measure of the knowledge granularity and the partitions are described firstly considering the cases of inserting an object or deleting an object. Then, the variation principle of knowledge granules in the case of attribute domain keeping unchanged (changed) are presented. Furthermore, the variation properties of the relative degree of mis-classification in the cases that knowledge granule increases (decreases) and concept granule keeps unchanged (increases) are investigated. Principles and algorithms for effectively up-dating approximations of variable precision rough set model while the universe varies are given. Experiments have been carried out in UCI data sets which verify the effectiveness of the algorithm.(Chapter3)2. A dominance-characteristic relation based rough set is proposed which may be used to pro-cess incomplete and ordering information. The definitions of up (down) multi-level (single-level) attribute coarsening or refining are given in incomplete order decision systems. The dynamic properties of dominating classes and dominated classes are investigated in different cases of attribute coarsening and refining. Then, the principles and algorithms for updating approximations are proposed under the dominance-characteristic relation based rough set model while the value domain varies. Experimental results verify the effectiveness of the algorithms and show that the performance of algorithms is related to the granularities of the dominating classes and dominated classes.(Chapter4) 3. A minimum discernibility attribute set is defined in classical rough set model. Then, the properties of knowledge granule and relevant parameters are described in complete decision systems under the coarsening or refining of attribute values. The dynamic properties of the measures of the rules are presented. In addition, the relationship between the general decision of the equivalence classes and approximations are investigated. A decision feature matrix in classical rough set model is defined. The dynamic properties of equivalence classes, gen-eral decision of the equivalence classes, the importance of attributes, and the discernibility attributes with regard to the coarsening or refining of attribute values are investigated. Fur-thermore, propositions and algorithms for updating rules via updating the decision feature matrix, the assignment discernibility attribute matrix and the minimal discernibility attribute set are given. The effectiveness of the algorithm is verified by experiments.(Chapter5)4. An equivalence feature matrix and a characteristic value matrix in the information system are defined firstly. The variations of the granularities under decision-theoretic rough set mod-el are analyzed when objects and attributes are added simultaneously, respectively. Then, the information system is decomposed into three subspaces considering the different effect-s to the granularities by objects and attributes. The relationship between granules and the effect to the equivalence feature matrix as well as the properties of dynamically updating approximations are discussed in the different subspaces. Furthermore, algorithms based on the granule are developed to update approximation of decision theoretic rough set model dy-namically while attributes and objects evolve with time simultaneously. Experimental results verify the effectiveness of the algorithms.(Chapter6)The works in the thesis extend the research field of the rough set theory and it’s appli-cations, and present the principles and algorithms based on granular computing for dynam-ic knowledge discovery, which improve the efficiency of knowledge discovery. The research achievements have certain theoretical and practical significance to the analysis of big data.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络