节点文献

软集理论及其在知识获取中的应用研究

【作者】 耿生玲

【导师】 李永明;

【作者基本信息】 陕西师范大学 , 计算机软件与理论, 2013, 博士

【摘要】 随着计算机与网络信息技术飞速发展,数据库数量和规模的迅速增长,人们所面对的海量数据里含有大量的不确定性.在知识发现中,根据不确定性信息的不同表现形式,相继产生了一些处理不确定性的数学理论和工具,诸如普遍使用的概率论、模糊集理论、粗集等等.但是,这些理论有各自的局限性,很难用其中一种工具来处理各种不确定性问题.新的处理不确定性的数学工具-软集理论的提出提供了一种处理多种不确定性(包括随机性、模糊性、不完全性和不可区分性等)的统一数学模型.一些研究工作表明:软集是对模糊集和粗糙集的共同扩展,具有有较好的互补性.软集从对象论域和参数空间两方面来描述不确定性,这使得相对于模糊集或粗糙集等不确定理论而言,软集能够进行更为丰富的信息描述和运算操作,从而在不确定性信息理论、决策分析、模式识别和数据挖掘等领域潜力巨大.因此软集处理不确定性问题的研究具有重大理论研究和实际应用价值.本文在深入研究软集理论已有成果的基础上,系统分析了粗集和软集在知识获取中的不同含义和方法理论,以信息和决策系统为研究对象,从知识获取的关键技术入手,继续探讨软集理论及相关方面的应用,力图使软集理论得到不断丰富和发展.本文的主要工作具体如下:首先,主要讨论了软集在知识获取中的两类知识约简方法.两类知识约简包括保持集合分类能力不变的属性约简和最优决策对象保持不变的参数约简.参数约简主要用于解决软决策中的约简问题,属性约简运用较为广泛,软集理论都能实现这两种约简.以考虑算法的实用性为出发点,讨论了软集的参数约简方法以及有效的约简算法.另外,考虑到软集作为一种工具的普遍可能性,定义了软集近似函数的属性真度,利用软集初始的近似分类,以属性真度作为启发式信息,按照自顶向下的设计方法,逐渐增加重要属性进行约简,直到得到约简结果为止.这样可使软集的属性约简算法不仅可以求得与粗集理论相同的属性约简结果,还具有较好的约简质量和较高的约简效率,使算法复杂度降低,使得在实际解决问题中更有效.其次,讨论软集在知识依赖中的扩展模型和关联规则挖掘的方法.基于对规则关联挖掘在知识依赖应用中重要性认识,给出更有利于值依赖数据挖掘的值软集模型,将包含度引入软集数据关联规则挖掘中,给出软集的包含度度量,讨论了包含度和可信度之间的联系.在此基础上,给出利用包含度在事务数据值软集中挖掘满足给定的支持度和可信度阈值的软关联规则方法,以及最大软关联规则的提取算法.从而证明基于软集包含度的方法提取关联规则极大地约简了冗余,提高了算法的效率,更有利于挖掘有意义的值依赖关联规则,更容易处理多参数和大数据的信息表.最后,讨论了基于优势关系的不完备信息系统的扩展软集模型与知识规则的获取方法.在不完备信息系统中,给出不完备信息的扩展决策软集模型,提出优势决策描述语言的概念,利用优势决策描述公式获取不完备决策软集中的所有确定决策规则,进一步利用软辨识矩阵法对可信规则进行提取与简化,从而获取最优可信规则.其价值在于软集方法可以获得包含更为丰富信息的决策规则,获取的决策规则具有规则短,支持度高等特点,进而确保获取泛化能力强的规则.

【Abstract】 Today the databases’number and scale has been growing in high speed due to the rapid development of computer and network information technology. As a result, people are confronted with huge amounts of data containing abundant uncertainty. In knowledge discovering there are some mathematical theories and tools dealing with different forms of uncertainty information, such as probability theory, fuzzy set theory and rough set. However, it is hard to use only one of them to deal with all uncertainty problems especially in processing huge amounts of data, because these theories have their own limitations. Soft set theory has been proposed to solve uncertainty, which could provide a unified framework to deal with variety of uncertainty (including randomness, fuzziness and incompleteness and indistinguishable, etc.). As an extension of common fuzzy set and rough set, soft set theory provides good complementary for those limitations. Soft set theory describes uncertainty by means of domain and parameters’space so it can describe more abundant information and arithmetic operation than fuzzy set and rough set theory do. In uncertainty information theory, decision analysis, pattern recognition, data mining and other fields, soft set theory has potential superiority. Therefore soft set has important significance on theoretical research and practical application in dealing with uncertainty problems.Based on thorough investigation on soft set theory results, this paper gives a systematic analysis of the different meanings and methods between rough set and soft set in acquiring knowledge. With information and decision system as the research objects, the author has continued to explore soft set theory and its relevant applications by using the key technology of knowledge acquisition. The soft set theory has been enriched and developed. The main work of this paper is as follows:Firstly, two knowledge reductions are mainly discussed in knowledge acquisi-tion, containing attribute reduction which has kept classification ability unchanged and parameters reduction which has kept optimal decision object unchanged. Pa-rameters reduction is mainly used to solve the problem of reduction of soft decision, while attribute reduction is more widely used. This paper discusses the parameters reduction of soft set and gives much more effective reduction algorithm. The truth degree of attribution of soft sets’approximate functions is developed. Attributes are increased for reduction gradually in a top-down way with the truth degree of attribution as heuristic information until the reduction results are obtained. The proposed attribute reduction algorithm of soft sets can obtain the same result as rough set attributes reduction does. And also it has good reduction quality and high reduction efficiency, which is suitable for processing large data sets with redundant attributes.Secondly, we deal with association rule mining by expanding value soft set model, which is an important content in real practice. In order to make the organi-zation form of soft sets for table data more beneficial to data mining, the notion of inclusion degree、association rules and maximal association rules between two sets of the attributes are proposed. We point out an effective approach using inclusion degree of soft set for association rules mining. The validity of this method has been verified by the example comparative analysis. The algorithm’s complexity is de-creased. This method is more advantageous to mine meaningful value dependence rules and is easier for dealing with multiple parameters, mass data information tables.Finally, in acquisition technology of incomplete information system decision rules, optimal credible rules are obtained by expanding decision soft set model of incomplete information. By using descriptive formula of dominance decision making, we propose the concept of dominance decision-making descriptive language to obtain all determined decision rules in incomplete decision soft sets. Then soft simplified discernible matrix method is used to extract and reduce credible rule. Soft set method can obtain decision rules with more abundant information, which are short decision rules and have high supporting degree.

  • 【分类号】TP18;TP311.13
  • 【被引频次】1
  • 【下载频次】384
  • 攻读期成果
节点文献中: