节点文献
基于约束概念格的离群数据挖掘方法及应用
Research on Outliers Mining Method and Its Application Based on Constrained Concept Lattice
【作者】 蒋义勇;
【作者基本信息】 太原科技大学 , 计算机软件与理论, 2007, 硕士
【摘要】 概念格是数据分析和知识提取的一种有效形式化工具,具有精确性和完备性等特点。约束概念格是利用用户对数据集的兴趣、了解、认识等作背景知识,指导概念格的构造,从而使概念格的结构更具有针对性和实用性。本文针对约束概念格的代数系统、基于约束概念格的离群数据挖掘进行了研究。主要研究工作如下:第一、约束概念格的代数系统。利用约束概念格节点之间的上、下确界运算,构造出了约束概念格的代数系统,并给出其代数性质,证明了约束概念格知识表示的完备性,从而为基于约束概念格的数据挖掘与知识发现奠定了理论基础。第二、提出了基于约束概念格的离群数据挖掘算法。首先,将约束概念格中每个概念节点的内涵缩减看作子空间,并计算其稀疏度系数,若某个K维内涵缩减的稀疏度系数小于稀疏度系数阈值,则考察其所有K-1维真子集,判断由这些真子集构成的子空间是否稠密;其次,根据稀疏度系数和稠密度系数,判断概念节点的外延所包含的对象是否为离群数据;最后,采用天体光谱数据作为形式背景,实验结果表明,该算法挖掘低维子空间中的偏离数据是准确的、完备的和有效的。第三、在上述研究的基础上,以VC++ 6.0和Oracle 9i为开发工具,设计并实现了天体光谱数据离群挖掘原型系统,并对软件模块功能、体系结构及关键技术进行了详细描述。运行结果表明,该系统是可行的、有价值的,从而为实现天体光谱数据离群数据挖掘提供了一种新途径。
【Abstract】 Concept lattice, which has accurate and complete characteristics, is an effective tool for data analysis and knowledge discovery. In order to improve the utility and pertinence to concept lattice construction, taking customer’s interest and understanding about data set as back grand knowledge, guiding the process of constructing concept lattice, a new concept lattice– constrained concept lattice is presented. This paper research on the algebra system of constrained concept lattice and outliers mining based on constrained concept lattice. The main research work can be summarized as follows:First, the algebra system of constrained concept lattice is constructed. According to the operation of supremum and infimum among constrained concept lattice nodes, the algebra system of constrained concept lattice is constructed and its algebra property and the complement of knowledge are proved. Establishing the theory base for outlier mining based on constrained concept lattice.Second, the outliers mining algorithm based on constrained concept lattice is proposed. Firstly, the constrained intent reduction of constrained concept lattice nodes is regarded as subspace, and sparsity coefficient is computed for every constrained intent reduction of the nodes, If there is a k dimensional constrained intent reduction that its sparsity coefficient is less than the sparsity coefficient threshold value which user set beforehand, then enumerate the k-1dimensional subset of the constrained intent reduction and judge whether it is dense subspace. Secondly, judging whether the object contained in the extent of constrained concept lattice are outliers, according to sparsity coefficient and dense coefficient. Finally the experiment results prove the efficient and validity of outlier mining based on concept lattice algorithm CLOM by taking the star spectra from the LAMOST project as the formal context. Third, on the basis of above, by using VC++ 6.0and Oracle 9i as development tools, the outliers mining system for star spectra data are designed and realized, and its function modules, software architecture and key technologies are elaborated. In the end, the running results show that it is feasible and valuable for outlier mining for star spectra data.
【Key words】 Constrained Concept Lattice; Algebra System; Outlier; Star Spectra; Sparsity Coefficient; Dense Coefficient;
- 【网络出版投稿人】 太原科技大学 【网络出版年期】2007年 04期
- 【分类号】TP311.13
- 【被引频次】1
- 【下载频次】109