节点文献

基于目标函数的模糊聚类新算法及其应用研究

Research of New Fuzzy Clustering Algorithms Based on Objective Function and Its Applications

【作者】 汪庆淼

【导师】 鞠时光;

【作者基本信息】 江苏大学 , 计算机应用技术, 2014, 博士

【摘要】 聚类分析是统计模式识别中无监督分类的一个重要分支,基于实际问题的需要,聚类分析在近三十年的研究及应用中得到飞速的发展。由于能更准确描述模式间的不确定关系,模糊聚类算法研究发展成为聚类分析领域的研究热点。基于目标函数的模糊聚类算法将聚类分析问题转换为一个带约束条件的优化数学问题,通过求解条件优化问题的解从而确定数据集的模糊划分及聚类结果。此类算法具有较好直观理解、算法设计简单、聚类效果良好、易于推广应用等优点,在模式识别及分类、图形图像处理、以及计算机视觉等众多领域中获得了成功的应用,从而成为数据挖掘和机器学习领域的研究热点。模糊c均值聚类(FCM)及可能性c均值聚类(PCM)是两种典型的基于目标函数的模糊聚类算法,本文综述了这两种算法的研究现状,针对聚类算法的四个研究方面:平衡不平衡数据集模糊聚类、多模糊指标广义化、基于PSO算法的模糊指标广义化、模糊指标自适应寻优进行了研究,主要的工作如下:(1)针对平衡或不平衡数据集分类问题,说明了聚类分析与有监督分类关于不平衡数据集问题的区别,分析了聚类分析针对平衡或不平衡数据集分类应满足的基本性质,指出模糊聚类结果不均衡的原因在于对样本容量的忽略,提出了模糊聚类算法均衡化的概念、基本原理和实现方法,通过在聚类算法目标函数中引入被忽略的样本容量信息可实现算法均衡化。基于模糊聚类算法均衡化的原理,对FCM及PCM算法进行了均衡化处理,得到均衡FCM算法及均衡PCM算法。由于目标函数的复杂性,无法利用梯度信息得到模糊隶属度迭代公式,引入粒子群生物群智能优化算法对模糊隶属度进行估计,实现了聚类算法对于平衡或不平衡数据集统一形式的有效分类。(2)研究了聚类算法多模糊指标的广义化。分析了FCM算法聚类收敛的基本原理,解析了FCM算法选择极小值点迭代进而实现目标函数单调递减的算法构造,揭示了多模糊指标与原有单一模糊指标的关系,即非最速下降迭代路径和最速下降迭代路径的关系,从而提出聚类算法模糊指标广义化的概念及实现途径。对FCM及PCM算法施行模糊指标广义化,得到了广义FCM及广义PCM算法,使得原有聚类算法成为广义化算法的特例,扩展了模糊指标的取值范围并可得到多种算法迭代路径,丰富和优化了聚类算法的聚类结果。另外也分析了FCM算法模糊指标m≤1时的各取值阶段特性,从反面验证了FCM算法不能取值m≤1的原因。(3)研究了基于粒子群算法的模糊指标广义化。在模糊指标广义化研究的基础上,对模糊指标取值范围进行了分析讨论,受限于FCM算法目标函数对模糊隶属度二阶海塞(Hesse)矩阵正定的要求,FCM算法模糊指标m要求大于1,通过理论分析发现,利用粒子群算法对模糊隶属度进行估计,可放宽m值约束要求为大于0,从而提出模糊指标粒子群广义化的想法,在此基础上对FCM及PCM算法进行粒子群广义化处理,采用粒子群算法对模糊隶属度解空间寻优,放松了梯度法所求模糊隶属度迭代公式对m>1的要求,从而进一步拓展了聚类算法模糊指标取值空间,优化了聚类算法的寻优路径。(4)在模糊指标自适应寻优方面,总结并分析了传统模糊指标m值确定方法的分类、基本原理及存在的不足,讨论了模糊指标与模糊隶属度、聚类中心三者的相互关系及对于聚类算法的价值意义。说明了模糊指标的取值应与模糊隶属度及聚类中心的迭代寻优相互关联,指出其取值应满足动态、自适应及目标函数存在模糊指标极值的基本要求,提出利用粒子群算法并基于实际数据对模糊指标进行自适应寻优的设想。对FCM及PCM算法进行了模糊指标自适应寻优处理,通过改造FCM及PCM算法目标函数,使目标函数对模糊指标存在极值,采用粒子群算法对模糊指标及模糊隶属度进行估计,实现了聚类算法对模糊指标与模糊隶属度、聚类中心三参量动态自适应寻优的目的。

【Abstract】 Cluster analysis is an important branch of unsupervised classification in statistical pattern recognition area. It has grown rapidly in the nearly three decades of research and application. Because of its more accurate description of the uncertainty relation between models, fuzzy clustering algorithm has become popular research field of cluster analysis. Based on objective function, fuzzy clustering algorithm uses a constrained optimization mathematical problem to represent clustering problem, and then determines the division of data sets and fuzzy clustering results by solving the optimization problem. Fuzzy clustering algorithm has been widely applied in image processing, pattern recognition and computer vision.Fuzzy c-means clustering and possibilistic c-means clustering are two typical fuzzy clustering algorithms which are based on objective function. The dissertation overviews the research status of two algorithms, The dissertation studies on four research areas:Fuzzy clustering for balanced/imbalanced dataset, the generalized multi-fuzzy indicators, the generalized PSO algorithm fuzzy indicator and adaptive optimization fuzzy indicator.The main research results achieved in this dissertation are given as follows:1. For classification problem of balanced/imbalanced data set, the dissertation analyses the difference between supervised classification and unsupervised classification about imbalanced dataset. It illustrates the basic properties that the clustering analysis for balanced or unbalanced dataset classification should meet. The dissertation indicates the causes of fuzzy clustering imbalance is due to the missing of sample size, and proposes equalization concept, basic principle and method for fuzzy clustering algorithm. Clustering algorithm can implement equalization by importing the sample size information in the objective function. Based on the principle of equalization, the dissertation equalizes the FCM and PCM algorithm, and obtains the balanced FCM and PCM algorithm. Due to the complexity of the objective function, We cannot use gradient information to obtain iterative formula of fuzzy membership. The dissertation introduces particle swarm optimization algorithm to estimate fuzzy membership and then implements clustering algorithm classification for balanced or unbalanced data sets.2. The dissertation studies the generalized multi-fuzzy indicators of clustering algorithm. It analyzes the basic principle of convergence about the FCM clustering algorithm and illustrates algorithm construction of FCM algorithm that chooses the minimum point iteration for the objective function monotonically decreasing. The study reveals the relationship of multi-fuzzy indicators and original single indicators,that is, the relationship of the non-steepest iterative descent path and the steepest descent iterative path. Based on this research, we propose the concepts of generalized fuzzy indicator for clustering algorithm. We implement generalized fuzzy indicators to FCM and PCM algorithm. We make the original clustering algorithm as a special case and extend the range of fuzzy indicators and obtain a variety of iterative path for fuzzy algorithms. It optimizes clustering results and iterative path for clustering algorithm.3. After studying PSO algorithm-based fuzzy indicator, the dissertation discusses the value range of fuzzy indicators. Subject to the FCM Objective function’s requirements that second-order Hessian matrix about fuzzy membership must be positive definite, fuzzy indicator m in FCM algorithm must be greater than1. After through theoretical analysis we found that fuzzy indicator m may relax the value of m constraints greater than0if we use particle swarm algorithm to estimate the fuzzy membership degree. Based on this assumption, the dissertation proposes fuzzy indicators with generalized particle swarm to generalize FCM and PCM algorithms. The method use particle swarm algorithm to find the optimal solution in fuzzy membership space. It relaxes requirement of gradient method that fuzzy indicator must be greater than1and further expands the value range of fuzzy indicators in clustering algorithm.4. Self-adaptive optimization of fuzzy index is studied. The dissertation first summarizes the traditional methods for determining fuzzy control indicator and analyzes their classification, basic principle and shortcoming. It discusses the relationship of fuzzy indicator, fuzzy membership and cluster center. It illustrates that the fuzzy index values should be interrelated with iterative optimization of fuzzy membership and cluster centers. The study points out that fuzzy indicator value should meet the dynamic, self-adaptive and exist extremum for objective function. The dissertation proposes the method of particle swarm algorithm for fuzzy indicator’s self-adaptive optimization. The objective function of fuzzy algorithms existing extremum through transforming objective function of FCM and PCM algorithms. The method utilizes particle swarm optimization to estimate fuzzy indicator and fuzzy membership, and implements self-adaptive optimization for fuzzy indicator, fuzzy membership and clustering center.

  • 【网络出版投稿人】 江苏大学
  • 【网络出版年期】2014年 08期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络