节点文献

基于免疫遗传算法和粒子群算法的聚类研究

Research on Clustering Based on Immune Genetic Algorithm and Particle Swarm Optimization

【作者】 孙洋

【导师】 罗可;

【作者基本信息】 长沙理工大学 , 计算机应用技术, 2010, 硕士

【摘要】 随着信息科学技术的发展,人们越来越倾向于选择用计算机来统计和管理数据,数据库的规模也随之不断地扩大。当人们积累了大量的商业数据以后,如何从汪洋大海般的数据中发现有价值的信息成为一个急需解决的重要问题。由此数据挖掘技术应运而生,它是目前数据库和信息决策领域最前沿的研究方向之一。聚类分析作为数据挖掘的一个重要分支,是通过分析数据的相似性把大型数据集合划分成组,使得同一个组里面的数据彼此最为相似,而不同组中的数据彼此相异。聚类是发现有用信息的一种有效手段。目前,聚类分析已经广泛地应用于模式识别,数据分析,图像处理以及市场研究等领域。目前在文献中存在大量的聚类算法。算法的选择取决于数据的类型、聚类的目的和应用。本文探讨了基于免疫遗传算法和基于粒子群算法的C—均值聚类方法。所做的主要工作如下:1.用免疫遗传算法完成聚类工作。首先,分析现有遗传算法的优缺点,将免疫机制引入遗传算法,用来克服了标准遗传算法的早熟现象;其次,将C—均值算法和免疫遗传算法有机结合,形成一种混合算法;最后,根据聚类问题的实际情况设计遗传选择、交叉和变异算子,使得混合算法更快、更有效地收敛到全局最优解。2.用改进后的粒子群算法实现聚类。首先,分析现有粒子群算法的优缺点;其次,将局部搜索能力强的C—均值算法和基于遗传算法的交叉、变异操作同时结合到粒子群算法中;最后,通过适当调节,发挥各自的优点。既提高了PSO算法的局部搜索能力,又因为增加了种群的多样性,防止了算法的早熟。3.将改进后的算法选择一些数据集用MATLAB编程做聚类实验,并与其他算法结果进行对比,分析试验结果。

【Abstract】 With the development of information science and technology, people have inclined to collecting and organizing all kinds of data by computers, then the size of data has expended as well. When people have accumulated massive amount of business data, how to find the valuable information in the vast ocean-like data have become an urgent need to be solved. For this data mining techniques have emerged,which is one of the most cutting-edge research of the database and information decision-making. Cluster analysis as an important branch of data mining is the analysis of data’s similarity, and divided the large data sets into groups, in which the data inside the same group was most similar to each other and the data in different groups was differ from each other. Clustering is an effective means of finding useful information. At present, Cluster analysis has been widely used in pattern recognition, data analysis, image processing, market research and many other fields.There is a large number of clustering algorithms in the literature. The choice of algorithm depends on the type of data, the purpose and applications of clustering. This paper discussed C-means clustering method which based on the immune genetic algorithm and particle swarm optimization algorithm separately. Following is the main work has been done:1. Complemented clustering algorithm with immune genetic algorithm. First, analyzed the strengths and weaknesses of the existing genetic algorithm, the immune mechanism was introduced into the standard genetic algorithm to overcome the premature phenomenon; Second, the C-means algorithm and the immune genetic algorithm were combined to form a hybrid algorithm; Finally, based on the actual situation of the clustering problem designed the genetic selection, crossover and mutation operators, made the hybrid algorithm converge to the global optimal solution much faster and more efficiently.2. Clustering with the improved particle swarm algorithm. First, the advantages and disadvantages of the existing particle swarm optimization were analyzed; second, the C-means algorithm which has strong local search ability and the genetic algorithm-based crossover and mutation operations were mixed into the particle swarm algorithm; finally, them have played their advantages respectively through appropriate regulation. Not only the PSO algorithm’s ability of local search is improved, but also the diversity of the population was increased, at last achieved the purpose of prevent premature problem of the algorithm.3. Selected some data sets and the clustering experiments were implemented through MATLAB programming by the improved algorithms, and results were compared with other algorithms, and analyzed the result of the experiment.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络