节点文献

图论在聚类分析中的应用

Some Applications of Graph Theoretic Approach in Clustering Analysis

【作者】 张春明

【导师】 高敬振;

【作者基本信息】 山东师范大学 , 应用数学, 2004, 硕士

【摘要】 本文讨论聚类分析中的系统聚类法、模糊聚类法和灰色聚类法,着重探讨其聚类的图论方法。 第一章 介绍聚类分析的基本知识。 第二章 讨论系统聚类分析及其中著名的最短距离法,指出了最短距离法即为图论中求赋权连通图最小支撑树Kruskal算法的变形,得到了最短距离法的优良性质: 定理2.1对最短距离法和k=n-1,n-2,…,1成立。它说明最短距离法得到的系统H=(Pn,Pn-1,…,P1)对每个m=n-1,n-2,…,2,Pm既是分离性最好的m-剖分,又是相似性最好的m-剖分。 第三章讨论模糊聚类分析及其常用的传递闭包法,指出这种聚类方法的复杂性,定义了模糊关系图G=(X,E,(?))及其中两点u,v间的连通强度S(u,v),证明了S是X上的模糊等价关系、G=(X,E,(?))的最大支撑树具有如下性质: 定理3.2 设T是模糊关系图G=(X,E,(?))的一棵支撑树,则以下陈述等价: (1)T是G的最大支撑树; (2)对于任意的e’∈E(T),从T中移出e’得T1,T2,则w(e’)=(?)w(e); (3)对任意的u,v∈X(u≠v),T中u,v之间的惟一路是G中最优(u,v)路。并建立了G=(X,E,(?))中任两点的连通强度和(?)的传递闭包t((?))之间的关系:定理“.“设X二!xl,花,…,xn}的传递闭包t(尽)=尽”=(‘(”,),、。图,则,尽=(动nx,是X上的模糊相似关系,尽G=(X,E,豹是与<X,尽>相对应的模糊关系t(B从,x’)二稍=s(x,.,xj.)·由此仿照图论中求赋权连通图最小支撑树的Dijkstra算法,给出了模糊聚类的最大支撑树算法如下:设模糊相似矩阵尽=(份)nx,,聚类水平为兄,令人表示树的结点集,记份二r(气,xj(l)任选一点vl:X,记鸿={vl},令l(vl)=o,P(vl)==。,l(v):=r(vl,v),P(v):=vl(丫,。X、鸿),k:=(2)算z(v)誊z(v·),令凡十,一人。{v·},对每个;。x\人*,,若 r(v’,v)>l(v),则l(v):=r(v’,v),P(v):=v’. (3)若k二n一1,据指针P(.)倒向追踪得最大支撑树T;T去掉w(e)<兄的 所有边e后的各连通分支即为X在之水平上的一个分类,停,否则,- 令k片k+1,回(2).而且其复杂性仅为口(矛),最后通过实例进行了验证. 第四章讨论灰色关联聚类分析及其传统的逐行检查法,分析用这种方法进行灰色关联聚类的复杂性为口(m4).仿照模糊聚类分析的最大支撑树方法中连通强度的定义和相关结论,给出了赋权图G=(犷,E,D)中两点u,v间的连通强度S(。,‘)及其最大支撑树的一个性质·定义了G=(V,E,D)的兄一截图叹和连通闭包G两个概念,得到了如下结论:定理4.2设赋权图G=(代E,D),则任取兄任[0,l],G;=认.由此分析说明了指标集X中两点u,v在兄水平上能够归为一类,等价于它们在关联图G=(X,E,A)的最大支撑树T的兄一截图瓜中连通·据此建立了灰色关联聚类的最大支撑树方法,其复杂性仅为口(mZ),有效降低了运算的复杂性,最后通过实例进行了验证. 关键词:系统聚类;模糊聚类;灰色关联聚类;最小支撑树;最大支撑树 中图分类号:O巧7.6

【Abstract】 In this paper, we discuss hierarchical clustering methods, fuzzy clustering methods and grey clustering methods in cluster analysis, especially graph theoretic approach.Chapter one introduces the basic knowledge of cluster analysis.In Chapter two we discuss the hierarchical clustering analysis and the famous minimum distance method, and suggest that the method is a transformation of the Kruskal algorithm for the minimum spanning tree of a weighted connected graph in graph theory, obtain a good property of minimum distance method:Theorem 2.1 for minimum distance method and k = n-1,n-2,...,1.It implies that any Pm for m=n~1, n-2,...,2 in the system H = (Pn,Pn-1,...,P1)obtained by the minimum distance method is not only a separation best m -partition, but also a similarity best m -partition.In Chapter three we discuss fuzzy clustering analysis and the well-known transitive closure method, indicate the complexity of the method, define theconnected intensity S(u,v) between vertices u,v in the fuzzy relation graph G = (X,E,R), prove that S is a fuzzy equivalence relation and the maximumspanning tree of G=(X,E, R) has the following properties:Theorem 3. 2 If Tis a spanning tree of fuzzy relation graph G = (X,E,R),then the following statements are equivalent:(1) T is a maximum spanning tree of G;(2) For any e’ ∈ E(T), let T1, T2 be the two components of T - e’, then(3) For any u,v∈(u≠v), the unique path L between u,v in T is an optimal (u,v) path of G.and develop the relation between the connected intensity in G = (X,E,R) and the transitive closure t(R) of R :Theorem 3.3 If X = {X1,X2,...,Xn} , R = (rij)n×n is a fuzzy resembling relation of X, the transitive closure of R is t(R) = Rn =(rij(n))n×n,G=(X,E,R) is the responding fuzzy relation graph of (X, R), thenFrom above we imitate the Dijkstra algorithm for the minimum spanning tree of a weighted connected graph, give an algorithm of the fuzzy clustering with maximum spanning tree:If is a fuzzy resembling matrix and X is the clustering level, letting Ak be a set of vertices in the tree and rij = r(xi ,xj), then(1) Choose a vertex v1∈X, let A1 = {v1}, l(v1) = 0, p(v1) = , l(v):=r(v1,v),(2) Calculate max for any ifr(v*,v)>l(v), then l(v):=r(v*,v) and p(v):=v*.(3) If k=n-1, use the point function p(·) to find out back tracking a maximum spanning tree T of G; each connected component of T with all edges e of w(e) < λ removed is a cluster of X on the level λ, stop.Otherwise, let k := k +1 and return to (2). The complexity of the algorithm is only O(n2). We check the algorithm by anexample finally.In Chapter four we discuss grey correlation clustering analysis and the traditional line-by-line check-up method, point out that the complexity of the methodis O(m4), by imitating the definitions of the connected intensity and related results in the maximum spanning tree algorithm of fuzzy clustering analysis, we give the definition of connected intensity in a weighted graph G = (V, E, D) and property of its maximum spanning tree. Furthermore we define the λ - cross graph Gλ and the connected closure G of G = (V,E, D), obtain a property of Gλand G:Theorem 4. 2 If G = (V,E,D) is a weighted graph, then for any λ∈[0,l],-By which we indicate that vertices u and v in X clustered together in level λ, if and only if they are connected in the λ- cross graph Tλ of a maximumspanning tree T of the correlation graph G = (X, E, A). From this we develop the maximum spanning tree algorithm of grey correlation clustering analysis with the complexity is only O(m), which reduce the amount of operation effectively. Finally we check the algorithm by an example.

  • 【分类号】O157.5
  • 【被引频次】3
  • 【下载频次】586
节点文献中: 

本文链接的文献网络图示:

本文的引文网络