节点文献

可扩展的数据中心网络互联关键技术研究

Research on the Key Technology of Scalable Data Center Network Interconnection

【作者】 黄鑫

【导师】 彭宇行;

【作者基本信息】 武汉大学 , 通信与信息系统, 2014, 博士

【摘要】 数据中心是云计算等新型应用服务的重要基础设施。作为数据中心的核心,数据中心网络支持数万至数十万台服务器间的互联,并为上层计算服务提供高效的网络通信和数据传输能力。云计算等新型应用服务的快速发展,对网络互联的可扩展性、路由协议和容错性等关键性技术提出了新要求,以致新型数据中心网络成为近年来人们研究的热点。本文面向数据中心网络,开展网络拓扑结构和容错路由、数据中心的数据放置、数据查询以及网络连接复杂性等方面的研究。当前数据中心网络拓扑设计以可扩展为主要目标,具有并行度高、容错性强等特点,但也具有成本高、连接复杂和维护困难等问题。针对该问题,本文提出了一种可扩展、低成本的模块化集装箱式数据中心网络MyHeawood。MyHeawood基于两端口低端商用服务器和小型交换机,以分层递归定义的方式构建一个大规模数据中心网络。其基本思想是:MyHeawoodo由n个两端口服务器直接连接到一个小型交换机上组成,将14个MyHeawoodo按Heawood图连成MyHeawood1, MyHeawood1构成机柜内服务器间的互联方式;将14个MyHeawood1按Heawood图连成MyHeawood2. MyHeawood2构成集装箱内机柜间服务器的互联方式;对于集装箱间的互联,我们设计了两种网络互联拓扑:一种是将14个MyHeawood2按Heawood图连成MyHeawood3;另一种是基于交换机连接,能实现任意个MyHeawood2间的互联。基于MyHeawood结构,我们设计了一种容错路由算法。分析和实验表明:MyHeawood具有构建成本低、平均路径短和容错能力强等特点。数据中心作为一种数据存储与处理的基础设施,必须具备海量的数据存储和处理能力,而这种能力依赖于数据中心网络中高效、持久的数据放置策略。面向MyHeawood网络互联拓扑,结合数据中心数据放置的特点和要求,本文提出了一种适合MyHeawood拓扑特征的数据放置方法。该方法基于三副本策略,将副本分别放置在距离相近而处于不同MyHeawood子层的节点服务器上。通过构造一个哈希函数族将数据第一副本r0映射到MyHeawood3服务器上;第二副本r1放置在与r0所在服务器直接相连的相同MyHeawood1中不同的MyHeawood0服务器上;第三副本r2放置在与r1所在服务器直接相连的不同MyHeawood2中的MyHeawood0服务器上。实验表明,该数据放置方法具有良好的负载均衡和查询效率。数据中心中的服务器既存储数据,又参与路由,服务器的失效会引起数据查询的失败。针对这一问题,本文提出了一种目标节点失效下的高效、分布式容错查询算法。其基本思想是:先计算出所有目标节点地址,再根据其远近选择被查询的目标节点;如果目标节点发生故障,则查询与失效目标节点更近的副本目标节点;如果前面两个副本都发生了失效,则查找存储第三个副本的目标节点。实验表明,该算法具有很好的容错特性。针对数据中心网络互联的性能评价问题,本文提出了大规模数据中心网络拓扑的连接复杂性及其度量方法,包括网络中的节点命名、连接方式、拓扑结构的递归性和连接复杂度等四个方面。节点命名的复杂性计算反映了在相同拓扑结构下,不同的节点命名方式对其复杂性的影响力;连接方式的复杂性表现了连线的规则对网络拓扑连接的复杂性影响;拓扑结构的递归性定义为,当一个拓扑增加节点时,不改变原拓扑的连接关系,则认为该拓扑图是可递归的,其可递归性越好,复杂性越低。连接复杂度从维护的角度对拓扑复杂性进行定义,通过对环、全连通图、网格、2维环、De Bruijn、多维体、胖树、蝶网、DCell、BCube和MyHeawood等典型结构进行复杂性分析的结果表明,该方法能有效计算网络拓扑的连接复杂性值;同时也表明,在数据中心构建时,网络拓扑的连接复杂性是一个必须考虑的重要指标。

【Abstract】 Data center is an important infrastructure for new data intensive computing, such as cloud computing. The core component of data center is data center network that supports tens or hundreds of thousands of servers, and provides efficient network communication and data transmission capabilities for computing services. With the rapid development of application services based on cloud computing model, new requirements for scalable, routing protocol and fault-tolerant key technology for networks have been proposed. Hence, the novel data center network has become the study focus in recent years. The thesis researches network topology, routing, data placement, fault-tolerant query and network connection complexity for the data center.The current topology design for data center network is targeted at extensibility, which has features of high parallelism and strong fault tolerance, but also has problems of high cost, complexity and maintenance. Aiming at these problems, the thesis proposes a scalable, low-cost modular container network named MyHeawood. MyHeawood is based on two-ports, low-end commercial server and small switches, and can build a large-scale data center networks by the hierarchical recursive way. The basic idea is that MyHeawoodo is built by a small switch directly connected n two-port servers. The14MyHeawoodo are connected by Heawood graph into a MyHeawoodi, which is interconnection between servers in cabinet of data center network. The14MyHeawood1are connected by Heawood graph into a MyHeawood2, which is interconnection of the inter-cabinet in container. For interconnection between the containers, we design two kinds of network interconnection topology. One topology is that14MyHeawood2are connected by Heawood graph into MyHeawood3. The other is based on the switch which can realize the interconnection between any numbers of MyHeawood2. Based on the MyHeawood structure, we design a fault-tolerant routing algorithm. The analysis and experiment results show that MyHeawood has the characteristics of low cost, short average path and strong fault tolerance etc..As an infrastructure, data center must have the ability for mass data storage and processing, which depends on the efficient, persistent data placement strategy in data center network. Oriented to the MyHeawood network topology, and based on characteristics and requirements of data placement in data center, the thesis proposes a method of data placment for MyHeawood network based on the three replicas strategy.The replicas are placed in three nodes, each of which holds one replica and has approximative distance and lies in different sub layer of MyHeawood. The first replica ro is directly mapped to server of MyHeawood3by constructing a family of hash functions. The second replica r1is placed in server of different MyHeawoodo in the same MyHeawood1directly connected with the server of ro. The third replica T2is placed in server in MyHeawoodo directly connected with the server of r1in different MyHeawood2-Experiments show that the data placement method has good load balance and query efficiency.The server in data center can not store data, but participate in routing. In this context, the server failure will cause the failure of data query. Aiming at this problem, this thesis proposes an efficient distributed fault-tolerant query algorithm for target node failure. The basic idea is firstly to calculate the target node addresses, and again choose the queried target node according to their distance. If one target node fails, the new target node with replica which is closer to the failure one in distance is selected to query. If both of replicas are failure, the target node storing the third replica is to be lookuped. Experiments show that the algorithm has strong fault-tolerance.Aiming at the problem of performance evaluation of data center network, we propose the complexity measurement methods of large-scale data centers interconnected network topology, including nodes naming, connection mode, recursion of topology structure and connection complexity in the network. The computing complexity of node naming reflects influence of different node naming on its complexity in the same topology structure. The complexity of connection shows that connection rules affects performance of the connection complexity in network topology. The recursion of topology structure defined as follows:when a topological node is increased, without changing the original relationship of the topological connection, the topology graph is recursive. The recursion is better, lower complexity is. Connection complexity is defined from the perspective of topology maintance. Based on the complexity analysis for the typical structure such as the ring, graphs, grid,2-dimensional ring, De Bruijn, cube, fat tree, butterfly net, DCell, BCube and MyHeawood, results shows that the method can effectively calculate the complexity of network topology. It also shows thatthe connection complexity of network topology is important evaluation parameters to be considered in the data center construction.

  • 【网络出版投稿人】 武汉大学
  • 【网络出版年期】2014年 09期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络