节点文献

大规模并行计算系统软件低功耗关键技术研究

Research of Software Low-power Optimization in Large-scale Parallel Computing System

【作者】 董勇

【导师】 杨学军;

【作者基本信息】 国防科学技术大学 , 计算机科学与技术, 2012, 博士

【摘要】 功耗已经成为大规模并行计算系统性能提升的重要约束条件之一。过高的功耗和能量消耗对系统的运行产生多种负面影响,包括系统故障率提高、可靠性降低、运行成本增加等。对大规模并行计算系统进行功耗优化研究具有重要的现实意义。大规模并行计算系统的功耗优化已经在系统设计与实现的各个层面展开,包括电路设计、逻辑设计、体系结构以及系统软件和应用软件层等。硬件设备的动态电压调节(DVS)和部件关闭等技术为软件功耗优化奠定了实现基础。软件低功耗优化具有不依赖于硬件平台、更灵活、可移植性好等诸多优点。计算系统,通信系统是大规模并行计算系统的核心,也是大规模并行计算系统功耗优化的重点。本文针对大规模并行计算系统的软件功耗优化技术,对基于循环调度的结点机能量优化、基于网络拓扑图划分的互连网络能量优化展开了研究。针对结点机的能量优化是大规模并行计算系统能量优化的重要组成部分。本文研究了基于OpenMP循环调度的结点机能量优化技术,通过将DVS和调度算法相结合提出了两类功耗优化算法:性能受限的能量优化和能量受限的性能优化。性能受限的能量优化通过对块轮循静态调度算法进行改进,提出了能量节约的最优静态调度算法(EOSS)。进一步考虑cache失效对访存延迟的影响,提出了改进的最优静态调度算法(IEOSS)。能量受限的性能优化在有限能量供给条件下,通过循环调度,减少循环执行时间,提出了能量受限的性能最优静态调度算法(ECPOSS)。论文证明了EOSS和ECPOSS的最优性,并通过实验验证了上述算法的有效性。互连网络能量优化对全系统能量优化具有重要意义。静态能耗是大规模并行计算系统互连网络能量消耗的主要组成部分。网络部件关闭是有效降低互连网络静态能耗的重要技术。本文提出了基于网络拓扑图划分的互连网络能量优化。首先分析了空间、时间两维路由器的占用性,提出了基于路由规则的网络拓扑图划分的概念,提出了Nd-mesh、Nd-torus、胖树的确定性路由、方向自适应路由和完全自适应路由规则的拓扑图划分方法,以此为基础提出了基于网络拓扑图划分的静态能量管理实现关键技术,在空间、时间两个维度上实现了空闲路由器关闭。基于TH-1A的软件系统框架提出并实现了基于网络拓扑图划分的互连网络静态能量管理技术方案,构建了虚实结合的验证环境,并通过大量实验结果验证了所提出方法的有效性。本文的主要创新点如下:1.提出了能量优化指导的并行循环调度方法。从性能受限能量优化和能量受限性能优化两个角度出发,给出了两类OpenMP循环调度能量优化算法,分别以能量节约的最优静态调度算法(EOSS)和能量受限的性能最优静态调度算法(ECPOSS)为代表。证明了EOSS和ECPOSS的最优性,并通过实验评测验证了这两类算法的有效性。2.提出了基于网络拓扑图划分的路由器关闭思想。从空间维度分析了作业对路由器的直接占用和间接占用,从时间维度分析了作业对路由器的连续占用。在此基础上分析了多作业对路由器的独立占用,从而提出了网络拓扑图划分的概念,以指导路由器关闭。3.提出了典型网络、典型路由规则的网络拓扑图划分方法。给出了Nd-mesh、Nd-torus、胖树的确定性路由、方向自适应路由、完全自适应路由规则支配下的网络拓扑图划分定理和算法。4.提出了基于网络拓扑图划分的静态能量管理实现的关键技术,包括不可关闭域设置技术、拓扑感知的资源分配策略以及空间碎片管理技术;基于TH-1A的软件系统框架提出并实现了基于网络拓扑图划分的互连网络静态能量管理实现方案,构建了虚实结合的验证环境,实验结果验证了本文所提方法的有效性。

【Abstract】 Power consumption has become one of the most important constraints forperformance enhancement of large-scale parallel computing system. Too high powerand energy consumption has many negative effects on system running, which includeincreasing system failure frequency, reducing system reliability, increasing running costand so on. Therefore, studying power optimization of large-scale parallel computingsystem has important practical significance.Large-scale parallel computing system power optimization has been extended intoeach hiberarchy of system design and implementation, including circuit design, logicdesign, architecture, system software and applications. Dynamic voltage scale (DVS)and shutting down technology of hardware device provide implementation basis forsoftware power optimization. Software-level low power optimization has manyadvantages, such as hardware-platform independence, more flexibility, goodtransplantability and so on. Computing system and communication system compose thecrucial components of large-scale parallel computing system so that they are the focusof power optimization. From the viewpoint of low-power optimization softwaretechnology research in large-scale parallel computing system, this thesis studies loopscheduling based energy optimization of compute nodes, energy optimization ofinterconnection network based on network topology partition.The energy optimization of compute nodes is an important part of poweroptimization of large-scale parallel computing system. This thesis studies the OpenMPloop scheduling based energy optimization of compute nodes. Through combining DVSand scheduling algorithm, the thesis proposes two kinds of power optimizationalgorithms: performance-constrained energy optimization and energy-constrainedperformance optimization. Performance-constrained energy optimization improves theblock static scheduling algorithm and then propose Energy Saving Optimal StaticScheduling (EOSS) algorithm. The Improved Energy Optimal Static Scheduling(IEOSS) is then proposed which considers the impact of cache misses on the latency ofmemory access. Energy-constrained performance optimization is presented by EnergyConstrained Performance Optimal Static Scheduling (ECPOSS) algorithm which canreduce the execution time by loop rescheduling within a given energy constraint. Theoptimalities of EOSS and ECPOSS have been proved. The experiment results validatethe effectiveness of the above algorithms.The energy optimization of interconnection network has an important meaning forenergy optimization of the whole system. Static energy consumption occupies the mainpart of interconnection network energy consumption. The important technique to reducethe static energy consumption is shutting down network components. This thesis proposes energy optimization of interconnection network based on network topologypartition. Firstly, the thesis analyzes the occupancy characteristics of interconnectionnetwork in space and time dimension followed by the concept of network topologypartition based on routing rule. Then the thesis proposes network topology partitionmethods for Nd-mesh, Nd-torus and fat-tree network with three kinds of routing rule:determinate routing, oblivious adaptive routing and full adaptive routing. Finaly, thethesis describes the key techniques of static energy management for interconnectionnetwork based on network topology partition which shut down routers in space and timedimension. Within the software framwork of TH-1A, the scheme of static energymanagement of interconection network has been proposed with which the virtual-actualcombined experimental environment is constructed. A large amount of experiment hasproven the effectiveness of it.The main contributions of this thesis are as follows.1. Propose the energy optimization guided parallel loop scheduling methods. Fromthe viewpoint of performance-constrained energy optimization and energy-constrainedperformance optimization, the thesis represents two kinds of OpenMP loop schedulingenergy optimization algorithms, among which the representative two algorithms areEnergy Saving Optimal Static Scheduling (EOSS) and Energy ConstrainedPerformance Optimal Static Scheduling (ECPOSS). The optimalities of these twoalgorithms are proved. The experiment results validate the effectiveness of these twokinds of algorithms.2. Propose the ideology of router shutting down based on interconnection networktopology partition. The thesis analyzes the direct and indirect occupancy of parallel jobson routers in space dimension, and the continual occupancy on routers in timedimension. The independent occupancy of multip-jobs on routers is also analysed whichis followed by the concept of network topology partition--it can instruct the shuttingdown of routers.3. Propose the network topology partion methods for typical network with typicalrouting rules. The network topology partition methods for Nd-mesh, Nd-torus andfat-tree network with three kinds of routing rule: determinate routing, oblivious adaptiverouting and full adaptive routing are presented.4. The key techniques for static energy management based on interconnectionnetwork topology partition are proposed including the setting of regions in which therouters cannot be powered off, topology-awared resource allocation policy and spacialfragments management. Under the software framework of TH-1A, the thesis proposeand implements scheme of the network topology partition based interconnectionnetwork static energy management with which the virtual-actual combined experimental environment is constructed. The experimental results validate the effectiveness ofenergy optimization of interconnection network.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络