节点文献

基于图着色的存储层次优化技术研究

Research on Graph Coloring Based Optimization Technologies for Memory Hierarchies

【作者】 邓宇

【导师】 杨学军;

【作者基本信息】 国防科学技术大学 , 计算机科学与技术, 2007, 博士

【摘要】 处理器与存储器的性能差距导致了“存储墙”问题的出现,使得存储系统成为计算机系统的瓶颈。从目前工艺水平和体系结构技术的发展趋势来看,这种差距还会继续增加,因此在未来可预测的范围内,对存储系统的优化将一直是提高计算机系统性能的关键技术之一。本文着重研究了如何将图着色理论应用于各级存储层次的优化问题,在cache、以流寄存器文件为代表的片上大容量寄存器文件和主存三方面提出了创新的编译时优化方法。本文取得的主要研究成果如下:(1)提出了一个基于图着色的cache优化算法—Cache Coloring。该算法根据访存行为将程序中的数据划分成若干数据对象,然后根据数据对象的大小将cache划分为一个带有别名的伪寄存器文件,每个伪寄存器由若干cache行组成,可以容纳一个数据对象;最后使用一个经过改进的图着色寄存器分配算法来决定这些对象在cache中的位置以及发生冲突时的替换关系。数据对象的划分将cache的管理分为两个层次,一个是编译时编译器对粗粒度的数据对象的管理,另一个是运行时硬件对细粒度的cache行的管理,这样编译器和硬件的优势都得到发挥。我们构造了比传统的生命周期相干图蕴涵更多相干信息的冲突矩阵,作为处理寄存器分配冲突时的指导原则。我们基于GCC进行了实现,并通过simplescalar构造了支持CacheColoring的硬件模拟平台。实验结果表明Cache Coloring能较好的开发程序的局部性,降低cache失效率。(2)提出了一个基于图着色的流寄存器文件分配算法—SRF Coloring。该算法通过寄存器划分将流寄存器文件转换为大小和位置固定的传统寄存器文件,从而将流寄存器文件的分配问题归结为一个可以采用图着色寄存器分配算法解决的问题。针对流寄存器文件的硬件机制和程序访问流寄存器文件的特点,我们对已有的图着色算法进行了扩充,使之能够更加有效地进行流寄存器文件的分配。为了解决因流太长而导致相干图不可着色的问题,我们提出了重用优先的双缓冲策略。我们在SF95Compiler编译框架中实现了SRF Coloring。SF95Compiler是一个为FT64及其编程语言SF95开发的编译器。实验结果表明,SRF Coloring能够有效地管理流寄存器文件。(3)提出了一个基于区间着色的主存分配算法—MM Coloring。我们以数组作为分配主存空间的候选,将主存分配问题归结为区间着色问题。由于一般的区间着色问题是NP完全问题,我们利用程序所具有的一个性质,即数组生命周期的包含性来降低区间着色的难度,将一般的区间着色问题简化为超完美图的区间着色问题,并据此提出了实现最优区间着色的判定条件以及实现算法。当不满足最优区间着色的条件时,可以通过分割数组的生命周期来使之得到满足。我们提出了两种分割数组生命周期的策略。一种是自底向上的积极分割策略,它首先将数组的生命周期分割到最小,然后在满足着色条件的前提下逐步合并生命周期;另一种是自顶向下的被动分割策略,它在一开始将数组的生命周期保持为最长,只有在不满足着色条件时才选择某些生命周期进行分割。我们基于GCC进行了实现。模拟实验结果表明,该算法是一种有效的管理主存的编译方法。

【Abstract】 The performance gap between processor and memory causes the problem known as "Memory Wall", which makes the memory system become the bottleneck of the computer system. The developing trends of current semiconductor and computer architecture techniques show that this gap will keep increasing. Consequently, the optimization to the memory system will be one of the key technologies to improve the overall performance of the computer system in the visible future.This thesis focuses on applying the graph coloring theory to the management optimizing problem of the memory hierarchies. Novel management approaches in compile-time have been presented in the aspects of cache, large on-chip register files represented by Stream Register File (SRF) and the main memory. Primary innovative works in this paper could be summarized as follows:(i) A graph coloring based management optimizing algorithm for cache, namely Cache Coloring, has been proposed. This algorithm first partitions the data into several data objects according to their memory accessing behaviors. Then it partitions the cache into a pseudo register file with alias according to the size of the data objects. Each pseudo register in this register file can hold one of the data objects. Finally, it uses an extended graph coloring register allocation algorithm to determine the position of each data object in the cache and their replacement relationship. The data object partitioning divides the management of cache into two levels, one for the coarse-granularity management of the data objects in the compile-time and the other for the fine-granularity management of the cache lines in the run-time. So the advantages of both compiler and hardware are exploited. The conflict matrix which contains more information about interference than the traditional live-range interference graph is constructed as a guide to deal with the conflicts of register allocation. Cache Coloring is implemented in GCC. A hardware simulation platform which supports Cache Coloring is built based on simplescalar. The primary experimental results show that Cache Coloring can exploit the locality well and reduce the cache miss rate.(ii) A graph coloring based SRF allocation algorithm, namely SRF Coloring, is proposed. This algorithm formulates the problem of SRF allocation into a register allocation problem which can be solved by a well understood graph coloring algorithm by partitioning the SRF into a traditional register file with fixed position and length. The existing graph coloring algorithm is extended and modified to address the unique aspect of the SRF allocation problems. A reusing-first double-buffering strategy is presented to address the uncoloralbe problem caused by very long streams. The SRF Coloring algorithm is implemented in the compiler framework of SF95Compiler, which is developed for FT64 stream processor and its programming language SF95. The experimental results demonstrate that SRF Coloring represents a promising compiler approach for the SRF management.(iii)An interval coloring based allocation algorithm for main memory, namely MM Coloring, is proposed. The main memory allocation problem is mapped to an interval coloring problem by taking arrays as candidates for memory allocation. As the general interval coloring problem is NP-complete, a property of the program, namely the containment of arrays’ live-ranges, is used to decrease the hardness of interval coloring and simplify it as an interval coloring problem for superperfect graph. Based on this, a judgment theorem for optimal interval coloring together with the implementation algorithm is proposed. Live-range splitting can be used to make the graph satisfy the optimal interval coloring condition. Two strategies for live-range splitting are proposed. One is a top-down aggressive strategy which first splits the live-ranges into minimum and then merges them gradually when the coloring condition is satisfied. The other one is a bottom-up passive strategy which keeps the live-ranges as long as possible and splits some of them only when the coloring condition is not satisfied. MM Coloring with both aggressive and passive strategies is implemented in GCC. Modeling tests show that it is an efficient approach for compile-time memory allocation.

节点文献中: