节点文献

高性能计算机的存储方法研究

【作者】 李恩有

【导师】 夏培肃; 张祥; 刘志勇;

【作者基本信息】 中国科学院研究生院(计算技术研究所) , 计算机体系结构, 1997, 博士

【摘要】 半导体技术的发展,使得主存储器件的存取速度不能满足处理器存取数据的要求,人们在各种计算机系统中广泛采用了并行存储系统和层次存储系统,以提高整个存储系统的平均存取速度。 然而,在实际应用中却发现,传统的并行存储系统和层次存储系统并不是总能达到人们的预期目的。这是由于处理器的存取访问在并行存储系统中和高速缓存系统中存在存储体冲突和高速缓存行冲突。进一步的研究发现,并行存储系统和层次存储系统中的存储映射方法对它们的存储性能有很大的影响。 XOR斜排存储方法是一类非常有效的非线性斜排存储方法,作者在研究了许多具有实际使用价值的XOR存储方法的基础上,提出了LR-XOR斜排存储方法.在采用LR-XOR斜排存储方法的并行存储系统中,不仅可以并行存取在传统的交叉并行存储系统中可以并行存取的连续存储数据存取模式,而且可以并行存取N×N矩阵的矩阵行、矩阵列、矩阵主P×Q块、矩阵散列P×Q块以及间隔为2~i的等间隔主向量、间隔为2~i的移位等间隔主向量等许多在科学和工程应用程序中常用的数据存取模式,可以大幅度地提高并行存储系统的平均存取速度。 本文在对高速缓存系统结构进行深入分析的基础上,把XOR斜排存储映射方法应用于数据高速缓存的存储映射中。理论分析表明,在高速缓存映射系统中采用EE-XOR和LR-XOR存储映射方法,可以使科学和工程应用程序中大量常用存取模式的所有数据元素同时驻留在高速缓存系统中,把应用程序中的数据复用率更多地转化为高速缓存系统中暂存数据的复用率,从而大幅度地提高层次存储系统的平均存取速度,充分发挥处理器的运算能力。 作者创造性地在高速缓存系统的映射机构中实现了EE-XOR斜排存储方法,以使高速缓存系统可以充分地利用程序执行过程中存储访问的局部性。在作者设计的Pentium和平实验系统中,其二级高速缓存映射中使用

【Abstract】 With the rapid development in semiconductor technology, the disparity of data access cycle-time between the fast microprocessors and the relatively slow main memory systems become more and more serious. The computer designers use the parallel memory system and hierarchical memory systems in their computers in order to reduce the average access time of the memory system.However, practical experiences have shown that the traditional interleaved parallel memory system architecture and hierarchical memory systems can not satisfy well the most frequently used data accesses in a wide variety of application algorithms. This is because the most frequently used data patterns can produce many memory bank conflicts in the traditional parallel memory system or cache memory line conflicts in the cache memory systems. Address mapping methods used in the parallel systems or the cache memory systems have important effects on the conflict rates.XOR schemes are a set of nonlinear skewed memory allocation schemes which can be used in the parallel memory systems. We present a XOR scheme, named LR-XOR scheme, after careful studying the former schemes. In a parallel memory system with N=2~i memory banks, the processing units can access most of the data patterns frequently used in scientific and engineering programs only in one memory access cycle. These include the row, column, main PxQ block, scattered PxQ block, main vector and shifted main vector with 2~i stride of NxN matrix. These parallel access properties can reduce the memory system’s average data access time.Comparing the behavior of parallel memory system and the behavior of the cache memory system, we found that the memory schemes used in parallel memory system should have good properties if they are used as cache memory mapping schemes. Our theoretical analysis stated that if we use the EE-XOR or

节点文献中: