节点文献

基三体系结构中并行运算的关键机制研究

【作者】 李嘉欣

【导师】 石峰;

【作者基本信息】 北京理工大学 , 计算机软件与理论, 2010, 博士

【摘要】 片上多核处理器(Chip Multi-Processor,CMP)已经成为提高计算机性能的主要方式之一,基于多核处理器的并行计算是现阶段高性能计算研究的热点,同时也存在不少的难点,包括如何充分利用多核处理资源,如何帮助程序设计人员在多核体系结构基础上进行并行编程等。基三体系结构(Triplet-Based Architecture,TriBA)是一个面向对象的多核处理器体系结构,在TriBA上的并行计算研究也存在这些问题。本文对TriBA中并行计算所涉及到的一些关键技术进行了深入研究和探讨,包括对多核处理器的并行程序设计辅助工具的研究以及对基于片上网络( Network-on-Chip , NoC)的多核体系结构中并行数据流调度和数据传输的研究。研究内容及主要成果包括:1.提出并部分实现了一种多核体系结构上并行程序设计的辅助工具——反馈并行程序设计框架(Feedback Parallel Programming Framework,FPPF)的设计思想。FPPF的主要思想是帮助程序设计人员降低程序设计的思考层次,了解一定的硬件特征,选择适合的程序设计方案,从而编写出适用于特定多核体系结构的并行程序,提高程序的性能。FPPF的构成是组件化的,各种体系结构及相应算法以模板的形式存在,程序设计人员可以通过组合、修改、新建体系结构模板来预先构造并行程序的解决方案,并且通过FPPF对各方案的评估和比较来选择较优的方案进行进一步程序设计,从而减轻反复修改、调试、验证的负担。另外,也可以将现有的工具以组件或模块的形式添加到FPPF中。2.对TriBA片上网络拓扑的遍历性质进行了证明,包括TriBA的Hamilton路和最小生成树两个方面。定义了流水模型的概念,并构造了TriBA的几种流水模型。在FPPF中,流水模型可以用于构造其用户接口组件,从而帮助程序设计人员了解体系结构的拓扑特征。还可以利用流水模型进行并行程序的顶层设计,以及对并行数据流的调度。3.提出一种并行体系结构描述方法——层次化并行运算模型(Hierarchy Parallel Computing Model,HPCM)。HPCM是一种自嵌套的多层次并行体系结构描述方法,该方法能够灵活地在不同粒度层面上对并行体系结构及其运行方式进行描述。同时,还提出了基于不同精度的HPCM的并行解决方案的性能评估方法。HPCM及其相应的性能评估方法可以用于构造FPPF的体系结构模板库组件和静态评估引擎组件。4.提出了对多核处理器片上网络中并行数据传输的关键部件——并发多方向数据交换结构(Concurrent Multi-direction Data Switch Structure, CMDSS)的设计方法,称作图状态选择(Graph State Select, GSS)。GSS利用片上网络的拓扑特征,对多方向数据交换结构的基本状态进行提取。提出并实现了控制调度算法FG-NC,该算法利用GSS提取的状态来构造对数据交换结构的控制码,从而在特定硬件条件下提高数据交换结构的并行性。利用GSS对TriBA的InterUnit进行了重新设计,提供了对单播、组播和广播数据并行传输的高效支持。5.提出了一组利用多核处理器片上网络拓扑特征进行数据流调度的方法——基于拓扑特征的流调度(Stream Schedule based on Topology Features,SSTF)。SSTF策略主要包含平分策略和选择策略,其中平分策略用于体系结构中固有负载较少的情况,选择策略在固有负载较多时利用拓扑权重来辅助平分策略完成数据流任务的调度。本文以SSTF在基三网络中的应用为例,计算了基三网络的拓扑权重,对各种平分策略在包含和不包含传输延迟的情况进行了分析。

【Abstract】 Chip Multi-Processors (CMP) has become one of the most important methods to improve the performance of the computer. The CMP-based parallel computing is a hot-spot issue, and is also a difficult issue that all of the programmers should face. The issue is relevant to how to utilize the emerging huge and diversified CMP computing resources, and how to help programmers to design parallel applications based on CMP. Triplet-Based Architecture(Triplet-Based Architecture, TriBA) is an object-oriented CMP architecture. Many of those problems are also existed in TriBA. This dissertation deeply researches and discusses some key technologies relavant to parallel computing in TriBA, which include researches on“aid tools for parallel programming design based on CMP”and“parallel data stream scheduling and data transferring in CMPs based on Network-on-Chip (NoC)”. The brief research content and achievement in this dissertation is:1. An aid tool for parallel programming design based on CMP is proposed, which is called FPPF (Feedback Parallel Programming Framework). The main idea of FPPF is to help the programmer think in a low level during the course of programming, which can be convenient for the programmers to learn some hardware features, to choose more proper solutions, to develop parallel programs that fit to specific CMP architecture, and eventually improve the performance of the program. FPPF is composed of many components. Several architectures and corresponding algorithms are stored in FPPF as patterns. Programmers can combine, modify or create these patterns to configure their solutions, which are evaluated and compared by FPPF to find a better one for programmers to continue their further design. The course avoid the burdon of repeatly reversing, debugging and verificating. Besides, some existing tools can also be added into FPPF as components or modules.2. The ergodic property of NoC topology in TriBA is proved, including Hamilton route and the minimal spanning tree. The concept of Streaming Model is defined, and TriBA’s streaming models are constructed. In SPPF, streaming model can be used to construct the User Interface component. It can not only help programmers to know the topology features of specific architecture, but also can be used to deal with top level program design and to schedule the parallel data streams.3. HPCM (Hierarchy Parallel Computing Model), which is a method to describe parallel architectures, is proposed. HPCM is a self-nesting description of hierarchy parallel architectures. It can describe parallel architectures and their running patterns in several granularities. The method of evaluating parallel solutions based on HPCM with different granularities is also introduced. HPCM and its evaluation method can be utilized to construct the architecture pattern library and the static evaluating engine in FPPF.4. The method of designing a key component (Concurrent Multi-direction Data Switch Structure, CMDSS) for transferring parallel data is proposed. The method is called GSS (Graph State Select). GSS can utilize topology features of NoC to extract basic states of CMDSS. A control and schedule algorithm called FG-NC is also introduced and implemented. FG-NC transform the states found by GSS into control codes, thus improving the parallelism of CMDSS. The InterUnit in TriBA is re-designed using GSS, and the new InterUnit efficiently support parallel data transfers with unicast, groupcast and broadcast types.5. Methods that utilize the features of architecture topology to compute the weights on every edge are proposed. These methods are called SSTF (Stream Schedule based on Topology Features). SSTF mainly includes Divide and Select methods. Divide methods are suitable for the architecture with little load. When there is more load, Select method can be used to assist Divde method to complete the data stream scheduling. The use of SSTF in Triplet-based Interconnection Networks (THIN) is taken as an example. The topology weight of THIN is computed, and the situations that Divide method with and without transfer latency are also be analysed.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络