节点文献

OpenMP编译与优化技术研究

Research on Compilation and Optimization for OpenMP Programs

【作者】 陈永健

【导师】 王鼎兴; 舒继武;

【作者基本信息】 清华大学 , 计算机科学与技术, 2004, 博士

【摘要】 论文对OpenMP程序的编译和优化技术作了研究。论文的第一部分研究了OpenMP程序源代码级优化技术。优化的主要目的是将简单的fork-join类型的OpenMP程序转换为SPMD类型的OpenMP程序,从而以更高效的方式来表达程序中的并行性。主要的优化包括并行循环的调度参数优化,OpenMP程序中的并行块扩张与合并算法,以及在并行块扩张和合并算法基础上进行的冗余指导语句删除,特别是冗余同步的消除,及针对并行块的变量数据属性进行的优化。主要的贡献包括:提出了一种新的的并行循环调度参数优化算法。这种算法综合考虑了调度参数对OpenMP程序中各种开销的影响,特别的,这种算法考虑了后端优化对调度参数的要求,能更有效的防止不合适的调度参数所导致的性能退化现象。提出了一种新的并行块扩张与合并算法。这种算法具有两个不同于其它类似方法的特点:首先,它是一种积极扩张的算法,通过变量与计算私有化来处理合并中出现的变量数据属性冲突;其次,它可以跨越过程边界,进行跨过程边界的并行块提升。采用这种算法可以构成更大的并行区域,从而提供更多的优化机会。提出了对OpenMP程序中SPMD区域进行优化的新算法,包括对同步的优化,以及对变量数据属性的优化。前者减少了程序中冗余指导语句和同步操作带来的额外开销,而后者则以数据属性优化的方式,实现了私有变量的合并,这不仅减少了空间开销,也可以进一步开发存储器的局部性。论文第二部分研究了对OpenMP程序进行有效编译的方法。主要的贡献包括两个方面:提出了一种对OpenMP程序进行翻译和优化的框架,这种框架建立在对OpenMP指导语句的全局嵌套类型分析的基础上。采用这种方<WP=4>法可以对指导语句进行更有效的翻译与优化,它消除了部分额外开销,同时也改善了运行时库的性能。基于上面的分析和翻译框架,本文实现了一个IA64/Linux上的OpenMP编译与优化系统,以作为研究相关平台上高性能计算和开发线程级并行性的研究平台,同时也作为一个大的OpenMP开发环境的一部分。对它的测试表明,它具有较完整的功能,同时具有良好的性能,也证明了所提出的优化和翻译算法的有效性。

【Abstract】 This dissertation focuses on the research of compilation and optimization techniques for OpenMP programs.This first part of this paper is about source level optimization techniques for OpenMP programs, with the main purpose of translating fork-join style OpenMP programs into SPMD style, to express the parallelism more efficiently. Main Optimizations include schedule parameter optimization for parallel loops, the parallel region expansion and mergence, and thus introduced redundant directives elimination, especially elimination of redundant synchronization operations, and variable’s data attribute oriented optimizations. Main contributions in this part arrive from the follows:A novel schedule parameter optimization algorithm for parallel loops is presented to determine a near-optimal schedule scheme by considering the impact of schedule parameter to different kinds of overhead in the program, especial the impact to backend optimization requirement. It thus can prevent performance degradation caused by improper schedule parameters more effectively. A new parallel region expansion and mergence algorithm is raised to form SPMD regions. Different from other methods, it gets two distinct features. First, it’s an aggressive algorithm in the way that it handles variable data attribute confliction through variable and computation privatization. Second, it can hoist parallel region across procedure boundaries. Larger parallel regions can be formed and thus more optimization opportunities are available.New optimization algorithms for SPMD regions in OpenMP programs are proposed to optimize the SPMD style programs, including synchronization optimization and variable data attribute optimization. The former reduces the overhead caused by redundant directives and synchronization operations <WP=6>elimination, and the latter reduces the spatial overhead and improves the locality by merging private variables in the form of variable data attribute optimization.The second part of this paper is about efficient compilation of OpenMP programs. Main contributions include:A translation and optimization framework for OpenMP programs is raised based on the global nesting type analysis for OpenMP directives. By this way, more effective translation and optimization can be done, and it eliminates part of the overhead and improves the performance of the Runtime Library. Based on the analysis and translation framework, an OpenMP compiling and optimization system is implemented on IA64/Linux, to provide a research vehicle for researches on related high performance computing and thread level parallelism exploration, and to be part of a larger OpenMP develop environment. Benchmarking results has proven its functionality and good performance, and also an exhibition of the effectiveness of aforementioned optimization and compilation algorithms.

  • 【网络出版投稿人】 清华大学
  • 【网络出版年期】2005年 03期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络