节点文献

循环变换技术在自动向量化中的应用研究

Research on Application of Loop Transformation for Auto-vectorization

【作者】 黄磊

【导师】 姚远;

【作者基本信息】 解放军信息工程大学 , 计算机软件与理论, 2011, 硕士

【摘要】 近年来,随着多媒体应用的迅速发展,很多高性能微处理器都采用了SIMD扩展技术。SIMD扩展技术缺少统一的指令描述规范,程序员不仅需要对程序的结构有较为深刻的认识,而且要掌握目标平台支持的扩展体系结构和指令集的特点,对程序员来说编写SIMD程序是具有挑战性的工作,因此当前编译技术中的自动向量化就是解决这个问题的一个重要方法。自动向量化需要编译器能够充分分析原程序的特点,将其中可向量化的循环或语句块转换成语义等价的SIMD指令。本文针对当前的主流编译器在自动向量化方面的缺点,结合传统的循环变换方法,深入研究了可以提升自动向量化效率的循环变换技术。首先,以数据依赖关系分析为基础,分析了程序中语句向量化的合法性,设计实现了语句向量化识别算法及基于语句向量化识别的循环分布算法。第二,针对SIMD体系结构中标量部件与向量部件可并行工作的特点,研究了基于混合并行的循环变换技术,设计实现了一种简单通用的循环选择合并算法,并针对合并不成功或性能不能达到预期效果的情况,提出了相应的分段展开变换策略,提高了系统的硬件资源利用率。第三,针对自动向量化变换在处理复杂应用程序所存在的不足,研究了基于信息识别的交互式变换方法,设计实现了一种交互式的循环变换调优框架。在该框架结构内,通过分析应用程序的特点,采用可视化动静态编译信息的方法,为用户提供一个高效的渐近式SIMD交互调优环境。本文研究的循环变换方法在自动向量化项目SW-VEC中进行实现,测试结果表明,使用本文的自动变换方法对SPEC CPU2000浮点测试集中几个向量化较好的测试程序性能平均有10%左右的提升,从而验证了本文提出的方法可以提升向量化识别率并能够得到较高的加速比。同时,测试结果表明,通过交互式的循环变换方法,可以弥补自动向量化中难以处理的复杂情况,进一步提高编译器的向量化能力。

【Abstract】 In recent years, as the rapid development of multimedia applications, SIMD extension technologies have been widely used in many high performance microprocessors. Due to lacks of unitive rules to describe the instructions for SIMD extension technologies, the programmer not only need to profoundly understand the structure of program, but also need to know the characteristics of the structure and instruction sets of corresponding SIMD extension, which bring great challenges to writing SIMD program by hand. So lots of current compilers adopt auto-vectorization to solve this problem.In order to realize auto-vectorization, compilers need to completely identify the program’s characteristics and transform the loops or basic blocks to SIMD instructions with equal function. Considering the limit of current auto-vectorization compilers, this thesis makes researches to improve the efficiency of auto-vectorization with methods of loop transformation. Firstly, based on analyzing the data dependence relationship, an algorithm for identifying statement’s vectorization and a loop distribution algorithm based on identifying statement’s vectorization ability is discussed and implemented. Secondly, considering the SIMD assembly and scalar assembly can work together in SIMD computer system, some loop transformation methods for mix-parallel are researched. Then this thesis designs and implements a common loop fusion algorithm. If the algorithm can’t achieve the expect effect, a corresponding loop unroll policy is brought out. Thirdly, in order to help auto-transformation when dealing with the complexity applications, this thesis designs a tune frame for interactive loop transformation after researching the relative interactive transformation methods based on information recognizing. In the frame, an effective interactive transformation environment is provided to users by displaying the dynamic and static compiler information through analyzing the applications.These loop transformation methods have been realized in current auto-vectorization project, and the test results show that with our auto-transformation methods, the performance have rised up 10% averagely for some programs which are fit for vectorization in the SPEC CPU2000 float test sets. It’s proved that our transformation methods can improve the vector recognization and execution efficiency. At the same time, it shows that with the interactive transformation method, we can solve some complex instances which can’t be solved by the auto-vectorization compiler, and farther improve the program vectorization ability of the compiler.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络