节点文献
基于多GPU的三维Kirchhoff积分法体偏移
3D Kirchhoff integral prestack migration based on GPUs
【摘要】 提出3种策略挖掘三维Kirchhoff积分法体偏移在众核GPU(图形处理器)上的并行性.首先,使用数据传输线程和GPU计算线程构造流水线并行框架,基于此框架直接实现异步输入输出(I/O)以减少GPU和网络存储之间数据传输所需的时间;其次,使用GPU的线程满载策略以使指令吞吐量最大化;最后,应用纹理缓存和常量缓存来减少片外存储器访问,并使用固定功能单元计算超越函数.实验结果表明:相比于IntelXeon E5430CPU上的算法串行版本,在nVidia Tesla C1060GPU上的优化算法实现了约20倍的加速比.比较了算法在3种不同GPU架构上的性能,并给出了CPU与GPU结果在0.5×10-4误差限下仅0.3×10-5的浮点数绝对误差.
【Abstract】 Three approaches were proposed to expose parallelism of 3D Kirchhoff integral prestack migration on many-core GPUs(graphic processing units).First,pipeline parallel framework was constructed using two separated host threads: data transfer thread and GPU compute context thread.From the pipeline parallel framework,asynchronous input/output(I/O) was directly realized to minimize the time taken of data transfer between GPUs and network attached storages.Second,GPU threads full-loaded arrangement was used to achieve maximum instruction throughput.Third,texture cache and constant cache was applied to minimize off-chip memory accessing,and fixed function units was used to calculate transcendental functions.The experimental results show that our optimized algorithm implementation on nVidia Tesla C1060 GPU achieves about 20 times speedup compare to its sequential version on Intel Xeon E5430 CPU.Finally,a comparison of our algorithm performance on 3 different GPU architectures was described,and an analysis of only 0.3×10-5 floating point number absolute error between CPU and our GPU results under 0.5×10-4 error threshold was demonstrated.
【Key words】 parallel computing; graphics processing unit; Kirchhoff integral prestack migration; pipeline parallel; asynchronous input/output; compute unified device architecture(CUDA);
- 【文献出处】 华中科技大学学报(自然科学版) ,Journal of Huazhong University of Science and Technology(Natural Science Edition) , 编辑部邮箱 ,2011年S1期
- 【分类号】TP391.41
- 【网络出版时间】2011-06-17 12:45
- 【被引频次】2
- 【下载频次】123