节点文献

FDTD算法的网络并行研究及其电磁应用

Parallel FDTD Algorithm Based on Network and Applications in Electromagnetic Problems

【作者】 刘瑜

【导师】 梁正;

【作者基本信息】 电子科技大学 , 光学, 2008, 博士

【摘要】 作为一种强有力的数值计算技术,时域有限差分法(FDTD)被广泛应用于光子晶体仿真、生物电磁剂量计算等各类电磁研究领域。但由于FDTD的离散必须满足Courant稳定条件和一定的精度保证,这使得FDTD法在解决电大尺寸复杂结构问题时需划分数量庞大的网格,带来计算耗时过长和计算机内存不足的难题。随着个人计算机性能的不断提高和网络技术的飞速发展,FDTD结合网络并行技术,可将大规模计算划分成小块任务分配给各台计算机分别处理,既解决了大内存的需求又缩短了计算时间,从而为运用FDTD方法进行电大尺寸复杂电磁问题的模拟提供了一条有效的途径。FDTD网络并行计算现已成为电磁场数值模拟的一个研究热点,但现阶段已有的网络并行FDTD算法,往往随着计算节点的增多,各项性能指标(如并行效率、计算稳定性等)下降明显,这严重制约了并行FDTD算法的实际应用。与传统的网络集群相比,现代局域网系统具有一系列新的软硬件特性,主要表现为:1)硬件支持多任务的超线程、多核处理器(CPU)的迅速普及;2)可编程图形处理器(GPU)的广泛应用;3)局域网操作系统的互操作协议进一步完善。本论文研究了如何将这些新的软硬件特性融入FDTD算法,有针对性的改进并行FDTD算法的设计与实现,以提高网络并行运算的各项性能,降低网络并行FDTD算法实现的复杂度,从而促进并行技术在电磁数值计算中的应用。本文的主要工作如下:(1)引入局域网两层并行能力的概念,首次在普通局域网上利用数据一任务两级混合并行技术实现网络并行FDTD计算。数值实验表明,两级并行化策略不仅能够大幅提高FDTD算法的并行可扩展性与并行效率,使得网络并行计算能够很好地适用于细粒度的域分解FDTD算法,而且可以降低域分解拓扑结构对并行性能的影响。(2)研究了并行FDTD计算中的子域插值技术以及基于超吸收边界原理的插值误差修正算法,这样域分解后的子域可以根据求解问题的几何特征,在局部坐标系中实施独立的网格划分,相邻子域间的场值交换通过插值方法实现,由此可以极大地提高并行FDTD建模的便利性与灵活性。数值结果表明,使用并行插值技术能够保证FDTD迭代的精确性与稳定性。(3)提出了FDTD计算的时间与空间容错概念,并基于内存映射文件原理,创新性的研究了网络并行FDTD系统的简化与扩展技术。作为算法应用,模拟研究了两种光子晶体波导结构:对于耦合型光子晶体波导,计算了耦合区域内介质柱的半径比变化与波导耦合长度的关系;对于直角弯曲结构光子晶体波导,研究了其传输效率特点。数值实验表明,利用本文提出的这种新方法能够有效降低并行FDTD算法设计的复杂度,增加其与常用计算软件的协同工作能力,从而进一步提高FDTD系统的计算效率。(4)首次利用GPU—CPU协同计算实现了交变方向隐式时域有限差分法(ADI-FDTD)的模拟,给出了GPU通用数值计算的一种方案框架,并对协同计算中的一些细节问题,如GPU线性方程组求解进行了详细分析,性能测试结果表明,协同算法的性能在同等条件下相对于普通CPU-ADI-FDTD算法有明显提高。(5)利用插值并行FDTD技术对不同的人体姿势建模,在此基础上,研究了在“标准站立”与“举起手臂”两种不同的姿势下,手机射频辐射对人体头部电磁能量分布的影响。数值结果表明,头部重要器官大脑的比吸收率(SAR)值对姿势的改变非常敏感,与“标准站立”相比,“举起手臂”时大脑SAR值有明显的上升。

【Abstract】 As a powerful numerical technique, The FDTD (Finite Difference Time Domain)method has been widely applied to various electromagnetic problems, such as thesimulation of photonic crystals, the calculation of bioelectromagnetic dosimetry, and soon. However, in order to meet the requirement of precision and Courant stabilitycondition, the FDTD method have to generate enormous grids when simulating theelectrically-large-size or complex objects. Hence two problems have to be confronted:huge memory consumption and long execution time.With the rapid development of the personal computer and the network technique,the FDTD method combined with the parallel technique divides the wholecomputational space into some subspaces and assigns every subspace to one node in aparallel system. Thus, the requirements of the huge memory and CPU time can besharply decreased, which provides a feasible approach for simulating theelectrically-large-size or complex objects. The research for the parallel FDTDcomputing based on network has become a hotspot in numerical applications forelectromagnetic field now, nevertheless, for the existing parallel FDTD algorithm, theperformance indexes, such as parallel efficiency and computation stability, will dropobviously with the quantity of computers increasing. This constrains heavily thepractical application of the parallel FDTD algorithm.As compared to the traditional cluster, the modern LAN (local area network)possesses a series of new hardware and software features, which are as follows:1) The CPUs with hyper-threading or multi-core technology have becomewidespread, which can give directly support to multiple tasks from the hardware.2) The programmable GPU (Graphics Processing Unit) is widely adopted.3) The interoperability protocol for the LAN operating system is more perfect.To improve the parallel performance, and reduce the complexity of implementationfor parallel FDTD algorithm, this dissertation studies how to apply the new softwareand hardware characteristic in FDTD algorithm, and improve the design andimplementation for parallel FDTD algorithm, so as to promote the application of the parallel technique in the electromagnetic computation.The major achievements of this dissertation are as follows:(1) The concept of "two level parallelization on PC cluster" is presented. Byusing two-level parallelization of the data and tasks, a high performance MPI-OpenMPhybrid FDTD algorithm is developed on PC cluster for the first time. The numericalresults show the strategy of two level parallelization can improve substantially thescalability and efficiency of parallel FDTD algorithm, which is well suited for the finegrained parallel FDTD computing on PC cluster, and moreover, it can also lessen theinfluence of the subspaces virtual topology on the parallel FDTD performance.(2) A universal and efficient interpolation technique based on the superabsorbingboundary principle is studied, which can improve the interpolation accuracybetween subdomains and ensure the stability of the parallel FDTD iterative procedure.Thus, the computational space is divided mostly according to the features of the originalproblem, and the meshes are created in local coordinates. During the iteration process ofparallel FDTD, the data are exchanged between adjacent subdomains with theinterpolation technique, which can largely enhance the convenience and flexibility ofthe parallel FDTD for building the model.(3) The concepts of time and space fault-tolerance are presented, and by usingthe principle of the memory-mapped file, the simplifying and extending technique basedon the LAN characteristic is studied for the parallel FDTD system. Besides, severalstructures of the photonic crystal waveguides are investigated in virtue of this approach.For the electromagnetic coupling effect of photonic crystal waveguide, the relationshipbetween the different radius ratio of dielectric cylinders and the coupling length of thewaveguide is discussed. For the photonic crystal 90°waveguide bend, the characteristicof transmission efficiency is calculated. The numerical experiments confirm the fact thatthe approach proposed can reduce availably the complexity of the algorithm design forparallel FDTD system, and enhance the capacity of cooperating well with the commoncomputing software, which further improves the efficiency of the FDTD computing.(4) For the first time, the collaborative computation of the GPU and CPU isachieved for the Alternative Direction Implicit Finite Difference Time Domain(ADI-FDTD) algorithm. An implementation frame for the general purpose computationon GPU is given, and a detail analysis for solving linear equations system on GPU is presented. The performance test results show that the collaborative algorithm is moreefficient as compared to the general CPU-ADI-FDTD algorithm under the sameconditions.(5) Using the interpolation technique, the human models in the different postureare built for the parallel FDTD computation. The SAR (Specific Absorption Rate)values in the head are calculated, and the effect of human-body posture on the SAR isanalyzed for two different postures: the standard standing and arms stretching up, whenthe human body is exposed to the electromagnetic radiation from a mobile phone. Theresults show that the SAR values in the brain, as a critical organ in the head, aresensitive to the change of the body posture, and increase obviously in arms up postureas compared with those in standard standing posture.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络