节点文献

面向高性能计算的可扩展I/O体系结构研究与实现

Research on I/O Architecture and Implementing Technology for High Performance Computing

【作者】 李琼

【导师】 杨学军;

【作者基本信息】 国防科学技术大学 , 计算机科学与技术, 2009, 博士

【摘要】 数值模拟计算是进行科学研究和探索的主要技术手段之一,其对计算机的计算和数据处理能力提出了巨大的、不断增长的需求,推动着并行计算机系统的发展。高性能计算已进入PetaFlops时代,与此同时,数据存储也进入Petabyte(千万亿字节)时代,对I/O性能、可扩展性、可靠性、可用性和易管理性提出了严峻挑战。大规模并行计算机系统的I/O效能已经成为阻碍系统获得高效能的重要瓶颈。这主要表现在两个方面,一是I/O设备速度、I/O体系结构等因素的制约,使系统I/O性能和计算性能严重不匹配;二是系统规模的扩展导致I/O设备高故障率和数据恢复时间的增长,使I/O系统的可用性问题日益突出。为了缓解I/O瓶颈问题,可以从应用程序、可扩展算法、编译器和语言、运行时库、操作系统和体系结构六方面展开研究。其中,I/O体系结构是所有技术途径的关键支撑。针对高性能计算I/O需求与挑战,结合高效能并行计算机系统的研制任务,论文首先研究了I/O体系结构,从体系结构上保证并行I/O的性能及可扩展性。其次,在实现机制上,研究了涵盖I/O的存储一致性模型及实现技术、智能I/O控制、电磁混合存储加速和事务型存储管理等技术,达到提高并行I/O性能和系统可用性的目的。论文的主要研究工作和创新点如下:1. I/O受限的并行加速比模型针对并行计算机系统的可扩展性问题,研究了I/O负载对并行计算机系统可扩展性的影响,提出了I/O受限的并行加速比性能模型,以此为基础对三类常见的I/O体系结构的可扩展性进行了分析;最后用性能模型指导I/O体系结构设计,设计了一种面向高性能计算的可扩展并行I/O系统结构,提出了提高系统可扩展性的几种策略。2.涵盖I/O的广义域存储一致性模型及协议实现技术针对支持全局DMA操作的共享存储系统存储一致性问题,从I/O与存储体系结构一体化设计理念出发,定义了涵盖I/O的广义程序概念,研究了广义存储一致性,建立了广义顺序一致性模型、广义释放一致性模型和广义域一致性模型,基于广义域一致性模型设计并实现了Cache-Memory-I/O数据一致性协议,在大规模CC-NUMA系统上实现了支持全局并发DMA的全局共享I/O系统。实测结果表明,该系统I/O吞吐能力和扩展能力强,实测并行I/O带宽高达20.2GB/s,并行I/O带宽随着进程个数良好扩展。3.基于强化学习的智能I/O调度算法RL-scheduler针对实际应用中磁盘阵列的I/O服务效率问题,将机器学习领域中的强化学习技术引入RAID控制器中,提出了基于强化学习的智能I/O调度算法RL-scheduler,利用Q-学习策略实现了面向并行应用的自治调度策略。RL-scheduler综合考虑了调度的公平性、磁盘寻道时间和MPI应用的I/O访问效率,并提出多Q-表交叉组织方法提高Q-表的更新效率。实验结果表明,RL-scheduler缩短了并行应用的平均I/O等待时间,提高了大规模并行计算机系统的实用I/O带宽,增强了系统的可扩展性。4.支持事务语义的电磁混合存储管理算法针对高性能计算对I/O性能和可用性的双重需求与挑战,在存储设备一级将事务型存储管理和电磁混合存储加速技术有机结合,研究支持事务语义的电磁混合存储技术,提出了基于令牌的并行事务冲突处理协议和自适应动态逻辑分区管理算法。模拟结果表明,支持事务语义的电磁混合存储系统能够有效利用事务访问规律提高固态盘缓存命中率,隐藏版本管理、冲突检测等开销,获得I/O性能和可用性的双重改善。

【Abstract】 Numerical simulation computing serve as one of the main methods to performscientificresearch and exploration, which poses a tremendous and continuous growingdemand on the computation and data processing capacity of high performancecomputers, has driven the development of scientific computing and parallel computersystem. At present, High Performance Computing has entered an era of Petaflops, andthe storage systems also entered the Petabyte era. The challenges of petascalecomputingondatastoragecapacity, I/Operformance,scalability,reliability, availability,and manageability are tremendous. However, the I/O bottleneck issues obstruct largescale parallel systems to achieve higher efficiency, which happens in two occasions. Inthefirst place, I/O performanceis restricted byfactors such as I/Odevice speedand I/Oarchitectures, which results in I/O and computing speed beingsignificantly unmatched.Secondly, scaling up system size makes disk drivefailure more frequently and longertime to reconstruct the failed drive; in consequence availabilityof I/O system becomesmuchcriticalissue.The effective solutions for the I/O bottleneck can be found from the following sixlevels, including applications, algorithms, languages and compilers, run-time libraries,operating systems, and I/O architecture. Among all the levels mentioned above, I/Oarchitectureisthemostfundamental.Inordertomeet the I/Orequirement andchallenge,alongwithourresearchtask ofa high performance parallel computing system, this paper is presenting our theoreticalstudy of I/O architectures, from which make it possible the high performance andscalability in terms of I/O architecture level. Meantime, I/O implementationmechanisms is focused on this paper, including technologies such as I/O-includedmemory consistency model and its implementation, intelligent I/O control, hybridstorage and transactional storage management, so as to promote I/O performance andavailability.Themainworkandinnovativepointsofthispaperareasfollows.1. I/Orestrictedparallelspeedup modelCurrent parallel I/O performance analysis lacks scientific theoretical models tosupporttheI/Oarchitecturedesign.ThepaperstudiestheimpactofI/Oworkloadonthescalability of parallel computing systems and proposes the I/O restricted parallelspeedupmodel.Basedonthismodel,whichcanbeusedtoguideI/Oarchitecturedesign,a scalable parallel I/O architecture for HPC is presented. Moreover, the paper analyzesseveral strategies for improving the system scalability, which serve as the basis forfurtherstudy.2. I/O-includedgeneralmemoryconsistencymodelandimplementingtechnology As for the consistency problem of shared memory systems with global DMAoperations,thepaperdefinestheconceptof I/O-includedgeneral program.Basedontheconcept, the paper studies the general memory consistency model, builds the generalsequence consistency model, general release consistency model and general scopeconsistency model. Using general scope memory consistency model, the paper designsand implements the CC-NUMA Cache Coherence protocol with global DMA and theglobal shared parallel I/O architecture at the hardware level. The experiment resultsshow that the I/O bandwidth and scalability of the system perform fairly well. Theactual parallel I/O bandwidth reaches 20.2 GB/s, and scales well with the number ofsystemprocesses.3. IntelligentI/OschedulealgorithmbasedonreinforcementlearningTo improve the I/O service efficiency of RAID and optimize the I/O performanceof parallel applications, the paper presents an intelligent I/O schedule algorithm,RL-scheduler, in RAID controllers based on reinforcement learning. RL-schedulerutilizes Q-learning strategy to implement a self-control and self-optimizing scheduler.The algorithm leverages the scheduling equity, disk seeking time and the I/O accessefficiency of MPI applications. Furthermore, the proposed interleaving organization ofmultiple Q-tables improves the efficiency of the Q-table updating. The experimentresults show that, on a large-scale parallel system with multiple parallel applications,RL-scheduler shortens the average I/O waiting time of parallel applicationsconsiderably. Thus increases the practical I/O bandwidth, and improves the systems’scalability.4. Hybrid storagemanagement algorithmtosupport transactionsemanticsTo address the requirement and challenge posed by HPC, the paper combines theidea of transactional storage management and hybrid storage acceleration, andintroduces an electro-magnetic hybrid storage management algorithm to supporttransaction semantics. A token-based protocol is designed to cope with the conflictsbetween I/O transactions and an adaptive logical partition algorithm is proposed tomanage Solid State Disk (SSD) storage. Simulation results show that theelectro-magnetic hybrid storage system can deal with transactions with varied accesspattern elegantly and effectively improve SSD hit rate, hide the overhead of versionmanagement and conflict detection. Both the I/O performance and availability aresignificantlyimproved.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络