节点文献

高性能计算集群文件系统的优化技术研究

The Research of Optimization Technologies for the File System of High Performance Computing Cluster

【作者】 张钰森

【导师】 吴庆波;

【作者基本信息】 国防科学技术大学 , 计算机科学与技术, 2010, 硕士

【摘要】 随着高性能计算技术的飞速发展,越来越多的领域开始使用该技术来解决生产和科研中所遇到的实际问题,例如气象数值模拟与预报、地震预报、生物信息、环境科学、空间科学、金融等重要领域。高性能计算技术的发展水平已经逐渐成为衡量一个国家综合国力和国际竞争力的重要指标。在构建高性能计算系统的过程中,存储系统的性能是影响其计算性能的主要因素。因此,研究高性能计算文件系统并对其进行优化具有重要意义。本文对高性能计算文件系统的存储原理、存储结构以及相关存储技术进行了深入研究。在此基础上,对其实际应用过程中存在的不足进行分析。针对这些不足之处,对高性能计算文件系统进行优化,以提高存储系统的I/O性能。在存储资源分配策略方面,本文将经济学模型引入高性能计算文件系统。利用相关经济学理论对文件系统进行建模,并在该模型基础上设计了相应的算法对文件系统的存储资源进行分配。优化之后的文件系统能够根据应用场景的不同,动态调整其存储资源分配策略。不仅简化了文件系统的调优工作,还提高了系统资源利用率。在数据访问控制方面,本文提出了一种基于状态感知的数据访问控制方法。状态感知访问控制方法的关键在于客户端能够感知到整个系统的负载状态,并能够根据负载状态信息动态调整其请求发送策略。这种数据访问控制方法能够在一定程度上避免拥塞发生,并使文件系统工作在最优负载状态,充分发挥其I/O性能。在元数据访问控制方面,分布式元数据存储结构是消除单元数据服务器瓶颈的有效解决方案。本文对这种存储结构进行了优化设计,并在此基础上对文件系统元数据访问策略进行了优化。为提高元数据服务器的响应速度,本文对元数据的操作进行了适当的松弛处理。优化之后的文件系统能够更好地满足高性能计算对存储系统的需求。最后,本文基于上述工作设计了原型系统SA-Lustre,并在Lustre模拟器上实现了该原型系统。通过对SA-Lustre原型系统的测试可以发现,优化之后的文件系统在I/O性能、并发I/O带宽以及吞吐率方面有了很大的提高。

【Abstract】 With the rapid development of the HPC (High Performance Computing) technology, it has been widely used by more and more areas to solve practical problems, such as Weather Forecast, Earthquake Prediction, Bioinformatics, Environmental Science, Space science, Finance and other important areas. The state of HPC Techniques’development has gradually become a major significant of a country’s comprehensive national strength, and the indicators of its international competitiveness. During the progress of building HPC Systems, the performance of its storage system is one of the main factors of its computational performance. Therefore, it is necessary to study the storage system of HPC Systems and to do optimization to it.In this paper, the storage principles, storage system architecture and related storage technologies have been researched in depth. Based on this, the shortcomings, coming out from the progress of the practical applications, have been analyzed. In response to the disadvantages, the HPC Filesystem has been optimized to improve the I/O performance of the storage system.About the strategies of storage resource allocation, the economic model will be introduced to the HPC Systems. With the help of the economics theory, the file system has been modeled. Besides, algorithms for storage resource allocating have been designed based on the model. The file system which has been optimized could adjust the strategy for storage resource allocating dynamically, according to different scenarios. The tuning work of the file system has been simplified by the optimization; the utilization of system resource has also been improved.About the data access control, this paper presents a technique for data access control based on the state of system. The key point of this method is that client could sense the state of server’s load and adjust its request sending strategy according to it. This method could make the file system avoid congestion and work in the optimal load status. So file system could give full play to its I/O performance.About the metadata accessing, this paper introduces MDCache (Metadata Cache) to HPC Filesystem. Optimization about the metadata access strategy has been done to it based on this. Besides, the operation about metadata has been relaxation treated to reduce the response time of the MDS (Metadata Server). File system which has been optimized could meet the HPC’s growing demand in an even better fashion.Finally, the prototype system SA-Lustre has been designed based on the optimization techniques above, and implemented with the help of Lustre Simulator. Compare the testing result between SA-Lustre and Lustre; it could be found that the I/O performance, concurrent I/O bandwidth and throughput have been greatly improved after the optimization.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络