节点文献

可扩展的单一映象文件系统

Scalable Single-image File System

【作者】 王建勇

【导师】 祝明发;

【作者基本信息】 中国科学院研究生院(计算技术研究所) , 计算机系统结构, 1999, 博士

【摘要】 传统的分布式文件系统不能为机群系统提供严格的单一映象功能,而且由于不能适应计算技术的发展趋势,无法满足应用对机群系统的I/O性能、可扩展性和可用性的需求。曙光超级服务器是典型的机群系统,我们为其研制开发了可扩展的单一映象文件系统COSMOS,并称其原型系统为S2FS。本文主要描述了S2FS的设计、实现及评价。 首先,S2FS是一个全局文件系统,它通过实现位置透明性和严格的UNIX文件共享语义而保证了严格的单一系统映象。我们在不修改AIX操作系统源码的前提下,通过Vnode/VFS层核心扩充,实现了与其底层平台的无缝连接,保证了与UNIX应用程序的完全二进制兼容,验证了虚拟文件系统机制是实现这一目标的一种有效途径。 其次,为了提高S2FS系统的性能和可扩展性,本文对合作式缓存进行了研究和评价。在避免系统死锁的前提下,设计了基于目录的无效使能协议,并证明其保证了缓存一致性。为进一步提高系统性能,提出了双粒度缓存一致性协议,在此基础上设计了启发式缓存管理算法,通过模型分析证明其同目前常用的N-Chance算法相比有了进一步的性能改进。 最后,为了避免单一服务器瓶颈问题,我们为S2FS采用数据存储与元数据管理分开的策略,实现了分布式的数据存储和元数据管理功能。元数据管理服务器除了存储及维护系统元数据(如文件索引节点和超级块)外,还记录了数据缓存位置,并维护合作式缓存的一致性。在存储服务器一端,实现了网络磁盘存储分组功能及软件RAID1模型,底层存储基于可靠的JFS和异步I/O功能,提高了I/O带宽和存储的可用性。 虽然本文在保证系统单一映象和二进制兼容性的基础上,对适合于机群文件系统的可扩展性技术进行了研究,但由于应用对I/O的需求是永无止境的,且其I/O存取特征以及计算技术的发展趋势也在不断发生变化,这一切都为我们未来研制新型的分布式文件系统提出了更大的挑战。

【Abstract】 Traditional distributed file systems can’t provide clusters with strict single-system image, and because of failing to keep up with the trends in computing technology, they can’t meet the cluster applications’ requirements either, such as I/O performance, scalability, and availability. Dawning super-server is a typical cluster system, we have developed COSMOS file system for it, and call its prototype file system S2FS, an acronym for a Scalable Single-image File System. Mainly presented in this dissertation are S2FS’s design, implementation, and evaluation.First. S2FS is a global file system. In order to maintain a strict single-system image, it provides location transparency and strong UNIX file-sharing semantics. Being lack of AIX operating system’s source code, we can still add S2FS into AIX seamlessly at the Vnode/VFS interface so mat S2FS maintains ABI/API compliance with UNIX file system, thus demonstrating that Virtual File System is an effective mechanism to achieve this objective.Further, this dissertation highlights the research and evaluation of cooperative caching used to improve S2FS’s performance and scalability. After a sufficient condition of the deadlock-free design has been given, the directory-based invalidate cache coherence protocol is introduced and its cache coherence is verified using belief. Then we propose the dual-granularity cache coherence protocol as a way to further improve the system performance, and devise a hint-based heuristic cooperative caching algorithm under dual-granularity protocol. The analytical models are established for both heuristic algorithm and the state-of-the-art N-Chance algorithm, the analytical results show that the heuristic algorithm can effectively reduce the I/O response time compared with N-chance algorithm almost in each case.Finally, in order to eliminate central file server bottleneck found in traditional file systems. S2FS splits the traditional server’s functionality into two separate pieces: data storage and metadata management, and distributes them among cooperating networked machines respectively. The metadata management server, which we call manager, is responsible for storing and maintaining system metadata(including file inodes and superblook), and it also records the data location in the clients’ caches so as to preserve cooperative cache coherence. The storage server implements network disk stripping

节点文献中: