节点文献

一种Linux分布式存储系统的设计和实现

The Design and Implementation of a Linux Distribute File System

【作者】 程炜烨

【导师】 谢长生;

【作者基本信息】 华中科技大学 , 计算机系统结构, 2007, 硕士

【摘要】 由于集成电路技术的快速发展,计算机的计算能力呈指数增长,但磁盘等存储设备的I/O速度增长缓慢,处理器与I/O在性能上的差异造成了严重的瓶颈问题。与此同时,随着当今时代信息科技的飞速发展,数据存储的规模逐步达到TB和PB。单纯的依靠磁盘和磁带已很难满足人们的需要。针对上述问题,设计一种具有大容量、高可靠性、高可用性、高性能、动态可扩展性、易维护性的文件存储系统越来越成为需要。通过研究当前广泛使用的两种网络存储结构(NAS和SAN),提出了一种既具备NAS和SAN系统技术优点、又能克服两者缺点的基于Linux的分布式存储系统。每个存储节点都存储一部分数据文件及对应的元数据。当存储节点启动后,与其在同一个局域网的存储节点通信,最后获得全局的元数据信息。各个存储节点相互独立,提高了系统的并行性和可扩展性。对于热点文件,借鉴于RAID思想,将连续的数据分割成相同大小的数据块,将每段数据分别写入不同的存储节点,这样可以采取并行访问的策略来提高该文件的访问速度。为了使该存储系统使用方便,在应用服务器端设计实现了LDFS(Linux Distribute File System)文件系统,当其挂载后,可以像普通的文件系统一样使用。测试表明,在相同的网络环境下对同等大小的数据进行访问,Linux分布式存储系统的I/O性能均优于CIFS和SMB文件系统的I/O性能。

【Abstract】 Because of the speedy development of integrated circuits , the computer’s computation capability takes exponential growth. But the low increase speed of storage devices’ (like a disk) I/O, the CPU’s computation capability can’t match with it ending a bottleneck. At the same time, with the rapid development of nowadays information technology, the size of data storage is gradually up to TB or PB. It’s hard to satisfy user needs by purely depend on disk and tape.According to the problem above, it’s necessary to design a kind of file storage system which with large volume, high reliability and usability, perfect performance, dynamic extension and easy to maintenance. Through research on NAS and SAN which are two kinds of network storage structure, it puts forward a distributed storage system that bases on soft-RAID technology, this system not only has the merits of NAS and SAN, but also can overcomes their shortcomings.Every storage node stores part of data file and the matching metadata. When a storage node starts up, it will communicate with any other node in the same local area network, and finally gain all the metadata information. As the individual storage node is independent, it improves the system’s parallelism and expansibility.Towards hotspot file, referencing the thought of RAID, we can divide a consecutive stream of data into a series of data blocks, the blocks all being of the same size, and write each segment data into different storage node, so the file access speed will be boosted when takes a parallel-access method.For the facility to use the storage system, LDFS file system is designed and realized on the application server, you can operate the system like an ordinary one when it’s mounted.Test indicates that the I/O performance of Linux distributed storage system is much better than Samba file system and CIFS, when they access data that in equal size in the same network environment.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络