节点文献

存储系统容错及阵列编码

Fault-tolerant of Storage Systems and Array Codes

【作者】 林胜

【导师】 刘璟; 王刚; 刘晓光;

【作者基本信息】 南开大学 , 计算机应用技术, 2010, 博士

【摘要】 磁盘的容错问题是大规模存储系统设计中不能回避的一个重要的问题。容错编码理论为提高存储系统数据的可靠性提供了有效的手段。针对存储系统的一些特点,一类性能良好的二进制阵列码兼顾了系统的容错能力、编码计算复杂度和更新复杂度,被公认是存储系统容错较好的解决方案。然而此类编码并不如通信编码理论那样具有坚实的理论基础和丰富的成果。目前,存储系统中使用比较广泛的是一些双容错的阵列码。这些编码存在着一些限制,例如:都需要将码长限制为素数才能达到其最优性能;向多容错的扩展也都比较困难。论文的重要工作体现在以下三个方面:首先,本文在对目前常用的双容错的阵列码进行总结的基础上,使用组合数学工具,给出了一种系统的阵列码定义及表示方法;进而分析了码的标准化表示及阵列码的一些基本特性,为进一步的深入研究打下坚实的理论基础。其次,为了根据特定的优化目标,构造出实用的编码,文本对下列两种编码结构进行了讨论:1、校验可分阵列码。为了说明这种结构的本质规律,论文研究了置换向量代数的相关特性,并利用此工具指出了校验可分阵列码的容错性能与校验支撑置换的圈分解的关系。根据这一结论,论文利用已知的组合构造方法一哈密尔顿拉丁方构造了LS (Latin Square)阵列编码。本文证明了双容错LS码是对已有的几种双容错水平编码的统一及扩展。进而,论文使用置换向量代数构造了多容错的LS码,为校验可分阵列建立了理论的框架。此外,论文还对固定编码周期下码长限制的问题进行了研究,利用LS码的层叠构造给出了一种解决方案。2、循环阵列码。借鉴线性编码理论的思想,论文研究了循环阵列码的基本理论,并给出了一种循环阵列码的基本构造。在此基础上,研究了最长最低密度阵列码的构造,给出了此种编码码长的上界。利用组合结构’’NRB" (Near Resolvable Balanced Incomplete Block Designs),本文给出了一种3容错最长最低密度阵列码的构造,并给出了编码清晰的代数描述,为进一步的深入研究打下基础。最后,文章从存储系统的整体可靠性角度,以FULL-2码为例,利用阵列修复模型研究了非MDS (Maximum Distance Separable)码的实际容错能力,为系统编码方案的选择提供了数据依据。本文的工作尝试使阵列编码这一领域的一些现有零散结论系统化,并为它们建立统一的理论基础。

【Abstract】 Fault-tolerant is an important issue to a large-scale storage systems. Coding theory provides an effective way to improve the reliability of data storage system.Accounting into the system’s fault tolerance, coding computational complexity, and update complexity of storage systems, a class of binary array codes is regard as a good solution. But comparing to communication coding theory, there are no such a solid theoretical foundation and rich results to array codes. At present, only some 2-erasure array codes are used in storage systems, and the code length of those array codes are usually limited to prime numbers in order to achieve those optimal performance. Moreover, they are usually hard to be expanded to the multiple fault-tolerant case.This thesis firstly summarizes the current commonly used 2-erasure array codes,then gives a systematic definitions of array codes with the language of combinatorics. Base on the definitions,we give a standardized representation of array codes,and then study some basic characteristics of array codes.In order to construct exactly codes for specific optimized goals,the thesis studies two basic classes array codes,they are:1.Parity-dividable array code:In order to illustrate the law of this structure, the thesis; develops a kind of permutation-vector algebra as a research tool. With this tool,we find out the relations between the parity map permutation’s circle structure and the code’s fault-tolerant ability. With the well-known combinatorial structure--latin square, we construct a class of 2-erasure codes which is proved to be the expansion of several known 2-erasure codes. Further-more,with permutation-vector algebra, we gives the multi-fault-tolerant construction of LS-code. In addition, to break the limit of the short code length,we construct a class of codes named Cascade-Latin code.2.Cyclic array codes.The thesis gives some basic structures of cyclic array codes.Base on this, we study the longest lowest-density array codes. First we gives the upper bound to the code length of those codes. With a well known combinatorics structure:"NRB" (Near Resolv-able Balanced Incomplete Block Designs), the article presents a 3-erasure longest lowest-density array codes named T-code.Finally, from the overall perspective, the thesis analyze the reliability of storage system base on FULL-2 code. The analysis point out that those non-MDS codes are also has good performance in some case.The article try to build a systematic theory frame to the array codes. This is very useful to the further research.

【关键词】 存储系统容错编码阵列码
【Key words】 Storage systemsfault-tolerant codingarray code
  • 【网络出版投稿人】 南开大学
  • 【网络出版年期】2011年 07期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络