节点文献

基于对象的主动存储关键技术研究

Research on Key Technologies of Object-Based Active Storage

【作者】 覃灵军

【导师】 冯丹;

【作者基本信息】 华中科技大学 , 计算机系统结构, 2006, 博士

【摘要】 随着计算机和互联网的迅猛发展,网络存储应用现了一些新特点:数据总量爆炸性增长的趋势越来越快,存储管理和维护的自动化和智能化程度要求越来越高,多平台的互操作性和数据共享能力越来越重要。网络存储正发生着革命性的变化,基于对象的存储应运而生。基于对象的存储将存储接口作了根本性的改变,提出了对象接口,由此克服了块接口与文件接口的缺陷,使得对象存储系统在安全性,数据共享,可扩展性及性能等方面能做到最好的折中。对象接口访问的基本单位是对象,对象除了包含用户数据外,还包含能描述对象特征的属性。通过在用户和设备之间传递对象属性信息,对象接口比其他接口具有更为丰富的语义表达能力。随着电子技术的发展,存储设备上已嵌入了越来越多的处理能力,上层应用的部分功能可以迁往设备(即“主动存储”的概念)。目前广泛流行的存储设备是“哑设备”,只能被动地响应用户的请求,随着设备功能越来越复杂,传统的对设备透明的管理方式已很难胜任。人们迫切需要一种设备参与在内、更简单灵活的方式来管理,而主动存储能很好地满足这些应用的需求。另外,由于硬磁盘本身机械运动的本质特征没有明显改变,在网络存储条件下网络延迟不能忽略,特别是网络共享跨越了广域网,如何提高处理节点和存储节点间的数据传输速率成为当前提高系统整体性能的关键因素。主动存储系统能够在很大程度上解决该项难题。可以说主动存储系统代表着将计算向数据移近的发展方向。但在对象接口出现之前,原有设备使用的接口是一种简单透明的接口,这妨碍了设备端和用户端的任务合作。对象接口的出现加速了主动存储的发展进程,主动存储可以利用富于表达力的对象接口将更多的信息在设备端和用户端之间传递。可以说对象存储与主动存储的结合将是未来存储界领先的技术之一。基于对象的主动存储系统(OBASS)结合了对象接口与主动存储的优点,主动存储将建立在对象基础上,所有信息都以对象的形式出现。OBASS的基础是基于对象的存储设备(OSD),在软件层次上,OSD主要分为三层:对象层、主动服务层以及存储质量控制层。对象层统一管理所有对象,负责对象的磁盘数据组织(包括磁盘内对象数据布局和磁盘间的对象放置)。主动服务层实现了主动存储功能,上层应用把功能模块下载到OSD中后作为特殊的对象(方法对象),由对象层存储与管理,由主动服务层调度方法对象执行。存储服务质量控制层对所有层次模块的执行过程施加影响,使OSD在不同的负载状况下能满足不同用户的对象读写QoS要求。磁盘内的对象数据组织即对象文件系统,其中一个重要的特性是性能的持久性,即对象文件系统在长期的使用的过程中,经历了频繁的对象创建、删除和写操作后,其性能仍然能够维持在较高水平。性能持久性是通过柔性空闲空间管理和分配粒度可变的渐近空间分配策略实现的。磁盘间的对象放置策略研究对象在多个磁盘间的放置,使得整个系统性能达到最优,负载得以均衡。不同的应用放置策略是不同的,对于流媒体应用,适用的放置策略是基于阻塞概率模型的放置策略;而对于事务处理,基于响应时间模型的放置策略更合适。对象放置策略应能运用于在线环境,但在线运行时负载特点是无法事先预料的,利用OSD的智能性及对象的属性,对象放置策略可对负载的特征进行自主学习,并根据学习的结果指导对象放置。为实现基于对象的主动存储,首先对现有的T10 OSD标准进行修改,扩充对象概念引入方法对象使其支持主动存储。方法对象的执行有两种方式:一种是外界用户的请求触发执行,另一种是条件满足时的策略触发执行。主动服务层建立了统一的框架,把这两种方法的调度机制有效结合起来,支持计算任务与管理任务向OSD迁移。方法的执行机制是主动存储的核心,针对过滤型方法和服务型方法这两类不同方法,分别提出了在Linux系统下的实现机制。为评估主动存储的效能,将这两类方法分别运用于两种不同应用:OSDFS文件系统及对象存储系统的负载均衡。在OSDFS中,两种常规操作lookup和unlink作为过滤型方法下放到OSD中,分析表明,这可以大大减少网络延迟和内存拷贝。而负载均衡算法可以作为服务型方法运行在OSD中,实验表明,启用对象复制和对象迁移的负载均衡算法能最大程度地减少平均系统响应度。存储服务质量控制分三个层次实现QoS:上层调度器实现对象请求调度,从对象级保证QoS;中层调度器实现对象预处理,与对象文件系统及缓存结合,对于对象读,通过对象的预取,对于对象写,通过页面预分配、延迟空间分配及延迟写保障对象读写的QoS;而下层调度器与Linux I/O子系统中的块I/O调度器结合,综合考虑带优先级的实时负载以及非实时负载的调度,即考虑I/O请求的时限,磁头定位时间和优先级三个因素在内的I/O调度。将OSD的QoS调度模块(OIS)与Linux系统中其他的调度器进行了比较,结果表明,对于实时读操作,OIS引起的延时抖动比其他调度器至少小1个数量级;而对于写操作,至少小2个数量级。

【Abstract】 With the development of computer and Internet, new trends have emerged in network storage: the amount of data increases more quickly, storage management needs more intelligence, and cross-platform data share becomes more important. Object-Based Storage(OBS) is an emerging standard designed to address this problem. Using object-based interface, OBS can overcome the limitation of block-based and file-based interface, and can improve safety, data sharing, scalability and performance. The access unit in OBS is object. An object contains not only user data, but also attributes describing characteristic of data. Object-based interface is more expressive than others, because it can transport attributes among clients and devices.At the same time, devices have embedded with more and more processing capability, more funcionts are migrated to devices(active storage). However, these devices are usually“dump”, which response to outside requests passively. In this way, devices are externally managed without involving in storage management. To make management more efficient, devices should become more intelligent, self-managed and application-aware. In fact, this problem can be solved by exploiting active storage technology. Moreover, disks are still slow devices; and delivering data throught WAN still brings non-neglective delay. To avoid this, data should be processed inside storage devices using downloaded code. With the emergency of object-based interface, active storage can take advantage of expressive interface to achieve the cooperation of application and devices. Object-Based Active Storage System (OBASS) is the combination of OBS and active storage, and will be the next wave in network storage field.OBASS is built on the basis of objects, and all the information will be represented as objects. Objects are storaged in Object-Based Storage Devices (OSDs), which is the building block of OBASS, The software in OSD mainly composes of three layers: object layer, active service layer and QoS layer. Object layer is responsible for object data organization in disks. Active service layer implements active storage, and application can download code to devices saveing as special objects (method object). Active service layer is responsible for scheduling of methods and puts methods into execution. The QoS layer controls all software layers to satisfy the QoS demands of all kinds of users under various workload.In object layer, data organization in one disk is implemented in Long-lived Object-Based File System(LOBFS). One goal of LOBFS is to achieve persistent performance, that is, the performance doesn’t decline after a long period during which a large number of operations (object create/delete/write/read) are performed in LOBFS. The persistent performance is implemented through flexible free space management and gradual space allocation which uses variable allocation granulality.In object layer, another data organization problem is object assignment among disks (object placement). Different application adopts different placement policies. Two policies, based on blocking probability model and response time model, are presented to use in stream media and transaction processing application respectively. In OBAS, object placement policy can be used in on-line circumstance. To address the online problem without a priori knowledge of workload parameters, the policy employs an adaptive mechanism to estimate the characteristics of workloads based on device intelligence and object attributes.In active service layer, T10 OSD protocol is modified to support active storage, including defining new object such as method object. Method object can be put into execution in two approaches: request-driven and condition-driven. Combining these two approaches, the active service layer establishes a uniform framework for offloading computing task and management task to OSDs. Method execution is the key of active storage, an implementation in Linux is illustrated for two kinds of method: filter-typed method and service-typed method. To evaluate the effect of active storage, two applications are investigated: OSD File System and load balancing. In the first application, two operations(lookup and unlink) are offloaded to OSD as filter-typed method, and the analysis shows that network delay and memery copy can be reduced largely. In the second application, load balancing algorithm can be executed as service-typed method in OSD, experiments show that the system response time can be sharply reduced with object replication and migration.Storage QoS control can be achievd through three layers: Upper layer(object level) implements the scheduling of object request; Middle layer(object-block level) implements QoS through object prefetching, page preallocation, delay space allocation and delay writing; Lower layer(block level) schedules block requests for both realtime workload and non-realtime workload by considering three factors: deadline, disk head position and I/O priority. The tests show that the jitters are at least 1(for realtime read) or 2(for realtime write) order of magnitude less than the ones under other schedulers.

节点文献中: