节点文献

MapReduce计算任务调度的资源配置优化研究

Researches on Optimization of Resource Allocation for MapReduce Scheduling

【作者】 韩海雯

【导师】 齐德昱;

【作者基本信息】 华南理工大学 , 计算机应用技术, 2013, 博士

【摘要】 大数据处理平台中任务密度和数据厚度不断增加,平台资源规模也随之不断扩展。面对错综复杂的大数据计算任务串并行执行过程和并发调度过程,如何合理配置平台资源,这直接决定了大数据处理平台的业务承载能力。现有的以面向数据并行编程模型为核心的大数据处理技术,主要着眼于计算任务调度执行过程中各种资源的并行化及相关伸缩性实现研究,却在基于不同用户和不同计算任务间相异的资源需求展开资源配置优化方面尚未展开充分的研究。大数据处理平台的资源配置优化问题,是大数据应用发展推动下形成的重要研究领域,目前相关的研究工作仍处于起步阶段。瞄准这一薄弱点,着眼于新兴的MapReduce大数据处理框架,本文对大数据处理技术特点和MapReduce计算任务调度执行过程进行了全面而深入的分析,并提出了资源配置优化的系统解决方案,从纵向的单计算任务串行执行和横向的多计算任务并发调度这两个层面对大数据处理平台资源的配置进行优化,以达到提高大数据处理平台资源利用率、加强平台业务承载能力的最终目的。本文的主要研究工作和创新点概括如下:1.从大数据处理显著的动态特性出发,为构建自适应的资源配置优化体系框架,提出计算任务运行概貌概念,为大数据处理计算任务塑型负载表征。由此出发,基于新兴大数据处理系统—MapReduce编程模型及其支撑系统的工作原理和工作机制,对MapReduce计算任务运行概貌的实际结构及组成字段进行了详细的设计和构建。进一步地,基于BTrace技术开发了非入侵式的动态探针程序,实现对MapReduce计算任务实际执行情况的细粒度实时探测,并生成具体的计算任务运行概貌值。2.基于MapReduce计算任务运行概貌,从纵向的单MapReduce计算任务串行执行层面,提出一种自适应动态资源配置自调优方法,即运行概貌-性能预测-性能优化(Profile-Predict-Optimize,PPO)方法,并依次构建了相应的MapReduce计算任务性能预测模型和MapReduce计算任务性能优化模型。其中,MapReduce计算任务性能预测模型采用基于已知计算任务运行概貌及假设计算任务资源配置计划的白盒分析方法和基于决策树学习的黑盒评估方法等进行综合建模,实现对计算任务执行性能的预测和估算。MapReduce计算任务性能优化模型则在此基础上进一步采用子空间分解和递归随机搜索技术对庞大而高维的资源配置计划解空间进行有效搜索,并基于用户优化目标和相应约束条件进行寻优比较,求出资源配置计划最优解。深入的实验评测结果表明,性能预测模型在运行探针程序额外开销下,会产生平均15.1%的计算任务执行时间过量预测,但基本能够清晰有效地识别出导致好的优化效果的计算任务配置参数值;与目前常用的经验规则方法相比,性能优化模型能在多计算任务并发执行中把计算任务执行时长改善幅度的平均值提高42%、最大值提高25.7%。3.基于计算任务运行概貌和计算任务性能预测模型,从横向的多MapReduce计算任务并发调度层面,提出一种自适应的资源感知动态并发调度方法(Resource-awareDynamic Scheduler,RDS),并据此设计和开发了RDS调度器原型。RDS调度器创新性地在多任务并发调度过程中纳入了对来自多用户的不同计算任务完成质量需求的考虑,面向多个动态随机到达的MapReduce计算任务,通过资源放置矩阵感知系统资源使用情况的最新状态,基于用户计算任务完成质量需求建立计算任务效用评估模型,以计算任务效能总值最大化为调度目标,不断动态更新计算任务在各处理机节点的资源调度分配,以达到满足平台多用户计算任务完成质量要求和提高平台总体资源利用率的双赢。综合评测结果表明, RDS调度器能够对平台资源在多个并发执行的计算任务间的分配情况进行动态调整,在放松的计算任务完成时长目标和紧缩的计算任务完成时长目标下,其表现均优于Hadoop系统提供的公平调度器,达到与其相比5-100%的计算任务执行时长的缩减。

【Abstract】 The job frequency and data density increase continuously in big data processing platform,together with the platform resources. To achieve the excellent carrying capacity in big dataprocessing platform, it is important to allocate the platform resources properly among big datacomputation jobs in the complicated execution and concurrent scheduling process. Theexisting research on big data processing technology about data-oriented parallel programmingmodel pays more attention on the implementation of computation job’s parallelism executionthan on the different resource demand of different users and different computation jobexecution processes, where hide a huge opportunity of resource utilization improvement andbusiness carrying capacity enhancement by optimizing the resources allocation amongdifferent computation jobs and different computation job execution processes.The resource allocation optimization of big data processing platform is a so brand newresearch scope being developed by the big data application development that the relatedresearch work is still in shortage currently. Targeting at this gap, a complete model of resourceallocation optimization for the emerging big data processing MapReduce framework isproposed according to the in-depth study and creative development of the resource allocationoptimization during the vertical MapReduce computation job execution and horizontalmulti-jobs’ concurrent scheduling process in the big data processing platform. This modeldevelops the existing technology of MapReduce programming model and its supportingsystem by optimizing the resource allocation from both levels including vertical computationjob execution and horizontal jobs concurrent scheduling process to reach the target ofresource utilization improvement and business bearing capacity enhancement in big dataprocessing platform.Specifically, the main contributions of this study are as follows:1. A new concept, computation job execution profile, is proposed in this study to developthe self-adaptive capacity for the dynamic feature in big data processing. Bycomprehensively studying the detailed mechanism of the MapReduce programmingmodel and its support system, the construction and the composed fields of thecomputation job execution profile are formed according to the MapReduce job’s micro-processing execution phases. Afterward, a non-invasive dynamic probe program isdesigned and developed using BTrace technique to trace the actual MapReducecomputation job’s execution procedures during its execution and get the detail executioninformation in granular real-time to count out the result, which is the specific value ofeach profile field.2. With the vertical job execution point, a new adaptive dynamic auto-tuning methodcomposed of three phases including job execution status profiling, job performancepredicting and job performance optimizing (Profile-Predict-Optimize,PPO) is proposed,with the corresponding MapReduce job performance prediction model and theMapReduce job performance optimization model. The MapReduce job performanceprediction model is constructed to predict the MapReduce computation job performanceaccording to the given computation job running profile and computation job resourceallocation plan. And, using the MapReduce job performance prediction model, theMapReduce job performance optimization model could find out the most optimalresource allocation plan by searching the resource allocation plans space effectivelyaccording to the user’s optimization demand. The experiment results show that theperformance prediction model basically could clearly and effectively identify the betteroptimization configuration values, though producing an average of15.1%of thecalculated excess predict task execution time because of the probe overhead. On the basis,the performance optimization model would improve the computation job’s execution timeby average42%, maximum25.7%than the commonly used rule and thumb methods forthe concurrently multiple computation jobs.3. A new adaptive resource-aware dynamic scheduler (Resource-aware Dynamic Scheduler,RDS) for multi tasks concurrently scheduling problem is proposed and constructed. RDSachieves both the different levels of customer satisfaction and the resource utilizationimprovement by sensing the resource usage status timely through a resource placementmatrix of each processor node computing resource scheduling assignment constantlyupdated dynamically and maximizing the total tasks utility through task effectivenessevaluation model based on user QoS requirements. The comprehensive evaluation resultsshow that the RDS scheduler is able to dynamically adjust the platform resources allocation among the concurrently multiple computation jobs under no matter the relaxedlong completion time goal or the crunched completion time goal with the superiorperformance than the Hadoop’s fair scheduler about5-100%completion time reduced forthe multiple computation jobs.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络