节点文献

网格服务可靠性建模及任务调度优化研究

Research on Grid Service Reliability Modeling and Task Scheduling Optimization

【作者】 郭夙昌

【导师】 黄洪钟;

【作者基本信息】 电子科技大学 , 机械电子工程, 2010, 博士

【摘要】 随着制造产品功能和结构的复杂性增加,在产品设计中对计算能力和存储能力的需求量也越来越大,仅靠单个计算机已经无法满足现代产品设计的需求。网格技术的出现,使得人们能够通过互联网获得更加强大的计算能力和数据储存能力,而且还可以实现多台计算机的协同工作。利用网格技术,许多产品设计领域中一些以前看来不可能完成的问题都能迎刃而解。然而,由于网格系统的复杂性,网格在可靠性方面仍面临着诸多问题,致使网格系统还无法真正深入制造领域并最终实现整个行业模式的巨大变革。作为衡量网格服务质量的重要属性,网格服务可靠性能够从用户的角度反映网格系统提供服务的能力,因此如何分析和提高网格服务可靠性已经引起国内外众多学者的极大关注。本文将网格容错技术和可靠性分析方法相结合,研究了容错机制下的网格服务可靠性建模以及考虑网格服务可靠性的任务调度优化问题,为从根本上提高网格系统的可靠性做出基础性的探索。首先引入节点失效恢复机制,研究了考虑失效恢复以及本地任务到达的网格服务可靠性建模问题;其次,在所建立的模型基础上,研究了以网格服务可靠性为中心的网格任务调度优化方法;最后针对网格任务调度过程中的资源定价问题,提出了网格资源补偿的概念,形成了一种市场机制下的资源定价方式,从而为网格走向社会生产生活创造有利条件。本文的研究成果主要体现在如下几个方面:(1)考虑失效恢复的网格服务可靠性建模为了提高网格的服务可靠性,引入本地失效恢复机制,同时考虑软件失效的影响,给出了一种考虑节点失效恢复能力的网格服务可靠性模型。为了提高模型的实用性,允许资源拥有者根据资源状况自行调节资源的失效恢复次数以及网格任务生存时间,在此基础上,研究了失效恢复限制下的网格服务可靠性建模问题。该模型为解决“大网格服务”可靠性偏低问题提供了一种有效的途径。(2)制造网格中制造资源的网格任务可靠性研究制造资源具有自治性、异构性及动态性等特点,制造资源除了完成制造网格分配的任务外,还负责本地管理域的工作任务。特别是在本地任务优先策略下,本地任务到达以及失效等因素都会严重影响资源能否在规定时间内完成网格任务。针对这一问题,采用Petri网技术对网格任务的执行过程进行了状态分析。在此基础上,通过蒙特卡洛仿真获得了本地任务优先策略下的网格任务可靠性,并分析了本地任务到达率、本地任务执行率等因素对网格任务可靠性的影响,从而为网格资源管理系统更好地实现网格任务调度提供依据。(3)失效恢复机制下的网格任务调度优化研究在建立的网格服务可靠性模型基础上,研究了网格服务可靠性最大化和执行费用最小化的多目标任务调度优化问题,并采用蚁群算法对该模型进行求解。为了提高网格服务可靠性,采用网格任务冗余调度模式,建立了费用约束下的网格任务冗余调度优化模型。在模型求解中,采用遗传算法,并针对资源约束问题设计了专门的修正因子,从而确保算法的正常运行。仿真结果验证了算法的有效性。(4)市场价值下的网格资源补偿研究市场机制下的网格资源管理是解决网格资源短缺的重要手段。通过深入分析目前网格资源稀缺的原因,得出了网格用户不仅需要支付一定的资源花费,而且还需要对资源由于执行网格任务而丧失执行本地任务的损失做出补偿的结论。分析了两种调度策略下的资源平均收益,并基于微观经济学中机会成本的概念,给出了网格任务时间限制下的资源最小补偿表示形式,在此基础上提出了一种市场机制下的资源定价模型。采用蒙特卡洛仿真模拟两种任务调度策略的期望收入,获得了资源最小补偿的具体数值,分析了网格任务特征和资源特征对资源最小补偿的影响。网格资源补偿的提出能够为资源拥有者提供加入网格的动力,从而吸引更多的网格资源加入网格。

【Abstract】 With the increase of function and structural complexity of manufacture products, an increased demand of computing and storage ability is needed in product design, which has been beyond the current ability of a single computer. With the emergence of grid technology, people not only can gain more powerful computing power and data storage capacity from Internet, but also can use multiple computers to work together. It can tackle large-scale and difficult problems that would be impossible to feasibly solve using the computing resources of a single organization. However, due to the complexity of grid system, there are a lot of problems unsolved, i.e., grid reliability problem, so that the grid has not been widely used in manufacturing industry and then achieves great changes in industry mode.As one of the important measures of quality of service, grid service reliability can reflect the capacity of providing reliable services from a user’s point of view. How to analyze and improve grid service reliability has attracted a lot of research and attention. In this paper, combining grid service reliability analysis with fault tolerance, we study grid service reliability modeling and the reliability-oriented optimization of grid task scheduling. The research can pave the way for thoroughly improving grid service reliability. Firstly, a fault recovery mechanism in grid nodes is introduced and the modeling of grid service reliability considering fault recovery and local task arrivals is studied. Based on the proposed model, an optimization model of grid task scheduling is presented to maximize the grid service reliability. Finally, for the crucial problem of resource pricing in grid task scheduling, a fair price model in market-oriented environment is presented, which can accelerate the grid penetrating into the society.The contributions of this dissertation are summarized as follows:(1) Grid service reliability model with fault recoveryIn order to improve grid service reliability, a fault recovery mechanism in local grid nodes is introduced. Considering fault recovery and software failure, a grid service reliability model is proposed. To make fault recovery more practical, certain constraints on fault recovery, i.e., constraints on the life times of subtasks and on the numbers of recoveries performed, are introduced, and grid service reliability models under these practical constraints are developed. The proposed models can provide an efficient solution for low reliability of time-consuming tasks.(2) Grid task reliability model in manufacturing gridIn manufacturing grid, manufacturing resources have characteristics of autonomy, heterogeneousness and dynamic. They engage tasks coming from not only manufacturing grid system but also the local administrative domain. Especially in the priority strategy of local tasks, the arrival of local task and failure occurrence in the execution of grid task have great impact on the reliable completion of grid task in a specified time. To solve this problem,the state analysis of manufacturing resources based on Petri net is given to describe the complexity of grid task execution process in manufacturing resources. Based on the Monte Carlo Simulation, grid task reliability in the priority strategy of local tasks is obtained. Furthermore, the influence of local task arrival rate and local task execution rate on grid task reliability is analyzed. The results can provide some information to grid resource management so as to make grid task scheduling better.(3) Optimal redundant scheduling of grid tasks based on fault recoveryBased on the proposed grid reliability model, a multi-objective task scheduling optimization model, i.e., minimizing cost and maximizing reliability, is presented and an ant colony optimization algorithm is developed to solve it effectively. Furthermore, to improve grid service reliability, a redundant scheduling strategy of grid tasks is used and an optimization model with a cost constraint is presented to maximize the grid service reliability. A genetic algorithm is developed to solve it and some repair operators are designed to adjust the infeasible solutions of the chromosomes, which can ensure the algorithm work well. A numerical example is given to show the efficiency of the algorithm.(4) Analysis of grid resource compensation in market-oriented environmentMarket-oriented grid resource management is an efficient solution to cope with the scarceness of grid resources in grid system. Through in-depth analysis of this scarceness, it can be known that grid users should pay resource owners a sum of money not only for resources consumed but also for the loss of local task execution. Based on the analysis of the expected incomes of two priority strategies in grid resources, the minimal compensation which grid users should pay to resources owners is determined using the concept of opportunity cost. Based on it, a variable price model is presented to ensure a fair market environment. To calculate minimal compensation, an evaluation approach based on Monte Carlo simulation is given and the minimal compensation can be determined. Furthermore, the influence of the attributes of grid tasks and grid resources on minimal compensation is studied. The research can provide an incentive to resource owners and attract more and more resources in the Internet to participate in the grid.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络