节点文献

高可靠实时多阶段系统可靠性分析

Reliability Analysis of Dependable Real-Time Multiple-Phase Systems

【作者】 莫毓昌

【导师】 Daniel P. Siewiorek; 杨孝宗;

【作者基本信息】 哈尔滨工业大学 , 计算机系统结构, 2008, 博士

【摘要】 随着微电子技术、信息技术、制造技术的迅猛发展,人类社会的各种关键应用领域(如航天运载、核能控制、武器装备、空间探测、电信交换、交通控制、医疗器械等)中逐渐涌现出一类复杂系统。这类系统有很高的实时性要求和可靠性要求;系统运行具有明显的阶段性;系统结构普遍采用分布式冗余机制从而具有动态可变的系统结构。本文把这类系统统称为高可靠实时多阶段系统(DEpendable Real-time Multiple-phased Systems, DERMS)。在开发DERMS过程中,一个重要的步骤就是分析给定的系统设计是否满足预定的可靠性需求,即DERMS可靠性分析。针对DERMS可靠性分析国内外研究人员提出了两种方法:面向静态DERMS的故障树分析方法和面向动态DERMS的状态空间分析方法。通过对研究现状进行分析发现已有的研究并不成熟、系统,突出表现在:1)已有的基于BDD的故障树分析方法,系统BDD生成效率低;2)已有的基于随机过程的状态空间分析方法,只考虑了简单、小规模动态DERMS并没有考虑具有复杂行为的大规模动态DERMS。针对这两个问题本文进行了深入系统的研究。针对静态DERMS组合系统故障树的BDD变量排序问题,论文给出了PDO排序策略、组合排序策略、一般排序策略和改进一般排序策略四类策略,并对各类策略的性能进行了细致的分析和比较,最后基于分支定界搜索给出了一个的策略选择方法。基于16个测试样本的实验数据表明,相关研究中普遍采用的后向PDO排序方法只能够以7/16的概率成为10种备选策略中的最优策略,并且没有一个排序方法能够以大于5/8的概率成为10种备选策略中的最优策略,所以本文给出的分支定界策略选择方法是一个很实用的策略选择方法。针对静态DERMS组合系统故障树的BDD生成算法问题,论文从Zhang提出的PDO操作和PDOCombine算法入手,通过从扩大变量排序使用范围和提升运行性能这两个角度不断的对原始的PDOCombine算法进行改进从而获得了变量排序使用范围更广、性能更好的QuickPDOCombine算法。如对于一个较复杂的静态DERMS实例分析表明,虽然最终生成的系统BDD为155个节点但已有算法耗时近2分钟并且要求能够存储具有358,575个节点的中间BDD,而本文给出的算法耗时只需1.2秒并且只要求155个BDD节点的存储空间。针对动态DERMS的阶段可靠性分析问题,论文重点给出了共享维修子系统、独立维修子系统和不可维修冗余子系统的可靠性分析方法。已有研究对于共享维修子系统的可靠性分析是基于MRGP过程的分析,本文在已有研究基础上简化了核心矩阵的推导方法和计算方法。已有研究对于独立维修子系统的可靠性分析都是假设维修活动是指数分布的Markov过程分析,针对实际维修活动通常是确定的或者一般性分布情况,本文推导了相应的组合分析公式。公式的组合性能够有效缓解了状态空间分析中存在的状态爆炸问题。针对动态DERMS的系统可靠性分析问题。论文引入了阶段粘合剂—分支矩阵,并在阶段转换无记忆假设的基础上推导了系统任务可靠度的通用计算公式,提出了两种系统任务可靠度求解方法:解析式求解法和数值卷积积分求解法。通过案例分析表明,所给出的任务可靠度分析方法,在定制系统开发过程中可以用于验证和评估设计人员给出的设计;在通用系统开发过程中可以用于指导任务设计和选择。综上所述,本文围绕DERMS可靠性分析问题,针对已有基于BDD的故障树分析方法和基于随机过程的状态空间分析方法的不足之处进行了系统研究,提出了新的高效算法和统一分析框架,使得这两种方法能够被有效的应用到实际的大规模复杂DERMS可靠性分析中去。

【Abstract】 A DEpendable Real-time Multiple-phased system (DERMS) is defined as a system, which is subject to multiple, consecutive, non-overlapping phases of operation. During each phase it has to accomplish a specified task. Thus, the system configuration, failure criteria, and failure behavior can change from phase to phase. Many DERMS instances are deployed in various critical applications. Because of their deployment in critical applications, the reliability analysis of DERMS is an issue of primary relevance that has been widely investigated. Much work has been proposed either based on combinatorial models or on state space oriented models. Our research shows that two main problems can be found in the analysis methods proposed in the literature:1) For the BDD-based fault tree analysis of static DERMS, how to generate the system BDD efficiently is the most important thing. Our research work shows that the existing methods in the literature are too inefficient to be used for industrial DERMS instances, and 2) For the sake of a cost effective analytical solution, state space oriented methods necessarily need to introduce many simplifying modeling and analytical assumptions, which make them become inapplicable to real dynamic DERMS instances. This paper focuses on these two problems and produces some beautiful results.Variable ordering for static DERMS fault tree is critical to the BDD method. Several ordering schemes proposed in the literature have various deficiencies in applicability and performance. To attack the weak points of the state-of-the-art, this paper builds an ordering heuristic library based on a heuristic classification. It includes PDO ordering heuristic, combining ordering heuristic, simple ordering heuristic and revised simple ordering heuristic. A heuristic selection method based on branch-and-bound technique is also presented to avoid the intensive computation of some extremely bad ordering heuristics. The set of possible selection choices are 10 alternative heuristics, and the widely used ordering heuristic backward PDO has a 7/16 chance to become the best ordering heuristic from the set of 10 for the test set of 16 given DERMS and there is such an ordering heuristic backward PDO has more than 5/8 chance to become the best ordering heuristic from the set of 10, so the presented heuristic selection method is a very practical method. For the problem of system BDD generation of static DERMS fault tree, this paper starts from the PDO operation and PDO algorithm presented by Zhang, and improves the original algorithm to obtain a better algorithm, which can be applicable to much more DERMS instances and has better performance. The test data show that for a slightly complex static DERMS with a final system BDD with 155 nodes, the existed algorithm needs 2 minutes computation time and 358,575 nodes storage space, but our algorithm only needs 1.2 seconds computation time and 155 nodes storage space.For the problem of reliability analysis of single phase, this paper starts from the classic structure characteristics of dynamic DERMS and presents a two-phase analysis methodology, which firstly analyzes independent component quorums and then produces phase reliability from the quorum reliability results. All component quorums can be divided into three groups: share-repair quorums, self-repair quorums and no-repair quorums. Based on the MRGP-based analysis method for share-repair quorums proposed by the related work, this paper makes some improvements on the derivation and computation for the kernel matrices. The existed analysis methods for self-repair quorums proposed by the related work are based on Markov process, but this paper considers generally distributed repair activities and derives an efficient combinatorial formulation for their reliability analysis.For the problem of reliability analysis of system mission, this paper starts from branching matrix, which is used to bind all single phase results, and the assumption that memory is losable at phase boundaries, and then presents a general analysis formulation, and proposes two special solutions: analytical solution and numerical solution, according to different types of phase duration and intraphase stochastic processes. Equipped with our dynamic DERMS reliability analysis methods, the design scheme for a custom-tailor system can be verified quickly and the application scenario selection can be achieved for general purpose systems.With the help of the algorithms and analysis solutions presented by this paper, the reliability of many large-scale industrial DERMS can be efficiently analyzed.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络