节点文献
控制系统中多模冗余与网络可靠性研究
Research on Multi-Module Redundancy and Network Reliability in Control Systems
【作者】 张本宏;
【导师】 陆阳;
【作者基本信息】 合肥工业大学 , 计算机应用技术, 2010, 博士
【摘要】 提高系统可靠性和安全性主要有两种技术:一种是避错技术、一种是容错技术。避错技术通过质量控制、环境防护及提高元件集成度等措施避免将故障引入系统,以达到增加系统可靠性和安全性的目的。然而再好的避错技术也不能完全防止故障的发生。容错技术以资源冗余为前提,通过采用故障限制、故障屏蔽或者重组的方法,使得系统即使在存在故障的情况下,仍能产生可靠安全的输出。在采用容错技术的计算机控制系统中,使用较多的有双机比较结构、双机热备结构和三模表决冗余结构。近年来,在一些安全关键系统中,开始使用一种非表决的四模冗余结构——“二乘二取二冗余”(D2V2R:Double 2-Vote-2 Redundancy)结构。论文研究了这种冗余结构的工作原理和控制策略,并从可靠度、可用度和安全度角度对其性能进行了分析。论文还提出一种新的非表决四模冗余结构——双模冗余-比较”(C-DDMR:Comparison of Double Dual-Module Redundancy)结构,并将其性能与双机热备结构、三模表决冗余结构和“二乘二取二冗余”结构进行了对比。计算机控制系统中,除了直接数字控制系统外,集散式控制系统、现场总线控制系统和网络化控制系统的处理单元都通过网络相连。这些系统的可靠性不仅与控制单元的可靠性有关,还与控制单元间链路的可靠性及网络拓扑结构有关。论文重点研究了多跳控制网络可靠度和可用度的计算方法以及部件重要度的分析方法。论文的主要研究贡献归纳如下:(1)针对“二乘二取二”冗余结构的工作特点,提出了一种新的控制策略。现有的控制策略均假定子系统中有一个模块出现故障时,子系统将停止工作。采用这种工作策略可以保证系统有较高的安全性。如果子系统中在一个模块出现可测故障时,子系统不停机,则任意子系统中只要有一个模块正常,系统就可能继续正常工作。采用这种策略,则在故障检测覆盖度较大的情况下,可以以较小的安全度换取较大的可靠度。(2)提出了一种新的冗余结构——“双模冗余—比较”结构模型。和“二乘二取二”冗余结构一样,该结构也是由两个子系统组成,所不同的是子系统内部是双机热备结构,子系统间是比较结构。针对“双模冗余—比较”结构,给出了两种控制策略,研究了两种策略下的性能。(3)利用马尔可大过程对“双模冗余—比较”结构和“二乘二取二”冗余结构的性能进行了对比研究,讨论了各自的适用范围,即在模块故障检测覆盖率较大且修复率较小时适合使用“双模冗余—比较”结构;在模块故障检测覆盖率较小且修复率较大时适合使用“二乘二取二”冗余结构;其他情况下,两种结构的性能相差不大。(4)提出了一种计算节点不完全可靠情况下t时刻控制网络k-端可靠度或可用度的方法。该方法的基本思想是先将图的边定义为控制网络的链路及其端点,再通过对图的邻接矩阵进行递归分解,从而得到t时刻不相交的边变量k-端路集。在此基础上,使用条件概率对边变量k-端路集的概率进行求解以得到控制网络k-端可靠度或k-端可用度。该方法不仅可用于分析节点和链路不可靠的情形,还可用于分析节点和节点问存在多条链路的情况。另外,在对邻接矩阵进行变换运算时,以矩阵元素而不是以边作为变换的对象,可大大减少矩阵运算次数。(5)提出了一种基于图的路集完备集分析控制网络部件重要度的方法。介绍了图的路集完备集的概念,研究了路集完备集的性质及求法,给出了通过图的路集完备集计算结构重要度、概率重要度和关键重要度的过程和步骤。该方法可同时对这三种重要度指标进行快速分析。
【Abstract】 There are mainly two kinds of techniques to improve reliability and safety of system——fault-avoidant technology and fault-tolerant technology. The fault-avoidant technology takes the measure of quality control, environmental protection and component integration improvement to avoid faults.into system and then improves reliability and safety. However, even the best fault-avoidant technology can not completely prevent the fault from occurring. By the redundancy of resources, the fault-tolerant technology adopts the method of fault restriction, fault masking or system reconfigurations, which makes the system still produce reliable and safe outputs even in the presence of faults.In computer control systems adopting fault-tolerant technology, dual-modular comparison redundancy (DMCR), dual-modular hot standby redundancy (DMHSR) and triple module voting redundancy (TMVR) are more used. In recent years, a non-voting quadruple modular redundant——double 2-vote-2 redundancy (D2V2R) begins to use in some safety-critical systems. The thesis studies the working principle and control strategy of this redundant structure, and analyzes its performance from the view of reliability, availability and safety. The thesis also proposes a new non-voting quadruple modular redundant——comparison of double dual-module redundancy (C-DDMR), and compares its performance with DMHSP, TMVR and D2V2R.In all computer control systems, except for the direct digital control systems, the processing units are connected through network in other systems, such as decentralized control system, fieldbus control system and network control system. The reliability of those systems is relative with not only the reliability of control unit, but also with the reliability of the link between control units and the network topology structure. This thesis focuses on the computation of reliability and availability in multi-hop control networks and the analyzing method of the component importance.The main contributions of this study are summarized as follows:(1) To the working features of D2V2R, a new control strategy is proposed. The existing control strategies assume when a module in a subsystem faults, the subsystem will stop working. This strategy can ensure that the system have higher safety. If the subsystem doesn’t stop work when a detected fault occurs in a module, the system may continue to work as long as one module is normal in any subsystem. Adopting this strategy, the smaller safety can be exchanged for the better reliability in the situation of larger fault detection coverage rate.(2) A new redundancy structure model of C-DDMR is proposed. As the same as D2V2R, C-DDMR also consists of two subsystems. The difference is that subsystem is dual-modular hot standby redundancy structure and the two subsystems are comparison structure. For C-DDMR structure, two control strategies are given, and the performance under the two strategies are studied.(3) With the help of Markov procession, D2V2R structure and C-DDMR structure are comparatively studied and the application scope of those structures is discussed. That is if module has larger fault detection coverage rate and smaller repair rate, C-DDMR structure should be selected. If module has smaller fault detection coverage rate and larger repair rate, D2V2R structure should be selected. In other cases, performance of D2V2R structure is almost as same as that of C-DDMR structure.(4) A method is proposed for computing k-terminal reliability or availability of control network with unreliable nodes at time t. The basic idea of this method is that the edge of graph is defined as link and its terminals of control network first and then the disjoint edge variable k-terminal path sets are obtained by recursive decomposition of the adjacency matrix of the graph. On this basis, k-terminal reliability or availability is solved by computing all the probability of those path sets in the use of conditional probability. The proposed method can be used to analyze not only network with unreliable nodes and links but also network in which there are multiple links between two nodes. In addition, the number of matrix operations can be significantly reduced because transformation object is the matrix elements other than edge of graph when connection matrix is transformed.(5) A method is also proposed for importance analysis of control network components based on complete sets of path sets of graph. Concept of the complete sets is introduced and property and solving method of the complete sets are studied. Steps on calculating structure importance, probability importance and critical importance by the complete sets are also given. The proposed method can simultaneously analysis these three importance metrics quickly.
【Key words】 Control system; Multi-module redundancy; Control network; k-terminal reliability; Component importance;