节点文献

基于模式识别的流程工业生产在线故障诊断若干问题研究

Study on On-Line Fault Diagnosis System and Key Technologies in Process Industry Production Based on Pattern Recognition

【作者】 庄进发

【导师】 罗键;

【作者基本信息】 厦门大学 , 控制理论与控制工程, 2009, 博士

【摘要】 随着流程工业系统不断地朝着大规模化、复杂化与智能化方向发展,研究与设计一个快速有效的在线智能故障诊断系统,使其具备在线故障检测、在线故障诊断、在线识别引起故障发生的相关监控变量以及在线学习更新系统认知能力的功能,正在成为流程工业及系统科学智能领域的一个重要研究课题。利用该系统,专家工程人员可以快速地对流程工业系统进行在线故障检测诊断,从而有效地保证企业生产的安全运行,最终达到提高企业生产效率的目的。目前,随着计算机集成过程系统(Computer Integrated Process System,CIPS)的发展,流程工业系统在运行过程中,有大量的过程数据被采集和保存下来。如何充分利用这些数据的深层次信息,来进一步提高故障检测与诊断能力,正在成为研究在线智能故障诊断系统的一个热点。本论文主要以支持向量数据描述(Support Vector Data Description,SVDD)与随机森林(Random Forests,RF)模式识别工具为基础,对流程工业在线故障诊断的若干问题进行研究,其具体内容如下:(1)针对SVDD的核参数σ优化及其决策边界规整问题,提出了基于核样本球形分布的核参数优化方法与基于核主元分析(Kernel Principal ComponentAnalysis,KPCA)的SVDD决策边界规整方法。核参数优化方法主要利用测量核空间样本的非高斯性值,来寻找较优的核参数。当核参数选定之后,核空间的样本可能存在分布不均匀的现象,针对此问题,本文进一步利用KPCA来调整决策边界线,以使得SVDD达到更优的分类性能。(2)针对SVDD处理大数据样本时存在时间复杂度较大的问题,提出了一种随机蚕食快速增量式支持向量数据描述算法(Random Greed Incremental SVDD,RGInc-SVDD)。首先,该算法利用随机抽样定理(Sampling Lemma,SL)把训练样本集分割成一些小训练集,然后将其中某一子训练集用来建模一子Inc-SVDDi分类器,最后利用迭代蚕食算法来合并增长子Inc-SVDDi分类器,以生成整个训练集的SVDD分类器。RGInc-SvDD算法使得标准SVDD的时间复杂度从O(n3)降到O(floor(n/k)3),其中n,k分别为训练集的样本数与迭代过程中的平均蚕食样本数。(3)针对SVDD决策边界的过严格问题,提出了一种核最小体积椭球体数据描述方法(Kernel Minimum Volume Enclosing Ellipsoid,KMVEE)。KMVEE采用的是与SVDD类似的思想,即在核空间中寻找一个最小体积的超球体来尽可能多地包含核映射样本,并以该球体作为界面来对数据进行描述。在相同的核参数设置下,KMVEE能够生成比SVDD更紧凑的决策边界,这使得其性能得到进一步地改善。(4)针对可认知故障(即落于超球体内的故障样本Xinside)的诊断分类问题,提出了一种拒绝式转导推理多类支持向量数据描述方法(Rejected TransductiveInference Multi-SVDD,RTIM-SVDD)。该方法应用M+1个超球体来处理M分类问题,及利用了转导推理思想原则来评判模糊样本点的类别归属问题。RTIM-SVDD相对于距离式M-SVDD,性能具有进一步地提高。(5)针对不可认知故障的聚类问题(即落于超球体外的样本Xoutside),提出了一种改进支持向量聚类(Support Vector Clustering,SVC)方法。该方法主要利用最速下降梯度法来寻找样本的局部最小点,以生成样本不变集,并利用三线完全图(Three Line Completed Graph,TLCG)来标识不变集的簇标签。其时间复杂度从O(n2m)降到O(nop2m),其中m为连线取样数,nop为局部最优点个数。(6)针对不可认知故障的故障定位问题,提出了一种基于RTIM-SVDD的故障定位方法及基于改进随机森林RF的故障定位方法。基于RTIM-SVDD的方法主要通过性能指数PROC的大小来进行故障定位;基于改进RF的方法,主要通过改进Bagging抽取方式、决策树分类以及样本变量重要性法则来生成一改进随机森林,并通过其变量重要性来进行故障定位。以上所有方法的有效性验证,都是基于以下三个数据源:UCI标准数据、TEP(Tennessee Eastman Process,TEP)故障仿真数据以及基于现实故障仿真数据QAMADICS(Development and Application of Methods for Actuator Diagnosis inIndustrial Control Systems,QAMADICS)。实验结果证明,上述所提方法是有效的。最后,在总结全文的基础上,提出了有待进一步研究的课题和今后工作的重点。

【Abstract】 As the development of process industry towards large-scale,complexity and intelligence,doing and designing a good rapid on-line intelligent fault diagnosis system is becoming an important objection in the process industry and intelligent science filed.And this on-line system should be able to on-linely detect the faults, on-linely diagnose faults,on-linely recognise the variables which may incur the faults happening,and onlinely update the learning ability and knowledge of this system in the processes of the production.Assisted by this on-line system,the engineers can easily do the on-line fault detection and on-line fault diagnosis so as to locate the reasons of the faults,i.e.making the production of process industry much safer.And then improve the efficiency of the enterprise.With the development of the process industry CIPS(Computer Integrated Process system),a great amount of process data can be sampled and collected to store in database.How to fully mine this deep-level information to improve the performance of the process monitoring has been gradually becoming one of the focuses in the field of process control and intelligenct science.This thesis mainly focuses on studying and designing an on-line fault diagnosis system method for process industry based on patten recognition tools of SVDD (Support Vector Data Description)and RF(Random Forestes).The main contributions of this thesis are as follows:(1)To address the problem of kernel parameterσselection and regulation of the decision boundary in SVDD algorithm,this thesis proposes a new kernel parameter optimization method based on the spheral distribution of samples in kernel space and regulation of the decision boundary method based on KPCA(Kernel Principal Component Analysis).Firstly,this optimization method utilizes non-Gaussian to measure how the kernel samples approximate to a spheral area so as to select a better kernel parameter.If the kernel parameter has been selected,the distribution of kernel samples is still possible with uneven distribution.This thesis applies KPCA in depth to regulate the decision boundary in order to arrive at high classification performance.(2)To address the problem of SVDD processing larget samples dataset with huge time complexity,this thesis proposes a novel RGInc-SVDD(Random Greed Incremental SVDD)algorithm.Firstly,using the SL(Sampling Lemma)to divide the training samples dataset into several small samples subsets;sencondly,create an Inc-SVDDi model with one of samples subsets;then,apply rule of interactive random greed to grow the Inc-SVDDi until the SVDD being created with whole training samples information.The RGBInc-SVDD algorithm makes the time complexity significantly decrease from 0(n3)to 0(floor(n/k)3),where n,k respectively denote the number of training samples and the number of random greed in each interactive step.(3)To address the problem of decision boundary of SVDD with more lax,this thesis proposes a new data description algoritm named KMVEE(Kernel Minimum Volume Enclosing Ellipsoid).As like the SVDD algorithm,KMVEE is also looking for a Minimum Volume Enclosing Hyper-Ellipsoid to enclose all the traning samples as more as possible.Under the same kernel parameter,the KMVEE shows the better performance than the SVDD,because the decision boundary of KMVEE is tighter than one of the SVDD.(4)To address the problem of known fault diagnosis or multi-class classification(some samples falling inside of decision boundary,i.e.,Xinside),this thesis proposes a new M-SVDD algorithm named RTIM-SVDD(Rejected Transductive Inference Multi-SVDD).Being different from the distance M-SVDD, the RTIM-SVDD applies M+1 hyper-sphere as decision boundary to form M-SVDD classification.To solve a bottleneck problem of labeling fuzzy samples, we adopt a new transductive inference to label these fuzzy samples.Transductive inference mainly makes use of measurement confidence to deal with labeling fuzzy samples,and shows better performance than tranditional M-SVDD.(5)To address the problem of unknown fault diagnosis or clustering(some samples falling outside of decision boundary,i.e.,Xoutside),this thesis proposes a new improved SVC(Support Vector Clustering)algorithm.This proposed method employs Steepest Descent Gradient Method to hunt for local optimization points and then use TLCG(Three Line Completed Graph)rule to assign labels of these points so as to cluster all samples.The time complexit of SVC in assigning labels of cluster is significantly decrease from 0(n2m)to 0(nop2m),where m denotes the number of selected points on the line between two traning samples and nopdenotes the local optimizatial solution.(6)To address the problem of unknown fault locations,this thesis proposes a new method based on RTIM-SVDD and RF(Random Forests).The fault location based on RTIM-SVDD utilizes the performance index PROCto locate the fault reasons.The fault location based on RF mainly improves a serial of process:the Bagging split of decision trees and variables importance computation so as to locate the fault reasons.The above proposed methods are validated based on the following dataset:UCI data set,TEP(Tennessee Eastman Process)dataset and QAMADICS(Development and Application of Methods for Actuator Diagnosis in Industrial Control Systems) dataset.The results demonstrate the feasibility of these proposed methods.Finally,there are concluded with a summary and some further research areas in this thesis.

  • 【网络出版投稿人】 厦门大学
  • 【网络出版年期】2009年 12期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络