节点文献

基于聚类的多模型建模及其在软测量中的应用

Clustering-Based Multiple Modeling Approach and Its Application in Soft Sensor

【作者】 陈定三

【导师】 杨慧中;

【作者基本信息】 江南大学 , 控制理论与控制工程, 2011, 硕士

【摘要】 多模型可以显著改善模型估计精度,提高模型泛化性能。本文结合实际工程应用背景,对双酚A生产过程中的结晶塔单元进行多模型软测量建模,以实现各种过程变量的在线监测。在众多的多模型建模方法中,基于聚类的多模型建模方法受到了广泛的青睐。但是,聚类算法中关于如何初始化聚类个数、聚类中心等问题长期以来一直都没有得到很好的解决,它间接制约了多模型的发展。同时,大多数聚类算法不具有鲁棒性,当样本数据集中存在异常样本点时,聚类效果大打折扣。此外,聚类算法自身也存在着一定的缺陷:在对样本数据集进行聚类的过程中,它仅利用了样本数据的输入集,而忽略了输出集对最终聚类结果的巨大影响,这在一定程度上影响了聚类结果的有效性。最后,作为多模型最重要的部分,子模型的好坏直接关系到最终的多模型精度。针对以上问题,本文从以下方面着手对聚类算法和子模型建模方法进行改进,建立有效的多模型软测量系统:1、鉴于传统聚类方法严重依赖于样本数据先验知识和初始参数的固有缺点,提出一种适用于任意形状样本分布的单参数调节扩张搜索聚类算法。该方法以近邻算法为基础,定义各样本的ε-邻域,通过扩张搜索的方法将所有相关联的ε-邻域样本归为一类,从而聚类样本数据。将其用于聚类样本数据,得到基于扩张搜索聚类的多模型建模方法。2、为抑制异常样本点对聚类结果的影响,提出一种基于局部重构融合流形聚类的多模型软测量建模方法。该方法将样本集拆分为多个互不相交的样本子簇,克服异常样本点对聚类结果的影响;以各样本子簇重构线性流形面,融合属于同一流形面且相距较近的样本子簇,得到多个子类;采用支持向量机为各个子类样本建立回归子模型,得到软测量多模型。3、针对传统聚类算法在处理不完备信息时存在的不足,提出一种基于二次数据划分的多模型建模方法。该方法对聚类得到的样本子簇利用改进的粗糙集分类器进行二次数据划分,在一定程度上消除矛盾样本点可能对模型精度造成的影响。对得到的各个子类利用支持向量机建立回归子模型,得到多模型软测量系统。同时,鉴于分类过程中可能由于样本分布不均而出现不平衡分类问题,采用改进的加权粗糙集分类器对上述算法作进一步的改进,提高分类器的精度,确保了多模型的有效性。4、子模型的效果直接影响着最终的多模型精度,提出一种局部惩罚加权核偏最小二乘算法。该方法通过核映射将原始输入映射到高维特征空间实现非线性问题的线性化处理,通过偏最小二乘算法进行主成分提取,降低数据维数;对由主成分构成的新数据集,依据局部学习思想构建局部惩罚加权最小二乘回归模型,有区别的对待各样本的贡献值,在一定程度上抑制异常样本点的影响,优化模型参数。鉴于多模型可以改善模型估计精度,提高泛化性,采用扩张搜索聚类算法聚类样本集,对得到的聚类子簇依据上述算法建立回归子模型,得到多模型软测量系统。在双酚A生产过程质量指标的软测量建模仿真中验证了上述各方法的有效性。

【Abstract】 In view of multiple models can significantly improve model’s estimation accuracy and generalization performance, combining with actual industry application background, it is used to construct multiple models of crystal tower and realize online monitoring of process variables.In sorts of multiple modeling method, clustering-based ones get the most widespread concern. However, in the traditional clustering algorithm, many problems, such as how to decide cluster numbers and centers, are still unsolved. And it indirectly constrains the development of multiple models. At the same time, most clustering algorithms are sensitive to abnormal sample points what greatly reduces the effectiveness of clustering results. Moreover, traditional clustering algorithm has inherent shortcomings which only uses the input sets, while ignores the enormous impact of output set in the process of clustering sample set results in determining the quality of clustering. At last, as the most important part of the multi-model, sub-model will have a direct bearing on the accuracy of multiple modes.To solve the problems mentioned above, the paper improves clustering algorithm and modeling approach from the following four aspects so as to establish effective multiple models.In view of the traditional cluster algorithm’s shortcomings of heavily relying on the priori knowledge and initial parameters, a single-parameter adjustment expanding search clustering method which is suitable for arbitrary shape sample distribution is presented. The new clustering method is based on the nearest neighbor algorithm. By definingε-neighborhood of samples and applying expanding search method, the algorithm classifies all associatedε-neighborhood samples into one cluster, and therefore, the work of clustering sample set is achieved. The proposed algorithm is used for clustering sample set, and obtains multiple modeling techniques based on expanding search clustering.The existing of outliers will severely affect clustering results. A multiple modeling method based on manifold clustering with local reconstruction and merging is proposed. In order to restraining the impacts of outliers to clustering results, data set is split into several small disjoint sub-clusters. By reconstructing linear manifold level based on every sub-cluster respectively, it completes the work of clustering through merging sub-clusters who are not only closer but also belonging to the same manifold level. Meanwhile, Support Vector Machine is used to construct regression model in each sub-class and multiple models is obtained finally.The traditional clustering algorithm can’t deal with incomplete information very well. A multiple modeling approach based on secondary data partition is presented. The proposed method carries out the secondary classification on the sub-class by improved rough set classifier which obtains from clustering sample set, so as to eliminate affect of contraction samples on model’s accuracy to some extent. Support vector machine is used for building regression sub-model on each subclass, and finally obtain the soft-sensing multiple models. At the same time, in view of possible appearance of unbalanced classification problem, the improved weighted rough set classifier is adopt to improve above multiple modeling method further more so that significantly boosts the classification accuracy of classifier and ensures the reliability of multiple models.The accuracy of final multiple models directly depends on effect of sub-models. A novel local penalized weighted kernel partial least squares algorithm is presented. The proposed method map original inputs into a high dimensional feature space so as to realize the linear treatment of nonlinear problems, and meanwhile, partial least squares algorithm is used to extract the principal components. According to local learning theory, a local penalized weighted least squares regression model is constructed based on the new data set, which is formed by the principal component, in order to differentially treat the contribution of each sample value, reduce the model sensitivity of abnormal data and optimize the model parameters. In view of multiple models can improve the estimated accuracy and generalization of model, the expanding search clustering algorithm and local penalized weighted kernel partial least squares are used to cluster sample set and establish the regression sub-models on corresponding sub-cluster respectively. Finally, a soft sensor system based on multiple models is obtained.The proposed algorithms are used in a soft sensor model for the Bisphenol-A productive process, and the result of simulation shows the effectiveness of the algorithm.

  • 【网络出版投稿人】 江南大学
  • 【网络出版年期】2011年 08期
节点文献中: 

本文链接的文献网络图示:

本文的引文网络