节点文献

计算机辅助卷烟配方设计关键技术研究

Research on Key Technologies of Computer-Aided Cigarette Formula Design

【作者】 杨宁

【导师】 丁香乾;

【作者基本信息】 中国海洋大学 , 计算机应用技术, 2010, 博士

【摘要】 计算机辅助卷烟配方设计作为一项信息技术和配方技术相结合的边缘性的研究课题,到目前为止尚没有统一的概念、理论框架和技术体系。最近几年,随着烟草企业联合重组步伐的加快、品牌集中度的提高以及对安全性的严格要求,对烟叶原料资源的整合与深度利用、传统的配方研发与质量控制模式、产品多点生产的均质化控制能力都提出了严峻的挑战。本文在前期多年研究的基础上,立足于工程实践中遇到的新的难点技术问题和一些新的挑战,以计算机辅助卷烟配方设计为工程背景,从机器学习和模式识别工程的视角审视了模式识别技术和归纳推理机制存在的一些缺陷,并针对实际应用给出了新的方法。本文从辅助配方设计的基本原理分析入手,围绕原料相似性度量、卷烟质量评价以及卷烟分类器预测控制三个关键技术方法,开展了深入研究。本文主要研究内容概述如下:1、在概述计算机辅助卷烟配方设计国内外研究现状和领域问题复杂性分析的基础上,鉴于该研究领域缺乏统一概念、技术框架的研究现状,本文分析给出了计算机辅助卷烟配方设计的基本原理、基本特征、技术目标,以及从烟草和计算机技术两个角度的技术体系框架,并在部分关键技术上进行了分析,期望能对该领域的研究人员以及企业界更清晰的把握研究内容和预期提供参考。本文还指出了计算机辅助卷烟质量评价是解决辅助配方设计的一项核心工作,模拟质量评价的优劣直接影响到配方组合优化设计中配方方案的筛选效果。2、针对原料相似性度量中“距离失效”问题,本文从“维数灾难”的数学分析、几何理解、高维空间点间距离一致性等角度分析了“距离失效”的本质原因。在对流形学习的局部线性嵌入(LLE)算法研究基础上,针对烟草质量数据样本点稀疏、局部非线性、不光滑等数据特性,以及原料相似性度量要求低维空间保距映射的特征,提出了基于核变换测地距离的LLE改进算法(KGLLE),详细分析了KGLLE算法的设计思想和过程,并用KGLLE算法解决了高维空间原料相似性度量存在的问题。3、面对工程实践中遇到的专家经验或领域知识难以融入分类器、专家被动接受分类器预测结果等实际问题,本文从模式识别推理机制的角度分析了“归纳推理”的缺陷,以及目前主流分类器对解决上述问题的“有心无力”,继而在转导推理一致性预测器框架基础上提出了核化K邻域度量的一致性预测器(CP-KKNN),并在Iris数据集和卷烟焦油数据上进行了实验验证,取得了良好的分类效果;同时也分析指出了一致性预测器对解决卷烟质量模式识别这类工程问题具有很高的应用价值。4、在分析当前主流分类器预测时出现盲目的、机械的预测错误基础上,本文在仿生模式识别“认识”和“覆盖”理念的指导下,综合集成假设检验、凸壳构造与内点分析、序列随机性检验等理论和方法,设计提出了具有拒绝识别和可信度分析特征的分类器预测控制算法(RC-PC),并在卷烟香型SVM分类器上实验验证了算法对降低预测错误率的实际效果。研究表明如果把模式识别预测环节作为研究重点的话,对改善实际工程应用效果将优于改进分类器算法。5、简要分析了多技术集成的辅助配方设计在软件系统层面应解决的主要问题,并给出了系统的设计目标、设计指导思想、四层框架结构以及在智能化系统中算法选择的一般性原则等。最后给出了两张系统截图,旨在期望更多的模式识别或机器学习研究人员投身到工程实践中,从实践中寻找创新之源。6、总结了本文的主要创新性工作和研究结论,并从数据集结构分析、预测器构造、凸壳研究以及辅助卷烟配方设计四个方面展望了今后的研究方向。

【Abstract】 As an interdisciplinary research project which combined with computer technology and blending technology, computer aided cigarette formula design does not still has unified concept, theoretical framework and technological system. But in recent years, with the acceleration of reorganization in tobacco companies, improvement of brand concentration and more strict safety requirements, many fields have raised serious challenge such as the integration and more valid use of the raw material of tobacco, the mode of traditional formula design and the control capability of homogeneous quality manufacture in multi-manufactory.Based on past researches in computer aided cigarette formula design in the early years, this dissertation faces and solves the new technical difficulties and new challenge which encountered in engineering practice.From the perspective of engineering of machine learning or pattern recognition, this dissertation analyzes the shortcoming of pattern recognition and inductive inference mechanism and gives the new method. The dissertation begins from the analysis of basic principle of aided formula design, and pays more attention on three key technologies. The major research results summarized as follows:1.Based on the review of domestic and international research situation and analysis of complexity of formula field, this dissertation gives the basic principle, basic characteristic, technological target and technological framework about computer aided cigarette formula design, and deeply analyzed some key technologies so as to help the researcher and companies to master the research contents and objective. At the same time, the dissertation points out computer aided quality evaluation is the kernel research to aided cigarette formula design, the performance of quality evaluation will directly influence the selection effect of formula in formula optimization design.2.To the "distance invalidation" issue in raw materials similarity measure, the dissertation analyzes the intrinsic reason about "distance invalidation" from mathematical analysis, geometric understanding and the consistency of distance between points in high-dimensional space to "curse of dimensionality".Based on the studies of Locally Linear embedding (LLE) algorithm in manifold learning, an improved algorithm of LLE (KGLLE) which names LLE based on geodesic distance of kernel transformation is proposed.KGLLE algorithm is designed to adapt the characteristic of tobacco quality data including the sparse of samples, local non-linear, non-smooth features, and satisfied the Isometric in low-dimensional space to similarity measure. The dissertation gives the design principles and process of KGLLE in details, and uses the KGLLE solved the problems of tobacco material similarity measure in high-dimensional space.3.Facing the engineering problem in which experts’experience and domain knowledge are difficult to integrate into classifier and experts only passively accept prediction results of classifier, the dissertation points out the limitation of "induction inference" from the perspective of inference mechanism and the difficulty for current classifier to solve above problems.Then the kernelizing K nearest neighbor metric conformal predictor (CP-KKNN) is proposed based on transduction inference conformal predictor, and verified by the experiment in Iris dataset and tobacco tar dataset. The experiment obtains good classification performance. The research results show that conformal predictor has very important practice value to classification application which is similar to tobacco quality prediction.4.Most of current main classifiers have mechanical and blind prediction behavior on prediction. Instructed by the idea of "cognition" and "coverage" in bionic pattern recognition, the dissertation integrates several theories and methods including hypothesis testing, convex hull and interior point analysis and sequence random testing and so on, designed a classifier prediction control algorithm (RC-PC) which has the characteristic of rejecting recognition and credibility analysis, and verified the good effect for reducing prediction error rate by the experiment on SVM classifier of tobacco aroma style. The studies show that the effect in engineering application is better than improving classification algorithm if is gived emphasis to prediction process.5.After a brief analysis of main issue to multi-technology integrated aided formula design software system, the dissertation gives the system design objective, design instruction principles, four layer frameworks, and proposes the general principles to the algorithm selection, and provides two software screenshots, so as to expect more of the pattern recognition or machine learning researchers to pay more attention to engineering practice.Finding the source of innovation from practice!6.The dissertation summarizes the major innovative work and research findings, and indicates the future research emphasis from four directions:the analysis of data set structure, new predictor, convex hull and aided cigarette design.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络