

Application Research of Data Mining Technologies on Membrane Separation Process of Chinese Medicine Water Extract

【作者】 洪弘

【导师】 李玲娟;

【作者基本信息】 南京中医药大学 , 社会医学与卫生事业管理, 2012, 硕士

【摘要】 膜分离是一种新兴的分离方式,在中药制药工业中发展前景广阔。然而在中药水提液的膜分离过程中,被滤液体中的一些微粒、胶体离子或溶质分子与膜会存在物理化学反应从而使膜孔径堵塞,不利于进一步的分离过程,所以了解膜污染的机理模型在膜分离工艺中是一个急需解决的问题,但是中药水提液复杂体系中的数据间存在大量非线性、高噪声、多因子的复杂关系,建立膜污染机理的模型需借助数据挖掘技术来解决。数据挖掘是揭示数据间关系的学科,是统计学的扩展。因为医药数据集具有异构性、主观性、大量性等特点,所以数据挖掘在医药领域的应用需要快速、鲁棒和可靠的数据挖掘算法。在研究了数据挖掘的过程和常用模式后,确立了主要模式为预测模式。但在预测建模前需进行数据特征描述、缺失值处理、基于距离的离群点分析、变量变换、属性筛选等处理手段,这些辅助工作是较繁琐但是很重要的过程,主要目的是构造干净整齐的数据集,以提高预测模型的准确度。预测模式是一种被频繁使用的数据挖掘模式,它通过分析研究历史数据来对未来的趋势或者可能的结果做推测和估计。本文研究了多元线性回归模型、多元二项式回归模型、误差反向传播神经网络模型、径向基神经网络模型与支持向量机模型,在此基础上进行了一定的优化处理以适应具体问题的解决,并对不同模型的建模效果和预测效果进行了对比。在理论研究的基础上,选用了Matlab工具实现了具体的算法,并设计实现了相应的界面,以便系统使用的专业化、友好化与便捷化。

【Abstract】 Membrane separation is a new separation method, it have broad prospects in the development of Chinese pharmaceutical industry. But some particles or solute molecules or colloidalion of the filtered liquid, will have physical and chemical reaction with membrane in the aqueous extract of traditional Chinese medicine during the membrane separation process so the membrane pore will be blocked, and it’ll obstruct the further separation process, then understanding the mechanism of membrane fouling in membrane separation process is an urgent problem to be solved, however, there are a large number of non-linear data sets between the complex system of Chinese herb extractions complex, and the data sets are high-noise, multi-factor relations, so establishing the mechanism of membrane fouling model requires the data mining(DM) technology.Data Mining is a subject that revealed the relationship between the data set, and it’s a expansion of statistics. It requires a fast, robust and reiable data mining algorithms in the field of medicine because the dataset is isomerism, subjective and large.Confirm the main model is prediction model after the research on the process and common model in data mining field. Data feature descriptions,deal with missing values, the distance-based outlier analysis,variable transformation and attribute slection are supported job.Although it’s complicated,it’s important.Because it’ll create a clean dataset,so that the predictive model will have a higher accuracy. The forecast model is a frequently used data mining model to the trend of the future or possible outcome through the analysis of historical data to speculate and estimates. The Multiple linear regression model, the Multivariate binomial regression model,the BP Artificial Neural Network model, RBF Artificial Neural Network model and SVM model are involved.And some optimization are introduced in.Compared the effects of different model.Choose Matlab tool for the solution to the problem and design the interface to the system after the basic theory learning.The interface design is professional,friendly and convenient.
