节点文献
分类方法在中医辨证诊断应用中的比较研究
Comparative Study of Classification Method in Traditional Chinese Medicine Differentiation
【作者】 陈淑慧;
【导师】 梁伟雄;
【作者基本信息】 广州中医药大学 , 中医内科学, 2008, 博士
【摘要】 研究背景:中医研究领域中,辨证是中医学的核心,也是确保疗效的前提。为了研究中医辨证分类规律,流行病学方法、多元统计方法、机器学习、神经网络等多种方法被引入了研究之中,从而形成了百家争鸣的场面。然而,不同方法会产生不同的分类器,分类器的优劣直接影响数据挖掘的效率与准确性。目前许多涉及数据分析/挖掘方法在中医辨证诊断中应用的研究多局限于研究的方法本身,尚未涉及各种典型数据分析/挖掘方法,较为全面、深入的横向比较;再者,模型评价的方法使用混乱,不规范,因此难以避免出现管中窥豹,只见一斑的片面评价。如何正确评价各种分类方法在中医辨证研究中的应用价值,以及各自的优缺点,以期在分类方法的选择上做出指导,是中医现代化多学科研究中方法学合理应用的前提,是一个有广泛应用前景的研究方向。原发性失眠症的证治规律探讨是目前临床研究的热点,在方法学应用上也千法并举,莫衷一是。本研究以该病为切入点,搭建数据平台。在结合统计学预处理及基于相关性分析、主成分分析、粗糙集方法的属性约简预处理的基础上,应用分类方法中统计方法、机器学习方法及神经网络方法中的典型代表方法:Logistic回归、贝叶斯分类器法、基于规则的分类方法(PARI)、C4.5决策树方法、BP、RBF神经网络方法,并引入概率神经网络方法、支持向量机方法,对原发性失眠的临床数据进行中医辨证分类的研究,以期对各种方法进行横向比较,评估其应用于中医证候分类研究的价值,提出符合中医数据类型特点的数据约简方法、分类方法和模型评价方法。目的:1应用支持向量机、概率神经网络方法建立原发性失眠症中医辨证分类模型,评估其应用于中医证候分类研究的价值,并与其他几种常用分类方法比较,分析比较各种算法的特性,评价其优劣。2比较评估3种属性约简方法(基于相关性分析、主成分分析、粗糙集方法的属性约简)在中医证候数据处理中的应用价值。方法:本研究为横断面调查。根据国内外有关原发性失眠的研究报道、中医理论,建立了包括西医量表及中医证候调查表的《失眠症临床观察表》,调查广州中医药大学第二附属医院大德路总院、芳村分院,神经科内科门诊或睡眠心理专科就诊的原发性失眠病患者。根据观察表的内容应用Epidata4.1a建立数据库录入数据,经过填补缺失值、离散化、归一化等数据预处理后,分别用SPSS13.0中相关性分析(采用Spearman相关系数法计算相关性,并删除相关系数的P值大于0.05的变量)、主成分分析法(筛选特征根>1,公因子方差>0.4的证候信息)和Rosetta软件中基于粗糙集的属性约简方法(基于差别矩阵的粗糙集属性约简)进行数据约简(降维)。采用改进的样本划分法,按照5:1的比例(450例/92例)将数据库进行分割,取随机数字前92例形成验证集,余450例为训练集。然后分别对三种约简方法得到训练数据集进行如下建模:Logistic回归(Forward LR模型、Backward LR模型)采用SPSS13.0分析,贝叶斯分类、基于规则的分类器(PARI)、C4.5决策树方法采用WEKA3.5.7软件,BP神经网络、RBF神经网络、概率神经网络方法采用MATLAB7.0软件的神经网络工具箱,支持向量机方法(多项式核函数模型、径向基核函数模型、Sigmoid核函数模型)采用LIBSVM2.85软件完成。对训练集,分别采用自身回代验证、5倍交叉验证方法对所建立的模型的拟合效果和分类效果进行评价,主要评估指标包括:敏感度、特异度、准确度、漏诊率、误诊率、Youden指数、阳性预测值、阴性预测值、阳性似然比、阴性似然比、一致性检验(Kappa值)、ROC曲线。然后,利用验证数据对模型进行预测性能的前瞻性评价,评价指标:准确率、Kappa值、平均绝对误差、均方根误差。三种约简方法之间的比较主要评估指标有:属性蒸发率、构建模型的计算开销和模型复杂度、所构建模型的分类性能和预测性能。通过上述指标,评价三种约简方法之间以及各种二分类分类器之间的优劣。结果:共收集了原发性失眠病患者共414例,其中128例完成了两个时点的观察,286例完成了一个时点的观察,以时点为横断面,共采集证候断面资料542个,资料之中存在证型重叠。其中肝郁化火证最多,共183例,我们以肝郁化火型为例进行分类器的构建。1原始自变量(包括PSQI指标、症状、体征,除外舌淡红、苔薄白)共95个,结果相关性约简的结果得到包含55个属性的子集,主成分约简方法得到包含33个属性的子集,而粗糙集约简方法得到的子集规模最小,仅包含19个属性。属性蒸发率分别为42.105%、40.000%和65.455%,以粗糙集约简方法最高,由其构建的各种模型效果均优于主成分约简模型,优于或与相关性约简模型相仿。2无论哪种模型,自身回代验证的正确率都高于交叉验证的结果,甚至有的模型可相差接近20%的概率。而进一步使用高自身验证准确率的模型来进行验证集预测时,正确率却明显降低。3 Logistic回归模型:拟合的Backward LR模型各项指标优于Forward LR模型或与之相似,三种约简方法结果所构建的Logistic向前和向后模型,其5折交叉验证ROC曲线下面积差异均无统计学意义。三种约简方法结果所构建的Logistic向后模型,5折交叉验证平均分类正确率为86.222%,ROC曲线下面积平均为0.904,三者差异无统计学意义,平均预测正确率为89.855%。4贝叶斯分类器:三种约简方法结果所构建的贝叶斯分类器,其5折交叉验证分类正确率在79.111%~87.556%之间,平均84.148%,5折交叉验证ROC曲线下面积平均为0.895,相关性及粗糙集约简结果所构建模型与主成分约简结果模型比较差异有显著性意义,预测准确率在83.696%~92.391%之间,平均89.130%。5基于规则的分类器:三种约简结果构建的模型分别建立了5、4、5条规则。规则对训练集案例的覆盖率均较低。自身回代验证结果与5折交叉验证结果相差较大。三种约简结果构建的模型,其5折交叉验证分类正确率在77.778%~87.556%之间,波动较大,平均为83.037%,ROC曲线下面积平均为0.829,相关性及粗糙集约简结果所构建模型与主成分约简结果模型比较差异有显著性意义。预测正确率79.348%~91.304%,平均85.507%。6 C4.5决策树:三种约简结果构建的模型分别建立了含有15、12、10个节点数的决策树模型,训练较快速。但三种模型均只覆盖了若条件成立则阳性结果成立的属性,总体分类能力一般,分类正确率在85%左右波动,5折交叉验证ROC曲线下面积平均约0.834,其中,粗糙集约简结果模型优于其它两种约简结果模型,差异有统计学意义。预测正确率在83.696%~89.130%之间,平均86.957%。7支持向量机:三种核函数模型中,径向基核函数模型分类效果最好,各项指标均优于其它两种核函数模型,其5折交叉验证ROC曲线下面积与Sigmoid核函数模型比较差异有显著性意义,而其支持向量的数量也较少。进行参数寻优后正确率明显提高。相关性约简结果建模分类预测准确率可以达到100%,其它两种约简结果建模分类正确率分别为88.222%、92.222%。5折交叉验证ROC曲线下面积在0.94以上,粗糙集约简结果模型与主成分约简结果模型比较差异有显著性意义。预测正确率在92%以上。8 BP网络:三种约简结果构建的模型分别建立了含有4、3、5个隐节点的BP网络。参数设置较耗时。三种约简结果构建的模型分类正确率在81.778%~89.111%之间,平均85.185%。ROC曲线下面积平均为0.889,其中相关性约简结果优于其它两种约简结果模型,差异有统计学意义。预测正确率波动较大,在73.913%~95.652%之间,平均86.594%,预测误差较大。9 RBF神经网络:三种约简结果构建的模型各自建立了含有3个隐节点的RBF网络。学习速度较BP神经网络快,参数设置较简单,三种约简结果构建的模型,5折交叉验证平均分类正确率88.741%。5折交叉验证ROC曲线下面积在0.89以上,三种模型两两之间比较差异均有统计学意义。预测正确率平均为90.217%。10 PNN神经网络:参数少,运行速度快。5折交叉验证中分类正确率均在86%以上,甚至接近95%,平均为91.111%。5折交叉验证ROC曲线下面积在0.93以上,平均为0.967,其中,主成分约简结果模型差于其它两种约简结果模型,差异有显著性意义。预测准确率均高于90%,平均为93.840%。11根据5折交叉验证AUC曲线下面积大小,结合假设检验结果,将8种模型进行分类效能划分:相关性约简结果建模:SVM>PNN>Logistic、RBF>PARI、BP、C4.5,而Bayes与后两类模型比较差异均无显著性意义,故应介于3、4类之间。主成分约简结果建模:SVM、PNN>RBF、Bayes>C4.5、PARI,而Logistic、BP与RBF、Bayes、C4.5比较差异均无显著性意义,故介于2、3类间。粗糙集约简结果建模:PNN>SVM>Bayes、Logistic、BP、C4.5>PARI,而RBF与PNN、SVM比较差异均无显著性意义,故介于1、2类之间。结论:1粗糙集的属性约简方法能在保持较高质量分类能力的基础上,尽量消除信息系统(决策表)中不必要的知识,得到对证型有较好的分类能力的较小属性集合,一种值得在中医证候数据处理中推广应用的约简方法。2自身回代验证容易高估分类判别的效果,因此实用价值不大,不适于用于客观评价模型效果。而5折交叉验证的结果较稳定,能反映所建立的分类模型的真实分类能力,尤其是对存在干扰的情况下,它能很好的避免分类结果出现较大的波动。建议在今后的研究中尽量采用交叉验证的方法对模型的分类效能进行客观的评价。3与传统的评价指标相比,ROC曲线具有可信度高,描述客观精确,特别是不受数据环境影响等优势,并且能够对两个诊断试验的曲线下面积进行假设检验,结果更直观、客观。2.总体而言,应用的8种模型均有一定诊断价值,其中SVM、PNN、RBF最佳,Logistic、贝叶斯分类器、BP次之,C4.5、PARI较一般。3.Logistic回归模型的评价体系、模型修正与诊断较完善,可以清楚的显示各个自变量在模型中贡献的大小以及作用的方向。但容易受中医证候资料中共线性及强影响点等影响,其预测正确率及误差在8种模型中均处于中等位次。其中Backward LR构建的模型稍优于Forward LR构建的模型,考虑Backward LR法在筛选变量时侧重于向模型中引入联合作用较强的变量,因此对于普遍存在相关性的中医证候数据而言,建议采用Backward LR法构建模型。4贝叶斯分类器容易受频数及先验概率影响,分类效果与Logistic回归相仿。5基于规则的分类器可以产生易于理解的规则以及各规则的强度,但模型分类、预测能力均较差,稳健性较差,因此该模型适于用来抽取规则帮助理解中医证候内涵,但不适于用于分类和预测研究。6 C4.5决策树产生可视化树状图,有助于直观理解各属性在证候判别中的作用大小,对强影响点的干扰具有较好的鲁棒性,但模型敏感度、误诊率、阴性预测值、阴性似然比较低,而漏诊率、特异度、阳性预测值、阳性似然比较高,分类能力一般,预测误差较大。我们认为该模型适于用来形成决策树,帮助直观理解中医证候内涵,但不适于用于分类和预测研究。7支持向量机中径向基核函数模型较适于对中医证候研究数据分析,其分类效果及预测精度均优于多项式、Sigmoid核函数,而支持向量的数量也较少,泛化性好,。因此采用SVM进行中医证候分类研究时,RBF核是一个比较好的选择。SVM可以对中医证候数据构建一个最优超平面,使得非线性可分的中医证候数据在特征空间中得到准确率较高的划分,其分类效果优于其它分类器,而且模型有较好的鲁棒性、泛化能力较好。将SVM技术引入中医证候研究是可行而且有效的。8 BP网络用于中医证候诊断方面学习速度慢、泛化能力差、易陷入局部极小,且中医证候的特征矢量很难获得,证候的诊断准确率不高,因此实际作用较差,推广较难。9 RBF神经网络学习速度较BP神经网络快,参数设置较简单,对中医证候数据有较好的识别分类能力和预测性能,模型较稳健,是一种适用于中医证候研究的方法。10 PNN神经网络参数少,运行速度快,模型较稳健,分类效果及预测精度均较高,仅次于SVM,泛化性能较好,能很好地识别中医证候数据中的分类信息,较理想地完成证候分类及预测的工作,是值得在中医证候分类研究中推广的技术。
【Abstract】 Backgroud:In the field of Traditional Chinese Medicine research, Differentiation is the core of it and the precondition to ensure efficacy. In order to study the classification rule of TCM, epidemiological methods, multivariate statistical methods, machine learning, neural networks, and also many kinds of other methods have been introduced into the study, which formed a extensive contend scenes.However, different methods can produce different sorters, the quality of the sorters have direct influence on the efficiency and the accuracy of data mining. At present, most research on the application of data analysis/mining methods in TCM Differentiation limit to the research method which is used, more comprehensive crosswise comparison among every kind of typical data analysis/mining methods has not yet been involved. Furthermore, the use of the model evaluation methods is derangement and irregularity. Therefore it is difficult to avoid partial view. How to correctly evaluate the value of the application of each classified methods in TCM Differentiation research, as well as respective disadvantage and merit, for making a instruction in the choice of classified method, is the prerequisite for reasonable application of methods in TCM modernizational multi-disciplinary research and has the extensive prospect for future research.The discussion of Differentiation rules in primary insomnia is one of the focuses in the present clinical research. And the application of methods is also in the same situation. This research takes it as an investigation object and collects the relational clinical data. And on this data platform, first we carry on a attribute reduction respectively based on statistics processing and rough sets method. Then with the application of typical classification methods in statistical methods, the machine learning methods and the neural network methods: the Logistic regression , the Bayesian Classifier, rule-based classified method, the C4.5 decision tree, BP, RBF neural network method, and also the probability neural network method, the support vector machines method, we perform the primary insomnia clinical TCM data classification research. And we carry on the crosswise comparison among each foregoing method and assessment of the value on their application in TCM Syndrome Classification. By this means, we discuss the data reduction, classification and model evaluation methods which meet the characteristics of the TCM data.Objective:1 Etablish classification models of Pathogenic fire derived from stagnation of liver-QI of primary insomnia with support vector machine, probabilistic neural network method. And assess its application value for TCM syndrome classification, And compared with several other commonly used classification methods, evaluate their characteristics.2 With the comparison of 3 attribute reduction methods (separately based on the correlation analysis, principal component analysis, rough set methods), assess their application value for data processing of applications in TCM syndrome research.Method:This study is a cross-sectional survey. According to relative domestic and foreign research report and TCM theory about primary insomnia, we establish "Insomnia clinical observation questionnaire", including Western medicine scales and Chinese medicine syndromes questionnaire, through which we investigate the primary insomnia out-patients in Guangdong Province Hospital of TCM.According to the content of the questionnaire, Epidata4.1a was used to the establishment of a database. After data processing such as filling missing values, discretization and normalization, bivariate correlation analysis(spearman correlation coefficient was used and the attributes which P value was above 0.05 were filtered), principal component analysis(attributes which eigenvalues was above 1 and communality was above 0.4 were extracted) by SPSS 13.0 and rough set(ROSETTA software) were respectively performed for attribute reduction (reduced-dimension).Database was split into two parts by the improved sample division method in accordance with the ratio of 5:1 (450 cases / 92 cases). Cases with random number from 0 to 92 were into test set, the other were into the training set. Then the relative models of three reduction training database were built by follows methods: Logistic Regression (Forward LR model and Backward LR model) by SPSS 13.0 software, Bayesian classifier, rule-based classification (PARI), C4. 5 decision tree method by WEKA3.5.7 software, BP neural network, RBF neural networks, probabilistic neural network method by MATLAB7.0 software neural network toolbox, and Support Vector Machine (polynomial kernel model, radial basis function kernel model and Sigmoid kernel model) by LBSVM 2.85 software.For the training set, original and five-fold cross-validation method were used to evaluate the goodness of fit and the classified effect of the established models. The major assessment index include sensitivity, specificity, accuracy, the rate of missed diagnosis, the rate of misdiagnosis, Youden index, positive predictive value, negative predictive value, positive likelihood ratio, negative likelihood ratio, consistency test (Kappa values) and the ROC curve.Then, the models were used to predict the classification results of the test set for prospective evaluation with index included accuracy, Kappa, the average absolute error, the root mean squared error.Indicators applied to assess three attribute reduction methods included attribute evaporation rate, the calculation complexity and model complexity, the classification and prediction performance of models.Through all these index we estimated the pros and cons of three reduction methods and two-categories classified models.Result:414 cases of primary insomnia patients were enrolled. 128 of which completed twice observation, 286 cases completed one observation. Taken the observation time, 542 data of Syndrome were collected with overlapping syndromes. The most syndrome is Pathogenic fire derived from stagnation of liver-QI which up to 183 cases. And we used it as an example to build the sorter.1 The original variables (including PSQI, symptoms, signs, except for light red tongue and thin whitish fur) is up to 95. The result of the reduction by bivariate correlation analysis is a subset with 55 attributes. Principal component reduction result in a 33 attribute subset and the subset reduced by rough set was the smallest, containing only 19 attributes with the highest attribute evaporation rates (65.455%). The results of models constructd by it were better than principal components reduction models and better than that of the correlation analysis reduction or similar.2 No mater which kind of model, the accuracy of original was better than that of cross-validation, even in some model the difference reached nearly 20%. However, the further use of the model, which original test accuracy was high, showed that the results turn out to be markedly lower. 3 Logistic regression model: The Backward LR model was superior to Forward LR model or similar in all indicators. No matter Forward or Backward model, the area under the ROC curve(AUC) in 5-fold cross-validation of models constructed by three reduction methods were no statistically significant. Their average correct classification rate was about 86.222%. The average AUC in 5-fold cross-validation was 0.904 without statistic significance. And average prediction accuracy was 89.855%.4 Bayesian classifier: The accuracy of Bayesian classifier set up by 3 reduction results undulated 79.111%~87.556%, average 84.148%. The average AUC in 5-fold cross-validation was 0.895, and there significant difference between models from rough or relevance reduction outcome and model from principal components reduction outcome. And average prediction accuracy was 83.696%~92.391%.5 Rule-based classifier: The models respectively constructed by three reductions contained 5,4 and 5 rules separately. The coverage rate of rules were all relatively low on the training set and there was a large gap between the accuracy of original test and that of 5-fold cross-validation. The average accuracy of three models constructed by three reductions was volatile between 77.778% and 87.556%, average 83.037%. The AUC in 5-fold cross-validation is above 0.829 in average, and the prediction accuracy was 89.348%~81.304%, 85.507% in average.6 C4.5 decision tree: the nodes of C4.5 decision trees set up by three reduction results were 15, 12 and 10 correspondingly. The training process was quickly. But three models merely covered the attributes if which was positive then the positive result turned out, so the general classification capability was mediocre. The accuracy was about 85%. The area under the ROC curve in 5-fold cross-validation was approximate 0.834 in average and that of rough set reduction model was larger than the other two models with statistic significance. The prediction accuracy was 83.696%~89.130%, and 86.957% in average.7 SVM: Among three kernel models, the best classification effect was from radial basis function kernel model with a overall surpass in all indications compared with other two kernel models, There was a significant difference of the AUC in 5-fold cross-validation between Sigmoid kernel model and BRF kernel model with less number of support vectors. After choice of the optimization parameters, the correct rate increased significantly. The classified accuracy of model set up by correlation analysis reduction results was up to 100%. Those of the other two models were about 88.222% and 92.222%. The AUC in 5-fold cross-validation was above 0.94 and that of rough set reduction model was significantly better than that of principle components reduction models. The prediction accuracy was above 92%. 8 BP Network: Three BP networks respectively with 4, 3 and 5 hidden nodes were constructed on three reduction results. Parameter settings were time-consuming and the accuracy of classification and prediction were volatile with high prediction error. The accuracy of classification was 81.778%~89.111%, and 85.185% in average. The average AUC was 0.889, and that of correlation reduction model was superior significantly against the other two reduction models. The prediction accuracy was volatile obviously between 73.913% and 95.652%, and 86.594% in average.9 RBF neural network: Three reduction subsets respectively established RBF network with 3 hidden nodes. The learning process was faster than that of BP network, also the parameter settings were simpler. The average correct classification rate was 88.741%, The AUC in 5-fold cross-validation was above 0.89 and multiple comparisons between three reduction models were all had significant difference. The average prediction accuracy was about 90.217%.10 PNN neural network: The models were with less parameter, faster running speed. The classification accuracy in 5-fold cross-certification were all above 86%, even up to approximate 95%, and 91.111% in average. The average AUC in 5-fold cross-validation was more than 0.93, average 0.967, and that of principle components reduction model was lower than the other two reduction models with statistic significance. The prediction accuracy were all higher than 90%, average 93.840%.11 According to 5-fold cross-validation AUC and the hypothesis test results, the eight models were separated into several grades by classification performance:Correlation reduction models: SVM> PNN> Logistic, RBF> PARI, BP, C4.5. And Bayesian classifier had no significant difference with all models in the latter two categories, therefore it should range between 3,4 category.Principal component reduction models: SVM, PNN> RBF, Bayes> C4.5, PARI, And because Logistic, BP had no significant difference with RBF, Bayes and C4.5, it should be categorized between 2 and 3 grade.Rough set reduction model: PNN> SVM> Bayes, Logistic, BP, C4.5> PARI, And RBF had no significant difference with PNN or SVM, so it should range between 1,2 category.Conclusion:1 The models built by attribute reduction method based on rough set can maintain a high capability of classification. The reduction can eliminate unnecessary knowledge from the information system (Decision Tables) as far as possible, result in a small subset with well ability of classification. Therefore it is a worthy reduction method in TCM syndrome data processing.2 It is possible to overestimate the effect of classifier by original test, so its practical value isn’t enough and not suitable for the objective evaluation of models. While the results of 5-fold cross-validation test are more stable and can reflect the true capacity of classification of the models, especially with the interference data. It can avoid a large volatility of the classification results. And it is recommended that in the further study the use of cross-validation test should be carried on to evaluate the classifiers objectively as far as possible.3 Compared with the traditional evaluation index, ROC curve has such advantages as high reliability, accurate and objective description, specially the avoidance of the impact of bad data. It can process a hypothesis test of AUC between two diagnostic tests, so its results are more intuitive and objective.4 Overall, the eight models which is applied in this study all have certain diagnosis value, SVM, PNN, RBF is the best, then the Logistic, Bayesian classifier. And BP, C4.5, PARI is general.5 Logistic regression model has a perfect evaluation, revision system, and can clearly show the magnitude and direction of contribution of each attributes in the models. But it is easy to be infected by the collinearity and strong influential point. And the prediction accuracy and its error are in the medium sequence in eight models. Backward LR model is superior against Forward LR model. And with a second though that in the variable selection Backward LR model focus on the variables which have the strong joint action, so for the TCM syndrome data that have correlation generally, Backward LR model is suggested.6 Bayesian classifier is vulnerable to be impact by the frequency and priori probability. Its effect is similar with Logistic regression model.7 Rule-based classifier can generate easy-to-understand rules and show the strength of rules at the same time. But its classification, prediction capabilities are poor with poor stability, Thus the model is suitable for extracting rules to help understand the connotation of TCM syndromes, But unfit for classification and prediction research.8 C4.5 decision tree can generate a visual dendrogram which helps intuitive understanding of the contribution of attributes in syndrome discrimination. And it has good robustness with strong influential point. But the sensitivity, the rate of misdiagnosis, the negative predictive value and negative likelihood are relatively low, while the rate of missed diagnosis, specificity, positive predictive value and positive likelihood are relatively high. Its classification capability is mediocre with high prediction error. We suggest that the model is suitable to form a decision tree to help intuitive understanding the connotation of TCM syndromes, but not suitable for classification and prediction research.9 The radial basis functions kernel model of support vector machines is quite suitable for data analysis of TCM syndrome Research with a superiority of classification and prediction accuracy against polynomial kernel model and sigmoid kernel model, and less support vectors, It has good generalization. Therefore it is quite a good option to perform a RBF kernel when carrying on a TCM syndrome classification study. SVM can construct an optimal hyperplane for TCM syndrome data, which help to obtain a demarcation with relative high accuracy for nonlinear separable TCM syndrome data in the feature space. Its classification capability is better than other classifiers with better robustness and generalization. For these reasons, SVM technology would be feasible and effective in TCM syndrome research.10 The learning speed of BP Network for TCM syndrome diagnosis is slow. And its generalization ability is poor. It is vulnerable to fall into local minimization problem. And the feature vectors of TCM syndromes are difficult to obtain, the syndrome diagnostic accuracy is not high enough. Therefore its actual effect is relatively poor and difficult for promoting.11 The learning speed of RBF neural network is faster than BP neural network with a simpler parameter setting. It is good at classification and prediction to TCM syndrome data with better robustness and is applicable to TCM syndrome research.12 PNN neural network has fewer parameters and faster running speed. It is quite robust. Its classification and prediction accuracy are fairly high, merely inferior to SVM. It has good generalization performance and can well recognize classification information in TCM syndrome data, sequent with ideally results of syndrome classification and prediction. So it is worth to be promoting in the TCM syndrome classification research.
- 【网络出版投稿人】 广州中医药大学 【网络出版年期】2008年 09期
- 【分类号】R241
- 【被引频次】12
- 【下载频次】1113