

Design of Individual Physical Examination Package Based on Data Mining

【作者】 詹引

【导师】 金新政;

【作者基本信息】 华中科技大学 , 情报学, 2010, 硕士

【摘要】 背景:定期健康体检是疾病预防控制的有效途径之一,被称为健康的第五大基石。目前我国健康体检产业的发展存在体检机构定位不明确,服务内容单调,服务质量参差不齐等问题,其中大众较为需求的健康体检套餐普遍存在低水平重复、定价混乱、项目设置不合理等现象。目的:本研究旨在应用数据挖掘技术和问卷调查,深入探索不同类客户对体检及体检套餐的需求,并构建适合于不同类人群的个性化健康体检套餐。方法:随机抽取深圳市罗湖区288人进行体检套餐满意度调查,采用Microsoft Office Excel对问卷调查结果进行统计分析。收集深圳市某三甲医院体检中心2009年的34,224人次的健康体检数据,应用数据挖掘软件Clementine11.1构建体检数据的K-Means聚类模型、C5.0决策树模型、Apriori关联规则模型。结果:问卷调查结果统计显示:42.7%的人对现有体检套餐不满意,52.8%的人愿意接受300-500价位的体检套餐,分别有38.3%和36.6%的人认为10-20项和20-30项的体检项目数最能满足需求。Clementine软件的K-Means聚类模型根据客户的体检项目数和费用将34,224人次划分为6个类别,C5.0决策树模型分别对6个类别中客户的性别、年龄特征进行归纳,Apriori关联规则模型研究所有客户选择的不同体检项目间的关联规则。根据数据挖掘技术和问卷调查统计的结果,设计得到9种针对不同年龄和性别人群的健康体检套餐。结论:本研究首次提出应用数据挖掘技术设计健康体检套餐,在体检套餐设计方法创新方面进行了合理尝试,为开拓创新体检套餐设计方法提供可行思路,促进了健康体检信息系统和数据挖掘技术在健康体检行业的广泛应用。本研究应用Clementine11.1软件成功构建健康体检数据分析模型和健康体检套餐设计的数据流,可供相关研究参考借鉴。本研究的创新之处在于首次提出基于数据挖掘技术的健康体检套餐设计,并应用Clementine11.1数据挖掘软件成功构建健康体检套餐设计数据流以及9种针对不同类客户的体检套餐,国内外未见相关研究报道,研究具有新颖性。本研究的不足之处在于数据源单一、数据量不足,导致数据挖掘的结果可信度降低,研究结论适用范围较单一。

【Abstract】 Background: Periodic health examination is an effective way for disease prevention and control, known as the fifth foundation of health. At present, there are many problems in the process of physical examination industry development, such as the institution of physical examination not clear, the service provided monotonous and the quality spotty. Besides, the physical examination package public demanded prevails the phenomenon of repeating at low level, pricing chaos and items superfluous.Objective: The aim of this study was to investigate the different kinds of customers’demands for physical examination and examination package, and design individual physical examination package for them, with data mining technology and questionnaire method.Methods: Random sampling method was taken to sample 228 participants in Luohu district Shenzhen city. Investigation on satisfaction to physical examination package was carried out. The analysis of questionnaire uses Microsoft office Excel. Collect Data of 34,224 person-trips taken physical examination in the physical examination center of a Shenzhen hospital in 2009. K-Means clustering model, C5.0 decision tree model, Apriori association rules model based on these data was contributed with data mining software Clementine 11.1.Results: Statistics suggest that accounting for 42.7 percent were not satisfied with the existing physical examination packages, 52.8 percent willing to take the packages at CNY 300~500 Yuan, 38.3 percent and 36.6 percent was satisfied with the 10~20 items and 20~30 items in the packages. K-Means clustering model divided 34,224 person-trips into 6 cluster based on the examination items and cost, C5.0 decision tree model summarized the rules of the customers’gender and age from the 6 cluster, Apriori association rule model studied all the association rules between different examination items. Finally, 9 kinds of physical examination packages were designed for different ages and gender people based on data mining technology and questionnaires method.Conclusion: This research first proposed design of physical examination package based on data mining technology; take a reasonable attempt in the innovation design of package; provide a viable idea for package design; promote the physical examination information system and data mining technology widely used in the healthy industry. This research used Clementine 11.1 software to construct physical examination data analysis model and stream of physical examination package designing, which can use for reference.The innovation of this study include: first proposed design of physical examination package based on data mining technology; first used Clementine 11.1 software to construct stream of physical examination package designing and 9 kinds of physical examination packages for different kinds of people, which there is not related research reports at home and abroad.The shortcomings of this research are lack of data and single of source, which reduced reliability and application scope of the results of data mining. The next step is to verify the packages in the medical theory and clinical practice.


