节点文献

基于广义偏线性模型的40-65岁女性原发性骨质疏松症筛检工具研究

A Study on Primary Osteoporosis Screening Tool for 40-65 Years Old Women Based on Generalized Partial Linear Model

【作者】 田峰

【导师】 谢雁鸣;

【作者基本信息】 中国中医科学院 , 中西医结合临床, 2011, 博士

【摘要】 1目的1.1筛选出原发性骨质疏松症(primary osteoporosis, POP)发病的重要危险因素和中医证候要素,明确POP发病的重要影响因素。1.2建立基于广义偏线性模型(generalized partial linear model, GPLM)的、包括危险因素和中医证候要素内容的POP判别模型,为建立POP筛检工具提供数理模型依据。1.3初步建立符合北京、上海社区人群特征的POP筛检工具,为对POP高危人群进行筛查提供科学依据。2方法2.1 POP筛检问卷设计在导师课题组前期设计的《原发性骨质疏松症中医证候调查问卷》的基础上根据量表学和临床流行病学方法,结合骨质疏松症专家的临床经验,以及《中医内科常见病诊疗指南:西医疾病部分》中的中医证候辨证内容,增加了躯体症状条目,新加了生活习惯、发病相关因素等领域内容,制定了《社区40岁~65岁妇女骨质疏松危险因素及证候调查问卷》。整个筛检问卷包括一般信息、生活习惯、发病相关因素、躯体状况、临床体征等五个领域的内容,共65个条目,为封闭式设计。在调查实施前,由独立的中国中医科学院中医临床基础医学研究所伦理委员会对问卷内容进行论证,认为符合医学伦理要求。2.2调查人群筛选标准2.2.1纳入标准纳入:①女性;②年龄40~65岁;③意识清楚,可用言辞表达,有阅读能力,与调查人员沟通无障碍者;④经调查人员说明研究目的后,本人愿意接受筛检问卷调查和骨密度(bone mineral density, BMD)检测,并在“卷首页”签名同意者。2.2.2排除标准排除:①药物或其他疾病(如糖尿病、化脓性脊髓炎、肾炎、甲亢等)引起的继发性骨质疏松症;②有恶性肿瘤、痛风、类风湿性关节炎等疾病,影响中医证候判断者;③精神障碍、认知障碍者。2.3 POP诊断标准根据2008年中华中医药学会发布的《中医内科常见病诊疗指南:西医疾病部分》中BMD检测T值为骨质疏松定性诊断标准,取BMD检测报告单中腰椎L1-L4、股骨颈、股骨全区3个部位的最小T值,T值>M-1SD为骨量正常,M-1SD~2.0SD为骨量减少,<M-2.0SD以上为骨质疏松症。2.4数据来源2009年3月~8月期间,在上海市徐汇区凌云、华泾镇、长桥3家社区医疗服务中心,北京市东城区交道口、景山、朝阳、东华门、北新桥5家社区医疗服务中心,进行POP高危人群筛选,对符合本次调查纳入标准的人员进行现场问卷调查,并进行BMD检测。在上海社区共发放筛检问卷1101份,返回问卷1027份,经核实排除不合格问卷26份,得到合格问卷共1001份,占发放问卷数的90.92%。在北京社区共发放筛检问卷800份,返回问卷763份,经核实排除不合格问卷24份,得到合格问卷共739份,占发放问卷数的92.38%。应用课题组与北京科技大学合作开发的“骨质疏松症健康管理系统”网络数据采集平台(http://210.76.97.192:8080/gzss),将合格的问卷独立双录双核后,进行一致性检验,共获得1740例合格筛检问卷和BMD检测数据。2.5统计分析方法2.5.1统计分析软件应用SPSS 18.0 for Windows软件、SAS 9.2 for Windows软件和SPSS Climentine 12.0数据挖掘软件进行筛检问卷数据分析和统计建模,运用Lantern 1.5软件进行隐树模型分析。2.5.2筛检问卷信度和效度分析运用克朗巴赫α系数和Guttman折半系数法考核筛检问卷的信度,筛检问卷“躯体症状”领域条目的结构效度采用因子分析(factor analysis)方法。2.5.3 POP发病危险因素分析采用多分类logistic回归(multinomial logistic regression)方法分析POP发生与各危险因素之间的定量关系,建立多项logit模型(multinomial logit model),以初步筛选出影响POP发生的危险因素。2.5.4 POP中医证候要素分析运用隐树模型(latent tree model)方法,通过可见的“症状显变量”来探求内在的不可见的“证素隐变量”,并建立“证素隐变量”之间的隐树结构,分析POP的基本中医证候要素及其相互关系。2.5.5 POP判别模型的建构运用支持向量机(support vector machine, SVM)数据挖掘方法筛选出与骨质疏松发病相关的重要危险因素和中医症状为自变量,以BMD定性诊断为因变量,建立基于GPLM的POP判别模型。2.5.6 POP筛检工具的评价运用接收者工作特征曲线(receiver operating characteristic curve, ROC)评价筛检工具的判别准确度,通过ROC曲线下面积(area under the ROC curve, AUC)评价筛检工具诊断价值的大小。3结果3.1筛检问卷信度和效度评价3.1.1筛检问卷信度评价通过对筛检问卷“躯体症状”领域条目分析,肾阳虚证、肝肾阴虚证、脾肾阳虚证、血瘀证等4个维度的克朗巴赫系数α值分别为0.803、0.871、0.811和0.707,整个领域的克朗巴赫系数α值为0.913。“躯体症状”领域4个维度的Guttman折半信度值分别为0.789、0.831、0.743和0.699,整个领域的Guttman折半信度值为0.867。3.1.2筛检问卷效度评价采用因子分析方法,运用主成分法提取公因子,经过平均正交旋转法,迭代25次,KMO检验统计量为0.935(>0.5),Bartlett’s检验的近似χ2=18058.066,df=741,P<0.01。按特征根值>1提取其因子,共提取了10个因子,累积方差贡献率达到53.789%。3.2 POP发病危险因素的筛选3.2.1 POP发病相关因素一般描述经方差分析,年龄、体重指数、绝经年限在骨量正常、骨量减少和骨质疏松三类人群中的均数差异有统计意义(P<0.05)。经交叉表分析,肉类膳食、鱼类膳食、饮用咖啡、每天锻炼时间、是否变矮、是否绝经、怀孕次数、生产次数、骨折次数等在三类人群中的分布比例差异有统计意义(P<0.05)。3.2.2 POP发病危险因素的多项logit模型将肉类膳食、鱼类膳食、饮用咖啡、每天锻炼时间、是否变矮、是否绝经、怀孕次数、生产次数、骨折次数等9个影响因素作为自变量;绝经年限和体重指数作为协变量;以BMD定性诊断为因变量,以骨量正常人群作为参考分类标准,运用SPSS 18.0软件Multinomial Logistic过程逐步向前法(forward stepwise)建立多项logit模型。按α=0.05的标准,模型最终引入的变量有:绝经年限、体重指数、鱼类膳食、是否变矮、是否绝经和生产次数。3.3 POP中医证候要素分析3.3.1 POP躯体症状一般描述北京、上海两地总人群中,出现频率高于15%的躯体症状有:健忘、遇寒痛甚、腰膝酸软、恶热、乏力、畏寒、腰痛、易怒、脱发、纳呆、视物模糊、头晕、下肢骨痛、下肢困重、失眠、夜尿次数、小便清长、腿软、背痛、齿摇、目睛干涩、气短、便秘、多梦易惊、下肢抽筋、身痛、耳鸣、盗汗、尿频、口苦、手足烦热等。经方差分析和两独立样本t检验,按α=0.05的标准,畏寒、目睛干涩、齿摇、纳呆、腹胀、胸胁苦满、夜尿次数、下肢抽筋、下肢骨痛等9个症状在骨量正常和骨质疏松两类人群中的发生率差异有统计意义(P<0.05)。3.3.2 POP中医证候要素隐树模型分析利用启发式单重爬山算法学习得到BIC评分最高的隐树模型,其BIC评分为-15671。隐树模型中的显变量症状所展现的隐变量与POP常见的肾虚、肝虚、阳虚、阴虚、肾精不足、血瘀等基本证候要素特点相吻合,病变部位以肝肾两脏为主,病性以虚证为主3.4基于GPLM的POP判别模型建构和评价3.4.1 GPLM模型变量筛选采用支持向量机RBF核函数变量选择方法,筛选出12个对骨质疏松症有重要判别意义的变量,分别是:是否绝经、绝经年限、鱼类膳食、目睛干涩、驼背、体重指数、生产次数、下肢抽筋、下肢骨痛、腹胀、胸胁苦满、饮用咖啡等。3.4.2 GPLM模型参数估计综合多项logit模型筛选出来的危险因素和SVM方法筛选出来的中医症状变量为自变量,BMD定性诊断为因变量,构建GPLM模型。GPLM模型线性部分参数估计,是否绝经、体重指数、下肢抽筋、下肢骨痛、绝经年限(线性效应)的模型系数分别为:1.14182、-0.15805、0.36149、0.32267、0.12956,具有统计意义(P<0.05)。GPLM模型非线性部分估计,绝经年限(非线性效应)的检验χ2=13.5948,P=0.0012,具有统计意义(P<0.05)。3.4.3 GPLM模型评价用“是否绝经”、“下肢抽筋”、“下肢骨痛”“体重指数”4个危险因素和中医症状作为自变量,与因变量BMD定性诊断拟合线性logistic回归模型,其AUC值为0.7536,而拟合GPLM模型时加入了“绝经年限”的非线性效应,其AUC值为0.7971,提高了骨质疏松症人群的判别准确率,经检验,χ2=21.9162,P<0.001,具有统计意义(P<0.05),说明带有非线性效应的GPLM模型要优于线性logistic回归模型。3.5 POP筛检工具的建立和评价3.5.1 POP筛检工具的建立基于GPLM模型,将“是否绝经”、“绝经年限”和“体重指数”3个西医危险因素,“下肢抽筋”和“下肢骨痛”两个中医症状作为筛检工具的主要条目。将GPLM模型中各变量的参数估计值取Exp值,并取10倍值调整后得到筛检工具算式:Score=31.3×是否绝经+11.4×绝经年限-8.5×体重指数+14.4×下肢抽筋+13.8×下肢骨痛。3.5.2 POP筛检工具的评价筛检工具的AUC值为0.789(95%CI:0.766 to 0.812),与AUC=0.5相比较,检验Z值为21.482,P<0.0001,具有统计意义(P<0.05)。筛检骨质疏松的灵敏度为55.67%(95%CI:50.6% to 60.6%),特异度为84.62%(95%CI:82.0%-87.0%),阳性预测值为63.0%(95%CI:57.7% to 68.0%),阴性预测值为80.2%(95%CI:77.5%to82.8%),Youden指数为0.403。以-80为截断值,高危人群(Score≥-80)中,63.0%患有骨质疏松,37.0%骨量正常;低危人群中(Score<-80),仅19.8%患有骨质疏松,80.2%骨量正常。结果显示筛检工具具有较好的筛检骨质疏松和排除骨量正常人群的能力。4结论4.1绝经是POP的主要危险因素,低体重指数也是POP发病的危险因素;鱼类膳食和POP发病之间均存在负相关性,是POP的保护性因素。4.2下肢抽筋和下肢骨痛是POP发病的重要中医症状。肾虚、肝虚、阴虚、阳虚和血瘀是POP的基本中医证候要素。病变部位以肝肾两脏为主,病性以虚证为主。隐树分析方法可以弥补聚类分析在中医证候研究方法上的先天不足。4.3经过GPLM模型非参数部分的拟合检验,发现绝经年限与骨质疏松发病之间存在非线性效应。将西医危险因素和中医症状作为线性变量,绝经年限作为非线性变量,建立基于GPLM的POP判别模型,反映病证结合特点,与线性logistic回归模型相比,具有更好的判别准确性。4.4在POP筛检工具中,融入中医证候特色内容,具有较好的灵敏度和特异度,增加了筛检工具对骨质疏松高危人群的判别准确性,满足了骨质疏松中医临床实际应用的需要。

【Abstract】 1 Objective1.1 To screen out important risk factors and TCM syndrome essential factors of primary osteoporosis (POP).1.2 To establish a discriminant model for POP with risk factors and TCM symdrome essential factors based on generalized partial linear model (GPLM), which is a mathematical model for POP screening tools.1.3 To create a POP screening tool for community population in Beijing and Shanghai, so as to provide scientific evidence in detecting person’s POP risk.2 Methods2.1 POP risk factors screening questionnaire designIn a previous study, a primary osteoporosis TCM syndromes questionnaire was designed and used. On that basis, we developed the questionnaire on osteoporosis risk factors and TCM syndromes of 40~65 years old community women by referring to scaling methods, clinical epidemiological methods, clinical osteoporosis profession and a national TCM syndromes differentiation guideline. This screening questionnaire is a closed-ended questionnaire included 65 items concerned with 5 aspects:general background information, lifestyle, TCM symptoms and signs, body examination, and other related personal pathogenic factors. Before the survey was conducted, the questionnaire was examined and approved by clinical medical ethics committee of Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences.2.2 Study population2.2.1 Inclusion criteriaInclude women:(1) 40-65 years old; (2) with clear consciousness and reading ability, and good communication with investigators; (3) be willing to accept the questionnaire survey and bone mineral density(BMD) test. 2.2.2 Exclusion criteriaExclude:(1) secondary osteoporosis caused by drugs or other diseases, such as diabetes, suppurative myelitis, nephritis, hyperthyroidism, etc.; (2) malignant tumor, gout, and rheumatoid arthritis which may influence TCM syndromes evaluation; (3) mental disorders or cognitive disabilities.2.3 Diagnostic criteria of POPAccording to the guidelines for diagnosis and treatment of common internal diseases in Chinese medicine:diseases of modern medicine(2008, published by China Association of Chinese Medicine), the minimum T-score of BMD test on lumbar L1~L4, femoral neck, total femur were selected as the index of osteoporosis diagnostic criteria. T-score> M-1SD can be recognized as normal bone mass, M-1SD~2.0SD as osteopenia. and<M-2.0SD as osteoporosis.2.4 Data sourceFrom March to August 2009, women of 40~65 years old with high POP risk factors at three community health service centers in Xuhui district(Shanghai), and five community health service centers in Dongcheng district(Beijing) were selected. And then screening questionnaire survey and bone mineral density(BMD) examination were conducted. In Shanghai, out of 1101 distributed questionnaires,1027 answered questionnaires were collected. After that.26 unqualified questionnaires were excluded,. Finally,1001 qualified questionnaires with a proportion of 90.92% were identified. In Beijing, out of 800 distributed questionnaires,763 answered questionnaires were collected, then 24 unqualified questionnaires were excluded and 739 qualified questionnaires with a proportion of 92.38% were identified finally. All the data were input into a web-based osteoporosis health management system (http://210.76.97.192:8080/gzss) by two researchers independently. Consistency test was done between the two researchers data entry. At the end. electronic data of 1740 qualified questionnaire and BMD test results were collected.2.5 Statistical analysis2.5.1 Statistical analysis softwareSPSS 18.0 for Windows, SAS 9.2 for Windows and SPSS Clementine 12.0 were used for questionnaire data analysis and modeling analysis. Lantern 1.5 package was used for latent tree model analysis.2.5.2 Reliability and validity analysisThe questionnaire’s reliability was evaluated with Cronbach coefficient a and Guttman coefficient, and the structural validity of TCM symdromes was evaluated with factor analysis.2.5.2 POP risk factors analysisThe quantitative relation between risk factors and POP was analyzed with multinomial logistic regression, in order to establish a multinomial logit model and screen out risk factors associated with POP.2.5.3 Analysis on basic syndrome factors of POPThe basic POP syndrome factors and their relationships were analysized with latent tree model method.2.5.4 POP discriminant modelA data mining method named Support vector machine (SVM) was used to explore the POP risk factors and TCM symptoms which were combined with the qualitative diagnosis data from BMD test to establish a POP discriminant model based on the GPLM.2.5.5 Evaluating POP screening toolReceiver operating characteristic curve (ROC) was used to evaluate the discriminative accuracy of POP screening tool. Normally, the area under ROC curve (AUC) is a useful index to assess the diagnostic value of screening tools.3 Results3.1 Reliability and validity evaluation of the screening questionnaire3.1.1 Reliability evaluationThe Cronbach coefficient a for the four aspects:kidney yang deficiency syndrome, liver and kidney yin deficiency syndrome, spleen and kidney yang deficiency syndrome and blood stasis syndrome, were 0.803,0.871,0.811 and 0.707 respectively. The total a for the four TCM syndromes was 0.913. The values of the Guttman split-half reliability for the four aspects were 0.789,0.831,0.743 and 0.699 respectively.The total value of Guttman split-half reliability for the four TCM syndromes was 0.867.3.1.2 Validity evaluationFactor analysis was conducted to evaluate the validity. Common factors were extracted after average orthogonal rotation and 25 times rotation by following the principal component method. The KMO test value was 0.935(>0.5), and Bartlett’s test value X2 was 18058.066, df=741, P<0.01. According to the criterion of eigenvalues> 1, ten factors were extracted, with a cumulated variance contribution rate 53.789%.3.2 POP risk factors screening3.2.1 Descriptive statistics on POP related factorsThe mean values of age, body mass index (BMI), and duration of menopause among three groups:the normal bone mass group, osteopenia group and osteoporosis group, were statistically different (P<0.05) after variance analysis. The distribution proportion of meat diet, fish diet, coffee, drinking daily exercise time, whether height shorten, whether menopause, previous pregnancy history previous obstetric history and bone fracture history among three groups were statistically different (P<0.05) after crosstab analvsis.3.2.2 Multinomial logit model about POP risk factorsAccording to Alpha level=0.05, six independent variables were calculated in the multinomial logit model, including duration of menopause, body mass index, fish diet, whether height shorten, whether menopause, and previous obstetric history.3.3 TCM syndrome essential factors of POP3.3.1 Descriptive statistics on TCM symptomsIn both study populations in Beijing and Shanghai,19 TCM symptoms,for instance, forgetfulness, pains increase in cold weather, limp aching lumbus and knees, intolerance to heat, fatigue, intolerance to cold, osphyalgia, irascibility, hair loss, anorexia, blurred vision, dizziness, lower limb pains, heavy legs, insomnia, times of nocturia. clear abundant urine, weak legs, back pains, loose teeth, dry eyes, shortness of breath, constipation, many dreams or easy to wake up, lower limb cramps, body pains, tinnitus, night sweat, frequent urination, bitter taste and feverish palms and soles have a frequency rate higher than 15%. Through variance analysis and two independent sample t-test, the incidence of 9 TCM symptoms was statistically different(P< 0.05) between normal bone mass group and osteoporosis group, including intolerance to cold, dry eyes, loose teeth, anorexia, abdominal distention, fullness in the chest and rib-side, times of nocturia, lower limb cramps and lower limb pains.3.3.2 Latent tree model about TCM syndrome essential factors of POPA latent tree model with the highest BIC score-15671 was constructed with heuristic single hill-climbing (HSHC) algorithm. The latent variables showed in the model were consist with the basic TCM syndrome essential factors of POP. including kidney deficiency, liver deficiency, yang deficiency, yin deficiency, kidney essence insufficiency and blood stasis.3.4 Establishment and evaluation of a discriminative model about POP based on GPLM3.4.1 Variables screening for GPLM12 important variables for determining POP were screened out with SVM method based on RBF kernel function, including whether menopause, duration of menopause, fish diet, dry eyes, humpback, body mass index, production times, lower limb cramps, lower limb pains, abdominal distension, fullness in the chest and rib-side, coffee etc.3.4.2 Parameter estimation of GPLMIn the GPLM linear part, coefficients of whether menopause, body mass index, lower limb cramps, lower limb pains and duration of menopause(linear effect) were: 1.14182,-0.15805.0.36149,0.32267,0.12956 respectively, with a statistical significance(P<0.05). In the nonlinear part, duration of menopause (nonlinear effect), X2=13.5948, P=0.0012, with a statistical difference(P<0.05).3.4.3 Evaluating GPLM A linear logistic regression model was fitted with whether menopause, lower limb cramps, lower limb pains and body mass index four risk factors and TCM symptoms as independent variables, and the BMD qualitative diagnosis as dependent variable, with the AUC value 0.7536. The AUC value of GPLM was 0.7971 with the nonlinear effect of duration of menopause. After statistical test, X2 value was 21.9162, P<0.001, with a statistical significance (P<0.05), which indicated that the GPLM model with nonlinear effect was better than the linear logistic regression model.3.5 Establishment and evaluation of the POP screening tool3.5.1 Establishing of the POP screening toolFive factors such as whether menopause, duration of menopause, body mass index, lower limb cramps and lower limb pains were recognized as the main items for the POP screening tool. According to the parameters from GPLM, a mathematical formula of the POP screening tool was formed:Score=31.3×whether menopause+ 11.4×duration of menopause-8.5×body mass index+14.4×lower limb cramps+ 13.8×lower limb pains.3.5.2 Evaluating the POP screening toolAUC value of the POP screening tool was 0.789 (95%CI:0.766 to 0.812). Compared with AUC=0.5. the Z value was 21.482. P<0.0001, with a statistical significance (P<0.01). With the cun-off value-80, the sensitivity of the POP screening tool was 55.67%(95%CI:50.6% to 60.6%), and specificity was 84.62%(95% CI:82.0% to 87.0%), the positive predictive value was 63.0%(95%CI: 57.7% to 68.0%), the negative predictive value was 80.2%(95%CI:77.5% to 82.8%), and Youden index was 0.403. According to the cun-off value-80, out of the high-risk population(Score≥-80), osteoporosis group accounted for 63.0% and normal bone mass group accounted for 37.0%. Whereas, for the low-risk population (Score<-80).80.2% were normal bone mass, and only 19.8% were osteoporosis.4 Conclusions4.1 Menopause was a main risk factor for POP, and low body mass index was also a risk factor for POP. A negative correlation exists between POP and fish diet.4.2 Lower limb cramps and lower limb pains are important TCM symptoms for POP. Kidney deficiency, liver deficiency, yang deficiency, yin deficiency, kidney essence insufficiency and blood stasis were the basic TCM syndromes essential factors for POP. Pathological changes of liver and kidney were the main reasons and kidney deficiency was the main syndrome. Latern tree analysis method could avoid the disadvantages of clustering analysis method on TCM syndromes.4.3 Through fit testing the nonparametric part of GPLM model, the nonlinear effect between duration of menopause and POP was found. Western medicine risk factors and TCM symptoms were identified as linear variables, and duration of menopause as nonlinear variables. GPLM had more accurate discriminative characteristics, beng compared with other linear logistic regression models.4.4 TCM syndrome essential facors were added into the previous POP risk screening tool, which has been modified with good sensitivity, specificity and discriminative accuracy on screening high-risk POP population, and could meet the need of TCM clinical application.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络