节点文献

三种预测模型在主要传染病发病率预测中的应用

Application of Three Model in Forecasting Incidence of the Main Communicable Diseases

【作者】 王平

【导师】 沈毅;

【作者基本信息】 浙江大学 , 流行病与卫生统计学, 2010, 硕士

【摘要】 背景传染病是由病原微生物和寄生虫感染人体后所致的具有传染性的疾病,迄今为止传染病仍是严重危害人类健康的疾病之一。特别是近年来各种新发传染病的出现,如严重急性呼吸综合症、人感染高致病性禽流感等,都为人类在21世纪防控传染病提出了更高的挑战。80年代以来,我国传染病预测理论及其应用得到了迅速发展,并日臻完善,逐渐成为疾病监测工作中的热点。疾病的预测可以及早发现疾病的发展趋势,为深入开展疾病的预警奠定基础,也为制定防制策略及措施提供理论依据。因此,有必要研究主要传染病发病的流行规律,并用不同的模型预测其发病率,通过比较选择适合主要传染病的预测模型,预测其发病趋势,评价预防控制措施的效果。资料与方法本研究利用嘉善县1951-2009年17种法定传染病疫情报表资料,进行以下研究:①应用描述性流行病学方法分析嘉善县17种法定传染病总发病率的流行趋势以及三种主要传染病(病毒性肝炎、痢疾、麻疹)的流行趋势及其季节分布特征;②分别用指数曲线模型、灰色GM(1,1)模型及ARIMA模型拟合预测三种主要传染病的发病率;用平均误差率(MER)及决定系数(R2)两个指标对模型拟合效果进行评价和比较;对于点预测,则采用残差进行预测准确性的比较。通过比较,选择拟合和预测效果最好的模型对主要传染病未来发病率进行预测。结果1嘉善县1951~2009年17种法定传染病发病概况嘉善县17种法定传染病年总发病率在50年代后半期和60年代前半期较高,在70年代和80年代比较稳定,直至90年代初明显下降,且维持在较低水平。总平均发病率由最高峰的4938.73/10万(60年代)下降至90.27/10万(2000年代)。嘉善县17种法定传染病在不同年代前5位传染病疾病谱发生了改变,由50年代的麻疹、痢疾、百日咳、疟疾和流脑转变为2000年代的病毒性肝炎、痢疾、麻疹、伤寒和疟疾,但伤寒和疟疾发病率仅为4.36/10万和0.35/10万,发病率仍较高且居前三位的病毒性肝炎、痢疾及麻疹列为嘉善县目前主要的传染病。2三种模型在病毒性肝炎发病率预测中的应用甲肝的发病率呈连续下降趋势,从1990年最高214.13/10万降至2009年的0.15/10万;乙肝发病率从1990年(47.74/10万)至2003年(56.97/10万)处于波动状态,此后发病率一直下降,至2009年乙肝发病率达最低(16.56/10万);甲肝发病具有明显的季节波动性,冬春季发病明显高于其它月份;乙肝发病季节波动性不明显,1月份发病稍高于其他月份。三种模型预测乙肝发病率情况为:GM(1,1)模型不能用于乙肝发病率的预测,指数曲线模型和ARIMA模型可以用来预测乙肝发病率。指数曲线模型和ARMA(0,1,1)×(0,1,1)4模型对乙肝发病率拟合的MER分别为16.40%、10.10%,R2分别为0.21、0.71;两种模型预测2009年乙肝发病率分别为28.13/10万、20.16/10万,2009年乙肝实际发病率为16.56/10万,点预测残差分别为11.57/10万、3.60/10万。运用最优模型ARIMA(0,1,1)×(0,1,1)4模型预测2010年、2011年乙肝发病率,分别为15.30/10万及13.34/10万。3三种模型在痢疾发病率预测中的应用痢疾在50年代平均发病率(224.40/10万)高于60年代(110.55/10万),70~80年代达到最高峰(大于500/10万),80年代后期逐渐下降,2000年代达最低(20.56/10万);不同年代痢疾发病均呈现一定的季节性,发病高峰在夏秋季。三种模型预测痢疾发病率情况为:指数曲线模型、GM(1,1)模型以及ARIMA(0,1,1)×(0,1,1)4模型均可以对痢疾发病率进行拟合预测。三种模型对痢疾发病率拟合的MER分别为44.21%、28.00%、19.87%,R2分别为0.76、0.94、0.93;三种模型预测2009年痢疾发病率分别为8.09/10万、1.45/10万、5.93/10万,2009年痢疾实际发病率为3.92/10万,点预测残差分别为4.17/10万、2.47/10万、2.01/10万。运用最优模型ARIMA(0,1,1)×(0,1,1)4模型预测2010年、2011年痢疾发病率,分别为1.67/10万及0.98/10万。4三种模型在麻疹发病率预测中的应用麻疹发病率在未使用疫苗时期(1951~1965年)、小规模使用疫苗时期(1966~1969年)、按年接种时期(1970~1983)及按月甸接种时期(1984~2009)平均发病率分别为871.10/10万、264.76/10万、80.54/10万、8.82/10万;麻疹发病在四个时期均有一定的季节性。三种模型预测麻疹发病率情况为:指数曲线模型和GM(1,1)模型不能用来预测麻疹发病率;ARIMA(1,1,0)模型理论上可以用来预测麻疹发病率。ARIMA(1,1,0)模型对麻疹发病率拟合的MER为75.03%,R2为0.09;预测2009年发病率为13.61/10万,2009年麻疹实际发病率为4.21/10万,点预测残差为9.40/10万,相对误差为223.28%。ARIMA(1,1,0)模型拟合效果差,预测准确性也差,不能用于麻疹发病率的预测。结论1指数曲线模型对发病率基本呈现持续下降趋势、呈指数函数变化的痢疾预测效果较好;对发病率先小幅波动后呈下降趋势的乙肝预测效果不理想;对发病率波动性较大的麻疹不能进行预测。2 GM(1,1)模型对发病率基本呈现持续下降趋势、呈指数函数变化的痢疾预测效果较好;对发病率先小幅波动后呈下降趋势的乙肝以及发病率波动性较大的麻疹不能进行预测。3 ARIMA模型能对发病率先小幅波动后呈下降趋势的乙肝作出较好的预测;也能较好的预测发病率基本呈现持续下降趋势的痢疾;对发病率波动性较大但数据量不足的麻疹不能作出预测。用于一维时间序列传染病发病率预测的常用模型中ARIMA模型拟合效果最好,预测出未来乙肝发病仍呈缓慢下降趋势,痢疾发病在较低水平下仍呈下降趋势。4预测模型虽然可以对传染病发病趋势进行预测,但应用也存在着一定的局限性,指数曲线模型和GM(1,1)模型不适宜对存在较大波动的传染病进行预测,ARIMA模型不适宜对存在较大波动而且数据量不足的传染病进行预测。

【Abstract】 BackgroundsCommunicable diseases have the characteristics of communicability which are caused by pathogenic microorganism and parasites, they are also diseases which seriously impair human health.In recent years the emergence of new infectious diseases, such as severe acute respiratory syndrome and human infection with highly pathogenic avian influenza, was a sever challenge for the prevention and control of human infectious diseases in the 21st century. From 1980s, the theory and application of prediction of infectious diseases have been rapidly developed and the works have become a hot spot in terms of disease surveillance. Prediction of disease can show the development trend of the disease, which carry out the basis for early warning of disease and provide a theoretical basis for the development of prevention strategies and measures. Therefore, it is necessary to explore the epidemiology of the main communicable diseases. By comparing different models which used to forecast the incidence of main communicable diseases, select the appropriate forecasting model to predict trends and evaluate results of prevention and control measures. Materials & MethodsThe study based on the incidence of 17 infectious diseases in Jiashan County between 1951 to 2009,include:①analyzing the epidemic trend of 17 infectious diseases,the epidemic trend and seasonal characteristics of three main infectious diseases(viral hepatitis, dysentery and measles) in Jiashan County using descriptive epidemiological methods;②comparing exponential curve, gray GM (1,1) and ARIMA model which were used to forecast three main infectious diseases by mean error rate(MER), R2 and the residuals of point prediction; Select the appropriate forecasting model to predict trends of main infectious diseases.Results1 General situation of the incidence rate of 17 kinds of legal infections from 1951 to 2009 in Jiashan CountyThe average incidence rate of 17 kinds notifiable infectious diseases was high from mid-1950s to the first half of the 1960s, which remained stable in 1970s and 1980s, but decreased dramatically from early 1990s. After that the total incidence of notifiable infectious diseases stayed constant at lower level. The average incidence rate dropped from 4938.73/lakh(1960s) to 90.27/lak(2000s).The spectrums of top five infectious changed in different years, in 1950s, the infectious were measles, dysentery, whooping cough, malaria and epidemic meningitis, but in 2000s,the spectrums of infection were viral hepatitis, dysentery, measles, typhoid and malaria, but the incidence rate of typhoid and malaria was only 4.36/lakh and 0.35/lakh. Therefore, viral hepatitis, dysentery and measles were the main communicable diseases in Jiashan County because of the incidence of the top three.2 Application of three model in forecasting incidence of viral hepatitisThere was a continuous downtrend in the incidence of Hepatitis A which decreased from 214.13/lakh in 1990 to 0.15/lakh in 2009. There was a fluctuation in the incidence of Hepatitis B between 47.74/lakh in 1990 and 56.97/lakh in 2003; after that, the incidence of Hepatitis B decreased to the lowest level in 2009(16.56/lakh).There was a seasonal variation in the incidence of Hepatitis A, the incidence rates in winter and spring was higher than in other seasons. There was no distinct seasonal variation in the incidence of Hepatitis B but it was slightly higher in January than in other months.Three model were used to forecast the incidence of Hepatitis B. GM (1,1) model could not be used to predict the incidence of Hepatitis B, exponential curve model and the ARIMA model could be used to predict it. The MER of exponential curve model and ARIMA (0,1,1)×(0,1,1)4 model was 16.40% and 10.10%, and the R2 was 0.21 and 0.71 respectively. The predicted incidence of Hepatitis B in 2009 by two models was 28.13/lakh and 20.16/lakh respectively. According to the actual incidence rate in 2009(16.56/lakh) the point prediction residual was 11.57/lakh and 3.60/lakh respectively. Using the best model ARIMA (0,1,1)×(0,1,1) 4 model to predict the incidence rate of Hepatitis B,it would be 15.30/lakh and 13.34//lakh in 2010 and 2011 respectively.3 Application of three model in forecasting incidence of dysenteryThe average incidence of dysentery was higher in 1950s(224.40/lakh) than 1960s(110.55/lakh),was highest from 1970s to 1980s(more than 500/lakh). There was a continuous downtrend from late 1980s in the incidence of dysentery which decreased to the lowest level in 2000s(20.56/lakh). There was a seasonal variation in the incidence of dysentery in different years, the incidence rates in summer and autumn was higher than in other months.Three models were used to forecast the incidence of dysentery. Exponential curve, GM (1,1) and the ARIMA (0,1,1)×(0,1,1)4model all could be used to predict the incidence of dysentery. The MER of three models was 44.21%,28.00% and 19.87%, the R2 was 0.76,0.94 and 0.93 respectively. The predicted incidence of dysentery in 2009 by three models was 8.09/lakh,1.45/lakh and 5.93/lakh respectively. According to the actual incidence rate in 2009(3.92/lakh) the point prediction residual was 4.17/lakh,2.47/lakh and 2.01/lakh respectively. Using the best model ARIMA (0,1,1)×(0,1,1)4 model to predict the incidence rate of dysentery, it would be 1.67/lakh and 0.98//lakh in 2010 and 2011 respectively.4 Application of three model in forecasting incidence of measlesIn the different periods when measles vaccine was not used (1951 to 1965), small-scale used (1966 to 1969), annual vaccination (1970 to 1983) and mid-month vaccination (1984 to 2009), the average incidence of measles was 871.10/lakh, 264.76/lakh,80.54/lakh and 8.82/lakh respectively. There was some seasonal in the incidence of measles in four periods.Three models were used to forecast the incidence of measles. Exponential curve and GM (1,1) model could not be used to predict it, the ARIMA (1,1,0) model could be used to predict it in theory. The MER of ARIMA(1,1,0) model was 75.03% and the R2 was 0.09. The predicted incidence of measles in 2009 was 13.61/lakh. According to the actual incidence rate in 2009(4.21/lakh), the point prediction residual was 9.40/lakh and the relative error was 223.28%. The ARIMA (1,1,0) model which had poor fitting and poor accuracy could not be used to predict the incidence of measles.Conclusion1 Exponential curve model could well predict the incidence of dysentery which showed a continuous downtrend and exponential change.Prediction result was not satisfactory for Hepatitis B which showed a downward trend after a slight fluctuation.It was not used to predict the incidence of measles which had large fluctuations.2 GM(1,1) model could well predict the incidence of dysentery which showed a continuous downtrend and exponential change.It was not used to predict the incidence of Hepatitis B which showed a downward trend after a slight fluctuation and the incidence of measles which had large fluctuations.3 ARIMA model could well predict the incidence of Hepatitis B which showed a downward trend after a slight fluctuation and the incidence of dysentery which showed a continuous downtrend and exponential change. It was not used to predict the incidence of measles which had large fluctuations and a small amount of data. In common models of one-dimensional time series for the incidence of infectious diseases, ARIMA model had good fitting and accuracy. Predicted results for the incidence of Hepatitis B showed a slow decline and for the incidence of dysentery showed a decline at lower level.4 Although the prediction model could predict the trend of infectious diseases, but the application also had some limitations.Exponential curve model and GM (1,1) model was not suitable to forecast the infectious diseases which had large fluctuations. ARIMA model was not suitable to predict the infectious diseases which had large fluctuations and a small amount of data.

  • 【网络出版投稿人】 浙江大学
  • 【网络出版年期】2012年 03期
节点文献中: