

Data Analysis for Air Pollution Incidents

【作者】 王娜

【导师】 高巍;

【作者基本信息】 东北师范大学 , 概率论与数理统计, 2007, 硕士

【摘要】 空气是人类生存所必需的,空气被各种有害物质污染将直接或间接影响到人类的健康。大气污染是随着现代工业的发展、城市人口的密集、煤炭和石油燃料的迅猛增长而产生的。大气污染是全球性的重大问题,日益恶化的生活环境,给人类的生存带来了极大的威胁。近年来的研究表明,在一些地方,大气污染水平及其不良影响仍在逐年上升,有的地区大气污染甚至越来越严重。与大气污染有关的疾病发病率和死亡率也逐年上升。经典线性回归是统计分析的各种方法中应用最广泛的一种,它是处理变量间相互依赖关系的一种数理统计方法。变量间的相互依赖关系在实际问题中是大量存在的,回归分析是研究这种相互依赖关系的有效的数学方法。广义线性模型是常见的正态线性模型的直接推广,它可适用于连续数据和离散数据,特别是后者,如属性数据,计数数据等,这在实际应用上,尤其是生物、医学和经济、社会数据的统计分析上,有着重要的意义。本文以广义线性模型为理论基础,研究了2005年,全国31个地区,大气污染事件发生的次数同工业中产生大气污染物的各项指标之间的关系。由于工业中产生大气污染物的渠道众多,所以首先通过对回归方程的显著性检验剔除存在明显共线性的变量,然后对广义线性模型中的回归系数做似然比检验,挑选出与大气污染事件发生的次数存在显著性影响的变量,最后利用Poisson分布模型建立了它们之间的回归关系。

【Abstract】 Air is necessity for human being to survive and polluted air by various noxious substances directly or indirectly influences the health of peoples. Polluted air is the result of the development of modern industry, urban population concentration and the quick increase on coal and petroleum. Air pollution is a big global issue, and it is threaten to human being’s survival. In recent years, some researches have indicated that air pollution is becoming more and more serious and some diseases have close relationship with air pollution and its death ratio is increasing.Classical linear regression is the most widely-used method in the various statistic analyses, and it is one kind of mathematical statistics method which deals with the interdependent relationship of the variables. That interdependence of the variables exists in a great quantity in practical problem, and regression analysis is an effective mathematical method studying this interdependence.Generalized linear models are the generalization of the normal linear model, and it can be applied to the continuous data and discrete data it is more effective for the latter, such as attribute data, counting data. It has widely application in the field of biology, medical science and economy.This paper takes the generalized linear models as the theoretical basis and studies the relation between the times of air pollution event occurrence and various pollution index of air pollution produced by industry in 31 areas in 2005. Since the channel of producing pollution in industry is various, first we should get rid of the variable which exists the obvious collinearity, and then make the likelihood ratio test on the regression equation in order to pick out the influential variables of the times of air pollution event, and finally with the making use of the Poisson distribution model to form a regression relationship between them.

  • 【分类号】X51
  • 【被引频次】1
  • 【下载频次】209

