节点文献

地理加权回归基本理论与应用研究

The Basic Theoretics and Application Research on Geographically Weighted Regression

【作者】 覃文忠

【导师】 刘妙龙;

【作者基本信息】 同济大学 , 大地测量学与测量工程, 2007, 博士

【摘要】 地理加权回归是近年来提出的一种新的空间分析方法,它通过将空间结构嵌入线性回归模型中,以此来探测空间关系的非平稳性。由于该方法不但简单易行,估计结果有明确的解析表示,而且得到的参数估计还能进行统计检验,因此得到越来越多的研究和应用。本论文以地理加权回归基本理论及其应用为研究对象,重点研究了地理加权回归的统计推断,混合地理加权回归和空间尺度变化对地理加权回归分析的影响,并以上海市住宅销售平均价格为例对地理加权回归探测空间关系非平稳性的有效性进行了验证。论文首先阐述了地理加权回归模型的基本原理和局部加权最小二乘估计方法,然后详细介绍了地理加权回归估计中常用的Gauss权函数和bi-square权函数,以及这两类权函数的带宽优化方法。考虑到实际应用中常常出现空间数据采样点分布疏密不均的情况,论文对可变带宽的自适应权函数进行了深入讨论,通过各种约束条件来提高地理加权回归参数估计的精度。论文以正态变量二次型分布理论为基础,在一定假设条件下,分别构建了回归模型和回归参数空间非平稳性显著性检验的相关统计量,推导出了统计量检验p-值的精确解法和两种逼近方法:三阶矩x~2逼近法和F分布逼近法,并通过仿真试验证明了所建统计量与两种逼近方法在进行空间非平稳性显著性检验中的有效性。在上述假设条件和统计推断基础上,进一步推导出了地理加权回归模型回归参数和预测值的置信区间。除了经典统计推断方法,论文还对AIC准则检验回归模型空间非平稳性的有效性进行了讨论,结果表明其检验结果与经典统计推断结果是一致的。由于在实际应用中,通常存在部分回归参数为常参数,部分回归参数为变参数的情况,因此论文深入研究了混合地理加权回归模型及其估计。在线性半参数回归模型和可加模型估计方法基础上,论文推导出了混合地理加权回归模型中常参数的两步估计法和后向拟合估计法,并通过仿真试验证明了两种方法都能很好地估计常参数,但两步估计法的估计精度和稳定性都要略好于后向拟合法。关于混合地理加权回归模型中常参数项的确定,论文通过大量试验证明广义交叉验证方法确定的最优带宽并不适合进行回归参数空间非平稳性的显著性检验,为了提高检验精度,应采用大带宽进行统计推断。在空间数据分析中,尺度效应是普遍存在的。论文首先分析了尺度含义、尺度效应以及对社会经济数据常用的尺度推绎方法,然后从幅度和粒度两个方面深入研究了空间尺度变化对地理加权回归分析的影响。试验结果表明地理加权回归分析对幅度(权函数带宽)变化非常敏感,但通过带宽优化可很大程度上克服幅度变化对地理加权回归分析的影响;而对粒度变化地理加权回归则表现出一定的稳定性,这也说明地理加权回归分析有利于可变面元问题的解决。论文最后以上海市住宅销售平均价格为例,采用(混合)地理加权回归进行空间分析,结果表明(混合)地理加权回归能很好地探测到空间数据关系的空间非平稳性,分析结果与实际情况吻合良好。

【Abstract】 Geographically weighted regression (GWR) is a new presented analysis methodto explore spatially varying relationships recently, which expands ordinarily linearityregression by embeding spatial data structure into the regression model. This methodis studied and applied more and more because it is not only easy to construct a modeland to calculate estimates with explicit analytic expressions, but also able to choose aset of statistical inferential approaches to do significance test and give confidenceintervals. In this paper the basic theoretics and application on GWR is discussed,especial statistical inference on GWR, mixed GWR and spatial scale effect for GWRanalysis are laid a strong emphasis on the study, and a case of the average prices ofsaled houses in Shanghai is adopted to prove the validity of GWR to explore thespatial relationship non-stationarity.Firstly, the basic principle of GWR and the locally weighted least squareapproach are illustrated, then the spatial weighting functions, including Gaussfunction and bi-square function, and their bandwidth optimization are discussed indetial. Considering samples are usually sparse and distribute uneven in fact,self-adaptive weighting functions with varying bandwidth by meeting certianconstraints are introduced to improve estimate accuracy of GWR parameters.The relative statistics of regression model and model parameters are modeled inthis paper to test the significance of the spatial nonstationarity based on thedistribution of quadratic forms of normal variables and under some conditions. Anexact method and two simple approximate approaches, i.e. the three-momentx~2 approximation and the F distribution, for computing the p-value of the teststatistics are derived. The results of simulation experiments validate the validity ofproposed test statistics and calculating algorithm. Furthermore, the estimate of modelparameters and the prediction values of dependent variable are presented based onabove hypothesis and approaches. Beside classic statistical inference techniques, theAkaike Information Criterion (AIC) measurement is also used to test the significance of the spatial nonstationarity of regression model, and the experiments indicate aconsistency between these two methods.There is normally a mixture of constant and varying parameters in theregression models in practical application, so the model and estimate of mixed GWRare addressed deeply in this paper. Based on estimate methods of the linearsemi-parameter regression model and the additive model, a two-step approach and aback-fitting approach for constant parameters estimate are deducted. The simulationexperiments prove that both methods are capable to estimate constant parameters, andmore the accuracy and robustness of the two-step approach are a little better thanback-fitting approach. About how to select the fit bandwidth of weighting function soas to decide which patameters are constant parameters correctly, we suggest based ona lot of experiments that a larger bandwidth, not the optimized bandwidth selected bygeneralized cross-validation criterion (GCV), is more appropriate to obtain right testresults.Scale effect exists popular in spatial analysis. In this paper the concepts of scale,scale effects and scaling methods for the social economic data are detailed. Then theeffect to GWR analysis by spatial extent change and spatial grain change isinvestigated. The simulation experiments prove that the GWR analysis is sensitive tospatial extent change (the bandwidth of weighting fuction), but this impact can bedecreased greatly by optimizing the bandwidth of weighting functions. On thecontrary, the GWR analysis is relative robust to spatial grain change, and this meansthat the GWR is favor of solving modifiable areal unit problems in some degree.At the last of this paper, the average price of saled houses in Shanghai are usedto validate the theory proposed above, and the case demonstrates that (mixed) GWRcan be used to explore the spatial raring relationship among the spatial data well, andthe analysis results are well fitted with the actual fact.

  • 【网络出版投稿人】 同济大学
  • 【网络出版年期】2010年 06期
  • 【分类号】P208
  • 【被引频次】46
  • 【下载频次】3008
节点文献中: 

本文链接的文献网络图示:

本文的引文网络