节点文献

非线性混合效应模型及其在林业上应用

Nonlinear Mixed Effects Model and Its Application in Forestry

【作者】 符利勇

【导师】 唐守正;

【作者基本信息】 中国林业科学研究院 , 森林经理学, 2012, 博士

【摘要】 非线性混合效应模型(Nonlinear Mixed Effects Models,简称NLMEMs)是依据回归函数依赖于固定效应参数和随机效应参数的非线性关系而建立的。它是分析纵向数据、多水平数据及重复调查数据的近代统计学方法,既能反映总体的平均变化趋势,又能刻画个体差异的统计分析工具。近几年,NLMEMs受到越来越多的学者关注,并且被引入到多门学科,如医学、工学、农学及林学等。经过30多年的发展历程,学者已提出了单水平和嵌套多水平NLMEMs的多种参数估计方法,典型的计算软件有SAS和S-Plus。但在实际应用中发现2个问题,一是主流软件SAS和S-Plus计算经常不收敛,尤其是模型中待估参数较多时最为明显,二是现有的混合模型没有包括所有类型的随机效应的组合(例如交互作用,这在林业中是常用的),因而限制了模型的应用。本文的目的在于(1)提出一种收敛性好的计算方法来分析单水平和嵌套多水平NLMEMs;(2)提出一个包括所有随机效应类型的NLMEMs的统一表达式,并且给出一种参数计算方法;(3)完成上述2个算法的程序代码并在ForStat上实现;(4)应用我们的程序解决一个林学上的实际问题,该问题用已有的程序是无法解决的。本研究实现了上述4个目的。具体内容如下:1)根据一阶条件期望线性化—数学期望极大方法(First-order conditional expectation linearization–expectation maximation,简称FOCE-EM)的理论推导出计算单水平和嵌套多水平NLMEMs的计算公式、设计了计算流程。2)本研究提出了一种正态NLMEMs的标准表达式,它包括了正态NLMEMs的所有随机效应类型,给出该模型相应的一种参数估计方法,即线性逼近—逐步2次规划算法。3)发现并通过实例说明SAS中mixed模块不能保证随机效应参数方差为非负定矩阵,所以在本文提出采用线性逼近—逐步2次规划算法,同时给出十种方差类型满足正定或半正定的条件,因而算法可以保证不出现类似SAS中的错误。4)模型的标准表达式可以处理固定效应和随机效应参数分级(即数量化问题),并且指出了S-Plus中nlme函数分级差法计算有缺陷,而线性逼近—逐步2次规划算法克服了此问题。5)首次利用带有交互作用的两因素(林分密度和地位级指数)NLMEMs分析了落叶松树高—直径模型。在此基础上进一步分析了随机效应与海拔的关系。通过本项研究,可以得出以下主要结论:1)本研究提出了一种正态NLMEMs的标准表达式,它包含多种类型的非线性混合效应模型(随机效应参数服从正态分布),例如单水平NLMEMs、逐级嵌套多水平NLMEMs、只含主效应的多因素NLMEMs、包括主效应和交互效应的NLMEMs以及某几种类型组合的一般性NLMEMs等。模型中固定效应参数和随机效应参数可以考虑分级(即数量化)。同时还把正态NLMEMs的标准表达式推广到参数方差与某些因素(称为组变量)有关的NLMEM(s考虑组变量的NLMEMs)。因此模型比传统的NLMEMs表达式更为一般化,具有更广的用途。2)在计算单水平或逐级嵌套的多水平NLMEMs时,FOCE-EM算法与SAS及S-Plus提供的Lindstrom andBates(LB)算法计算精度非常接近,数值实例表明,至少四位小数相同。但FOCE-EM算法从理论上保证了线性步计算收敛,从而使得该算法计算收敛性明显好于LB算法。3)本研究给出线性逼近—逐步2次规划算法计算一般类型的NLMEMs。为保证方差非负定采用逐步2次规划是必要的。该算法同样也能计算考虑组变量的NLMEMs。4)从计算速度上讲,FOCE-EM算法比线性逼近—逐步2次规划算法要快,因此建议在计算单水平和逐级嵌套多水平NLMEMs时,使用FOCE-EM算法,而计算其它类型的NLMEMs时,使用线性逼近—逐步2次规划算法。5)通过两因素NLMEMs对落叶松树高—直径模型研究得出,考虑林分密度与地位级的交互作用能明显提高模型的预测精度,而且把海拔高度作为组变量时,还可以进一步提高模型预测精度。

【Abstract】 Nonlinear mixed effects models (NLMEMs) are built on the regression function, whichrelying on the nonlinear relationship of fixed effects parameters and random effects parameters.They are modern statistical methods to analyze longitudinal data, multilevel data and repeatedsurvey data. They also can be considered as powerful statistical tools, which not only reflectthe overall variation trend of population, but also describe individual differences. In recentlyyears, NLMEMs are getting much attention from the scholars in various disciplines, such asmedicine, engineering, agriculture and forestry and so on. After30years’ development,several scholars have proposed the single level NLMEMs and the multi-level NLMEMs, andtypical statistical software such as SAS and S-Plus were developed to estimate them. Inpractice, we found two main problems in existing NLMEMs and estimation methods. One isthat the main software such as SAS and S-Plus may not always reach convergence, especiallywhen the model contains many estimable parameters. The other is that the current NLMEMsmay not cover all types of random effects combinations (such as interaction that is commonlyused in forestry), which limits application of NLMEMs. The objectives of this paper aretherefore to (1) propose an algorithm with good convergence to estimate single level andmulti-level NLMEMs,(2) put forward a unified expression that includes all types of randomeffects of NLMEMs and corresponding parameter estimation method,(3) complete and realizethe coding for the above two algorithm in ForStat software,(4) and use our programs toresolve a practical problem in forestry, which is not resolved yet.The study has successfully achieved the above four objectives. Specific contents of thepaper are as follows:1) Based on the first-order conditional expectation linearization–expectation maximation (FOCE-EM), it draws calculation formulas and processes of singlelevel and multi-level NLMEMs.2) The study proposed a normal standard expression forNLNEMs, which contains all types of random effects, and put forward the relevant parameterestimation method, naming linearization approximation-sequential quadratic programming algorithm.3) We found the mixed procedure in SAS may not guarantee the variance ofrandom effects parameters as non-negative definite, so we used one specific case to verify thisdrawback. Therefore, we proposed the linearization approximation-sequential quadraticprogramming algorithm to estimate the parameters in the NLMEMs standard expression, andwe present ten common variance types of constraint conditions to guarantee each variancepositive definite or half positive definite.4) The standard expression of this model can dealwith classification (namely quantification problem) of both fixed effects and random effectsparameters. We also pointed out some defects of classification differential method in nlmefunction of S-Plus, and put forward the solutions such as linearization approximation-sequential quadratic programming algorithm to overcome this drawback5) It is the first timethat we used two factors (stand density and site index class) NLMEMs within interactions toanalyze the height–diameter model for Larix olgensis. We also further analyzed therelationship of random effects and altitude based on the two factors NLMEMs.Based on the analysis and results from the study, the following conclusions can be drawn:1) the proposed normal NLMEMs standard expression contain various types of NLMEMs(random effects is normal distribution), such as single level NLMEMs, grading multi-levelNLMEMs, multi-factor NLMEMs with main effects, multi-factor NLMEMs with main effectsand interaction effects, and general NLMEMs that combined several types of random effects..We also spread the normal NLMEMs standard expression to those NLMEMs that thevariances of random effects are related to some factors (also called as group variable). Ascompared with the traditional models, the proposed models are more general and useful.2)When calculating single level or grading multi-level NLMEMs, the FOCE-EM algorithm hasa similar computational accuracy with the LB algorithm of SAS and S-Plus. However,FOCE-EM algorithm in theory ensures the linear step convergence, therefore, the convergenceof this algorithm is significantly better than LB algorithm.3) The study proposed linearizationapproximation-sequential quadratic programming algorithm to estimate the general types ofNLMEMs. It is necessary to choose this algorithm to guarantee the random effects variance asnon-negative definite.4) The computational speed of FOCE-EM algorithm is more quickly than linearization approximation-sequential quadratic programming algorithm. Therefore, it isbetter to use the FOCE-EM algorithm to analyze single level and multi-level NLMEMs. Forother types of NLMEMs, linearization approximation-sequential quadratic programmingalgorithm would be better.5) Through using the two factor NLMEMs to analyze the height-diameter model for Larix olgensis, it can be drawn that the prediction precision can beobviously increased by considering the interactions of stand density and site index class, and ifthe altitude was considered as group variable in the mixed model, the prediction precisionwould be further improved.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络