节点文献

删失数据下若干半参数模型的经验似然与惩罚经验似然推断

Empirical Likelihood and Penalized Empirical Likelihood Inferences for Some Semiparametric Regression Models with Censored Data

【作者】 侯文

【导师】 宋立新;

【作者基本信息】 大连理工大学 , 概率论与数理统计, 2013, 博士

【摘要】 回归分析是统计学研究中的一个重要领域.本文主要研究将经验似然方法应用于响应变量被随机删失时几种常用的回归模型,对回归模型中的参数进行统计推断.删失数据是医学、可靠性工程、金融保险、环境科学等科学研究和实际问题中会经常出现一种重要的统计数据.对于这种响应变量随机删失的回归模型,回归分析中的标准方法如最小二乘法不能直接应用,于是如何在删失数据下对回归模型进行统计分析就需要深入的探讨,因此对响应变量随机删失回归模型的研究具有重要意义.经验似然方法是由Owen(1988)提出的一种非参数统计方法,与传统的渐近正态方法构造参数的置信域相比,经验似然方法则不需要估计参数的渐近方差,这是经验似然方法的一个优点.尤其是随机删失回归模型中参数估计量的渐近方差计算复杂,因此,应用经验似然方法更有意义.本文研究了随机删失数据下回归模型中参数估计问题,给出了经验似然比统计量,证明了其渐近分布为χ2分布,避免在构造参数置信域时需要估计渐近方差,提高了估计的精确性.另一方面,变量选择是目前回归分析中研究的热点问题之一.有效的变量选择方法可以选择显著的变量,剔除多余的变量,提高模型的预测精度Tibshirani(1996)提出了LASSO惩罚方法,它是一种系数压缩的思想方法,相对于传统子集选择方法计算量小,而且稳定.目前利用系数压缩的思想方法得到了统计学界的极大关注,一些统计学家相继提出了各种基于惩罚函数的变量选择方法,证明变量选择具有的Oracle性质.本文利用将系数压缩的变量选择方法与经验似然方法相结合的惩罚经验方法,研究了Cox比例风险模型中变量选择和参数估计问题.本文研究的主要内容包括以下几部分:第二章研究了响应变量在随机右删失情形下非线性半参数回归模型的参数估计问题,给出了关于未知参数的经验对数似然比统计量和调整经验对数似然比统计量,在一定条件下,证明了所给的经验似然比统计量渐近于χ2分布,并由此可以构造关于未知参数的置信域.此外,也给出了未知参数的最小二乘估计量,证明了它的渐近性质.模拟结果表明,经验似然方法在置信域的覆盖概率以及精度方面要优于最小二乘法.第三章研究了响应变量随机右删失非参数协变量带有测量误差情形下的非线性半参数回归模型参数估计问题,给出了关于未知参数的经验对数似然比统计量和调整经验对数似然比统计量,在一定条件下,证明了所给的经验似然比统计量渐近于χ2分布,并由此可以构造关于未知参数的置信域.此外,也给出了未知参数的最小二乘估计量,证明了它的渐近性质.模拟结果表明,经验似然方法在置信域的覆盖概率以及精度方面要优于最小二乘法.第四章研究了在响应变量随机右删失情形下的半参数变系数部分线性EV模型参数部分估计问题,构造了关于未知参数的经验对数似然比统计量,并证明了所构造的经验似然比统计量渐近于χ2分布,据此结果可以用来构造未知参数的置信域.通过模拟,在有限样本情形下,对经验似然方法和正态近似方法构造的置信区间在区间长度和覆盖概率两个方面进行了比较.第五章在Cox比例风险模型中,用惩罚经验似然方法研究模型中变量选择问题.利用Bridge惩罚函数,在一定的条件下,讨论了惩罚经验似然的Oracle性质,定义了回归系数的惩罚经验似然比统计量,证明了它渐近服从卡方分布.模拟研究表明Bridge惩罚经验似然方法具有较好的性质.第六章研究了保险精算中一类复合分布的计算问题,其索赔数变量属于一个较为广泛的分布族,而索赔额变量为混合型分布.首先给出复合分布满足的递归方程,然后将其用于超额损失再保险中得到相应的递归方程.最后,给出一些具体例子及数值计算结果.

【Abstract】 Regression analysis is an important area of statistical research. The paper studies several commonly used regression models when the response variable is randomly censoring of, by the means of the empirical likelihood method, and statistically inference the parameters in the regression model. Censored data is an important statistical data in research and reality of fields like medicine, reliability engineering, finance and insurance, environmental science and so on.For the regression model which the response variable is randomly censored, standard methods of regression analysis such as least squares method cannot be applied directly, so how to statistically analyse the regression model when there are censored data needs to be discussed in depth, and the study on the regression model when the response variable is randomly censored is of great significanceEmpirical likelihood method proposed by Owen (1988) is a non-parametric statistical method. Compared with traditional asymptotic normality method to construct confidence re-gion of parameters, the empirical likelihood method do not care about estimating asymptotic variance of parameters, which is an advantage of the empirical likelihood method. Further-more, the expression of asymptotic distribution variance of the parameter estimators in the model of the randomly censored regression is complex. Then, the application of empirical likelihood method is more meaningful.The paper studies the problem of parameter estimation in the regression model when the response variable is randomly censored, gives the empirical likelihood ratio statistic, and makes its asymptotic distribution is χ2distribution, avoids estimating asymptotic variance when constructing empirical likelihood confidence region of parameters, and improves the accuracy of the estimation.On the other hand, variable selection is one of the hot issues of the regression analysis research so far. Effective variable selection methods can select the remarkable variables and eliminate redundant variables to improve the prediction accuracy of the model. Tibshirani (1996) proposed LASSO penalty method, which is a coefficient shrunk method, and compared with the traditional subset selection method,the amount of its calculation is little and stable. At present, the coefficient shrunk method has been greatly concerned by statisticians, and some new penalty methods have been proposed to prove the Oracle property of selections.The paper studies variable selection and parameter estimation of the Cox proportional hazards model,which uses penalized empirical likelihood method combining coefficient shrunk method with empirical likelihood method.The main contents of this paper contain several following chapters.The second chapter investigates the question of the parameter estimation in non-linear semi-parametric regression model when the response variable is randomly right censored, constructs empirical log-likelihood ratio statistic and adjusted empirical log-likelihood ratio statistic for unknown parameters, proves that the constructed empirical likelihood ratio fol-lows an asymptotically χ2distribution under certain conditions, and constructs a confidence region of the unknown parameters. In addition, this chapter constructs least squares esti-mators of the unknown parameters, and proves its asymptotic properties. By corresponding simulation results, the empirical likelihood method is better than the least squares method at the coverage probability and accuracy of confidence region.The third chapter investigates the question of the parameter estimation in non-linear semi-parametric regression model when the response variable is randomly right censored and the nonparametric covariate has measurement error. An empirical log-likelihood ratio statistics for unknown parametric components is proposed, and it is proved that the pro-posed statistics follow an asymptotically χ2distribution under the null hypothesis, and the consequence can be used to construct the confidence region of the unknown parameter In addition, the least squares estimator of the unknown parameters is constructed, and its asymptotic properties is proved. Corresponding simulation results show that the empirical likelihood method is better than the least squares method at the coverage probability of the confidence region as well as precision.The fourth chapter mainly investigates the question of the estimation of the parame-ter part of semiparametric varying-coefficient partially linear errors-in-variables models in the condition of random right censored response variable. An empirical log-likelihood ratio statistics for unknown parametric components is proposed, and it is proved that the pro-posed statistics follow an asymptotically χ2under the null hypothesis, and the consequence can be used to construct the confidence region of the unknown parameter. By imitating, the confidence regions constructed by empirical likelihood method and the normal approxima-tion method are compared in terms of length of interval and coverage probability under the condition of finite sample. In the fifth chapter, the question of variable selected is researched with penalized em-pirical likelihood method in Cox proportional hazards model. The penalized function used is Bridge. The Oracle property of penalized empirical likelihood is discussed under certain conditions, namely, select the non-zero coefficients with probability1and the non-zero coef-ficients following a progressive normal distribution have the asymptotic normality. A penalty empirical likelihood ratio for regression coefficients is defined and it is proved to follow an asymptotically χ2distribution. Simulations and a real data example show that the proposed bridge penalty empirical likelihood have satisfying characters.The sixth chapter investigates the calculation of a kind of composite distribution in insurance and actuarial. The number of claim variable belongs to a widely distributed family, and claims amount follows a hybrid distribution. Firstly, present the recursive equation that composite distribution is satisfied. Secondly it is applied to the excess-of-loss reinsurance treaty to obtain corresponding recursive equation. Finally, give some concrete examples and numerical results.

节点文献中: 

本文链接的文献网络图示:

本文的引文网络