

Multivariate Goodness of Fit Tests and Statistical Analysis of Recurrent Event Data

【作者】 戴家佳

【导师】 杨振海;

【作者基本信息】 北京工业大学 , 概率论与数理统计, 2009, 博士

【摘要】 拟合优度检验和参数估计是统计推断中永恒的课题,结合具体模型有丰富的内容.本文主要研究了三个问题:首先,对Pearsonχ~2检验的改进,克服Pearsonχ~2检验不稳健和分组不唯一的弱点.其次,考虑了垂直密度表示的应用,提出并讨论了中心相似多元分布统计模型,并考虑了垂直密度表示在多元拟合优度检验中的应用.最后,针对复发事件数据,提出了几类回归模型,讨论了模型参数的估计及其统计性质.在第2章中,对Pearsonχ~2检验进行了改进.一方面,提出了极大χ~2检验,给出了极大χ~2统计量的构造,并将极大χ~2检验应用于检验方向数据是否来自球面上的均匀分布,与包括Rayleigh,Ajne,Bingham,Giné检验在内的已有均匀检验进行模拟功效的比较,大量的模拟结果表明:针对不同的对立假设,极大χ~2检验有较高的功效且是稳健的.另一方面,对Pearsonχ~2检验的分组原则进行了改进,提出了按概率密度函数值(纵坐标)进行分组的原则,给出了分组点具体的计算公式.克服了Pearsonχ~2检验传统(横坐标)分组不唯一的弱点.大量的模拟研究表明:相比较传统的分组原则,按概率密度函数值分组具有更高的模拟功效.在第3章中,基于垂直密度表示理论,提出了中心相似多元分布统计模型,多元正态分布是其特例.该模型为多兀分布的密度构造提供了一种可行的方法.首先,对所提出的模型中的未知参数,利用矩估计的方法,给出了未知参数矩估计一般的计算表达式.并证明了所得矩估计的渐近性质,通过一些例子说明了该模型的应用.其次,利用极大似然的思想,考虑了中心相似多元分布统计模型中未知参数的极大似然估计,同时给出了一般的估计方程组.最后,讨论了垂直密度表示在多元拟合优度检验中的应用,包括球对称分布的拟合优度检验,χ~2检验的VDR分组,中心相似分布的拟合优度检验.在第4章上,在复发事件数据下,提出了几类回归模型.首先,对单类型复发事件数据,提出了加性乘积比率回归模型,该模型包含了一大类比率回归模型,加性比率回归模型和乘性比率回归模型是其特殊情形.利用估计方程的思想,给出了该模型中未知参数和非参数函数的一种估计方法.利用现代经验过程理论,证明了所得估计的渐近性质.并将该模型应用于分析CGD数据.其次,对多类型复发事件数据,同样提出了加性乘积比率回归模型,讨论了该模型中未知参数的一种估计方法,在一些正则条件下,证明了所得估计的相合性和渐近正态性.最后,在多类型复发事件数据下,提出了变系数加性乘积比率回归模型,讨论了所提出模型未知参数的估计及其大样本性质.

【Abstract】 Goodness-of fit and parameter estimation are the eternal topic in statistical inference.They have various contents in different models.In this thesis,three problems are studied.First,two approaches are proposed in order to modify Pearson’s chi-squared test.These modified tests remove the weakness that Pearson’s chi-squared test is not stable and partition of sample space is not unique. Second,some applications of vertical density representation in goodness of fit tests of multivariate distribution are considered.Finally,several regression models for recurrent event data are proposed.And the unknown parameters in these models are estimated.The resulting estimators are proven to be consistent and asymp-totically normal.In Chapter 2,two methods are proposed to modify Pearson’s chi-squared test. On the one hand,maximized chi-squared test is proposed.A construction of the maximized chi-squared test statistic is obtained.And the maximized chi-squared test is applied to test whether the vectorial data come from the uniformity defined on the hypersphere.Tests include the maximized chi-squared test,Rayleigh,Ajne, Giné,and Bingham tests,are compared the empirical power against the hypothesis of a Von Mises-Fisher distribution or a Watson distribution in some cases. The simulation results show that the maximized chi-squared test is stable against different alternative.On the other hand,based on the value of probability density function,a new principle of partition of classes is proposed.Furthermore,a formula to calculate division points is represented.The new principle removes the weakness that the traditional partition of classes is not unique for the same sample. The simulation studies demonstrate that Pearson’s chi-squared test based on new principle is more powerful than that based on traditional partition in abscissa.In Chapter 3,based on the results of vertical density representation and center-similar distribution,a statistical model of center-similar multivariate distribution is proposed.The proposed model includes multivariate normal distribution as a special case.Firstly,the unknown parameters of the proposed model are estimated by method of moment.The asymptotic properties of the resulting estimators are established.Some examples are presented to illustrate the application of the proposed model.Secondly,by maximum likelihood method,the estimators of unknown parameters in the proposed model are obtained.The system of estimation equations is presented.Finally,results of vertical density representation are used to goodness-of-fit tests of multivariate distribution,including goodness-of-fit tests of spherically symmetric distribution and center-similar distribution, partition ofχ~2 test through vertical density representation.In Chapter 4,for recurrent event data,several regression models are proposed. Firstly,a class of general additive-multiplicative rates models for single type recurrent event is proposed.The proposed models include the additive rates and multiplicative rates models as special cases.For the inference on the model parameters,estimating equation approaches are developed,and asymptotic properties of the proposed estimators are established through modern empirical process theory.In addition,the proposed models are applied to multiple-infection data from a clinic study on chronic granulomatous disease(CGD).Secondly,general additive-multiplicative rates models for multiple type recurrent event data also are considered.We formulate estimating equations for unknown parameters and nonparametric function of proposed models.Under some regularity conditions,the consistency and asymptotic normality of resulting estimators are shown.Finally, we present a flexible additive-multiplicative rates model for multiple type recurrent event data.Procedures for making inference about the model parameters are provided.Asymptotic properties of the proposed estimators are established.


