STATISTICAL ESTIMATIONWITH POSSIBLY INCORRECT MODEL ASSUMPTIONS
We combine a consistent (base) estimator of a population parameter with one or several other possibly inconsistent estimators. Some or all assumptions used for calculating the latter estimators may be incorrect. The suggested in the manuscriptapproach is not restricted to parametric families and can be easily used for improvingefficiency of estimators built under nonparametric or semiparametricmodels. The combined estimator minimizes the mean squared error (MSE) in a family of linear combinations of considered estimators when all variances and covariancesused in its structure are known. In real life problems these variances and covariances are estimated generating an empirical version of the combined estimator.The combined estimator as well as its empirical version are consistent. The asymptotic properties of these estimators are presented. The combined estimator is applicable when analysts can use several different procedures for estimating the same population parameter. Different assumptions are associated with the use of each of non-base estimators. Our estimator is consistent in the presence of wrongassumptions for non-base estimating procedures. In addition to theoretical resultsof this manuscript, simulation studies describe properties of the estimator combiningthe Kaplan-Meier estimator with the censored data exponential estimator of a survival curve. Another set of simulation examples combine semi-parametricCox regression with exponential regression on right censored data.
Статистическое оценивание с учетом возможно неверных предположений о моделях..pdf In many applied problems researchers are challenged with statistical estimation of a population parameter Ґи.In some cases Ґи is expressed as a functional Ґи = Ўт g( y)dF( y) , where the real valuedand possibly multidimensional function g( y) is known but the distribution F( y) may be either completely unknown (nonparametric case) or unknown with some restrictions(for example, symmetric or belongs to a parametric family). Different degrees of uncertaintyabout F( y) are expressed by different sets of assumptions and lead to differentestimating procedures. The common part is that all of these procedures attempt to estimate the same Ґи. The quality of estimation highly depends on how well assumptionsare used in an estimating procedure and whether these assumptions are correct or not.There are situations when Ґи is not easily expressed via Ўт g( y)dF( y) , for example, when Ґи is a regression coefficient or a distribution parameter. Then, a model dependentinterpretation should be applied to Ґи. For example, Ґи may be defined as a hazard ratiobetween two groups in a proportional hazards regression model. Different assumptionson a baseline hazard lead to different estimating procedures. Cox model deals with a nonparametric baseline hazard, Weibull and exponential baseline hazards lead to parametricregression models. We emphasize that the interpretation of Ґи stays the same.Thus different estimating procedures can compete for being used for Ґи estimation.88 Sergey S. Tarima, Yuriy G. DmitrievOften researchers choose a single estimating procedure and proceed as if underlyingassumptions are correct. These procedures start with choosing a functional form of a model and then proceed with variable selection. A detailed review of model selectionprocedures can be found in [1]. The major focus of recent statistical research is on variableselection methods, see Fraiman [5], Radchenko [8] and Fan [4].Multimodel inference avoids reliance on a single model via combing several models.Bayesian model averaging is discussed by Hoeting et al. [6]. The frequentist counterpartin presented by Hjort and Claeskens [7].Hjort and Claeskens performed averaging over a set of parametric models with the same parametric form but different number of variables.In our work we attempt to improve properties of our base estimator by guessing on additional restrictions and thus creating grounds for using the other possibly more efficientestimators of Ґи. Our approach can deal with misspecification of a functional modelform as well as with misspecification of the set of variables.Section 2 derives the estimator. Its asymptotic properties are considered in Section3. Section 4 illustrates performance of the combined estimator for various scenarios of survival function estimation.1. EstimatorLet Y1,...,Yn be an independent sample from an unknown distribution. If there are no additional information of any kind we can estimate Ґи via a base estimator ˆ(0) Ґиn . Hereaftern in a subscript highlights dependence on a sample of size n. We assume that the base estimator is asymptotically unbiased estimator of Ґи. Further we assume that thereexist S sets of possibly incorrect assumptions. Each of these S assumptions can be usedfor building another estimator of Ґи, ˆ(s)Ґиn , s = 1,..., S . If an assumption, say (sЎЗ)th , is correct, we may reasonably expect that ˆ(s' )Ґиn is a more efficient estimator of Ґи than ˆ(0) Ґиn .However, it is not known which of the S sets of assumptions are correct and which are not. It is also possible that all sets are false.In order to avoid dealing directly with different sets of regularity conditions we assume(1)ўЈs ˆ (s) (s) (s) , n n n E ЎжЎДҐи = Ґи Ўж Ґи where Ґи(0) = Ґи , (2) [ˆ(s) ]E Ґиn < ЎД , ўЈs , ўЈn , (3) undera correctly chosen model (ˆ (s) (s) )ans Ґиn − Ґиn has a finite variance for every n includingits limiting case at n = -ЎД , where ans is a diverging to -ЎД sequence of positive numbersas n Ўж ЎД . Consider a family of estimators of Ґи (0) ( ( ) (0))1ˆ ( ) ˆ ˆ ˆSsn n n ns n n sҐи ҐЛ = Ґи -ҐТҐл Ґи − Ґи , where ҐЛn = (Ґлn0 ,...,ҐлnS ) . The mean squared error of ˆ n( n ) Ґи ҐЛ is ( )2(0) (0) ( ) (0)1ˆ ˆ ˆSsn n ns n n s E⎛ ⎞⎜Ґи − Ґи - Ґл Ґи − Ґи ⎟ .⎝ ⎠ҐТFor every Ґлni ( ) ( (0) (0) )( ( ) (0)) ( ( ) (0))( ( ) (0))1 MSE ˆ ( ) 2 ˆ ˆ ˆ 2 ˆ ˆ ˆ ˆ i S i s n n n n n n s ns n n n n ni E E ЎУ ⎡ ⎤ ⎡ ⎤ Ґи ҐЛ = ⎣ Ґи −Ґи Ґи −Ґи ⎦- Ґл ⎣ Ґи −Ґи Ґи −Ґи ⎦, ЎУҐл ҐТ Statistical estimation with possibly incorrect model assumptions 89or, in a matrix formMSE( ˆ ( )) 2 (ˆ (0) (0) ) ˆ 2 ˆ ˆ T n n n n n n n n n E E ЎУ ⎡ ⎤ ⎡ ⎤ Ґи ҐЛ = ⎣ Ґи − Ґи ҐД ⎦ - ҐЛ ⎣ҐД ҐД ⎦ , ЎУҐЛwhereˆ ( ˆ (1) ˆ ( )) (ˆ (1) ˆ (0) ˆ ( ) ˆ (0))S T S T n n n n n n n ҐД = ҐД ,...,ҐД = Ґи − Ґи ,...,Ґи − Ґи .FromMSE( ˆ n( n )) 0nЎУҐи ҐЛ ЎХ ЎУҐЛ, we find( (0) (0) ) 10 ˆ ˆ ˆˆ T T ҐЛn = −E ⎣⎡ Ґиn − Ґиn ҐДn ⎦⎤ E− ⎡⎣ҐДnҐДn ⎤⎦ .Since det{ ( ˆ )( ˆ )T }E⎡⎣ ҐДn−a ҐДn−a ⎤⎦ is minimized at a ЎХ EҐДˆ n and det{cov( ˆ ˆ )} 0 T ҐДn,ҐДn ЎГ , the matrix of second derivatives MSE( ˆ ( )) ˆ ˆ T T n n n n n n E ЎУ ⎡ ⎤ Ґи ҐЛ = ⎣ҐД ҐД ⎦ ЎУҐЛ ҐЛ is nonnegativedefinite, which assures that ҐЛn0 defines the smallest MSE among ˆ n( n ) Ґи ҐЛ . The casewhen the determinant is equal to zero corresponds to multiple solutions for ҐЛ0 , but the MSE stays at its minimum for each of them. The Moore-Penrose generalized inversecan be used for selecting one of these solutions. Then, ( ) (0) ( (0) (0) ) 1ˆ 0 ˆ ˆ ˆ ˆ ˆ ˆ T T Ґиn ҐЛn = Ґиn − E ⎣⎡ Ґиn − Ґиn ҐДn ⎦⎤ E− ⎡⎣ҐДnҐДn ⎤⎦ ҐДn , (1)provides the smallest MSE among all ˆ n ( n ) Ґи ҐЛ which is ( ) ( ) ( ) 2 (0) (0) (0) (0) 1 (0) (0) ˆ ˆ ˆ ˆ ˆ ˆ ˆ T T T T E n n E n n n E n n E n n n Ґи − Ґи − ⎡ Ґи − Ґи ҐД ⎤ − ⎡⎣ҐД ҐД ⎤⎦ ⎡ Ґи − Ґи ҐД ⎤ . ⎣ ⎦ ⎣ ⎦ (2)Due to the quadratic form at the right hand side the mean squared error of ˆ n( n0 ) Ґи ҐЛ is never higher than MSE(Ґиˆ (n0)) . The formulas (1) and (2) cannot be used directly because[ ] E ⋅ in their expressions are not known. Applying [ ] ˆE ⋅ instead of E[⋅] leads to ( ) (0) ( (0) (0) ) 10 ˆ ˆ ˆ ˆ ˆ ˆ T ˆ ˆ ˆ T ˆn n n E n n n E n n n Ґи ҐЛ = Ґи − ⎡ Ґи − Ґи ҐД ⎤ − ⎡ҐД ҐД ⎤ ҐД ⎣ ⎦ ⎣ ⎦ (3)and
Ключевые слова
Авторы
| | | |
| | | |
ТАРИМА Сергей Сергеевич | Медицинский колледж | кандидат технических наук (Ph.D.), ассистент профессорамедицинского колледжа | starima@hpi.mcw.edu_ |
ДМИТРИЕВ Юрий Глебович | Томский государственный университет | доктор физико-математических наук, зав. кафедрой теоретической кибрнетики факультета прикладной математики и кибернетики | dmit@mail.tsu.ru |
Всего: 4
Ссылки
Fan J., Li R. Variable selection via penalized likelihood // J. Amer. Statist. 2001. Ass. 96. P. 1348 - 1360.
Fraiman R., Justel A., Svarc M. Selection of variables for cluster analysis and classification rules // J. Amer. Statist. 2008. Ass. 103. P. 1294 - 1303.
Davidson A.C. - Hinkley D.V. Bootstrap methods and thier applications. Cambridge: Cambridge University Press, 1997.
Efron B. Censored data and the bootstrap // J. Amer. Statist. Ass. 1981. V. 76. P. 312 - 319.
Burnham P.B., Anderson D.R. Model selection and multimodel inference. Springer, 2002.
Hoeting J.A., Madigan D., Raftery A.E., Volinsky C.T. Bayesian model averaging: a tutorial // Statistical Science. 1999. V. 19. P. 382 - 417.
Hjort N.L., Claeskens G. Frequentist Model Average Estimators // J. Amer. Statist. Ass. 2003. V. 98. P. 879 - 899.
Radchenko P., James G.M. Variable inclusion and shrinkage algorithm // J. Amer. Statist. Ass. 2008. V.103. P. 1304 - 1315.
Kaplan E.L., Meier P. Nonparametric estimator from incomplete observations // J. Amer. Statist. Ass. 1958. V. 53. P. 457 - 481.
Klein J.P., Logan B., Harhoff M., Andersen P.K. Analyzing survival curves at a fixed point in time // Statistics in Medicine. 2007. V. 26. P. 4505 - 4519.
Klein J.P., Moshenberg M.L. Survival Analysis. Springer, 2003.
Sergey S. Tarima Division of Biostatistics Department of Population Health Medical College of Wisconsin Milwaukee, WI, 53226, USA E-mail: starima@hpi.mcw.edu