×

Model selection criteria for survival data based on Kullback’s divergence: a systematic and critical review. (English. French summary) Zbl 1538.62333

Summary: We did a literature review to summarize the trends in the model selection criteria derived from Kullback’s divergence in survival analysis. Furthermore, we conducted comprehensive discussions on these criteria to enhance the users’ understanding. Therefore, 4628 original papers on model selection criteria in survival analysis are identified via keyword searching using Pubmed, Web of Science, and Scopus search engines. Subsequently, 304 studies were fully analyzed, excluding those that did not utilize criteria based on Kullback’s divergence for model selection. The most commonly reported model selection criteria were the AIC and the AIC\(_c\). Surprisingly, none of the selected papers discussed of the KIC family model selection criteria.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
62B10 Statistical aspects of information-theoretic topics
62N01 Censored data models

Software:

WebDISCO

References:

[1] Acion. C. L., 2011. Criteria for generalized linear model selection based on Kullback’s symmetric divergence. PhD thesis, University of Iowa.
[2] Aida. H, Hayashi. K, Takeuchi. A, Sugiyama. D, and Okamura. T. , 2022. An ac-celerated failure time cure model with shifted gamma frailty and its application to epidemiological research. In Healthcare, volume 10, page 1383. MDPI.
[3] Akaike. H., 1973. Information theory and an extension of the maximum likelihood principle. In Proc. 2nd International Symposium on Information Theory, 1973, pages 267-281. Akademiai Kiado. · Zbl 0283.62006
[4] Akaike. H. , 1974. A new look at the statistical model identification. IEEE transac-tions on automatic control, 19(6):716-723. · Zbl 0314.62039
[5] Akpa. O. M. and Unuabonah. E. I., 2011. Small-sample corrected akaike informa-tion criterion: an appropriate statistical tool for ranking of adsorption isotherm models. Desalination, 272(1-3):20-26.
[6] Auranen. T., Nummenmaa. A., Hämäläinen. M. S., Jääskeläinen. I. P., Lampinen. J., Vehtari. A., and Sams.M., 2005. Bayesian analysis of the neuromagnetic inverse problem with p -norm priors. NeuroImage, 26(3):870-884.
[7] Azzaoui. N. and Hafidi. B., 2012. Criteria for longitudinal data model selection based on kullback’s symmetric divergence. Revue Africaine de la Recherche en Informatique et Math ématiques Appliqu ées, 15.
[8] Banbeta. A., Seyoum. D., Belachew. T., Birlie. B., and Getachew. Y., 2015. Modeling time-to-cure from severe acute malnutrition: application of various parametric frailty models. Archives of Public Health, 73(1):1-8.
[9] Barbu.
[10] V. S. , Karagrigoriou. A., and Makrides. A. . , 2020. Statistical inference for a general class of distributions with time-varying parameters. Journal of Applied Statistics, 47(13-15):2354-2373. · Zbl 1521.62256
[11] Behl. P., Dette. H., Frondel. M., and Tauchmann. H., 2012. Choice is suffering: A focused information criterion for model selection. Economic Modelling, 29(3): 817-822.
[12] Bumham. K. P. and Anderson. D. R., 2002. Model selection and multimodel infer-ence: a practical information-theoretic approach. Spnnger-Veflag, New York, New York. · Zbl 1005.62007
[13] Bumham. K. P. and Anderson. D. R., 2004. Multimodel inference: understanding aic and bic in model selection. Sociological methods & research, 33(2):261-304.
[14] Cavanaugh. J. E., 2004. Criteria for linear model selection based on kullback’s symmetric divergence. Australian & New Zealand Journal of Statistics, 46(2): 257-274. Journal home page: http://www.jafristat.net, www.projecteuclid.org/euclid.as, www.ajol.info/afst · Zbl 1061.62004
[15] C. Dete, M. Senou, G. E. Kossi, R. Glèlè Kakaï, Vol. 18 (1), 2023, pages 3379 -3398. Model selection criteria for survival data based on Kullback’s divergence: A systematic and critical review 3395 · Zbl 1538.62333
[16] Cavanaugh. J. E. and Neath. A. A., 2019. The akaike information criterion: Back-ground, derivation, properties, application, interpretation, and refinements. Wi-ley Interdisciplinary Reviews: Computational Statistics, 11(3):e1460. · Zbl 07909157
[17] Cavanaugh. M. A. and Noe. R. A., 1999. Antecedents and consequences of rela-tional components of the new psychological contract. Journal of Organizational Behavior: The International Journal of Industrial, Occupational and Organizational Psychology and Behavior, 20(3):323-340.
[18] Chilot. D., Belay. D. G., Shitu. K., Mulat. B., Alem.A. Z., and Geberu. D. M., 2022. Prevalence and associated factors of common childhood illnesses in sub-saharan africa from 2010 to 2020: a cross-sectional study. BMJ open, 12(11):e065257.
[19] Claeskens. G. and Hjort. N. L., 2003. The focused information criterion. Journal of the American Statistical Association, 98(464):900-916. · Zbl 1045.62003
[20] Claeskens. G. and Hjort. N. L., et al., 2008. Model selection and model averaging. Cambridge Books. · Zbl 1166.62001
[21] Cox. D., 1972. Regression models and life tables,“ journal of the royal statistical society, series b, 34, 187-220..(1975). Partial Likelihood,” Biometrika, pages 62-269. Cox. D. R., 1975. Partial likelihood. Biometrika, 62(2):269-276.
[22] Donohue. M., Overholser. R., Xu. R., and Vaida. F., 2011. Conditional akaike information under generalized linear and proportional hazards mixed models. Biometrika, 98(3):685-700. · Zbl 1231.62138
[23] dos Santos Junior. P. C. and Schneider. S., 2022. Power piecewise exponential model for interval-censored data. Journal of Statistical Theory and Practice, 16 (2):26. · Zbl 07524151
[24] Du. P., Ma.S., and Liang. H., 2010. Penalized variable selection procedure for cox models with semiparametric relative risk. The Annals of Statistics. · Zbl 1202.62132
[25] Ebrahimi. V., Khademian. M. H., Masoumi. S. J., Morvaridi. M. R. , and Ezzatzade-gan Jahromi. S., 2019. Factors influencing survival time of hemodialysis pa-tients; time to event analysis using parametric models: a cohort study. BMC nephrology, 20(1):1-9.
[26] Ghadimi. M. R., Mahmoodi. M., Mohammad. K., Rasouli. M., Zeraati. H., and A. Fo-touhi., 2012. Factors affecting survival of patients with oesophageal cancer: a study using inverse gaussian frailty models. Singapore medical journal, 53(5): 336. Grambsch. T. M. T. P. M., 1998. Penalized cox models and frailty. echnical report, Division of Biostatistics., (12):156-175.
[27] Gurmu. S. E., 2018. Assessing survival time of women with cervical cancer us-ing various parametric frailty models: a case study at tikur anbessa specialized hospital, addis ababa, ethiopia. Annals of Data Science, 5(4):513-527.
[28] Ha. I. D., Lee. Y., and MacKenzie. G., 2007. Model selection for multi-component frailty models. Statistics in Medicine, 26(26):4790-4807.
[29] Hanagal. D. D. and Dabade. A. D., 2014. Comparisons of frailty models for kid-ney infection data under weibull baseline distribution. International Journal of Mathematical Modelling and Numerical Optimisation, 5(4):342-373. · Zbl 1317.92051
[30] Hurvich. C. M. and Tsai. C.-L., 1989. Regression and time series model selection in small samples. Biometrika, 76(2):297-307. · Zbl 0669.62085
[31] Hurvich. C. M. and Tsai. C.-L., 1995. Model selection for extended quasi-likelihood models in small samples. Biometrics, pages 1077-1084. · Zbl 0875.62359
[32] S. Imori, H. Yanagihara, and H. Wakaki. General formula of bias-corrected aic in generalized linear models. Technical report, TR, 2011.
[33] H.-J. Kim and J. E. Cavanaugh. Model selection criteria based on kullback in-formation measures for nonlinear regression. Journal of statistical planning and inference, 2005. · Zbl 1140.62331
[34] H.-J. Kim, J. E. Cavanaugh, T. A. Dallas, and S. A. Foré. Model selection cri-teria for overdispersed data and their application to the characterization of a host-parasite relationship. Environmental and ecological statistics, 21(2):329-350, 2014.
[35] D. Kuk and R. Varadhan. Model selection in competing risks regression. Statistics in medicine, 32(18):3077-3088, 2013.
[36] S. Kullback. Probability densities with given marginals. The Annals of Mathematical Statistics, 39(4):1236-1243, 1968. · Zbl 0165.20303
[37] Kumar. M., Sonker. P. K., Saroj. A., Jain. A., Bhattacharjee. A., and Saroj. R. K., 2020. Parametric survival analysis using r: Illustration with lung cancer data. Cancer Reports, 3(4):e1210.
[38] Lawless. J. F., 2011. Statistical models and methods for lifetime data. John Wiley & Sons. Lee. H. and Ghosh. S. K., 2009. Performance of information criteria for spatial models. Journal of statistical computation and simulation, 79(1):93-106. · Zbl 1161.62064
[39] Liang. H. and Zou. G., 2008. Improved aic selection strategy for survival analysis. Computational statistics & data analysis, 52(5):2538-2548. · Zbl 1452.62087
[40] Liang. H., Wu. H., and Zou. G., 2008. A note on conditional aic for linear mixed-effects models. Biometrika, 95(3):773-778. · Zbl 1437.62527
[41] Lu. C.-L., Wang.S., Ji. Z., Wu. Y., Xiong. L., Jiang.X., and Ohno-Machado.L., 2015. Webdisco: a web service for distributed cox model learning without patient-level data sharing. Journal of the American Medical Informatics Association, 22(6): 1212-1219.
[42] Montaseri. M., Charati. J. Y., and Espahbodi. F., 2016. Application of parametric models to a survival analysis of hemodialysis patients. Nephro-urology monthly, 8(6).
[43] Naik. P. A., Shi. P., and Tsai. C.-L., 2007. Extending the akaike information crite-rion to mixture regression models. Journal of the American Statistical Association, 102(477):244-254. · Zbl 1284.62429
[44] Page M. J., McKenzie. J. E., Bossuyt. P. M., Boutron. I., Hoffmann. T. C., Mulrow. C. D., Shamseer. L., Tetzlaff. J. M., Akl. E. A., Brennan. S. E., et al., 2021. The prisma 2020 statement: an updated guideline for reporting systematic reviews. Systematic reviews, 10(1):1-11.
[45] Park. K. Y. and Qiu. P., 2014. Model selection and diagnostics for joint modeling of survival and longitudinal data with crossing hazard rate functions. Statistics in medicine, 33(26):4532-4546.
[46] Posada. D. and Buckley. T. R., 2004. Model selection and model averaging in phy-logenetics: advantages of akaike information criterion and bayesian approaches over likelihood ratio tests. Systematic biology, 53(5):793-808. Journal home page: http://www.jafristat.net, www.projecteuclid.org/euclid.as, www.ajol.info/afst
[47] C. Dete, M. Senou, G. E. Kossi, R. Glèlè Kakaï, Vol. 18 (1), 2023, pages 3379 -3398. Model selection criteria for survival data based on Kullback’s divergence: A systematic and critical review 3397 · Zbl 1538.62333
[48] Pourhoseingholi. M. A., Hajizadeh. E., Moghimi Dehkordi. B., Safaee. A., Abadi.A., Zali. M. R., et al., 2007. Comparing cox regression and parametric models for survival of patients with gastric carcinoma. Asian Pacific Journal of Cancer Pre-vention, 8(3):412.
[49] Ramasamy. R. and Kaliannan. M., 2021. Monitoring the newly infected cases of covid-19 data weekly: A survival data analysis (sda) perspective. Statistical Jour-nal of the IAOS, (Preprint):1-16.
[50] Rao. C., Wu. Y., Konishi.S., and Mukerjee. R., 2001. On model selection. Lecture Notes-Monograph Series, pages 1-64.
[51] RodN. H., Lange. T., Andersen. I., Marott. J. L., and Diderichsen. F., 2012. Additive interaction in survival analysis: use of the additive hazards model. Epidemiology, 23(5):733-737.
[52] Schober. P. and Vetter. T. R., 2018. Survival analysis and interpretation of time-to-event data: the tortoise and the hare. Anesthesia and analgesia, 127(3):792.
[53] G. Schwarz. Estimating the dimension of a model. The annals of statistics, pages 461-464, 1978. · Zbl 0379.62005
[54] Seidi . N., Tripathy. A., and Das. S. K., 2023. Using geographic location-based public health features in survival analysis. arXiv preprint arXiv:2304.07679.
[55] Shaik. A. B., Venkataramanaiah. M., and Thasleema. S., 2015. Statistical appli-cations of survival data analysis for breast cancer data. i-Manager’s Journal on Mathematics, 4(2):30.
[56] Shao. J., 1997. An asymptotic theory for linear model selection. Statistica sinica, pages 221-242.
[57] Shibata. R., 1980. Asymptotically efficient selection of the order of the model for estimating parameters of a linear process. The annals of statistics, pages 147-164. · Zbl 0425.62069
[58] Shibata. R., 1981. An optimal selection of regression variables. Biometrika, 68(1): 45-54. · Zbl 0464.62054
[59] Shibata. R., 1989. Statistical aspects of model selection. Springer.
[60] Sidhu. S., Jain. K., and Sharma. S. K., 2019. The generalized gamma shared frailty model under different baseline distributions. International Journal of Mathemat-ical, Engineering and Management Sciences, 4(1):219.
[61] Su. X. and Tsai. C.-L., 2006. An improved akaike information criterion for gener-alized log-gamma regression models. The International Journal of Biostatistics, 2 (1). Sugiura. N., 1978. Further analysis of the data by akaike’s information criterion and the finite corrections: Further analysis of the data by akaike’s. Communica-tions in Statistics-theory and Methods, 7(1):13-26. · Zbl 0382.62060
[62] Takeuchi. K., 1976. Distribution of an information statistic and the criterion for the optimal model. Mathematical Science, 153:12-18.
[63] Vaida. F. and Blanchard. S., 2005. Conditional akaike information for mixed effects models. Corrado Lagazio, Marco Marchi (Eds), page 101.
[64] Vrieze. S. I., 2012. Model selection and psychological theory: a discussion of the differences between the akaike information criterion (aic) and the bayesian in-formation criterion (bic). Psychological methods, 17(2):228. Journal home page: http://www.jafristat.net, www.projecteuclid.org/euclid.as, www.ajol.info/afst
[65] C. Dete, M. Senou, G. E. Kossi, R. Glèlè Kakaï, Vol. 18 (1), 2023, pages 3379 -3398. Model selection criteria for survival data based on Kullback’s divergence: A systematic and critical review · Zbl 1538.62333
[66] Wei. L.-J. , 1992. The accelerated failure time model: a useful alternative to the cox regression model in survival analysis. Statistics in medicine, 11(14-15):1871-1879.
[67] Xu. R., Vaida. F., and Harrington. D. P., 2009. Using profile likelihood for semipara-metric model selection with application to proportional hazards mixed models. Statistica Sinica, 19(2):819. · Zbl 1166.62030
[68] Xue. Y., Schifano. E. D., and Hu. G., 2020. Geographically weighted cox regression for prostate cancer survival data in louisiana. Geographical Analysis, 52(4):570-587. Yang. H., Liu. Y., and Liang. H., 2015. Focused information criterion on predictive models in personalized medicine. Biometrical Journal, 57(3):422-440. Journal home page: http://www.jafristat.net, www.projecteuclid.org/euclid.as, www.ajol.info/afst
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.