×

Predicting times to event based on vine copula models. (English) Zbl 1543.62360

Summary: In statistics, time-to-event analysis methods traditionally focus on the estimation of hazards. In recent years, machine learning methods have been proposed to directly predict the event times. A method based on vine copula models is proposed to make point and interval predictions for a right-censored response variable given mixed discrete-continuous explanatory variables. Extensive experiments on simulated and real datasets show that the proposed vine copula approach provides a decent approximation to other time-to-event analysis models including proportional hazards and Weibull Accelerate Failure Time models. When the proportional hazards or Weibull Accelerate Failure Time assumptions do not hold, predictions based on vine copulas can significantly outperform other models, depending on the shape of the conditional quantile functions. This shows the flexibility of the proposed vine copula approach for general time-to-event datasets.

MSC:

62H05 Characterization and structure theory for multivariate probability distributions; copulas
62N05 Reliability and life testing
62P10 Applications of statistics to biology and medical sciences; meta analysis

References:

[1] Barthel, N.; Geerdens, C.; Czado, C.; Janssen, P., Dependence modeling for recurrent event times subject to right-censoring with D-vine copulas, Biometrics, 75, 2, 439-451 (2019) · Zbl 1436.62509
[2] Barthel, N.; Geerdens, C.; Killiches, M.; Janssen, P.; Czado, C., Vine copula based likelihood estimation of dependence patterns in multivariate event time data, Comput. Stat. Data Anal., 117, 109-127 (2018) · Zbl 1469.62017
[3] Bedford, T.; Cooke, R. M., Probability density decomposition for conditionally dependent random variables modeled by vines, Ann. Math. Artif. Intell., 32, 1-4, 245-268 (2001) · Zbl 1314.62040
[4] Bedford, T.; Cooke, R. M., Vines — a new graphical model for dependent random variables, Ann. Stat., 30, 4, 1031-1068 (2002) · Zbl 1101.62339
[5] Biganzoli, E.; Boracchi, P.; Mariani, L.; Marubini, E., Feed forward neural networks for the analysis of censored survival data: a partial logistic regression approach, Stat. Med., 17, 10, 1169-1186 (1998)
[6] Breslow, N. E., Contribution to discussion of paper by Dr Cox, J. R. Stat. Soc. B, 34, 216-217 (1972)
[7] Brookmeyer, R., Prediction intervals for survival data, Stat. Med., 2, 4, 485-495 (1983)
[8] Chang, B.; Joe, H., Prediction based on conditional distributions of vine copulas, Comput. Stat. Data Anal., 139, 45-63 (2019) · Zbl 1507.62025
[9] Chen, G. H., Deep kernel survival analysis and subject-specific survival time prediction intervals, (Machine Learning for Healthcare Conference (2020)), 537-565, PMLR
[10] Cox, D. R.; Oakes, D., Analysis of Survival Data, vol. 21 (1984), Chapman & Hall: Chapman & Hall London
[11] Czado, C., Analyzing Dependent Data with Vine Copulas. A Practical Guide with R (2019), Springer: Springer Cham, Switzerland · Zbl 1425.62001
[12] Drasgow, F., Polychoric and polyserial correlations, (The Encyclopedia of Statistics, vol. 7 (1986)), 68-74
[13] Emura, T.; Nakatochi, M.; Matsui, S.; Michimae, H.; Rondeau, V., Personalized dynamic prediction of death according to tumour progression and high-dimensional genetic factors: meta-analysis with a joint model, Stat. Methods Med. Res., 27, 9, 2842-2858 (2018)
[14] Fard, M. J.; Wang, P.; Chawla, S.; Reddy, C. K., A Bayesian perspective on early stage event prediction in longitudinal data, IEEE Trans. Knowl. Data Eng., 28, 12, 3126-3139 (2016)
[15] Fleming, T. R.; Harrington, D. P., Counting Processes and Survival Analysis (2011), Wiley: Wiley New York
[16] Gneiting, T.; Raftery, A. E., Strictly proper scoring rules, prediction, and estimation, J. Am. Stat. Assoc., 102, 477, 359-378 (2007) · Zbl 1284.62093
[17] Gordon, L.; Olshen, R. A., Tree-structured survival analysis, Cancer Treat. Rep., 69, 10, 1065-1069 (1985)
[18] Harrell, F. E.; Califf, R. M.; Pryor, D. B.; Lee, K. L.; Rosati, R. A., Evaluating the yield of medical tests, J. Am. Med. Assoc., 247, 18, 2543-2546 (1982)
[19] Hothorn, T.; Bühlmann, P.; Dudoit, S.; Molinaro, A.; Van Der Laan, M. J., Survival ensembles, Biostatistics, 7, 3, 355-373 (2006) · Zbl 1170.62385
[20] Ishwaran, H.; Kogalur, U. B.; Blackstone, E. H.; Lauer, M. S., Random survival forests, Ann. Appl. Stat., 2, 3, 841-860 (2008) · Zbl 1149.62331
[21] Joe, H., Dependence Modeling with Copulas (2014), Chapman & Hall/CRC: Chapman & Hall/CRC Boca Raton, FL · Zbl 1346.62001
[22] Jóźwiak, I. J., An introduction to the studies of reliability of systems using the Weibull proportional hazards model, Microelectron. Reliab., 37, 6, 915-918 (1997)
[23] Khan, F. M.; Zubek, V. B., Support vector regression for censored data (SVRc): a novel tool for survival analysis, (2008 Eighth IEEE International Conference on Data Mining (2008)), 863-868
[24] Klein, J. P.; Moeschberger, M. L., Survival Analysis, Techniques for Censored and Truncated Data (2003), Springer: Springer New York, NY · Zbl 1011.62106
[25] Kraus, D.; Czado, C., D-vine copula based quantile regression, Comput. Stat. Data Anal., 110, 1-18 (2017) · Zbl 1466.62118
[26] Lawless, J. F., Statistical Models and Methods for Lifetime Data (2011), Wiley: Wiley Hoboken, NJ · Zbl 0541.62081
[27] Li, Z.; Chinchilli, V. M.; Wang, M., A Bayesian joint model of recurrent events and a terminal event, Biom. J., 61, 1, 187-202 (2019) · Zbl 1412.62166
[28] Moradian, H.; Larocque, D.; Bellavance, F., Survival forests for data with dependent censoring, Stat. Methods Med. Res., 28, 2, 445-461 (2019)
[29] Olsson, U.; Drasgow, F.; Dorans, N. J., The polyserial correlation coefficient, Psychometrika, 47, 3, 337-347 (1982) · Zbl 0536.62045
[30] Rubin, D. B., Multiple Imputation for Nonresponse in Surveys (2004), Wiley: Wiley New York · Zbl 1070.62007
[31] Schallhorn, N.; Kraus, D.; Nagler, T.; Czado, C., D-vine quantile regression with discrete variables (2017), arXiv Preprint
[32] Schepsmeier, U.; Stoeber, J.; Brechmann, E. C.; Graeler, B.; Nagler, T.; Erhardt, T., VineCopula: Statistical inference of vine copulas (2019), R Package Version 2.3.0
[33] Sklar, A., Fonctions de répartition á n dimensions et leurs marges, Publ. Inst. Stat. Univ. Paris, 8, 229-231 (1959) · Zbl 0100.14202
[34] Therneau, T. M.; Lumley, T., Package ‘survival’, R Top Doc, 128, 10, 28-33 (2015)
[35] Wang, P.; Li, Y.; Reddy, C. K., Machine learning for survival analysis: a survey, ACM Comput. Surv., 51, 6, 1-36 (2019)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.