×

A Bayesian-Frequentists approach for detecting outliers in a one-way variance components model. (English) Zbl 07532221

Summary: The most common Bayesian approach for detecting outliers is to assume that outliers are observations which have been generated by contaminating models. An alternative idea was applied by A. Zellner [J. Am. Stat. Assoc. 70, 138–144 (1975; Zbl 0313.62050)] and K. Chaloner [in: Aspects of uncertainty: a tribute to D. V. Lindley. Chichester: Wiley. 149–157 (1994; Zbl 0875.62120)]. They studied the properties of realized regression error terms. Posterior distributions for individual realized errors, and for linear and quadratic combinations of them, were derived. In this note, the theory and results derived by K. Chaloner [in: Aspects of uncertainty: a tribute to D. V. Lindley. Chichester: Wiley. 149–157 (1994; Zbl 0875.62120)] are extended. Since it is not clear to us what the frequentist properties of the Bayesian procedures of Chaloner and Zellner are (i.e., what the size of the Type I error is or the power of their tests are), a Bayesian-Frequentist approach is used for detecting outliers in a one-way random effects model. For illustration purposes, the L. D. Sharples [Biometrika 77, No. 3, 445–453 (1990; doi:10.1093/biomet/77.3.445)] contaminated data are used as our first example. It is concluded that the Bayesian-Frequentist approach seems to be more conservative than Chaloner’s method. In the second example, the Bayesian-Frequentist method is applied to A. O’Hagan’s [HSS model criticism (with discussion), Highly structured stochastic systems, 423–453 (2003)] artificial dataset and compared with the partial posterior predictive measures derived by M. J. Bayarri and M. E. Castellanos [Stat. Sci. 22, No. 3, 363–367 (2007; Zbl 1246.62030)].

MSC:

62-XX Statistics
Full Text: DOI

References:

[1] Bayarri, M. J.; Berger, J., The interplay between Bayesian and Frequentist analysis, Statistical Science, 19, 1, 58-80 (2004) · Zbl 1062.62001 · doi:10.1214/088342304000000116
[2] Bayarri, M. J.; Morales, J., Bayesian measures of surprise for outlier detection, Journal of Statistical Planning and Inference, 111, 1-2, 3-22 (2003) · Zbl 1027.62014 · doi:10.1016/S0378-3758(02)00282-3
[3] Bayarri, M. J.; Berger, J. O.; Vilaplana, J. P.; Puri, M. L., Robust Bayesian bounds for outlier detection, Recent advances in statistics and probability, 175-90 (1994), Ah Zeist: VSP, Ah Zeist · Zbl 0839.62023
[4] Bayarri, M. J.; Berger, J. O., p-Values for composite null models (with discussion), Journal of the American Statistical Association, 95, 452, 70 (2000) · Zbl 1004.62022 · doi:10.2307/2669749
[5] Bayarri, M. J.; Castellanos, M. E., Bayesian checking of the second levels of hierarchical models, Statistical Science, 22, 3, 222-43 (2007) · Zbl 1246.62030 · doi:10.1214/07-STS235REJ
[6] Berger, J. O.; Bernardo, J. M.; Goel, P., Reference priors in a variance components problem, 323-40 (1992), New York, NY: Springer, New York, NY
[7] Berger, J. O.; Bernardo, J. M.; Bernardo, J. M.; Berger, J. O.; Dawid, A. P.; Smith, A. F. M., Bayesian statistics, 4, On the development of reference priors, 35-60 (1992), Oxford: Oxford University Press, Oxford
[8] Box, G. E. P.; Tiao, G. C., A Bayesian approach to some outlier problems, Biometrika, 55, 1, 119-29 (1968) · Zbl 0159.47901 · doi:10.1093/biomet/55.1.119
[9] Box, G. E. P.; Tiao, G. C., Bayesian inference in statistical analysis (1973), Reading: Reading, MA: Addison-Wesley, Reading: Reading, MA · Zbl 0271.62044
[10] Bradlow, E. T.; Zaslavsky, A. M., Case influence analysis in Bayesian inference, Journal of Computational and Graphical Statistics, 6, 3, 314-31 (1997) · doi:10.2307/1390736
[11] Chaloner, K.; Smith, A.; Freeman, P., Aspects of uncertainty, Residual analysis and outliers in Bayesian hierarchical models (1994), Chichester, UK: Wiley, Chichester, UK · Zbl 0875.62120
[12] Chaloner, K.; Brant, R., A Bayesian approach to outlier detection and residual analysis, Biometrika, 75, 4, 651-9 (1988) · Zbl 0659.62037 · doi:10.1093/biomet/75.4.651
[13] Datta, G. S.; Ghosh, J. K., On priors providing frequentist validity of Bayesian inference, Biometrika, 82, 1, 37-45 (1995) · Zbl 0823.62004 · doi:10.2307/2337625
[14] Dawid, A. P., Posterior expectations for large observations, Biometrika, 60, 3, 664-7 (1973) · Zbl 0268.62014 · doi:10.1093/biomet/60.3.664
[15] Dey, D. K.; Gelfand, A. E.; Swartz, T. B.; Vlachos, A. K., A simulation-intensive approach for checking hierarchical models, Test, 7, 2, 325-46 (1998) · Zbl 0935.62082
[16] Freeman, P. R., On the number of outliers in data from a linear model, Trabajos de Estadistica Y de Investigacion Operativa, 31, 349-65 (1980) · doi:10.1007/BF02888359
[17] Geisser, S., Discussion on Sampling and Bayes’ inference in scientific modeling and robustness (by GEP Box), Journal of the Royal Statistical Society A, 143, 416-7 (1980)
[18] Geisser, S., Predictive approaches to discordancy testing (1987), University of Minnesota
[19] Geisser, S., Predictive discordancy tests for exponential observations, The Canadian Journal of Statistics/La Revue Canadienne de Statistique, 17, 1, 19-26 (1989) · Zbl 0672.62043 · doi:10.2307/3314759
[20] Girón, F. J.; Martínez, M. L.; Morcillo, C., A Bayesian justification for the analysis of residuals and influence measures, Bayesian Statistics, 4, 651-60 (1992)
[21] Guttman, I.; Peña, D., Outliers and influence: Evaluation by posteriors of parameters in the linear model, Bayesian Statistics, 3, 631-40 (1988) · Zbl 0704.62024
[22] Harvey, J., Bayesian inference for the lognormal distribution (2012), Bloemfontein, South Africa
[23] Hodges, J. S., Modelling and prediction honoring Seymour Geisser, Statistical practice as argumentation: A sketch of a theory of applied statistics, 19-45 (1996), New York, NY: Springer, New York, NY · Zbl 0895.62106
[24] Hodges, J. S., Some algebra and geometry for hierarchical models, applied to diagnostics, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 60, 3, 497-536 (1998) · Zbl 0909.62072 · doi:10.1111/1467-9868.00137
[25] Hoeting, J.; Raftery, A. E.; Madigan, D., A method for simultaneous variable selection and outlier identification in linear regression, Computational Statistics & Data Analysis, 22, 3, 251-70 (1996) · Zbl 0900.62352 · doi:10.1016/0167-9473(95)00053-4
[26] Lange, K. L.; Little, R. J.; Taylor, J. M., Robust statistical modeling using the t distribution, Journal of the American Statistical Association, 84, 408, 881-96 (1989) · doi:10.2307/2290063
[27] Lange, N.; Carlin, B. P.; Gelfand, A. E., Hierarchical Bayes models for the progression of HIV infection using longitudinal CD4 T-cell numbers, Journal of the American Statistical Association, 87, 419, 615-26 (1992) · Zbl 0850.62838 · doi:10.1080/01621459.1992.10475258
[28] Lange, N.; Carlin, B. P.; Gelfand, A. E., Rejoinder, Journal of the American Statistical Association, 87, 419, 631-2 (1992) · doi:10.1080/01621459.1992.10475261
[29] Marshall, E. C.; Spiegelhalter, D. J., Approximate cross-validatory predictive checks in disease mapping models, Statistics in Medicine, 22, 10, 1649-60 (2003) · doi:10.1002/sim.1403
[30] O’Hagan, A., On outlier rejection phenomena in Bayes inference, Journal of the Royal Statistical Society: Series B (Methodological), 41, 3, 358-67 (1979) · Zbl 0422.62027
[31] O’Hagan, A.; Bernardo, J. M.; DeGroot, M. H.; Lindley, D. V.; Smith, A. F. M., Modelling with heavy tails, Bayesian statistics 3 (1988) · Zbl 0713.62037
[32] O’Hagan, A., Outliers and credence for location parameter inference, Journal of the American Statistical Association, 85, 409, 172-6 (1990) · Zbl 0706.62030
[33] O’Hagan, A., HSS model criticism (with discussion), Highly structured stochastic systems, 423-53 (2003)
[34] Page, G. L.; Dunson, D. B., Bayesian local contamination models for multivariate outliers, Technometrics, 53, 2, 152-62 (2011) · doi:10.1198/TECH.2011.10041
[35] Peña, D.; Prieto, F. J., Multivariate outlier detection and robust covariance matrix estimation, Technometrics, 43, 3, 286-310 (2001) · doi:10.1198/004017001316975899
[36] Peña, D.; Tiao, G. C., Bayesian robustness functions for linear models, Bayesian Statistics, 4, 365-88 (1992)
[37] Peña, D.; Yohai, V., A fast procedure for outlier diagnostics in large regression problems, Journal of the American Statistical Association, 94, 446, 434-45 (1999) · Zbl 1072.62618 · doi:10.2307/2670164
[38] Penny, K. I.; Jolliffe, I. T., A comparison of multivariate outlier detection methods for clinical laboratory safety data, Journal of the Royal Statistical Society: Series D (the Statistician), 50, 3, 295-307 (2001) · doi:10.1111/1467-9884.00279
[39] Pettit, L. I.; Smith, A. F. M., Outliers and influential observations in linear models, Bayesian Statistics, 2, 1, 473-94 (1985) · Zbl 0671.62031
[40] Reid, N.; Cox, D. R., On some principles of statistical inference, International Statistical Review, 83, 2, 293-308 (2015) · Zbl 07763439 · doi:10.1111/insr.12067
[41] Seltzer, M. H., Sensitivity analysis for fixed effects in the hierarchical model: A Gibbs sampling approach, Journal of Educational Statistics, 18, 3, 207-35 (1993) · doi:10.3102/10769986018003207
[42] Sharples, L. D., Identification and accommodation of outliers in general hierarchical models, Biometrika, 77, 3, 445-53 (1990) · doi:10.1093/biomet/77.3.445
[43] Van der Merwe, A. J.; Pretorius, A. L.; Meyer, J. H., Bayesian tolerance intervals for the unbalanced one-way random effects model, Journal of Quality Technology, 38, 3, 280-93 (2006) · doi:10.1080/00224065.2006.11918615
[44] Van der Merwe, A. J.; Bekker, K. N., Bayesian analysis of insurance losses using the Buhlmann-Straub credibility model, Journal of Actuarial Practice, 13, 33-60 (2006) · Zbl 1193.91073
[45] Verdinelli, I.; Wasserman, L., Bayesian Analysis of outlier problems using the Gibbs sampler, Statistics and Computing, 1, 2, 105-17 (1991) · doi:10.1007/BF01889985
[46] Wakefield, J. C.; Smith, A. F. M.; Racine‐Poon, A.; Gelfand, A. E., Bayesian analysis of linear and non‐linear population models by using the Gibbs sampler, Applied Statistics, 43, 1, 201-21 (1994) · Zbl 0825.62410 · doi:10.2307/2986121
[47] Weiss, R., An approach to Bayesian sensitivity analysis, Journal of the Royal Statistical Society: Series B (Methodological), 58, 4, 739-50 (1996) · Zbl 0860.62031 · doi:10.1111/j.2517-6161.1996.tb02112.x
[48] Weiss, R. E., Residuals and outliers in Bayesian random effects models (1994), University of California at Los Angeles, Department of Biostatistics
[49] West, M., Outlier models and prior distributions in Bayesian linear regression, Journal of the Royal Statistical Society: Series B (Methodological), 46, 3, 431-9 (1984) · Zbl 0567.62022 · doi:10.1111/j.2517-6161.1984.tb01317.x
[50] West, M., Generalized linear models: Scale parameters, outlier accommodation and prior distributions, Bayesian Statistics, 2, 531-58 (1985) · Zbl 0671.62032
[51] Zellner, A., Bayesian analysis of regression error terms, Journal of the American Statistical Association, 70, 349, 138-44 (1975) · Zbl 0313.62050 · doi:10.1080/01621459.1975.10480274
[52] Zellner, A.; Moulton, B. R., Bayesian regression diagnostics with applications to international consumption and income data, Journal of Econometrics, 29, 1-2, 187-211 (1985) · Zbl 0585.62192 · doi:10.1016/0304-4076(85)90039-9
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.