×

On the computation of entropy prior complexity and marginal prior distribution for the Bernoulli model. (English) Zbl 1425.62104

Summary: As the size and complexity of models grow, the choice of the best model becomes a difficult and challenging task. Once the best model is specified, the goodness of fit of the model needs to be examined first. A highly complex model may provide a good fit, but giving no consideration to model complexity could result in incorrect estimates of parameter values and predictions. In order to improve the model selection process, model complexity needs to be defined clearly. This article studies different aspects of model complexity and discusses the extent to which they can be measured. The most common attribute that is usually ignored from many complexity measures is the parameter prior, which is an inherent part of the model and could impact the complexity significantly. The concept of parameter prior and its connection to model complexity are therefore discussed here, and some relationships to the entropy measure elements are also addressed.

MSC:

62K15 Factorial statistical designs
62J12 Generalized linear models (logistic models)
62B10 Statistical aspects of information-theoretic topics
Full Text: DOI

References:

[1] Balakrishnan, N.; Koukouvinos, C.; Parpoula, C., Analysis of a supersaturated design using entropy prior complexity for binary responses via generalized linear models, Stat. Methodol., 9, 478-185 (2012) · Zbl 1365.62293 · doi:10.1016/j.stamet.2011.10.005
[2] Balasubramanian, V., Statistical inference, Occam’s Razor, and statistical mechanics on the space of probability distributions, Neural Comput., 9, 349-368 (1997) · Zbl 0870.62006 · doi:10.1162/neco.1997.9.2.349
[3] Bennett, C. H., On the nature and origin of complexity in discrete, homogeneous locally-interacting systems, Found. Phys., 16, 585-592 (1986) · doi:10.1007/BF01886523
[4] Berger, A. L.; Pietra, S.; Pietra, V. J., A maximum-entropy approach to natural language processing, Comput. Linguistics, 22, 39-71 (1996)
[5] Bialek, W.; Nemenman, I.; Tishby, N., Predictability, complexity, and learning, Neural Comput., 13, 2409-2463 (2001) · Zbl 0993.68045 · doi:10.1162/089976601753195969
[6] Brooks, R. J.; Tobias, A. M., Choosing the best model: Level of detail, complexity and model performance, Math. Comput. Model., 24, 1-14 (1996) · Zbl 0885.68150 · doi:10.1016/0895-7177(96)00103-3
[7] Brookshear, J. G. 1989. Theory of computation: Formal languages, automata, and complexity. Redwood City, CA: Benjamin-Cummings Publishing Company. · Zbl 0678.68001
[8] Bueso, M. C.; Qian, G.; Angulo, J. M., Stochastic complexity and model selection from incomplete data, J. Stat. Plan. Inference, 76, 273-284 (1999) · Zbl 0924.62008 · doi:10.1016/S0378-3758(98)00112-8
[9] Catalan, R. G., J. Garay, and R. López-Ruiz. 2002. Features of the extension of a statistical measure of complexity for continuous systems. Phys. Rev. E, 66, 011102(6).
[10] Caticha, A.; Knuth, K. (ed.); etal., Information and entropy, No. vol. 954, 11 (2007), New York, NY
[11] Charles, S. B.; Härdle, W. (ed.); Ronz, B. (ed.), A comparison of marginal likelihood computation methods, 111-117 (2002), Berlin, Heidelberg
[12] Crutchfield, J. P.; Young, K., Inferring statistical complexity, Phys. Rev. Lett., 63, 105-108 (1989) · doi:10.1103/PhysRevLett.63.105
[13] Pietra, S.; Pietra, V. J.; Lafferty, J. D., Inducing features of random fields, IEEE Trans. Pattern Anal. Machine Intelligence, 19, 380-393 (1997) · doi:10.1109/34.588021
[14] Dunn, J., Model complexity: The fit to random data reconsidered, Psychol. Res., 63, 174-182 (2000) · doi:10.1007/PL00008176
[15] Feldman, D. P.; Crutchfield, J. P., Measures of statistical complexity, Phys. Lett. A, 238, 244-252 (1998) · Zbl 1026.82505 · doi:10.1016/S0375-9601(97)00855-4
[16] Grünwald, P. D.; Grünwald, P. D (ed.); Myung, I. J (ed.); Pitt, M. A (ed.), MDL tutorial, 16-17 (2005), Cambridge, MA · doi:10.7551/mitpress/1114.001.0001
[17] Grünwald, P. D. 2007. The minimum description length principle. Cambridge, MA: MIT Press. · doi:10.7551/mitpress/4643.001.0001
[18] Hall, P.; Hannan, J., On stochastic complexity and nonparametric density estimation, Biometrika, 75, 705-714 (1988) · Zbl 0661.62025 · doi:10.1093/biomet/75.4.705
[19] Hansen, A. J.; Yu, B., Model selection and the principle of minimum description length, J. Am. Stat. Assoc., 96, 746-774 (2001) · Zbl 1017.62004 · doi:10.1198/016214501753168398
[20] Hopcroft, J. E., R. Motwani, and J. D. Ullman. 2000. Introduction to automata theory, languages, and computation, 3rd ed. Reading, MA: Addison-Wesley. · Zbl 0980.68066
[21] Jaynes, E. T. 2003. Probability theory—The logic of science. Cambridge, UK: Cambridge University Press. · Zbl 1045.62001 · doi:10.1017/CBO9780511790423
[22] Kass, R. E.; Raftery, A. E., Bayes factors, J. Am. Stat. Assoc., 90, 773-795 (1995) · Zbl 0846.62028 · doi:10.1080/01621459.1995.10476572
[23] Lee, M. D., Generating additive clustering models with minimal stochastic complexity, J. Classification, 19, 69-85 (2002) · Zbl 1040.91085 · doi:10.1007/s00357-001-0033-y
[24] Li, M., and P. M. B. Vitanyi. 1993. An introduction to Kolmogorov complexity and its applications. New York, NY: Springer-Verlag. · Zbl 0805.68063 · doi:10.1007/978-1-4757-3860-5
[25] López-Ruiz, R.; Mancini, H. L.; Calbet, X., A statistical measure of complexity, Phys. Lett. A, 209, 321-326 (1995) · doi:10.1016/0375-9601(95)00867-5
[26] Myung, I. J.; Pitt, M. A., Applying Occam’s razor in modeling cognition: A Bayesian approach, Psychonomic Bull. Rev., 4, 79-95 (1997) · doi:10.3758/BF03210778
[27] Myung, I. J., The importance of complexity in model selection, J. Math. Psychol., 44, 190-204 (2000) · Zbl 0946.62094 · doi:10.1006/jmps.1999.1283
[28] Myung, I. J.; Balasubramanian, V.; Pitt, M. A., Counting probability distributions: Differential geometry and model selection, Proc. Nat. Acad. Sci. USA, 97, 11170-11175 (2000) · Zbl 0997.62099 · doi:10.1073/pnas.170283897
[29] Rissanen, J., Stochastic complexity and modeling, Ann. Statistics, 14, 1080-1100 (1986) · Zbl 0602.62008 · doi:10.1214/aos/1176350051
[30] Rissanen, J., Stochastic complexity (with discussion), J. R. Stat. Soc. Ser. B, 49, 223-265 (1987) · Zbl 0654.62008
[31] Rissanen, J. 1989. Stochastic complexity in statistical inquiry. Singapore: World Scientific Publishing Company. · Zbl 0800.68508
[32] Rissanen, J., Fisher information and stochastic complexity, IEEE Trans. Information Theory, 42, 40-47 (1996) · Zbl 0856.94006 · doi:10.1109/18.481776
[33] Rissanen, J.; Velupillai, K. (ed.), Complexity and information in modeling. Chapter IV (2005), Oxford, UK
[34] Rissanen, J. 2007. Information and complexity in statistical modeling. New York, NY: Springer-Verlag. · Zbl 1156.62005 · doi:10.1007/978-0-387-68812-1
[35] Rissanen, J. 2012. Optimal estimation of parameters. Cambridge, UK: Cambridge University Press. · Zbl 1292.62016 · doi:10.1017/CBO9780511791635
[36] Shannon, C. E., A mathematical theory of communication, Bell System Tech. J., 27, 379-423 (1948) · Zbl 1154.94303 · doi:10.1002/j.1538-7305.1948.tb01338.x
[37] Spiegelhalter, D. J.; Best, N. G.; Carlin, B. P.; Linde, A., Bayesian measures of model complexity and fit, J. R. Stat. Soc. Ser. B, 64, 583-639 (2002) · Zbl 1067.62010 · doi:10.1111/1467-9868.00353
[38] Linde, A., A Bayesian view of model complexity, Stat. Neerland., 66, 253-271 (2012) · doi:10.1111/j.1467-9574.2011.00518.x
[39] Vanpaemel, W.; Bengio, Y. (ed.); Schuurmans, D. (ed.); Lafferty, J. (ed.); Williams, C. K I. (ed.); Culotta, A. (ed.), Measuring model complexity with the prior predictive, 1919-1927 (2009), Red Hook, NY
[40] Wallis, K. F. 2006. A note on the calculation of entropy from histograms. Unpublished paper, University of Warwick, Coventry, UK.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.