×

A circular-linear dependence measure under Johnson-Wehrly distributions and its application in Bayesian networks. (English) Zbl 1454.62161

Summary: Circular data jointly observed with linear data are common in various disciplines. Since circular data require different techniques than linear data, it is often misleading to use usual dependence measures for joint data of circular and linear observations. Moreover, although a mutual information measure between circular variables exists, the measure has drawbacks in that it is defined only for a bivariate extension of the wrapped Cauchy distribution and has to be approximated using numerical methods. In this paper, we introduce two measures of dependence, namely, (i) circular-linear mutual information as a measure of dependence between circular and linear variables and (ii) circular-circular mutual information as a measure of dependence between two circular variables. It is shown that the expression for the proposed circular-linear mutual information can be greatly simplified for a subfamily of Johnson-Wehrly distributions. We apply these two dependence measures to learn a circular-linear tree-structured Bayesian network that combines circular and linear variables. To illustrate and evaluate our proposal, we perform experiments with simulated data. We also use a real meteorological data set from different European stations to create a circular-linear tree-structured Bayesian network model.

MSC:

62H11 Directional data; spatial statistics
62M30 Inference from spatial processes
94A17 Measures of information, entropy
Full Text: DOI

References:

[1] Abe, T.; Ley, C., A tractable, parsimonious and flexible model for cylindrical data, with applications, Econom. Stat., Inpress. (2016)
[2] Bowman, K.; Shenton, L., Methods of moments, Encycl. Stat. Sci., 5, 467-473 (1985)
[3] Chow, C.; Liu, C., Approximating discrete probability distributions with dependence trees, IEEE Trans. Inf. Theory, 14, 3, 462-467 (1968) · Zbl 0165.22305
[4] Cover, T. M.; Thomas, J. A., Elements of Information Theory (2012), John Wiley & Sons
[5] Demšar, J., Statistical comparisons of classifiers over multiple data sets, J. Mach. Learn. Res., 7, 1-30 (2006) · Zbl 1222.68184
[6] Fisher, N. I., Statistical Analysis of Circular Data (1995), Cambridge University Press
[7] Fisher, N. I.; Lee, A. J., Regression models for an angular response, Biometrics, 48, 3, 665-677 (1992)
[8] Friedman, M., The use of ranks to avoid the assumption of normality implicit in the analysis of variance, J. Am. Stat. Assoc., 32, 200, 675-701 (1937) · JFM 63.1098.02
[9] Friedman, N.; Geiger, D.; Goldszmidt, M., Bayesian network classifiers, Mach. Learn., 29, 131-163 (1997) · Zbl 0892.68077
[10] Gould, A. L., A regression technique for angular variates, Biometrics, 25, 4, 683-700 (1969)
[11] Gradshteyn, I. S.; Ryzhik, I. M., Table of Integrals, Series, and Products (2007), Academic Press · Zbl 1208.65001
[12] Jammalamadaka, S. R.; Sengupta, A., Topics in Circular Statistics (2001), World Scientific · Zbl 1006.62050
[13] Johnson, N. L.; Kotz, S.; Balakrishnan, N., Distributions in Statistics: Continuous Univariate Distributions (1970), Houghton Mifflin · Zbl 0213.21101
[14] Johnson, R. A.; Wehrly, T. E., Some angular-linear distributions and related regression models, J. Am. Stat. Assoc., 73, 363, 602-606 (1978) · Zbl 0388.62059
[15] Kato, S., A distribution for a pair of unit vectors generated by Brownian motion, Bernoulli, 15, 3, 898-921 (2009) · Zbl 1201.62066
[16] Kato, S.; Pewsey, A., A Möbius transformation-induced distribution on the torus, Biometrika, 102, 2, 359-370 (2015) · Zbl 1452.62188
[17] Kenley, C., Influence Diagram Models with Continuous Variables (1986), Stanford University, Ph.D. thesis
[18] Kent, J. T.; Tyler, D. E., Maximum likelihood estimation for the wrapped Cauchy distribution, J. Appl. Stat., 15, 2, 247-254 (1988)
[19] Kotz, S.; Balakrishnan, N.; Johnson, N. L., Continuous Multivariate Distributions, Models and Applications (2004), John Wiley & Sons
[20] Leguey, I.; Bielza, C.; Larrañaga, P., Tree-structured Bayesian networks for wrapped Cauchy directional distributions, Advances in Artificial Intelligence, vol. 9868, 207-216 (2016), Springer
[21] Lévy, P., L’addition des variables aléatoires définies sur une circonférence, Bulletin de la Société Mathématique de France, 67, 1-41 (1939) · JFM 65.1346.01
[22] Ley, C.; Verdebout, T., Modern Directional Statistics (2017), CRC Press · Zbl 1448.62005
[23] Lloyd, S., On a measure of stochastic dependence, Theory. Probab. Appl., 7, 3, 301-312 (1962) · Zbl 0115.13003
[24] Mardia, K. V., Statistics of directional data, J. R. Stat. Soc. Ser. B (Methodological), 37, 3, 349-393 (1975) · Zbl 0314.62026
[25] Mardia, K. V.; Hughes, G.; Taylor, C. C.; Singh, H., A multivariate von Mises distribution with applications to bioinformatics, Can. J. Stat., 36, 1, 99-109 (2008) · Zbl 1143.62031
[26] Mardia, K. V.; Jupp, P. E., Directional Statistics (2009), John Wiley & Sons
[27] Mardia, K. V.; Sutton, T. W., A model for cylindrical variables with applications, J. R. Stat. Soc. Ser. B (Methodological), 40, 2, 229-233 (1978) · Zbl 0392.62022
[28] Mardia, K. V.; Voss, J., Some fundamental properties of a multivariate von mises distribution, Commun. Stat., 43, 1132-1144 (2014) · Zbl 1462.62316
[29] McCullagh, P., Möbius transformation and cauchy parameter estimation, Ann. Stat., 24, 2, 787-808 (1996) · Zbl 0859.62007
[30] von Mises, R., Über die ǣganzzahligkeitǥ der atomgewichte und verwandte fragen, Zeitschrift für Physik, 19, 490-500 (1918) · JFM 46.1493.01
[31] Nemenyi, P., Distribution-free multiple comparisons, Biometrics, 18, 2, 263 (1962)
[32] Nojavan, F.; Qian, S.; Stow, C., Comparative analysis of discretization methods in Bayesian networks, Environ. Modell. Softw., 87, 64-71 (2017)
[33] Rényi, A., On measures of dependence, Acta Math. Acad. Sci. Hungarica, 10, 3-4, 441-451 (1959) · Zbl 0091.14403
[34] Rényi, A., On the dimension and entropy of probability distributions, Acta Math. Acad. Sci. Hungarica, 10, 1-2, 193-215 (1959) · Zbl 0088.10702
[35] Schwarz, G., Estimating the dimension of a model, Ann. Stat., 6, 2, 461-464 (1978) · Zbl 0379.62005
[36] Sengupta, A., On the construction of probability distributions for directional data, Bull. Indian Math. Soc., 96, 2, 139-154 (2004) · Zbl 1052.62061
[37] Shachter, R.; Kenley, C., Gaussian influence diagrams, Manage. Sci., 35, 5, 527-550 (1989)
[38] Shannon, C. E., A mathematical theory of communication, Mob. Comput. Commun. Rev., 5, 1, 3-55 (2001)
[39] Spirtes, P.; Glymour, C. N.; Scheines, R., Causation, Prediction, and Search (2000), MIT Press
[40] R.D.C. Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, 2008.; R.D.C. Team, R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, 2008.
[41] Tong, Y., The Multivariate Normal Distribution (1990), Springer · Zbl 0689.62036
[42] Wehrly, T. E.; Johnson, R. A., Bivariate models for dependence of angular observations and a related Markov process, Biometrika, 67, 1, 255 (1980) · Zbl 0431.62056
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.