×

On a multivariate copula-based dependence measure and its estimation. (English) Zbl 07524973

Summary: Working with so-called linkages allows to define a copula-based, \([0,1]\)-valued multivariate dependence measure \(\zeta^1(X, Y)\) quantifying the scale-invariant extent of dependence of a random variable \(Y\) on a \(d\)-dimensional random vector \(X=(X_1, \dots, X_d)\) which exhibits various good and natural properties. In particular, \(\zeta^1 (X,Y)=0\) if and only if \(X\) and \(Y\) are independent, \(\zeta^1 (X,Y)\) is maximal exclusively if \(Y\) is a function of \(X\), and ignoring one or several coordinates of \(X\) can not increase the resulting dependence value. After introducing and analyzing the metric \(D_1\) underlying the construction of the dependence measure and deriving examples showing how much information can be lost by only considering all pairwise dependence values \(\zeta^1 (X_1, Y), \dots, \zeta^1 (X_d, Y)\) we derive a so-called checkerboard estimator for \(\zeta^1 (X,Y)\) and show that it is strongly consistent in full generality, i.e., without any smoothness restrictions on the underlying copula. Some simulations illustrating the small sample performance of the estimator complement the established theoretical results.

MSC:

62-XX Statistics

Software:

Linkages; FOCI

References:

[1] ANSARI, J. and RÜSCHENDORF, L. (2021). Sklar’s theorem, copula products, and ordering results in factor models. Depend. Model. 9 267-306. · Zbl 1479.60040
[2] AZADKIA, M., CHATTERJEE, S. and MATLOFF, N. (2020). FOCI: Feature Ordering by Conditional Independence R package version 0.1.2.
[3] AZADKIA, M. and CHATTERJEE, S. (2021). A simple measure of conditional dependence. Ann. Statist. 49 3070-3102. · Zbl 1486.62175
[4] BAYRAMOGLU, I. (2014). On conditionally independent random variables, copula and order statistics. Comm. Statist. Theory Methods 43 2105-2117. · Zbl 1312.62076
[5] Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York. · Zbl 0172.21201
[6] Bogachev, V. I. (2007). Measure Theory. Vol. I, II. Springer, Berlin. · Zbl 1120.28001 · doi:10.1007/978-3-540-34514-5
[7] BOONMEE, T. and TASENA, S. (2016). Measure of complete dependence of random vectors. J. Math. Anal. Appl. 443 585-595. · Zbl 1384.62180
[8] CHANG, Y., LI, Y., DING, A. and DY, J. (2016). A Robust-Equitable Copula Dependence Measure for Feature Selection. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research 51 84-92. PMLR, Cadiz, Spain.
[9] CHATTERJEE, S. (2020). A New Coefficient of Correlation. Journal of the American Statistical Association 0 1-21. · Zbl 1506.62317 · doi:10.1080/01621459.2020.1758115
[10] DEB, N., GHOSAL, P. and SEN, B. (2020). Measuring Association on Topological Spaces Using Kernels and Geometric Graphs. arxiv preprint arxiv:2010.01768.
[11] Dette, H., Siburg, K. F. and Stoimenov, P. A. (2013). A copula-based non-parametric measure of regression dependence. Scand. J. Stat. 40 21-41. · Zbl 1259.62050 · doi:10.1111/j.1467-9469.2011.00767.x
[12] DEVROYE, L. (1987). A Course in Density Estimation. Progress in Probability and Statistics 14. Birkhäuser, Inc., Boston, MA. · Zbl 0617.62043
[13] DURANTE, F. and FERNÁNDEZ-SÁNCHEZ, J. (2010). Multivariate shuffles and approximation of copulas. Statist. Probab. Lett. 80 1827-1834. · Zbl 1202.62067
[14] DURANTE, F. and SEMPI, C. (2016). Principles of Copula Theory. CRC Press, Boca Raton, FL. · Zbl 1380.62008
[15] FERNÁNDEZ SÁNCHEZ, J. and TRUTSCHNIG, W. (2015). Conditioning-based metrics on the space of multivariate copulas and their interrelation with uniform and levelwise convergence and iterated function systems. J. Theoret. Probab. 28 1311-1336. · Zbl 1329.62269
[16] FUCHS, S. (2021). A bivariate copula capturing the dependence of a random variable and a random vector, its estimation and applications. arxiv preprint arXiv:2112.10147.
[17] GENEST, C., NEŠLEHOVÁ, J. G. and RÉMILLARD, B. (2014). On the empirical multilinear copula process for count data. Bernoulli 20 1344-1371. · Zbl 1365.62221
[18] GENEST, C., NEŠLEHOVÁ, J. G. and RÉMILLARD, B. (2017). Asymptotic behavior of the empirical multilinear copula process under broad conditions. J. Multivariate Anal. 159 82-110. · Zbl 1368.62036
[19] HAN, F. (2021). On extensions of rank correlation coefficients to multivariate spaces. Bernoulli News 28 (2) 7-11.
[20] HEWITT, E. and STROMBERG, K. (1965). Real and Abstract Analysis. A Modern Treatment of the Theory of Functions of a Real Variable. Springer, New York. · Zbl 0137.03202
[21] HUANG, Z., DEB, N. and SEN, B. (2020). Kernel Partial Correlation Coefficient – a Measure of Conditional Dependence. arxiv preprint arxiv:2012.14804.
[22] JANSSEN, P., SWANEPOEL, J. and VERAVERBEKE, N. (2012). Large sample behavior of the Bernstein copula estimator. J. Statist. Plann. Inference 142 1189-1197. · Zbl 1236.62027
[23] JOE, H. (1989). Relative Entropy Measures of Multivariate Dependence. Journal of the American Statistical Association 84 157-164. · Zbl 0677.62054 · doi:10.1080/01621459.1989.10478751
[24] JUNKER, R. R., GRIESSENBERGER, F. and TRUTSCHNIG, W. (2021). Estimating scale-invariant directed dependence of bivariate distributions. Comput. Statist. Data Anal. 153 Paper No. 107058, 22. · Zbl 1510.62253
[25] KALLENBERG, O. (2002). Foundations of Modern Probability, 2nd ed. Probability and Its Applications (New York). Springer, New York. · Zbl 0996.60001
[26] KASPER, T. M., FUCHS, S. and TRUTSCHNIG, W. (2021). On weak conditional convergence of bivariate Archimedean and extreme value copulas, and consequences to nonparametric estimation. Bernoulli 27 2217-2240. · Zbl 1473.62160
[27] KLENKE, A. (2008). Wahrscheinlichkeitstheorie. Springer-Lehrbuch Masterclass Series. Springer, Berlin Heidelberg.
[28] KRUPSKII, P. and JOE, H. (2013). Factor copula models for multivariate data. J. Multivariate Anal. 120 85-101. · Zbl 1280.62070
[29] KRUPSKII, P. and JOE, H. (2015). Structured factor copula models: Theory, inference and computation. J. Multivariate Anal. 138 53-73. · Zbl 1320.62139
[30] LANCASTER, H. O. (1963). Correlation and complete dependence of random variables. Ann. Math. Statist. 34 1315-1321. · Zbl 0121.35905
[31] LI, H., SCARSINI, M. and SHAKED, M. (1996). Linkages: A tool for the construction of multivariate distributions with given nonoverlapping multivariate marginals. J. Multivariate Anal. 56 20-41. · Zbl 0863.62049
[32] LIN, Z. and HAN, F. (2021). On boosting the power of Chatterjee’s rank correlation. arXiv preprint arXiv:2108.06828.
[33] MIKUSIŃSKI, P. and TAYLOR, M. D. (2009). Markov operators and \(n\)-copulas. Ann. Polon. Math. 96 75-95. · Zbl 1179.47062
[34] MIKUSIŃSKI, P. and TAYLOR, M. D. (2010). Some approximations of \(n\)-copulas. Metrika 72 385-414. · Zbl 1197.62050
[35] MROZ, T., FUCHS, S. and TRUTSCHNIG, W. (2021). How simplifying and flexible is the simplifying assumption in pair-copula constructions—analytic answers in dimension three and a glimpse beyond. Electron. J. Stat. 15 1951-1992. · Zbl 1471.62359
[36] Nelsen, R. B. (2006). An Introduction to Copulas, 2nd ed. Springer Series in Statistics. Springer, New York. · Zbl 1152.62030 · doi:10.1007/s11229-005-3715-x
[37] Reshef, D. N., Reshef, Y. A., Finucane, H. K., Grossman, S. R., McVean, G., Turnbaugh, P. J., Lander, E. S., Mitzenmacher, M. and Sabeti, P. C. (2011). Detecting novel associations in large data sets. Science 334 1518-1524. · Zbl 1359.62216
[38] RESHEF, D. N., RESHEF, Y. A., SABETI, P. C. and MITZENMACHER, M. (2018). An empirical study of the maximal and total information coefficients and leading measures of dependence. Ann. Appl. Stat. 12 123-155. · Zbl 1393.62094
[39] ROSENBLATT, M. (1952). Remarks on a Multivariate Transformation. Ann. Math. Statist. 23 470-472. · Zbl 0047.13104 · doi:10.1214/aoms/1177729394
[40] Rudin, W. (1987). Real and Complex Analysis, 3rd ed. McGraw-Hill, New York. · Zbl 0925.00005
[41] RÜSCHENDORF, L. (1981). Stochastically ordered distributions and monotonicity of the OC-function of sequential probability ratio tests. Math. Operationsforsch. Statist. Ser. Statist. 12 327-338. · Zbl 0481.62063
[42] RÜSCHENDORF, L. (2009). On the distributional transform, Sklar’s theorem, and the empirical copula process. J. Statist. Plann. Inference 139 3921-3927. · Zbl 1171.60313 · doi:10.1016/j.jspi.2009.05.030
[43] RÜSCHENDORF, L. and DE VALK, V. (1993). On regression representations of stochastic processes. Stochastic Process. Appl. 46 183-198. · Zbl 0779.60058
[44] SHI, H., DRTON, M. and HAN, F. (2021a). On Azadkia-Chatterjee’s conditional dependence coefficient. arxiv preprint arxiv:2108.06827.
[45] SHI, H., DRTON, M. and HAN, F. (2021b). On the power of Chatterjee’s rank correlation. Biometrika. asab028. · Zbl 07543326 · doi:10.1093/biomet/asab028
[46] Székely, G. J. and Rizzo, M. L. (2009). Brownian distance covariance. Ann. Appl. Stat. 3 1236-1265. · Zbl 1196.62077 · doi:10.1214/09-AOAS312
[47] Székely, G. J., Rizzo, M. L. and Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. Ann. Statist. 35 2769-2794. · Zbl 1129.62059 · doi:10.1214/009053607000000505
[48] TRUTSCHNIG, W. (2011). On a strong metric on the space of copulas and its induced dependence measure. J. Math. Anal. Appl. 384 690-705. · Zbl 1252.46019
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.