×

\(\ell^2\) inference for change points in high-dimensional time series via a two-way MOSUM. (English) Zbl 1539.62273

Summary: We propose an inference method for detecting multiple change points in high-dimensional time series, targeting dense or spatially clustered signals. Our method aggregates moving sum (MOSUM) statistics cross-sectionally by an \(\ell^2\)-norm and maximizes them over time. We further introduce a novel Two-Way MOSUM, which utilizes spatial-temporal moving regions to search for breaks, with the added advantage of enhancing testing power when breaks occur in only a few groups. The limiting distribution of an \(\ell^2\)-aggregated statistic is established for testing break existence by extending a high-dimensional Gaussian approximation theorem to spatial-temporal nonstationary processes. Simulation studies exhibit promising performance of our test in detecting nonsparse weak signals. Two applications on equity returns and COVID-19 cases in the United States show the real-world relevance of our algorithms. The R package “L2hdchange” is available on CRAN.

MSC:

62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
62H15 Hypothesis testing in multivariate analysis
62E20 Asymptotic distribution theory in statistics
62G20 Asymptotic properties of nonparametric inference

References:

[1] ADDARIO-BERRY, L., BROUTIN, N., DEVROYE, L. and LUGOSI, G. (2010). On combinatorial testing problems. Ann. Statist. 38 3063-3092. Digital Object Identifier: 10.1214/10-AOS817 Google Scholar: Lookup Link MathSciNet: MR2722464 · Zbl 1200.62059 · doi:10.1214/10-AOS817
[2] Arias-Castro, E., Candès, E. J. and Durand, A. (2011). Detection of an anomalous cluster in a network. Ann. Statist. 39 278-304. Digital Object Identifier: 10.1214/10-AOS839 Google Scholar: Lookup Link MathSciNet: MR2797847 · Zbl 1209.62097 · doi:10.1214/10-AOS839
[3] ARIAS-CASTRO, E., CANDÈS, E. J., HELGASON, H. and ZEITOUNI, O. (2008). Searching for a trail of evidence in a maze. Ann. Statist. 36 1726-1757. Digital Object Identifier: 10.1214/07-AOS526 Google Scholar: Lookup Link MathSciNet: MR2435454 · Zbl 1143.62006 · doi:10.1214/07-AOS526
[4] Bai, J. (2010). Common breaks in means and variances for panel data. J. Econometrics 157 78-92. Digital Object Identifier: 10.1016/j.jeconom.2009.10.020 Google Scholar: Lookup Link MathSciNet: MR2652280 · Zbl 1431.62353 · doi:10.1016/j.jeconom.2009.10.020
[5] BAI, J., HAN, X. and SHI, Y. (2020). Estimation and inference of change points in high-dimensional factor models. J. Econometrics 219 66-100. Digital Object Identifier: 10.1016/j.jeconom.2019.08.013 Google Scholar: Lookup Link MathSciNet: MR4152786 · Zbl 1464.62315 · doi:10.1016/j.jeconom.2019.08.013
[6] Bai, J. and Perron, P. (1998). Estimating and testing linear models with multiple structural changes. Econometrica 66 47-78. MathSciNet: MR1616121 · Zbl 1056.62523
[7] BARNETT, I. and ONNELA, J.-P. (2016). Change point detection in correlation networks. Sci. Rep. 6 18893. Digital Object Identifier: 10.1038/srep18893 Google Scholar: Lookup Link · doi:10.1038/srep18893
[8] CHAN, J., HORVÁTH, L. and HUŠKOVÁ, M. (2013). Darling-Erdős limit results for change-point detection in panel data. J. Statist. Plann. Inference 143 955-970. Digital Object Identifier: 10.1016/j.jspi.2012.11.004 Google Scholar: Lookup Link MathSciNet: MR3011306 · Zbl 1259.62074 · doi:10.1016/j.jspi.2012.11.004
[9] CHEN, C. Y.-H., OKHRIN, Y. and WANG, T. (2022). Monitoring network changes in social media. J. Bus. Econom. Statist. To appear. Digital Object Identifier: 10.1080/07350015.2021.2016425 Google Scholar: Lookup Link · doi:10.1080/07350015.2021.2016425
[10] CHEN, L., WANG, W. and WU, W. B. (2021). Dynamic semiparametric factor model with structural breaks. J. Bus. Econom. Statist. 39 757-771. Digital Object Identifier: 10.1080/07350015.2020.1730857 Google Scholar: Lookup Link MathSciNet: MR4272933 · Zbl 07925239 · doi:10.1080/07350015.2020.1730857
[11] CHEN, L., WANG, W. and WU, W. B. (2022). Inference of breakpoints in high-dimensional time series. J. Amer. Statist. Assoc. 117 1951-1963. Digital Object Identifier: 10.1080/01621459.2021.1893178 Google Scholar: Lookup Link MathSciNet: MR4528482 · Zbl 1515.62040 · doi:10.1080/01621459.2021.1893178
[12] CHEN, Y., WANG, T. and SAMWORTH, R. J. (2022). High-dimensional, multiscale online changepoint detection. J. R. Stat. Soc. Ser. B. Stat. Methodol. 84 234-266. Digital Object Identifier: 10.1111/rssb.12447 Google Scholar: Lookup Link MathSciNet: MR4400396 · Zbl 07593410 · doi:10.1111/rssb.12447
[13] CHERNOZHUKOV, V., CHETVERIKOV, D. and KATO, K. (2017). Central limit theorems and bootstrap in high dimensions. Ann. Probab. 45 2309-2352. Digital Object Identifier: 10.1214/16-AOP1113 Google Scholar: Lookup Link MathSciNet: MR3693963 · Zbl 1377.60040 · doi:10.1214/16-AOP1113
[14] CHERNOZHUKOV, V., CHETVERIKOV, D. and KOIKE, Y. (2023). Nearly optimal central limit theorem and bootstrap approximations in high dimensions. Ann. Appl. Probab. 33 2374-2425. Digital Object Identifier: 10.1214/22-aap1870 Google Scholar: Lookup Link MathSciNet: MR4583674 · Zbl 1529.60027 · doi:10.1214/22-aap1870
[15] Cho, H. (2016). Change-point detection in panel data via double CUSUM statistic. Electron. J. Stat. 10 2000-2038. Digital Object Identifier: 10.1214/16-EJS1155 Google Scholar: Lookup Link MathSciNet: MR3522667 · Zbl 1397.62301 · doi:10.1214/16-EJS1155
[16] Cho, H. and Fryzlewicz, P. (2015). Multiple-change-point detection for high dimensional time series via sparsified binary segmentation. J. R. Stat. Soc. Ser. B. Stat. Methodol. 77 475-507. Digital Object Identifier: 10.1111/rssb.12079 Google Scholar: Lookup Link MathSciNet: MR3310536 · Zbl 1414.62356 · doi:10.1111/rssb.12079
[17] CHO, H. and KIRCH, C. (2022). Two-stage data segmentation permitting multiscale change points, heavy tails and dependence. Ann. Inst. Statist. Math. 74 653-684. Digital Object Identifier: 10.1007/s10463-021-00811-5 Google Scholar: Lookup Link MathSciNet: MR4444107 · Zbl 1497.62230 · doi:10.1007/s10463-021-00811-5
[18] CRESSIE, N. A. C. (2015). Statistics for Spatial Data. Wiley Classics Library. Wiley, New York. MathSciNet: MR3559472 · Zbl 1347.62005
[19] Eichinger, B. and Kirch, C. (2018). A MOSUM procedure for the estimation of multiple random change points. Bernoulli 24 526-564. Digital Object Identifier: 10.3150/16-BEJ887 Google Scholar: Lookup Link MathSciNet: MR3706768 · Zbl 1388.62251 · doi:10.3150/16-BEJ887
[20] Enikeeva, F. and Harchaoui, Z. (2019). High-dimensional change-point detection under sparse alternatives. Ann. Statist. 47 2051-2079. Digital Object Identifier: 10.1214/18-AOS1740 Google Scholar: Lookup Link MathSciNet: MR3953444 · Zbl 1427.62036 · doi:10.1214/18-AOS1740
[21] ESFAHLANI, F. Z., JO, Y., FASKOWITZ, J., BYRGE, L., KENNEDY, D. P., SPORNS, O. and BETZEL, R. F. (2020). High-amplitude cofluctuations in cortical activity drive functional connectivity. Proc. Natl. Acad. Sci. USA 117 28393-28401.
[22] FASKOWITZ, J., ESFAHLANI, F. Z., JO, Y., SPORNS, O. and BETZEL, R. F. (2020). Edge-centric functional network representations of human cerebral cortex reveal overlapping system-level architecture. Nat. Neurosci. 23 1644-1654.
[23] FRYZLEWICZ, P. (2014). Wild binary segmentation for multiple change-point detection. Ann. Statist. 42 2243-2281. Digital Object Identifier: 10.1214/14-AOS1245 Google Scholar: Lookup Link MathSciNet: MR3269979 · Zbl 1302.62075 · doi:10.1214/14-AOS1245
[24] Horváth, L. and Hušková, M. (2012). Change-point detection in panel data. J. Time Series Anal. 33 631-648. Digital Object Identifier: 10.1111/j.1467-9892.2012.00796.x Google Scholar: Lookup Link MathSciNet: MR2944843 · Zbl 1282.62181 · doi:10.1111/j.1467-9892.2012.00796.x
[25] HORVÁTH, L., HUŠKOVÁ, M., RICE, G. and WANG, J. (2017). Asymptotic properties of the CUSUM estimator for the time of change in linear panel data models. Econometric Theory 33 366-412. Digital Object Identifier: 10.1017/S0266466615000468 Google Scholar: Lookup Link MathSciNet: MR3600047 · Zbl 1441.62741 · doi:10.1017/S0266466615000468
[26] HUŠKOVÁ, M. and SLABÝ, A. (2001). Permutation tests for multiple changes. Kybernetika (Prague) 37 605-622. MathSciNet: MR1877077 · Zbl 1264.62038
[27] Jirak, M. (2015). Uniform change point tests in high dimension. Ann. Statist. 43 2451-2483. Digital Object Identifier: 10.1214/15-AOS1347 Google Scholar: Lookup Link MathSciNet: MR3405600 · Zbl 1327.62467 · doi:10.1214/15-AOS1347
[28] Killick, R., Fearnhead, P. and Eckley, I. A. (2012). Optimal detection of changepoints with a linear computational cost. J. Amer. Statist. Assoc. 107 1590-1598. Digital Object Identifier: 10.1080/01621459.2012.737745 Google Scholar: Lookup Link MathSciNet: MR3036418 · Zbl 1258.62091 · doi:10.1080/01621459.2012.737745
[29] KIRCH, C. and KLEIN, P. (2023). Moving sum data segmentation for stochastic processes based on invariance. Statist. Sinica 33 873-892. Digital Object Identifier: 10.5705/ss.202021.0048 Google Scholar: Lookup Link MathSciNet: MR4575326 · Zbl 07763180 · doi:10.5705/ss.202021.0048
[30] KUCHIBHOTLA, A. K., BROWN, L. D., BUJA, A., GEORGE, E. I. and ZHAO, L. (2023). Uniform-in-submodel bounds for linear regression in a model-free framework. Econometric Theory 39 1202-1248. Digital Object Identifier: 10.1017/s0266466621000219 Google Scholar: Lookup Link MathSciNet: MR4678102 · Zbl 07785628 · doi:10.1017/s0266466621000219
[31] LEE, S., SEO, M. H. and SHIN, Y. (2016). The lasso for high dimensional regression with a possible change point. J. R. Stat. Soc. Ser. B. Stat. Methodol. 78 193-210. Digital Object Identifier: 10.1111/rssb.12108 Google Scholar: Lookup Link MathSciNet: MR3453652 · Zbl 1411.62205 · doi:10.1111/rssb.12108
[32] LÉVY-LEDUC, C. and ROUEFF, F. (2009). Detection and localization of change-points in high-dimensional network traffic data. Ann. Appl. Stat. 3 637-662. Digital Object Identifier: 10.1214/08-AOAS232 Google Scholar: Lookup Link MathSciNet: MR2750676 · Zbl 1166.62094 · doi:10.1214/08-AOAS232
[33] LI, D., QIAN, J. and SU, L. (2016). Panel data models with interactive fixed effects and multiple structural breaks. J. Amer. Statist. Assoc. 111 1804-1819. Digital Object Identifier: 10.1080/01621459.2015.1119696 Google Scholar: Lookup Link MathSciNet: MR3601737 · doi:10.1080/01621459.2015.1119696
[34] LI, J., CHEN, L., WANG, W. and WU, W. B. (2024). Supplement to “\( \ell^2\) inference for change points in high-dimensional time series via a Two-Way MOSUM.” https://doi.org/10.1214/24-AOS2360SUPP
[35] LIU, B., QI, Z., ZHANG, X. and LIU, Y. (2022). Change point detection for high-dimensional linear models: A general tail-adaptive approach. Preprint. Available at arXiv:2207.11532.
[36] MADRID PADILLA, O. H., YU, Y. and RINALDO, A. (2021). Lattice partition recovery with dyadic CART. Adv. Neural Inf. Process. Syst. 34 26143-26155.
[37] MATSUDA, Y. and YAJIMA, Y. (2009). Fourier analysis of irregularly spaced data on \(\mathbb{R}^{\mathit{d}} \). J. R. Stat. Soc. Ser. B. Stat. Methodol. 71 191-217. Digital Object Identifier: 10.1111/j.1467-9868.2008.00685.x Google Scholar: Lookup Link MathSciNet: MR2655530 · Zbl 1231.62169 · doi:10.1111/j.1467-9868.2008.00685.x
[38] Olshen, A. B., Venkatraman, E. S., Lucito, R. and Wigler, M. (2004). Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5 557-572. · Zbl 1155.62478
[39] ONNELA, J. P., CHAKRABORTI, A., KASKI, K., KERTÉSZ, J. and KANTO, A. (2003). Dynamics of market correlations: Taxonomy and portfolio analysis. Phys. Rev. E 68 056110.
[40] Rasmussen, C. E. and Williams, C. K. I. (2006). Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA. MathSciNet: MR2514435 · Zbl 1177.68165
[41] SCOTT, A. J. and KNOTT, M. (1974). A cluster analysis method for grouping means in the analysis of variance. Biometrics 30 507-512. · Zbl 0284.62044
[42] SHAO, X. (2010). A self-normalized approach to confidence interval construction in time series. J. R. Stat. Soc. Ser. B. Stat. Methodol. 72 343-366. Digital Object Identifier: 10.1111/j.1467-9868.2009.00737.x Google Scholar: Lookup Link MathSciNet: MR2758116 · Zbl 1411.62263 · doi:10.1111/j.1467-9868.2009.00737.x
[43] Stein, M. L. (1999). Interpolation of Spatial Data: Some Theory for Kriging. Springer Series in Statistics. Springer, New York. Digital Object Identifier: 10.1007/978-1-4612-1494-6 Google Scholar: Lookup Link MathSciNet: MR1697409 · Zbl 0924.62100 · doi:10.1007/978-1-4612-1494-6
[44] TIBSHIRANI, R. and WANG, P. (2008). Spatial smoothing and hot spot detection for CGH data using the fused lasso. Biostatistics 9 18-29. · Zbl 1274.62886
[45] WANG, D. and ZHAO, Z. (2022). Optimal change-point testing for high-dimensional linear models with temporal dependence. Preprint. Available at arXiv:2205.03880.
[46] WANG, R. and SHAO, X. (2020). Hypothesis testing for high-dimensional time series via self-normalization. Ann. Statist. 48 2728-2758. Digital Object Identifier: 10.1214/19-AOS1904 Google Scholar: Lookup Link MathSciNet: MR4152119 · Zbl 1464.62307 · doi:10.1214/19-AOS1904
[47] WANG, R. and SHAO, X. (2023). Dating the break in high-dimensional data. Bernoulli 29 2879-2901. Digital Object Identifier: 10.3150/22-bej1567 Google Scholar: Lookup Link MathSciNet: MR4632124 · doi:10.3150/22-bej1567
[48] WANG, R., ZHU, C., VOLGUSHEV, S. and SHAO, X. (2022). Inference for change points in high-dimensional data via selfnormalization. Ann. Statist. 50 781-806. Digital Object Identifier: 10.1214/21-aos2127 Google Scholar: Lookup Link MathSciNet: MR4405366 · Zbl 1486.62246 · doi:10.1214/21-aos2127
[49] Wang, T. and Samworth, R. J. (2018). High dimensional change point estimation via sparse projection. J. R. Stat. Soc. Ser. B. Stat. Methodol. 80 57-83. Digital Object Identifier: 10.1111/rssb.12243 Google Scholar: Lookup Link MathSciNet: MR3744712 · Zbl 1439.62199 · doi:10.1111/rssb.12243
[50] Wu, W. B. (2005). Nonlinear system theory: Another look at dependence. Proc. Natl. Acad. Sci. USA 102 14150-14154. Digital Object Identifier: 10.1073/pnas.0506715102 Google Scholar: Lookup Link MathSciNet: MR2172215 · Zbl 1135.62075 · doi:10.1073/pnas.0506715102
[51] Wu, W. B. and Zhao, Z. (2007). Inference of trends in time series. J. R. Stat. Soc. Ser. B. Stat. Methodol. 69 391-410. Digital Object Identifier: 10.1111/j.1467-9868.2007.00594.x Google Scholar: Lookup Link MathSciNet: MR2323759 · Zbl 07555358 · doi:10.1111/j.1467-9868.2007.00594.x
[52] Xie, Y. and Siegmund, D. (2013). Sequential multi-sensor change-point detection. Ann. Statist. 41 670-692. Digital Object Identifier: 10.1214/13-AOS1094 Google Scholar: Lookup Link MathSciNet: MR3099117 · Zbl 1267.62084 · doi:10.1214/13-AOS1094
[53] XU, H., WANG, D., ZHAO, Z. and YU, Y. (2022). Change point inference in high-dimensional regression models under temporal dependence. Preprint. Available at arXiv:2207.12453.
[54] YEO, B. T. T., KRIENEN, F. M., SEPULCRE, J., SABUNCU, M. R., LASHKARI, D., HOLLINSHEAD, M., ROFFMAN, J. L., SMOLLER, J. W., ZÖLLEI, L. et al. (2011). The organization of the human cerebral cortex estimated by intrinsic functional connectivity. J. Neurophysiol. 106 1125-1165.
[55] YU, M. and CHEN, X. (2021). Finite sample change point inference and identification for high-dimensional mean vectors. J. R. Stat. Soc. Ser. B. Stat. Methodol. 83 247-270. Digital Object Identifier: 10.1111/rssb.12406 Google Scholar: Lookup Link MathSciNet: MR4250275 · Zbl 07555264 · doi:10.1111/rssb.12406
[56] YU, Y. (2020). A review on minimax rates in change point detection and localisation. Preprint. Available at arXiv:2011.01857.
[57] YU, Y., MADRID, O. and RINALDO, A. (2022). Optimal partition recovery in general graphs. In International Conference on Artificial Intelligence and Statistics 4339-4358. PMLR.
[58] Zhang, N. R., Siegmund, D. O., Ji, H. and Li, J. Z. (2010). Detecting simultaneous changepoints in multiple sequences. Biometrika 97 631-645. Digital Object Identifier: 10.1093/biomet/asq025 Google Scholar: Lookup Link MathSciNet: MR2672488 · Zbl 1195.62168 · doi:10.1093/biomet/asq025
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.