×

A two-stage causality method for time series prediction based on feature selection and momentary conditional independence. (English) Zbl 07511820

Summary: Since the actual time series contain a lot of variables and the relations among them are complex. Hence, it is difficult to accurately judge cause and effect by conventional causality methods. Aiming at the problem, a two-stage causal network learning method, the feature selection stage and the conditional independence test stage, is proposed to reveal the causalities between variables and construct an accurate prediction model. In the first stage, there are two steps to perform. Firstly, a feature selection method is utilized to reduce data dimensionality by removing irrelevant and redundant variables. These variables are not only increase computational complexity, but also cover up part of the effective information, which may result in insufficient accuracy of the constructed model. Then, a global redundancy minimization (GRM) scheme is used to further refine the result of the previous step from a global perspective. In the second stage, a momentary conditional independence (MCI) test is performed to test the causalities between variables, which can accurately detect the causal network structure. Finally, an accuracy causal network and subsequent prediction model can be established based on the output of the two-stage model. In this simulations, two benchmark datasets, a coupled Lorenz system and two actual datasets are used to verify the effectiveness of the proposed method. The results show that the proposed method can effectively analyze the causalities between variables and construct an accuracy prediction model.

MSC:

82-XX Statistical mechanics, structure of matter

Software:

SSA; TETRAD
Full Text: DOI

References:

[1] Wang, Q., Multifractal characterization of air polluted time series in China, Phys. A, 514, 167-180 (2019)
[2] Jakob, R., Inferring causation from time series in earth system sciences, Nature Commun., 10, 1, 2553 (2019)
[3] Zhang, Q., A short-term traffic forecasting model based on echo state network optimized by improved fruit fly optimization algorithm, Neurocomputing, 416, 117-124 (2020)
[4] Safari, N.; Chung, C. Y.; Price, G. C.D., Novel multi-step short-term wind power prediction framework based on chaotic time series analysis and singular spectrum analysis, IEEE Trans. Power Syst., 33, 1, 590-601 (2018)
[5] Sugihara, G.; May, R. M.; Ye, H.; hao Hsieh, C.; Deyle, E. R.; Fogarty, M. J.; Munch, S. B., Detecting causality in complex ecosystems, Science, 338, 6106, 496-500 (2012) · Zbl 1355.92144
[6] Pearl, J.; Glymour, M.; Jewell, N. P., Causal inference in statistics : a primer, J. Chem. Inf. Model., 53, 1689-1699 (2016) · Zbl 1332.62001
[7] Zenil, H.; Kiani, N. A.; Zea, A. A.; Tegnér, J., Causal deconvolution by algorithmic generative models, Nat. Mach. Intell., 1, 1, 58-66 (2019)
[8] Hyvärinen, A.; Zhang, K.; Shimizu, S.; Hoyer, P. O., Estimation of a structural vector autoregression model using non-Gaussianity, J. Mach. Learn. Res., 11, 56, 1709-1731 (2010) · Zbl 1242.62097
[9] Ren, W.; Li, B.; Han, M., A novel granger causality method based on HSIC-lasso for revealing nonlinear relationship between multivariate time series, Phys. A, 541, Article 123245 pp. (2020) · Zbl 07527013
[10] Spirtes, P.; Glymour, C. N.; Scheines, R., Causation, Prediction, and Search (1993), MIT Press · Zbl 0806.62001
[11] Guo, R.; Cheng, L.; Li, J.; Hahn, P. R.; Liu, H., A survey of learning causality with data: Problems and methods, ACM Comput. Surv., 53, 4, Article 3397269 pp. (2020)
[12] Geweke, J., Measurement of linear dependence and feedback between multiple time series, J. Am. Stat. Assoc., 77, 378, 304-313 (1982) · Zbl 0492.62078
[13] Ancona, N.; Marinazzo, D.; Stramaglia, S., Radial basis function approach to nonlinear granger causality of time series, Phys. Rev. E, 70, Article 056221 pp. (2004)
[14] Shojaie, A.; Michailidis, G., Discovering graphical granger causality using the truncating lasso penalty, Bioinformatics, 26, 18, i517-i523 (2010)
[15] Marinazzo, D.; Pellicoro, M.; Stramaglia, S., Kernel method for nonlinear granger causality, Phys. Rev. Lett., 100, Article 144103 pp. (2008)
[16] Geweke, J., Measurement of linear dependence and feedback between multiple time series, J. Am. Stat. Assoc., 77, 378, 304-313 (1982) · Zbl 0492.62078
[17] Azqueta-Gavaldón, A., Causal inference between cryptocurrency narratives and prices: Evidence from a complex dynamic ecosystem, Phys. A, 537, Article 122574 pp. (2020)
[18] Runge, J.; Nowack, P.; Kretschmer, M.; Flaxman, S.; Sejdinovic, D., Detecting and quantifying causal associations in large nonlinear time series datasets, Sci. Adv., 5, 11, eaau4996 (2019)
[19] Spirtes, P.; Glymour, C. N., An algorithm for fast recovery of sparse causal graphs, Soc. Sci. Comput. Rev., 9, 1, 62-72 (1991)
[20] Li, J.; Cheng, K.; Wang, S.; Morstatter, F.; Trevino, R. P.; Tang, J.; Liu, H., Feature selection: A data perspective, ACM Comput. Surv., 50, 6, 94 (2017)
[21] Li, Y.; Li, T.; Liu, H., Recent advances in feature selection and its applications, Knowl. Inf. Syst., 53, 3, 551-577 (2017)
[22] Wang, D.; Nie, F.; Huang, H., Feature selection via global redundancy minimization, IEEE Trans. Knowl. Data Eng., 27, 10, 2743-2755 (2015)
[23] Peng, H.; Long, F.; Ding, C., Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., 27, 8, 1226-1238 (2005)
[24] Nie, F.; Yang, S.; Zhang, R.; Li, X., A general framework for auto-weighted feature selection via global redundancy minimization, IEEE Trans. Image Process., 28, 5, 2428-2438 (2019) · Zbl 1411.94009
[25] Han, M.; Ren, W., Global mutual information-based feature selection approach using single-objective and multi-objective optimization, Neurocomputing, 168, 168, 47-54 (2015)
[26] Miura, A.; Tomoeda, A.; Nishinari, K., Formularization of entropy and anticipation of metastable states using mutual information in one-dimensional traffic flow, Phys. A, 560, 3, Article 125152 pp. (2020)
[27] Zhao, Z.; Morstatter, F.; Sharma, S.; Alelyani, S.; Anand, A.; Liu, H., Advancing feature selection research, 1-28 (2010), ASU Feature Selection Repository Arizona State University
[28] Zhong, K.; Ma, D.; Han, M., Distributed dynamic process monitoring based on dynamic slow feature analysis with minimal redundancy maximal relevance, Control Eng. Pract., 104, Article 104627 pp. (2020)
[29] Ding, C. H.Q.; Peng, H., Minimum redundancy feature selection from microarray gene expression data, J. Bioinform. Comput. Biol., 3, 2, 185-205 (2005)
[30] Zhang, Q.; Wang, A., Decoupling control in statistical sense: minimised mutual information algorithm, Int. J. Adv. Mechatron. Syst., 7, 2, 61 (2016)
[31] Mirjalili, S.; Gandomi, A. H.; Mirjalili, S. Z.; Saremi, S.; Faris, H.; Mirjalili, S. M., Salp swarm algorithm: A bio-inspired optimizer for engineering design problems, Adv. Eng. Softw., 114, 163-191 (2017)
[32] Runge, J., Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets, (Conference on Uncertainty in Artificial Intelligence (2020), PMLR), 1388-1397
[33] Mirjalili, S.; Mirjalili, S. M.; Hatamlou, A., Multi-verse optimizer: a nature-inspired algorithm for global optimization, Neural Comput. Appl., 27, 2, 495-513 (2016)
[34] Anishchenko, V. S.; Silchenko, A. N.; Khovanov, I. A., Synchronization of switching processes in coupled lorenz systems, Phys. Rev. E, 57, 316-322 (1998)
[35] Santoso, A.; England, M. H.; Cai, W., Impact of indo-Pacific feedback interactions on ENSO dynamics diagnosed using ensemble climate simulations, J. Clim., 25, 21, 7743-7763 (2012)
[36] Walker, G., Correlation in seasonal variations of weather, VIII. A preliminary study of world weather, Mem. India Meteorol. Dep., 24, 4, 75-131 (1923)
[37] Li, D.; Han, M.; Wang, J., Chaotic time series prediction based on a novel robust echo state network, IEEE Trans. Neural Netw. Learn. Syst., 23, 5, 787-799 (2012)
[38] Chen, Z.; Cai, J.; Gao, B.; Xu, B.; Dai, S.; He, B.; Xie, X., Detecting the causality influence of individual meteorological factors on local PM2.5 concentration in the Jing-Jin-Ji region, Sci. Rep., 7, 40735 (2017)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.