×

A novel approach for missing data prediction in coevolving time series. (English) Zbl 1465.62153

Summary: Although various innovative sensing technologies have been widely employed, data missing in collections of time series occurs frequently, which turns out to be a major menace to precise data analysis. However, many existing missing data prediction approaches either might be infeasible or could be inefficient to predict missing data from multiple time series. To solve this problem, we proposed a novel approach based on the compressive sensing theory and sparse Bayesian learning theory for missing data prediction in coevolving time series. First, we model the problem by designing the corresponding sparse representation basis and measurement matrix. Then, the missing data prediction problem is formulated as the multiple sparse vectors recovery problem. Many simultaneous sparse estimation approaches focus on joint estimation of multiple sparse vectors with a common support from given linear observations, which is however too strict in some real applications. In this paper, largely utilizing the interior patterns of coevolving time series, we design a tuning parameter-free algorithm based on the sparse Bayesian learning, which can simultaneously solve multiple sparse estimation takes without the requirement of auxiliary information. Simulation results demonstrate that our approach can recover the entire time series efficiently using only those data that are not missing, even if, a high ratio of collected data are missing.

MSC:

62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
62M20 Inference from stochastic processes and prediction
62D10 Missing data

Software:

PRMLT
Full Text: DOI

References:

[1] Wu X, Zhu X, Wu GQ, Ding W (2014) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97-107 · doi:10.1109/TKDE.2013.109
[2] Vlahogianni EI, Golias JC (2004) Short-term traffic forecasting: overview of objectives and methods. Transp Res Rev 24(5):533-557 · doi:10.1080/0144164042000195072
[3] Ruby-Figueroa Ren, Saavedra Jorge, Bahamonde Natalia, Cassano Alfredo (2017) Permeate flux prediction in the ultrafiltration of fruit juices by ARIMA models. J Membr Sci 524:108-116 · doi:10.1016/j.memsci.2016.11.034
[4] Lippi M, Bertini M, Frasconi P (2013) Short-term traffic flow forecasting: an experimental comparison of time-series analysis and supervised learning. IEEE Trans Intell Transp Syst 14(2):871-882 · doi:10.1109/TITS.2013.2247040
[5] Strauman AS, Bianchi FM, Mikalsen KØ (2018) Classification of postoperative surgical site infections from blood measurements with missing data using recurrent neural networks. In: IEEE EMBS international conference on biomedical & health informatics (BHI), pp 307-310. https://doi.org/10.1109/BHI.2018.8333430
[6] Zhong M, Sharma S, Lingras P (2004) Genetically designed models for accurate imputations of missing traffic counts. Transp Res Rec 1879:71-79 · doi:10.3141/1879-09
[7] Kumar L, Kumar M, Rath SK (2016) Maintainability prediction of web service using support vector machine with various kernel methods. Int J Syst Assur Eng Manag 2:1-18
[8] Baharaeen S, Masud AS (1986) A computer program for time series forecasting using single and double exponential smoothing techniques. Comput Ind Eng 11:151-155 · doi:10.1016/0360-8352(86)90068-9
[9] Holt CC (2004) Forecasting seasonals and trends by exponentially weighted moving averages. Int J Forecast 20:5-10 · doi:10.1016/j.ijforecast.2003.09.015
[10] Chen C, Kwon J, Rice J, Skabardonis A, Varaiya P (2003) Detectingerrors and imputing missing data for single-loop surveillance systems. Transp Res Rec J Board 1855:160-167 · doi:10.3141/1855-20
[11] Al Deek HM, Chandra CVSR (2004) New algorithms for filtering and imputation of real-time and archived dual-loop detector data in I-4 data warehouse. Transp Res Rec J Transp Res Board 1867:116-126 · doi:10.3141/1867-14
[12] Boyles S (2011) A comparison of interpolation methods for missing traffic volume data. In: Proceedings of the 90th annual meeting of the transportation research board, pp 23-27
[13] Qu L, Li L, Zhang Y, Hu J (2009) PPCA-based missing data imputation for traffic flow volume: a systematical approach. IEEE Trans Intell Transp Syst 10(3):512-522 · doi:10.1109/TITS.2009.2026312
[14] Li Y, Li Z, Li L, Zhang Y (2013) Comparison on PPCA, KPPCA and MPPCA based missing data imputing for traffic flow. In: Proceedings of IEEE conference on intelligent transportation system, pp 1535-1540
[15] Shi W, Zhu Y, Yu PS (2017) Temporal dynamic matrix factorization for missing data prediction in large scale coevolving time series. IEEE Access 4(99):6719-6732
[16] Cai Y, Tong H, Fan W, Ji P (2015) Fast mining of a network of coevolving time series. In: Proceedings of SIAM international conference data mining, pp 298-306
[17] Si Z, Yu H, Ma Z (2016) Learning deep features for DNA methylation data analysis. IEEE Access 4:2732-2737 · doi:10.1109/ACCESS.2016.2576598
[18] Cands E, Romberg J, Tao T (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans Inf Theory 52(2):489-509 · Zbl 1231.94017 · doi:10.1109/TIT.2005.862083
[19] Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. JMLR.org · Zbl 0997.68109
[20] Babacan SD, Molina R, Katsaggelos AK (2010) Bayesian compressive sensing using laplace priors. IEEE Trans Image Process 19(1):53-63 · Zbl 1371.94480 · doi:10.1109/TIP.2009.2032894
[21] Andrews DF, Mallows CL (1974) Scale mixtures of normal distributions. J R Stat Soc Ser B (Methodol) 36:99-102 · Zbl 0282.62017
[22] Wipf D, Rao B (2007) An empirical Bayesian strategy for solving the simultaneous sparse approximation problem. IEEE Trans Signal Process 55(7):3704-3716 · Zbl 1391.62010 · doi:10.1109/TSP.2007.894265
[23] Tropp JA, Gilbert AC, Strauss MJ (2006) Algorithms for simultaneous sparse approximation. Part I: greedy pursuit. Signal Process 86:572-588 · Zbl 1163.94396 · doi:10.1016/j.sigpro.2005.05.030
[24] Cotter SF, Rao BD, Engan K, Kreutz-Delgado K (2005) Sparse solutions to linear inverse problems with multiple measurement vectors. IEEE Trans Signal Process 53:2477-2488 · Zbl 1372.65123 · doi:10.1109/TSP.2005.849172
[25] Tropp JA, Gilbert AC, Strauss MJ (2006) Algorithms for simultaneous sparse approximation. Part II: convex relaxation. Signal Process 86:589-602 · Zbl 1163.94395 · doi:10.1016/j.sigpro.2005.05.031
[26] Wipf DP, Rao BD (2007) An empirical Bayesian strategy for solving the simultaneous sparse approximation problem. IEEE Trans Signal Process 55:3704-3716 · Zbl 1391.62010 · doi:10.1109/TSP.2007.894265
[27] Zhang Z, Rao BD (2011) Sparse signal recovery with temporally correlated source vectors using sparse Bayesian learning. IEEE J Sel Top Signal Process 5:912-926 · doi:10.1109/JSTSP.2011.2159773
[28] Zhang Z, Rao BD (2010) Sparse signal recovery in the presence of correlated multiple measurement vectors. In: Proceedings of ICASSP, Dallas, TX, USA, pp 3986-3989
[29] Prasad R, Murphy CR, Rao BD (2014) Joint approximately sparse channel estimation and data detection in OFDM systems using sparse Bayesian learning. IEEE Trans Signal Process 62(14):3591-3603 · Zbl 1394.94464 · doi:10.1109/TSP.2014.2329272
[30] Chen Wei (2017) Simultaneous sparse Bayesian learning with partially shared support. IEEE Signal Process Lett 24(10):1641-1645 · doi:10.1109/LSP.2017.2753770
[31] Tipping ME (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1:211-244 · Zbl 0997.68109
[32] Rhee I, Shin M, Hong S (2009) Mobility traces. http://carwdad.org/ncsu/mobilitymodels/
[33] Samuel M. Intel lab data. http://db.csail.mit.edu
[34] Fonollosa J, Sheik S, Huerta R, Marco S (2015) Reservoir computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring. Sens Actuators B Chem 215:618-629 · doi:10.1016/j.snb.2015.03.028
[35] Wu X, Liu M (2012) In-situ soil moisture sensing: Measurement scheduling and estimation using compressive sensing. In: Proceedings of the 11th ACM international conference on information processing in sensor networks, pp 1-12
[36] Bishop CM (2006) Pattern recognition and machine learning. Springer, Berlin · Zbl 1107.68072
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.