×

LoCoMotif: discovering time-warped motifs in time series. (English) Zbl 07917022

Summary: Time series motif discovery (TSMD) refers to the task of identifying patterns that occur multiple times (possibly with minor variations) in a time series. All existing methods for TSMD have one or more of the following limitations: they only look for the two most similar occurrences of a pattern; they only look for patterns of a pre-specified, fixed length; they cannot handle variability along the time axis; and they only handle univariate time series. In this paper, we present a new method, LoCoMotif, that has none of these limitations. The method is motivated by a concrete use case from physiotherapy. We demonstrate the value of the proposed method on this use case. We also introduce a new quantitative evaluation metric for motif discovery, and benchmark data for comparing TSMD methods. LoCoMotif substantially outperforms the existing methods, on top of being more broadly applicable.

MSC:

62M10 Time series, auto-correlation, regression, etc. in statistics (GARCH)
62G10 Nonparametric hypothesis testing

References:

[1] Alaee S, Kamgar K, Keogh E (2020) Matrix profile XXII: exact discovery of time series motifs under DTW. In: 2020 IEEE international conference on data mining (ICDM), pp 900-905. doi:10.1109/ICDM50108.2020.00099
[2] Bagnall A, Hills J, Lines J (2014) Finding motif sets in time series. arXiv preprint arXiv:1407.3685. doi:10.48550/arXiv.1407.3685
[3] Benavoli, A.; Corani, G.; Mangili, F., Should we really use post-hoc tests based on mean-ranks?, J Mach Learn Res, 17, 5, 1-10, 2016 · Zbl 1360.62208
[4] Dau HA, Keogh E, Kamgar K, Yeh C-CM, Zhu Y, Gharghabi S, Ratanamahatana CA, Yanping Hu B, Begum N, Bagnall A, Mueen A, Batista G (2018) The UCR time series classification archive. https://www.cs.ucr.edu/ eamonn/time_series_data_2018/
[5] Dau HA, Keogh E (2017) Matrix profile V: a generic technique to incorporate domain knowledge into motif discovery. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pp 125-134. doi:10.1145/3097983.3097993
[6] Gao Y, Lin J (2019) Discovering subdimensional motifs of different lengths in large-scale multivariate time series. In: 2019 IEEE international conference on data mining (ICDM), pp 220-229. doi:10.1109/ICDM.2019.00032
[7] Grabocka J, Schilling N, Schmidt-Thieme L (2016) Latent time-series motifs. ACM Trans Knowl Discov Data. doi:10.1145/2940329
[8] Grandini M, Bagli E, Visani G (2020) Metrics for multi-class classification: an overview. arXiv preprint arXiv:2008.05756. doi:10.48550/arXiv.2008.05756
[9] Hao Y, Shokoohi-Yekta M, Papageorgiou G, Keogh E (2013) Parameter-free audio motif discovery in large data archives. In: 2013 IEEE 13th international conference on data mining, pp 261-270. doi:10.1109/ICDM.2013.30
[10] Javed, A.; Lee, BS; Rizzo, DM, A benchmark study on time series clustering, Mach Learn Appl, 2020 · doi:10.1016/j.mlwa.2020.100001
[11] Lin, JF-S; Karg, M.; Kulic, D., Movement primitive segmentation for human motion modeling: a framework for analysis, IEEE Trans Hum Mach Syst, 46, 3, 325-339, 2016 · doi:10.1109/thms.2015.2493536
[12] Linardi M, Zhu Y, Palpanas T, Keogh E (2018) Matrix Profile X: VALMOD - scalable discovery of variable-length motifs in data series. In: Proceedings of the 2018 international conference on management of data, pp 1053-1066. doi:10.1145/3183713.3183744
[13] Martello S, Toth P (1987) Linear assignment problems. In: Surveys in combinatorial optimization, volume 132 of North-Holland mathematics studies, pp 259-282. doi:10.1016/S0304-0208(08)73238-9 · Zbl 0611.90073
[14] Moody GB, Mark RG (1992) MIT-BIH arrhythmia database. https://physionet.org/content/mitdb/
[15] Mueen, A.; Chavoshi, N., Enumeration of time series motifs of all lengths, Knowl Inf Syst, 45, 1, 105-132, 2015 · doi:10.1109/ICDM.2013.27
[16] Müller, M., Fundamentals of music processing, Springer Int Publ, 2015 · doi:10.1007/978-3-319-21945-5
[17] Müller, M.; Jiang, N.; Grosche, P., A robust fitness measure for capturing repetitions in music recordings with applications to audio thumbnailing, IEEE Trans Audio Speech Lang Process, 21, 3, 531-543, 2013 · doi:10.1109/TASL.2012.2227732
[18] Schäfer, P.; Leser, U., Motiflets simple and accurate detection of motifs in time series, Proc VLDB Endowm, 16, 725-737, 2023 · doi:10.14778/3574245.3574257
[19] Soenen J, Van Wolputte E, Perini L, Vercruyssen V, Meert W, Davis J, Blockeel H (2021) The effect of hyperparameter tuning on the comparative evaluation of unsupervised anomaly detection methods. In: Proceedings of the KDD’21 workshop on outlier detection and description, pp 1-9
[20] Wang, L.; Chng, ES; Li, H., A tree-construction search approach for multivariate time series motifs discovery, Pattern Recogn Lett, 31, 9, 869-875, 2010 · doi:10.1016/j.patrec.2010.01.005
[21] Wankhedkar R, Jain SK (2021) Motif discovery and anomaly detection in an ECG using matrix profile. In: Progress in advanced computing and intelligent engineering, pp 88-95. doi:10.1007/978-981-15-6584-7_9
[22] Yeh C-CM, Kavantzas N, Keogh E (2017) Matrix profile VI: meaningful multidimensional motif discovery. In: 2017 IEEE international conference on data mining (ICDM), pp 565-574. doi:10.1109/ICDM.2017.66
[23] Yingchareonthawornchai S, Sivaraks H, Rakthanmanon T, Ratanamahatana CA (2013) Efficient proper length time series motif discovery. In: 2013 IEEE 13th international conference on data mining, pp 1265-1270. doi:10.1109/ICDM.2013.111
[24] Yurtman, A.; Barshan, B., Automated evaluation of physical therapy exercises using multi-template dynamic time warping on wearable sensor signals, Comput Methods Programs Biomed, 117, 2, 189-207, 2014 · doi:10.1016/j.cmpb.2014.07.003
[25] Yurtman A, Barshan B (2022) Physical therapy exercises dataset. doi:10.13140/RG.2.2.20101.01768
[26] Zimmerman Z, Kamgar K, Senobari NS, Crites B, Funning G, Brisk P, Keogh E (2019) Matrix profile XIV: scaling time series motif discovery with GPUs to break a quintillion pairwise comparisons a day and beyond. In: Proceedings of the ACM symposium on cloud computing, pp 74-86. doi:10.1145/3357223.3362721
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.