×

TIRPCLo: efficient and complete mining of time intervals-related patterns. (English) Zbl 07741281

Summary: Mining frequent Time Intervals-Related Patterns (TIRPs) from series of symbolic time intervals offers a comprehensive framework for heterogeneous, multivariate temporal data analysis in various application domains. While gaining a growing interest in recent decades, the efficient mining of frequent TIRPs is still a high computational challenge which has also not yet been investigated in its full complexity. The majority of previous methods discover only the first instances of the TIRPs within each series of symbolic time intervals, whereas their re-occurring instances are ignored. This eventually results in an incomplete discovery of frequent TIRPs, a problem that lies also in the challenge of mining only the frequent closed TIRPs, which was only recently investigated for the first time. In this paper, we introduce TIRPClo – an efficient algorithm for the complete mining of either the entire set of frequent TIRPs, or only the frequent closed TIRPs. The algorithm proposes a non-ambiguous sequential representation of symbolic time intervals series through the intervals’ end-points, as well as a memory-efficient index and a novel method for data projection, due to which it is the first algorithm to guarantee a complete discovery of frequent closed TIRPs. The experimental evaluation conducted on eleven real-world and four synthetic datasets demonstrates that TIRPClo is up to 10 times faster when mining the entire set of frequent TIRPs, and up to more than 100 times faster when mining only the frequent closed TIRPs compared to four state-of-the-art methods, while also reporting lower memory measurements.

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68T10 Pattern recognition, speech recognition
Full Text: DOI

References:

[1] Allen, JF, Maintaining knowledge about temporal intervals, Commun ACM, 26, 11, 832-843 (1983) · Zbl 0519.68079 · doi:10.1145/182.358434
[2] Ayres J, Flannick J, Gehrke J, et al (2002) Sequential pattern mining using a bitmap representation. In: Proceedings of the Eighth ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York, NY, USA, KDD ’02, pp 429-435, doi:10.1145/775047.775109
[3] Batal I, Sacchi L, Bellazzi R, et al (2009) A temporal abstraction framework for classifying clinical temporal data. In: AMIA Annual Symposium Proceedings, vol 2009. American Medical Informatics Association, Rockville, MD, p 29
[4] Benavoli, A.; Corani, G.; Mangili, F., Should we really use post-hoc tests based on mean-ranks?, J Mach Learn Res, 17, 1, 152-161 (2016) · Zbl 1360.62208
[5] Chang L, Wang T, Yang D, et al (2008) Seqstream: mining closed sequential patterns over stream sliding windows. In: 2008 Eighth IEEE International Conference on Data Mining. IEEE Computer Society, Washington, DC, USA, ICDM ’08, pp 83-92, doi:10.1109/ICDM.2008.36
[6] Chen, YC; Peng, WC; Lee, SY, Mining temporal patterns in time interval-based data, IEEE Trans Knowl Data Eng, 27, 12, 3318-3331 (2015) · doi:10.1109/TKDE.2015.2454515
[7] Chen, YC; Weng, JTY; Hui, L., A novel algorithm for mining closed temporal patterns from interval-based data, Knowl Inf Syst, 46, 1, 151-183 (2016) · doi:10.1007/s10115-014-0815-2
[8] Demšar, J., Statistical comparisons of classifiers over multiple data sets, J Mach Learn Res, 7, 1-30 (2006) · Zbl 1222.68184
[9] Ezeife CI, Lu Y, Liu Y (2005) Plwap sequential mining: open source code. In: Proceedings of the 1st international workshop on open source data mining: frequent pattern mining implementations. Association for Computing Machinery, New York, NY, USA, OSDM ’05, pp 26-35, doi:10.1145/1133905.1133910
[10] Fournier-Viger, P.; Lin, JCW; Kiran, RU, A survey of sequential pattern mining, Data Sci Pattern Recogn, 1, 1, 54-77 (2017)
[11] Fumarola, F.; Lanotte, PF; Ceci, M., Clofast: closed sequential pattern mining using sparse and vertical id-lists, Knowl Inf Syst, 48, 2, 429-463 (2016) · doi:10.1007/s10115-015-0884-x
[12] Garcia, S.; Herrera, F., An extension on“ statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons, J Mach Learn Res, 9, 12, 2677 (2008) · Zbl 1225.68178
[13] Gomariz A, Campos M, Marin R, et al (2013) Clasp: An efficient algorithm for mining frequent closed sequences. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, Berlin, pp 50-61, doi:10.1007/978-3-642-37453-1_5
[14] Han J, Pei J, Mortazavi-Asl B, et al (2000) Freespan: Frequent pattern-projected sequential pattern mining. In: Proceedings of the Sixth ACM SIGKDD international conference on knowledge discovery and data mining. Association for Computing Machinery, New York, NY, USA, KDD ’00, pp 355-359, doi:10.1145/347090.347167
[15] Han J, Pei J, Mortazavi-Asl B, et al (2001) Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth. In: proceedings of the 17th international conference on data engineering. IEEE Computer Society, Washington, DC, USA, pp 215-224
[16] Harel OD, Moskovitch R (2021) Complete closed time intervals-related patterns mining. In: proceedings of the 35th AAAI conference on artificial intelligence. AAAI Press, Palo Alto, CA
[17] Höppner F (2001) Learning temporal rules from state sequences. In: IJCAI Workshop on Learning from Temporal and Spatial Data, Citeseer · Zbl 1009.68798
[18] Höppner F (2002) Time series abstraction methods: a survey. Informatik bewegt: Informatik 2002-32 Jahrestagung der Gesellschaft für Informatik ev (GI) · Zbl 1024.68559
[19] Huang, JW; Jaysawal, BP; Chen, KY, Mining frequent and top-k high utility time interval-based events with duration patterns, Knowl Inf Syst, 61, 3, 1331-1359 (2019) · doi:10.1007/s10115-019-01333-6
[20] Huang KY, Chang CH, Tung JH, et al (2006) Cobra: Closed sequential pattern mining using bi-phase reduction approach. In: International Conference on data warehousing and knowledge discovery. Springer, Berlin, pp 280-291, doi:10.1007/11823728_27
[21] Hui, L.; Chen, YC; Weng, JTY, Incremental mining of temporal patterns in interval-based database, Knowl Inf Syst, 46, 2, 423-448 (2016) · doi:10.1007/s10115-015-0828-5
[22] Itzhak, N.; Jaroszewicz, S.; Moskovitch, R., Continuously predicting a time intervals based pattern completion towards event prediction (2023), Osaka, Japan: PAKDD, Osaka, Japan
[23] Jakkula, VR; Cook, DJ, Detecting anomalous sensor events in smart home data for enhancing the living experience, Artif Intell Smart Living, 11, 201, 1 (2011)
[24] Kam PS, Fu AWC (2000) Discovering temporal patterns for interval-based events. In: International conference on data warehousing and knowledge discovery. Springer, Berlin, pp 317-326, doi:10.1007/3-540-44466-1_32
[25] Kostakis O, Gionis A (2017) On mining temporal patterns in dynamic graphs, and other unrelated problems. In: International conference on complex networks and their applications. Springer, Berlin, pp 516-527, doi:10.1007/978-3-319-72150-7_42
[26] Kostakis O, Papapetrou P, Hollmén J (2011) Artemis: Assessing the similarity of event-interval sequences. In: Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 229-244, doi:10.1007/978-3-642-23783-6_15
[27] Kotsifakos A, Papapetrou P, Athitsos V (2013) ibsm: interval-based sequence matching. in: proceedings of the 2013 siam International Conference on Data Mining, SIAM, pp 596-604, doi:10.1137/1.9781611972832.66
[28] Lavrac, N.; Keravnou, E.; Zupan, B., Intelligent data analysis in medicine, Encycl Comput Sci Technol, 42, 9, 113-157 (2000) · Zbl 0882.92014
[29] Lee Z, Lindgren T, Papapetrou P (2020) Z-miner: An efficient method for mining frequent arrangements of event intervals. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. Association for Computing Machinery, New York, NY, USA, KDD ’20, pp 524-534, doi:10.1145/3394486.3403095
[30] Lin J, Keogh E, Lonardi S, et al (2003) A symbolic representation of time series, with implications for streaming algorithms. In: Proceedings of the 8th ACM SIGMOD workshop on research issues in data mining and knowledge discovery. Association for Computing Machinery, New York, NY, USA, DMKD ’03, pp 2-11, doi:10.1145/882082.882086
[31] Lin MY, Lee SY (2002) Fast discovery of sequential patterns by memory indexing. In: International conference on data warehousing and knowledge discovery. Springer, Berlin, pp 150-160, doi:10.1007/3-540-46145-0_15 · Zbl 1016.68564
[32] Mabroukeh, NR; Ezeife, CI, A taxonomy of sequential pattern mining algorithms, ACM Comput Surv, 43, 1, 1-41 (2010) · doi:10.1145/1824795.1824798
[33] Mirbagheri SM, Hamilton HJ (2020a) High-utility interval-based sequences. In: International Conference on big data analytics and knowledge discovery, Springer, pp 107-121, doi:10.1007/978-3-030-59065-9_9
[34] Mirbagheri SM, Hamilton HJ (2020b) Similarity matching of temporal event-interval sequences. In: Canadian conference on artificial intelligence, Springer, pp 420-425, doi:10.1007/978-3-030-47358-7_43
[35] Mirbagheri, SM; Hamilton, HJ, Mining high utility patterns in interval-based event sequences, Data Knowl Eng, 135, 101, 924 (2021) · doi:10.1016/j.datak.2021.101924
[36] Mörchen F, Fradkin D (2010) Robust mining of time intervals with semi-interval partial order patterns. In: Proceedings of the 2010 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp 315-326, doi:10.1137/1.9781611972801.28
[37] Mörchen F, Ultsch A (2005) Optimizing time series discretization for knowledge discovery. In: Proceedings of the Eleventh ACM SIGKDD international conference on knowledge discovery in data mining. Association for Computing Machinery, New York, NY, USA, KDD ’05, pp 660-665, doi:10.1145/1081870.1081953
[38] Mordvanyuk, N.; López, B.; Bifet, A., verttirp: Robust and efficient vertical frequent time interval-related pattern mining, Expert Syst Appl, 168, 114, 276 (2021) · doi:10.1016/j.eswa.2020.114276
[39] Mordvanyuk, N.; López, B.; Bifet, A., Ta4l: efficient temporal abstraction of multivariate time series, Knowl-Based Syst, 244, 108, 554 (2022) · doi:10.1016/j.knosys.2022.108554
[40] Moskovitch R (2022) Multivariate time series mining. Wiley’s Data Mining and Knowledge Discovery
[41] Moskovitch, R.; Shahar, Y., Classification-driven temporal discretization of multivariate time series, Data Min Knowl Disc, 29, 4, 871-913 (2015) · doi:10.1007/s10618-014-0380-z
[42] Moskovitch, R.; Shahar, Y., Classification of multivariate time series via temporal abstraction and time intervals mining, Knowl Inf Syst, 45, 1, 35-74 (2015) · doi:10.1007/s10115-014-0784-5
[43] Moskovitch, R.; Shahar, Y., Fast time intervals mining using the transitivity of temporal relations, Knowl Inf Syst, 42, 1, 21-48 (2015) · doi:10.1007/s10115-013-0707-x
[44] Moskovitch, R.; Peek, N.; Shahar, Y., Classification of ICU patients via temporal abstraction and temporal patterns mining, Notes of the intelligent data analysis in medicine and pharmacology (IDAMAP 2009) Workshop, 35-40 (2009), Verona, Italy: American Medical Informatics Association, Verona, Italy
[45] Moskovitch R, Walsh C, Wang F, et al (2015) Outcomes prediction via time intervals related patterns. In: 2015 IEEE international conference on data mining. IEEE Computer Society, Washington, DC, USA, pp 919-924, doi:10.1109/ICDM.2015.143
[46] Novitski P, Cohen CM, Karasik A, et al (2020) All-cause mortality prediction in t2d patients. In: International conference on artificial intelligence in medicine. Springer, Berlin, pp 3-13, doi:10.1007/978-3-030-59137-3_1
[47] Papapetrou, P.; Kollios, G.; Sclaroff, S., Mining frequent arrangements of temporal intervals, Knowl Inf Syst, 21, 2, 133 (2009) · doi:10.1007/s10115-009-0196-0
[48] Patel D, Hsu W, Lee ML (2008) Mining relationships among interval-based events for classification. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. Association for Computing Machinery, New York, NY, USA, SIGMOD ’08, pp 393-404, doi:10.1145/1376616.1376658
[49] Pei J, Han J, Mortazavi-Asl B, et al (2000) Mining access patterns efficiently from web logs. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, Berlin, pp 396-407, doi:10.1007/3-540-45571-X_47
[50] Rebane, J.; Karlsson, I.; Bornemann, L., Smile: a feature-based temporal abstraction framework for event-interval sequence classification, Data Min Knowl Disc, 35, 1, 372-399 (2021) · doi:10.1007/s10618-020-00719-3
[51] Sacchi, L.; Larizza, C.; Combi, C., Data mining with temporal abstractions: Learning rules from time series, Data Min Knowl Disc, 15, 2, 217-247 (2007) · doi:10.1007/s10618-007-0077-7
[52] Shahar, Y., A framework for knowledge-based temporal abstraction, Artif Intell, 90, 1-2, 79-133 (1997) · Zbl 1017.03517 · doi:10.1016/S0004-3702(96)00025-2
[53] Sharma AK, Patel D (2018) Stipa: A memory efficient technique for interval pattern discovery. In: 2018 IEEE International conference on big data (Big Data). IEEE Computer Society, Washington, DC, USA, pp 1767-1776, doi:10.1109/BigData.2018.8622421
[54] Shknevsky A, Shahar Y, Moskovitch R (2021) The semantic adjacency criterion in time intervals mining. arXiv preprint arXiv:2101.03842
[55] Srikant R, Agrawal R (1996) Mining sequential patterns: Generalizations and performance improvements. In: International conference on extending database technology. Springer, Berlin, pp 1-17, doi:10.1007/BFb0014140
[56] Tzvetkov, P.; Yan, X.; Han, J., Tsp: mining top-k closed sequential patterns, Knowl Inf Syst, 7, 4, 438-457 (2005) · doi:10.1007/s10115-004-0175-4
[57] Villafane, R.; Hua, KA; Tran, D., Knowledge discovery from series of interval events, J Intell Inf Syst, 15, 1, 71-89 (2000) · doi:10.1023/A:1008781812242
[58] Wang J, Han J (2004) Bide: Efficient mining of frequent closed sequences. In: Proceedings. 20th international conference on data engineering. IEEE Computer Society, Washington, DC, USA, pp 79-90, doi:10.1109/ICDE.2004.1319986
[59] Winarko, E.; Roddick, JF, Armada: an algorithm for discovering richer relative temporal association rules from interval-based data, Data Knowl Eng, 63, 1, 76-90 (2007) · doi:10.1016/j.datak.2006.10.009
[60] Wu, SY; Chen, YL, Mining nonambiguous temporal patterns for interval-based events, IEEE Trans Knowl Data Eng, 19, 6, 742-758 (2007) · doi:10.1109/TKDE.2007.190613
[61] Yan X, Han J, Afshar R (2003) Clospan: mining: closed sequential patterns in large datasets. In: Proceedings of the 2003 SIAM international conference on data mining. Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, pp 166-177, doi:10.1137/1.9781611972733.15
[62] Yang CW, Jaysawal BP, Huang JW (2017) Subsequence search considering duration and relations of events in time interval-based events sequences. In: 2017 IEEE International conference on data science and advanced analytics (DSAA), IEEE, pp 293-302, doi:10.1109/DSAA.2017.47
[63] Yang Z, Wang Y, Kitsuregawa M (2007) Lapin: effective sequential pattern mining algorithms by last position induction for dense databases. In: International conference on database systems for advanced applications. Springer, Berlin, pp 1020-1023, doi:10.1007/978-3-540-71703-4_95
[64] Zaki, MJ, Spade: an efficient algorithm for mining frequent sequences, Mach Learn, 42, 1-2, 31-60 (2001) · Zbl 0970.68052 · doi:10.1023/A:1007652502315
[65] Zhang, J.; Wang, Y.; Yang, D., Ccspan: mining closed contiguous sequential patterns, Knowl-Based Syst, 89, 1-13 (2015) · doi:10.1016/j.knosys.2015.06.014
[66] Zhao Q, Bhowmick SS (2003) Sequential pattern mining: a survey. ITechnical Report CAIS Nayang Technological University Singapore 1(26):135
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.