×

A survey of outlier detection in high dimensional data streams. (English) Zbl 1507.68256

MSC:

68T05 Learning and adaptive systems in artificial intelligence
68T09 Computational aspects of data analysis and big data
68-02 Research exposition (monographs, survey articles) pertaining to computer science
Full Text: DOI

References:

[1] Aggarwal, C. C., Outlier Analysis (2013), Springer-Verlag: Springer-Verlag New York · Zbl 1291.68004
[2] Aggarwal, C. C., Data Mining: The Textbook (2015), Springer · Zbl 1311.68001
[3] Mokni, M.; Yassa, S.; Hajlaoui, J. E.; Chelouah, R.; Omri, M. N., Cooperative agents-based approach for workflow scheduling on fog-cloud computing, J. Ambient Intell. Humaniz. Comput., 1-20 (2021)
[4] L. Tran, L. Fan, C. Shahabi, Outlier detection in non-stationary data streams, in: Proceedings of the 31st International Conference on Scientific and Statistical Database Management, 2019, pp. 25-36.
[5] Sadik, S.; Gruenwald, L., Research issues in outlier detection for data streams, ACM SIGKDD Explor. Newsl., 15, 1, 33-40 (2014)
[6] Hemalatha, C. S.; Vaidehi, V.; Lakshmi, R., Minimal infrequent pattern based approach for mining outliers in data streams, Expert Syst. Appl., 42, 4, 1998-2012 (2015)
[7] Cai, S.; Li, S.; Yuan, G.; Hao, S.; Sun, R., MiFI-Outlier: Minimal infrequent itemset-based outlier detection approach on uncertain data stream, Knowl.-Based Syst., 191, Article 105268 pp. (2020)
[8] Wen, J.; Xie, G.; Zhang, D.; Qin, Z.; Xie, K.; Cao, J.; Li, X.; Wang, X., On-line anomaly detection with high accuracy, IEEE/ACM Trans. Netw., 26, 3, 1222-1235 (2018)
[9] Lee, Y. J.; Yeh, Y. R.; Wang, Y. C.F., Anomaly detection via online oversampling principal component analysis, IEEE Trans. Knowl. Data Eng., 25, 7, 1460-1470 (2013)
[10] Dong, Y.; Japkowicz, N., Threaded ensembles of autoencoders for stream learning, Comput. Intell., 34, 1, 261-281 (2018)
[11] K. Doshi, Y. Yilmaz, Continual learning for anomaly detection in surveillance videos, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020, pp. 254-255.
[12] Nawaratne, R.; Alahakoon, D.; De Silva, D.; Yu, X., Spatiotemporal anomaly detection using deep learning for real-time video surveillance, IEEE Trans. Ind. Inf., 16, 1, 393-402 (2019)
[13] Zhang, L.; Lin, J.; Karim, R., Sliding window-based fault detection from high-dimensional data streams, IEEE Trans. Syst. Man Cybern.: Syst., 47, 2, 289-303 (2017)
[14] Sharan, V.; Gopalan, P.; Wieder, U., Efficient anomaly detection via matrix sketching (Nips) (2018), arXiv:1804.03065
[15] Sadik, M. S.; Gruenwald, L., DBOD-DS: Distance based outlier detection for data streams, (International Conference on Database and Expert Systems Applications (2010), Springer), 122-136
[16] Tran, L.; Fan, L.; Shahabi, C., Distance-based outlier detection in data streams, Proc. VLDB Endow., 9, 12, 1089-1100 (2016)
[17] Cao, L.; Wang, J.; Rundensteiner, E. A., Sharing-aware outlier analytics over high-volume data streams, (Proceedings of the 2016 International Conference on Management of Data (2016), ACM), 527-540
[18] Pokrajac, D.; Lazarevic, A.; Latecki, L. J., Incremental local outlier detection for data streams, (2007 IEEE Symposium on Computational Intelligence and Data Mining (2007), IEEE), 504-515
[19] Salehi, M.; Leckie, C.; Bezdek, J. C.; Vaithianathan, T.; Zhang, X., Fast memory efficient local outlier detection in data streams, IEEE Trans. Knowl. Data Eng., 28, 12, 3246-3260 (2016)
[20] G.S. Na, D. Kim, H. Yu, Dilof: Effective and memory efficient local outlier detection in data streams, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018, pp. 1993-2002.
[21] Schubert, E., Generalized and Efficient Outlier Detection for Spatial, Temporal, and High-Dimensional Data Mining (2013), lmu, (Ph.D. thesis)
[22] Cao, L.; Yang, D.; Wang, Q.; Yu, Y.; Wang, J.; Rundensteiner, E. A., Scalable distance-based outlier detection over high-volume data streams, (2014 IEEE 30th International Conference on Data Engineering (2014), IEEE), 76-87
[23] H. Ye, H. Kitagawa, J. Xiao, Continuous angle-based outlier detection on high-dimensional data streams, in: Proceedings of the 19th International Database Engineering & Applications Symposium, 2015, pp. 162-167.
[24] Yang, D.; Wang, Y.; Li, Y.; Ma, X., A variable Markovian based outlier detection method for multi-dimensional sequence over data stream, (2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (2016), IEEE), 183-188
[25] Bao, H.; Wang, Y., A c-svm based anomaly detection method for multi-dimensional sequence over data stream, (2016 IEEE 22nd International Conference on Parallel and Distributed Systems (2016), IEEE), 948-955
[26] Y. Liu, L. Zhang, Y. Guan, Sketch-based streaming PCA algorithm for network-wide traffic anomaly detection, in: Proceedings - International Conference on Distributed Computing Systems, 2010, pp. 807-816.
[27] Zhang, J.; Gao, Q.; Wang, H., SPOT: A system for detecting projected outliers from high-dimensional data streams, (International Conference on Database and Expert Systems Applications (2008)), 1628-1631
[28] Zhao, X.; Zhang, J.; Qin, X., LOMA: A local outlier mining algorithm based on attribute relevance analysis, Expert Syst. Appl., 84, 272-280 (2017)
[29] Hodge, V.; Austin, J., A survey of outlier detection methodologies, Artif. Intell. Rev., 22, 2, 85-126 (2004) · Zbl 1101.68023
[30] Zhang, Y.; Meratnia, N.; Havinga, P., A Taxonomy Framework for Unsupervised Outlier Detection Techniques for Multi-Type Data Sets (2007), Rap. Tech., Centre for Telematics and Information Technology University of Twente
[31] Chandola, V.; Banerjee, A.; Kumar, V., Outlier detection : A survey, ACM Comput. Surv., 41, 3, 241 (2009)
[32] Aggarwal, C. C., Outlier Analysis Second Edition (2017), Springer Cham · Zbl 1353.68004
[33] Zhang, J., Advancements of outlier detection: A survey, ICST Trans. Scalable Inf. Syst., 13, 1, Article e2 pp. (2013)
[34] Zimek, A.; Filzmoser, P., There and back again: Outlier detection between statistical reasoning and data mining algorithms, Wiley Interdiscip. Rev.: Data Min. Knowl. Discov., 8, 6, 1-26 (2018)
[35] Domingues, R.; Filippone, M.; Michiardi, P.; Zouaoui, J., A comparative evaluation of outlier detection algorithms: Experiments and analyses, Pattern Recognit., 74, 406-421 (2018)
[36] Xu, X.; Liu, H.; Yao, M., Recent progress of anomaly detection, Complexity, 2019 (2019)
[37] Wang, H.; Bah, M. J.; Hammad, M., Progress in outlier detection techniques: A survey, IEEE Access, 7, 107964-108000 (2019)
[38] Smiti, A., A critical overview of outlier detection methods, Comp. Sci. Rev., 38, Article 100306 pp. (2020) · Zbl 1484.68197
[39] Zimek, A.; Campello, R. J.; Sander, J., Ensembles for unsupervised outlier detection: challenges and research questions a position paper, ACM SIGKDD Explor. Newsl., 15, 1, 11-22 (2014)
[40] Ranshous, S.; Shen, S.; Koutra, D.; Harenberg, S.; Faloutsos, C.; Samatova, N. F., Anomaly detection in dynamic networks: a survey, Wiley Interdiscip. Rev. Comput. Stat., 7, 3, 223-247 (2015) · Zbl 07912769
[41] Chandola, V.; Banerjee, A.; Kumar, V., Anomaly detection for discrete sequences: A survey, IEEE Trans. Knowl. Data Eng., 24, 5, 823-839 (2010)
[42] Akoglu, L.; Tong, H.; Koutra, D., Graph based anomaly detection and description: a survey, Data Min. Knowl. Discov., 29, 3, 626-688 (2015)
[43] Gupta, M.; Gao, J.; Aggarwal, C. C.; Han, J., Outlier detection for temporal data: A survey, IEEE Trans. Knowl. Data Eng., 26, 9, 2250-2267 (2013)
[44] Zimek, A.; Schubert, E.; Kriegel, H.-P., A survey on unsupervised outlier detection in high-dimensional numerical data, Stat. Anal. Data Min.: ASA Data Sci. J., 5, 5, 363-387 (2012) · Zbl 07260336
[45] Aggarwal, C. C., High-dimensional outlier detection: the subspace method, (Outlier Analysis (2017), Springer), 149-184
[46] Xu, X.; Liu, H.; Li, L.; Yao, M., A comparison of outlier detection techniques for high-dimensional data, Int. J. Comput. Intell. Syst., 11, 1, 652-662 (2018)
[47] Thakkar, P.; Vala, J.; Vishal, P., Survey on outlier detection in data stream, Int. J. Comput. Appl., 136, 2, 975-8887 (2016)
[48] Chen, L.; Gao, S.; Cao, X., Research on real-time outlier detection over big data streams, Int. J. Comput. Appl., 7074, November, 1-9 (2017)
[49] Salehi, M.; Rashidi, L., A survey on anomaly detection in evolving data, ACM SIGKDD Explor. Newsl., 20, 1, 13-23 (2018)
[50] Sun, R.; Zhang, S.; Yin, C.; Wang, J.; Min, S., Strategies for data stream mining method applied in anomaly detection, Cluster Comput., 22, 2, 399-408 (2019)
[51] Mishra, S.; Chawla, M., A comparative study of local outlier factor algorithms for outliers detection in data streams, (Emerging Technologies in Data Mining and Information Security (2019), Springer), 347-356
[52] Alghushairy, O.; Alsini, R.; Soule, T.; Ma, X., A review of local outlier factor algorithms for outlier detection in big data streams, Big Data Cogn. Comput., 5, 1, 1 (2021)
[53] Hawkins, D. M., Identification of Outliers, Vol. 11 (1980), Springer · Zbl 0438.62022
[54] Grubbs, F. E., Procedures for detecting outlying observations in samples, Technometrics (1969)
[55] Barnett, V.; Lewis, T., Outliers in Statistical Data (1974), Wiley
[56] Krawczyk, B.; Minku, L. L.; Gama, J.; Stefanowski, J.; Woźniak, M., Ensemble learning for data stream analysis: A survey, Inf. Fusion, 37, 132-156 (2017)
[57] Aggarwal, C. C., Data Streams: Models and Algorithms (2007), Springer Science & Business Media · Zbl 1126.68033
[58] Nguyen, H.-L.; Woon, Y.-K.; Ng, W.-K., A survey on data stream clustering and classification, Knowl. Inf. Syst., 45, 3, 535-569 (2015)
[59] Gama, J.; Žliobaitė, I.; Bifet, A.; Pechenizkiy, M.; Bouchachia, A., A survey on concept drift adaptation, ACM Comput. Surv., 46, 4, 1-37 (2014) · Zbl 1305.68141
[60] Webb, G. I.; Hyde, R.; Cao, H.; Nguyen, H. L.; Petitjean, F., Characterizing concept drift, Data Min. Knowl. Discov., 30, 4, 964-994 (2016) · Zbl 1411.68127
[61] Barddal, J. P.; Gomes, H. M.; de Souza Britto, A.; Enembreck, F., A benchmark of classifiers on feature drifting data streams, (2016 23rd International Conference on Pattern Recognition (ICPR) (2016), IEEE), 2180-2185
[62] Barddal, J. P.; Gomes, H. M.; Enembreck, F.; Pfahringer, B., A survey on feature drift adaptation: Definition, benchmark, challenges and future directions, J. Syst. Softw., 127, 278-294 (2017)
[63] Zhang, J., Towards Outlier Detection for High-Dimensional Data Streams using Projected Outlier Analysis Strategy (2008), Dalhousie University: Dalhousie University Canada, (Ph.D. thesis)
[64] Souiden, I., Outlier Detection in the Context of Data Stream Mining: Survey of Approaches and Cloud Computing Case Study (2017), ISIGK, (Ph.D. thesis)
[65] Beyer, K.; Goldstein, J.; Ramakrishnan, R.; Shaft, U., When is “nearest neighbor” meaningful?, (International Conference on Database Theory (1999), Springer), 217-235
[66] Angiulli, F., On the behavior of intrinsically high-dimensional spaces: Distances, direct and reverse nearest neighbors, and hubness, J. Mach. Learn. Res., 18, 1, 1-60 (2017) · Zbl 1471.62368
[67] Houle, M. E.; Kriegel, H. P.; Kröger, P.; Schubert, E.; Zimek, A., Can shared-neighbor distances defeat the curse of dimensionality?, (International Conference on Scientific and Statistical Database Management (2010), Springer), 482-500
[68] Andoni, A.; Indyk, P.; Razenshteyn, I., Approximate nearest neighbor search in high dimensions (2018), arXiv: arXiv preprint arXiv:1806.09823 · Zbl 1490.68082
[69] Ditzler, G.; Roveri, M.; Alippi, C.; Polikar, R., Learning in nonstationary environments: A survey, IEEE Comput. Intell. Mag., 10, 4, 12-25 (2015)
[70] Liu, F. T.; Ting, K. M.; Zhou, Z.-H., Isolation-based anomaly detection, ACM Trans. Knowl. Discov. Data, 6, 1, 1-39 (2012)
[71] Wu, S.; Wang, S., Information-theoretic outlier detection for large-scale categorical data, IEEE Trans. Knowl. Data Eng., 25, 3, 589-602 (2013)
[72] Ting, K. M.; Zhou, G.-T.; Liu, F. T.; Tan, J. S.C., Mass estimation and its applications, (Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2010), ACM), 989-998
[73] Pevnỳ, T., Loda: Lightweight on-line detector of anomalies, Mach. Learn., 102, 2, 275-304 (2016) · Zbl 1338.68236
[74] Aggarwal, C. C., Outlier ensembles: position paper, ACM SIGKDD Explor. Newsl., 14, 2, 49-58 (2013)
[75] Shou, Z.; Tian, H.; Li, S.; Zou, F., Outlier detection with enhanced angle-based outlier factor in high-dimensional data stream, Int. J. Innov. Comput. Inf. Control, 14, 5, 1633-1651 (2018)
[76] Lin, F.; Le, W.; Bo, J., Research on maximal frequent pattern outlier factor for online high- dimensional time-series outlier detection, J. Converg. Inf. Technol., 5, 66-71 (2010)
[77] Zhang, J.; Gao, Q.; Wang, H.; Liu, Q.; Xu, K., Detecting projected outliers in high-dimensional data streams, (International Conference on Database and Expert Systems Applications (2009), Springer), 629-644
[78] Zhang, J.; Li, H.; Gao, Q.; Wang, H.; Luo, Y., Detecting anomalies from big network traffic data using an adaptive detection approach, Inform. Sci., 318, August, 91-110 (2015)
[79] S.D. Bay, M. Schwabacher, Mining distance-based outliers in near linear time with randomization and a simple pruning rule, in: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003, pp. 29-38.
[80] Ghoting, A.; Parthasarathy, S.; Otey, M. E., Fast mining of distance-based outliers in high-dimensional datasets, Data Min. Knowl. Discov., 16, 3, 349-364 (2008)
[81] H.-P. Kriegel, M. Schubert, A. Zimek, Angle-based outlier detection in high-dimensional data, in: Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2008, pp. 444-452.
[82] K. Bhaduri, B.L. Matthews, C.R. Giannella, Algorithms for speeding up distance-based outlier detection, in: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2011, pp. 859-867.
[83] Sugiyama, M.; Borgwardt, K., Rapid distance-based outlier detection via sampling, (Advances in Neural Information Processing Systems (2013)), 467-475
[84] Wu, Y.; Hoi, S. C.; Mei, T.; Yu, N., Large-scale online feature selection for ultra-high dimensional sparse data, ACM Trans. Knowl. Discov. Data, 11, 4, 48 (2017)
[85] Guyon, I.; Elisseeff, A., An introduction to variable and feature selection, J. Mach. Learn. Res., 3, 1157-1182 (2003) · Zbl 1102.68556
[86] Pang, G.; Xu, H.; Cao, L.; Zhao, W., Selective value coupling learning for detecting outliers in high-dimensional categorical data, (Proceedings of the 2017 ACM on Conference on Information and Knowledge Management (2017), ACM), 807-816
[87] Pang, G.; Cao, L.; Chen, L.; Lian, D.; Liu, H., Sparse modeling-based sequential ensemble learning for effective outlier detection in high-dimensional numeric data, (32 AAAI Conference on Artificial Intelligence (2018))
[88] Moradi Koupaie, H.; Ibrahim, S.; Hosseinkhan, L., Outlier detection in stream data by machine learning and feature selection methods, Int. J. Adv. Comput. Sci. Inf. Technol., 2, 3, 17-24 (2013)
[89] Almusallam, N. Y.; Tari, Z.; Bertok, P.; Zomaya, A. Y., Dimensionality reduction for intrusion detection systems in multi-data streams—A review and proposal of unsupervised feature selection scheme, (Emergent Computation (2017), Springer), 467-487
[90] Jolliffe, I., Choosing a subset of principal components or variables, (Principal Component Analysis (1986), Springer), 92-114
[91] Li, H.; Jiang, T.; Zhang, K., Efficient and robust feature extraction by maximum margin criterion, (Advances in Neural Information Processing Systems (2004)), 97-104
[92] Martínez, A. M.; Kak, A. C., Pca versus lda, IEEE Trans. Pattern Anal. Mach. Intell., 23, 2, 228-233 (2001)
[93] Roweis, S. T.; Saul, L. K., Nonlinear dimensionality reduction by locally linear embedding, Science, 290, 5500, 2323-2326 (2000)
[94] Tenenbaum, J. B.; De Silva, V.; Langford, J. C., A global geometric framework for nonlinear dimensionality reduction, Science, 290, 5500, 2319-2323 (2000)
[95] Belkin, M.; Niyogi, P., Laplacian eigenmaps and spectral techniques for embedding and clustering, (Advances in Neural Information Processing Systems (2002)), 585-591
[96] Hinton, G. E.; Salakhutdinov, R. R., Reducing the dimensionality of data with neural networks, Science, 313, 5786, 504-507 (2006) · Zbl 1226.68083
[97] Müller, E.; Schiffer, M.; Seidl, T., Statistical selection of relevant subspace projections for outlier ranking, (2011 IEEE 27th International Conference on Data Engineering (2011), IEEE), 434-445
[98] Keller, F.; Muller, E.; Bohm, K., HiCS: high contrast subspaces for density-based outlier ranking, (2012 IEEE 28th International Conference on Data Engineering (2012), IEEE), 1037-1048
[99] Zhang, J.; Yu, X.; Li, Y.; Zhang, S.; Xun, Y.; Qin, X., A relevant subspace based contextual outlier mining algorithm, Knowl.-Based Syst., 99, 1-9 (2016)
[100] A. Vanea, M. Emmanuel, F. Keller, B. Klemens, Instant selection of high contrast projections in multi-dimensional data streams, in: Proceedings of the Workshop on Instant Interactive Data Mining (IID 2012) in Conjunction with ECML PKDD, 2012.
[101] Zhang, J.; Zhang, S.; Chang, K. H.; Qin, X., An outlier mining algorithm based on constrained concept lattice, Internat. J. Systems Sci., 45, 5, 1170-1179 (2014) · Zbl 1284.68544
[102] Lazarevic, A.; Kumar, V., Feature bagging for outlier detection, (Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining (2005), ACM), 157-166
[103] T. Pevny, Anomaly detection by bagging, in: Proceedings of the 2013 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2013.
[104] Tan, S. C.; Ting, K. M.; Liu, T. F., Fast anomaly detection for streaming data, (The 22sd International Joint Conference on Artificial Intelligence (2011))
[105] E. Manzoor, H. Lamba, L. Akoglu, xStream : Outlier Dete‘ x ’ion in feature-evolving data streams, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018.
[106] Sathe, S.; Aggarwal, C. C., Subspace outlier detection in linear time with randomized hashing, (2016 IEEE 16th International Conference on Data Mining (2016), IEEE), 459-468
[107] Nguyen, H. V.; Müller, E.; Böhm, K., 4S: Scalable subspace search scheme overcoming traditional apriori processing, (2013 IEEE International Conference on Big Data (2013), IEEE), 359-367
[108] Nguyen, H. V.; Müller, E.; Vreeken, J.; Keller, F.; Böhm, K., CMI: An information-theoretic contrast measure for enhancing subspace cluster and outlier detection, (Proceedings of the 2013 SIAM International Conference on Data Mining (2013), SIAM), 198-206
[109] Aggarwal, C. C.; Sathe, S., Theoretical foundations and algorithms for outlier ensembles, ACM SIGKDD Explor. Newsl., 17, 1, 24-47 (2015)
[110] Kriegel, H. P.; Kröger, P.; Schubert, E.; Zimek, A., Outlier detection in arbitrarily oriented subspaces, (2012 IEEE 12th International Conference on Data Mining (2012), IEEE), 379-388
[111] Tran, L.; Mun, M. Y.; Shahabi, C., Real-time distance-based outlier detection in data streams, Proc. VLDB Endow., 14, 2, 141-153 (2020)
[112] Chen, L.; Wang, W.; Yang, Y., CELOF: Effective and fast memory efficient local outlier detection in high-dimensional data streams, Appl. Soft Comput., 102, Article 107079 pp. (2021)
[113] Khalique, V.; Kitagawa, H., VOA*: Fast angle-based outlier detection over high-dimensional data streams, (Pacific-Asia Conference on Knowledge Discovery and Data Mining (2021), Springer), 40-52
[114] HewaNadungodage, C.; Xia, Y.; Lee, J. J., Gpu-accelerated outlier detection for continuous data streams, (2016 IEEE International Parallel and Distributed Processing Symposium (2016), IEEE), 1133-1142
[115] Yu, K.; Shi, W.; Santoro, N.; Ma, X., Real-time outlier detection over streaming data, (2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (2019), IEEE), 125-132
[116] Qin, X.; Cao, L.; Rundensteiner, E. A.; Madden, S., Scalable kernel density estimation-based local outlier detection over large data streams, (EDBT (2019)), 421-432
[117] S. Yoon, J.-G. Lee, B.S. Lee, Ultrafast local outlier detection from a data stream with stationary region skipping, in: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 1181-1191.
[118] Zhou, X.; Li, S.; Chang, C.; Wu, J.; Liu, K., Information-value-based feature selectionalgorithm for anomaly detection over data streams, Teh. Vjesn./Tech. Gaz., 21, 2 (2014)
[119] Li, B.; Wang, Y.-j.; Yang, D.-s.; Li, Y.-m.; Ma, X.-k., FAAD: an unsupervised fast and accurate anomaly detection method for a multi-dimensional sequence over data stream, Front. Inf. Technol. Electron. Eng., 20, 3, 388-404 (2019)
[120] Benjelloun, F.-Z.; Oussous, A.; Bennani, A.; Belfkih, S.; Lahcen, A. A., Improving outliers detection in data streams using LiCS and voting, J. King Saud Univ.-Comput. Inf. Sci. (2019)
[121] Su, S.; Sun, Y.; Gao, X.; Qiu, J.; Tian, Z., A correlation-change based feature selection method for IoT equipment anomaly detection, Appl. Sci., 9, 3, 437 (2019)
[122] Xue, L.; Chen, Y.; Luo, M.; Peng, Z.; Liu, J., An anomaly detection framework for time-evolving attributed networks, Neurocomputing, 407, 39-49 (2020)
[123] Huang, L.; Nguyen, X.; Garofalakis, M.; Jordan, M. I.; Joseph, A.; Taft, N., In-network PCA and anomaly detection, (Advances in Neural Information Processing Systems (2007)), 617-624
[124] Jiang, R.; Fei, H.; Huan, J., A family of joint sparse PCA algorithms for anomaly localization in network data streams, IEEE Trans. Knowl. Data Eng., 25, 11, 2421-2433 (2013)
[125] Bhushan, A.; Sharker, M. H.; Karimi, H. A., Incremental principal component analysis based outlier detection methods for spatiotemporal data streams, ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci., 2, 4W2, 67-71 (2015)
[126] Hong, D.; Zhao, D.; Zhang, Y., The entropy and PCA based anomaly prediction in data streams, Procedia Comput. Sci., 96, 139-146 (2016)
[127] Kurt, M. N.; Yilmaz, Y.; Wang, X., Real-time nonparametric anomaly detection in high-dimensional settings, IEEE Trans. Pattern Anal. Mach. Intell. (2018)
[128] Pham, D. S.; Venkatesh, S.; Lazarescu, M.; Budhaditya, S., Anomaly detection in large-scale data stream networks, Data Min. Knowl. Discov., 28, 1, 145-189 (2014) · Zbl 1281.68099
[129] Huang, H.; Kasiviswanathan, S. P., Streaming anomaly detection using randomized matrix sketching, Proc. VLDB Endow., 9, 3, 192-203 (2016)
[130] Kathareios, G.; Anghel, A.; Mate, A.; Clauberg, R.; Gusat, M., Catch it if you can: Real-time network anomaly detection with low false alarm rates, (Proceedings of the 16th IEEE International Conference on Machine Learning and Applications ) (2017), IEEE), 924-929
[131] W. Yu, W. Cheng, C.C. Aggarwal, K. Zhang, H. Chen, W. Wang, Netwalk: A flexible deep embedding approach for anomaly detection in dynamic networks, in: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018, pp. 2672-2681.
[132] S. Bhatia, A. Jain, P. Li, R. Kumar, B. Hooi, MStream: Fast anomaly detection in multi-aspect streams, in: Proceedings of the Web Conference 2021, 2021, pp. 3371-3382.
[133] Bhatia, S.; Jain, A.; Srivastava, S.; Kawaguchi, K.; Hooi, B., MemStream: Memory-based anomaly detection in multi-aspect streams with concept drift (2021), arXiv preprint arXiv:2106.03837
[134] Francis, D. P.; Raimond, K., A random fourier features based streaming algorithm for anomaly detection in large datasets, (Advances in Big Data and Cloud Computing (2018), Springer), 209-217
[135] Francis, D. P.; Raimond, K., A fast and accurate explicit kernel map, Appl. Intell., 50, 3, 647-662 (2020)
[136] Fouché, E.; Kalinke, F.; Böhm, K., Efficient subspace search in data streams, Inf. Syst., 97, Article 101705 pp. (2021)
[137] M.M. Breunig, H.-P. Kriegel, R.T. Ng, J. Sander, LOF: identifying density-based local outliers, in: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 2000, pp. 93-104.
[138] D. Cai, C. Zhang, X. He, Unsupervised feature selection for multi-cluster data, in: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2010, pp. 333-342.
[139] Vempala, S. S., The Random Projection Method, Vol. 65 (2005), American Mathematical Soc.
[140] Johnson, W. B.; Lindenstrauss, J., Extensions of Lipschitz mappings into a Hilbert space, Contemp. Math., 26, 189-206, 1 (1984) · Zbl 0539.46017
[141] Chalapathy, R.; Chawla, S., Deep learning for anomaly detection: A survey (2019), arXiv:1901.03407
[142] Dong, Y.; Japkowicz, N., Threaded ensembles of supervised and unsupervised neural networks for stream learning, (Canadian Conference on Artificial Intelligence (2016), Springer), 304-315
[143] J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779-788.
[144] E. Ilg, N. Mayer, T. Saikia, M. Keuper, A. Dosovitskiy, T. Brox, Flownet 2.0: Evolution of optical flow estimation with deep networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2462-2470.
[145] Tishby, N.; Pereira, F. C.; Bialek, W., The information bottleneck method (2000), arXiv preprint physics/0004057
[146] Yang, S.; Zhou, W., Anomaly detection on collective moving patterns: Manifold learning based analysis of traffic streams, (2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing (2011), IEEE), 704-707
[147] E. Fouché, J. Komiyama, K. Böhm, Scaling multi-armed bandit algorithms, in: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 1449-1459.
[148] Zhang, J.; Gao, Q.; Wang, H., Anomaly detection in high-dimensional network data streams: A case study, (2008 IEEE International Conference on Intelligence and Security Informatics (2008), IEEE), 251-253
[149] Sathe, S.; Aggarwal, C. C., Subspace histograms for outlier detection in linear time, Knowl. Inf. Syst., 1-25 (2018)
[150] Fawcett, T., An introduction to ROC analysis, Pattern Recognit. Lett., 27, 8, 861-874 (2006)
[151] Boukhari, K.; Omri, M. N., Approximate matching-based unsupervised document indexing approach: application to biomedical domain, Scientometrics, 1-22 (2020)
[152] García, S.; Ramírez-Gallego, S.; Luengo, J.; Benítez, J. M.; Herrera, F., Big data preprocessing: methods and prospects, Big Data Anal., 1, 1, 9 (2016)
[153] Schneider, M.; Ertel, W.; Ramos, F., Expected similarity estimation for large-scale batch and streaming anomaly detection, Mach. Learn., 105, 3, 305-333 (2016) · Zbl 1432.94044
[154] Uzilov, A. V.; Keegan, J. M.; Mathews, D. H., Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change, BMC Bioinformatics, 7, 1, 173 (2006)
[155] Caruana, R.; Joachims, T.; Backstrom, L., KDD-Cup 2004: results and analysis, ACM SIGKDD Explor. Newsl., 6, 2, 95-108 (2004)
[156] W. Kim, A. Roopakalu, K.Y. Li, V.S. Pai, Understanding and characterizing PlanetLab resource usage for federated network testbeds, in: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, 2011, pp. 515-532.
[157] Lin, Y.; Lee, B. S.; Lustgarten, D., Continuous detection of abnormal heartbeats from ECG using online outlier detection, (Annual International Symposium on Information Management and Big Data (2018), Springer), 349-366
[158] Sharafaldin, I.; Lashkari, A. H.; Ghorbani, A. A., Toward generating a new intrusion detection dataset and intrusion traffic characterization, ICISSp, 1, 108-116 (2018)
[159] Moustafa, N.; Slay, J., UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set), (2015 Military Communications and Information Systems Conference (MilCIS) (2015), IEEE), 1-6
[160] W. Luo, W. Liu, S. Gao, A revisit of sparse coding based anomaly detection in stacked rnn framework, in: Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 341-349.
[161] Angiulli, F., CFOF: A concentration free measure for anomaly detection, ACM Trans. Knowl. Discov. Data, 14, 1, 1-53 (2020)
[162] Aggarwal, C. C.; Sathe, S., Outlier Ensembles (2017), Springer
[163] Ruff, L.; Görnitz, N.; Deecke, L.; Siddiqui, S. A.; Vandermeulen, R.; Binder, A.; Müller, E.; Kloft, M., Deep one-class classification, (International Conference on Machine Learning (2018)), 4390-4399
[164] Zhang, Q.; Yang, L. T.; Chen, Z.; Li, P., A survey on deep learning for big data, Inf. Fusion, 42, 146-157 (2018)
[165] Settles, B., Active Learning Literature SurveyTech. rep. (2009), University of Wisconsin-Madison Department of Computer Sciences
[166] Das, S.; Islam, M. R.; Jayakodi, N. K.; Doppa, J. R., Active anomaly detection via ensembles: Insights, algorithms, and interpretability (2019), arXiv:1901.08930
[167] Liu, N.; Shin, D.; Hu, X., Contextual outlier interpretation (2017), arXiv: arXiv preprint arXiv:1711.10589
[168] Jiang, Y.; Zeng, C.; Xu, J.; Li, T., Real time contextual collective anomaly detection over multiple data streams, (Workshop on Outlier Detection & Description under Data Diversity (2014))
[169] Hayes, M. A.; Capretz, M. A., Contextual anomaly detection framework for big sensor data, J. Big Data, 2, 1, 1-22 (2015)
[170] Ahmad, H.; Dowaji, S., A novel framework for context-aware outlier detection in big data streams, J. Digit. Inf. Manage., 16, 5, 213-222 (2018)
[171] Liang, J.; Parthasarathy, S., Robust contextual outlier detection: Where context meets sparsity, (Proceedings of the 25th ACM International on Conference on Information and Knowledge Management (2016), ACM), 2167-2172
[172] Aleroud, A.; Karabatis, G., Contextual information fusion for intrusion detection: a survey and taxonomy, Knowl. Inf. Syst., 52, 3, 563-619 (2017)
[173] Dietterich, T. G.; Zemicheal, T., Anomaly detection in the presence of missing values (2018), arXiv preprint arXiv:1809.01605
[174] Wei, Y.; Tang, Y.; Nicholas, P. D., Flexible high-dimensional unsupervised learning with missing data, IEEE Trans. Pattern Anal. Mach. Intell. (2018)
[175] de Vries, T.; Chawla, S.; Houle, M. E., Density-preserving projections for large-scale local anomaly detection, Knowl. Inf. Syst., 32, 1, 25-52 (2012)
[176] Kirner, E.; Schubert, E.; Zimek, A., Good and bad neighborhood approximations for outlier detection ensembles, (Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), LNCS, vol. 10609 (2017)), 173-187, arXiv:arXiv:1011.1669v3
[177] Law, Y. N.; Zaniolo, C., An adaptive nearest neighbor classification algorithm for data streams, (European Conference on Principles of Data Mining and Knowledge Discovery (2005), Springer), 108-120
[178] Ramírez-Gallego, S.; Krawczyk, B.; García, S.; Woźniak, M.; Benítez, J. M.; Herrera, F., Nearest neighbor classification for high-speed big data streams using spark, IEEE Trans. Syst. Man Cybern.: Syst., 47, 10, 2727-2739 (2017)
[179] Sundaram, N.; Turmukhametova, A.; Satish, N.; Mostak, T.; Indyk, P.; Madden, S.; Dubey, P., Streaming similarity search over one billion tweets using parallel locality-sensitive hashing, Proc. VLDB Endow., 6, 14, 1930-1941 (2013)
[180] Suri, N. M.R.; Athithan, G., Outlier Detection: Techniques and Applications (2019), Springer
[181] Kennedy, J., Swarm intelligence, (Handbook of Nature-Inspired and Innovative Computing (2006), Springer), 187-219
[182] Pang, G.; Cao, L.; Chen, L.; Liu, H., Learning representations of ultrahigh-dimensional data for random distance-based outlier detection, (Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (2018), ACM), 2041-2050
[183] Ali, A. M.; Angelov, P.; Gu, X., Detecting anomalous behaviour using heterogeneous data, (Advances in Computational Intelligence Systems (2017), Springer), 253-273
[184] Kriegel, H. P.; Schubert, E.; Zimek, A., The (black) art of runtime evaluation: Are we comparing algorithms or implementations?, Knowl. Inf. Syst., 52, 2, 341-378 (2017)
[185] Campos, G. O.; Zimek, A.; Sander, J.; Campello, R. J.; Micenková, B.; Schubert, E.; Assent, I.; Houle, M. E., On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study, Data Min. Knowl. Discov., 30, 4, 891-927 (2016)
[186] Marques, H. O.; Campello, R. J.; Zimek, A.; Sander, J., On the internal evaluation of unsupervised outlier detection, (Proceedings of the 27th International Conference on Scientific and Statistical Database Management (2015), ACM), 1-12
[187] Macha, M.; Akoglu, L., Explaining anomalies in groups with characterizing subspace rules, Data Min. Knowl. Discov., 32, 5, 1444-1480 (2018) · Zbl 1416.62350
[188] Bin, X.; Zhao, Y.; Shen, B., Abnormal subspace sparse PCA for anomaly detection and interpretation (2016), arXiv preprint arXiv:1605.04644
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.