×

Decomposition of data mining algorithms into unified functional blocks. (English) Zbl 1400.68174

Summary: The present paper describes the method of creating data mining algorithms from unified functional blocks. This method splits algorithms into independently functioning blocks. These blocks must have unified interfaces and implement pure functions. The method allows us to create new data mining algorithms from existing blocks and improves the existing algorithms by optimizing single blocks or the whole structure of the algorithms. This becomes possible due to a number of important properties inherent in pure functions and hence functional blocks.

MSC:

68T05 Learning and adaptive systems in artificial intelligence

Software:

WEKA
Full Text: DOI

References:

[1] Agrawal, R.; Srikant, R., Fast algorithms for mining association rules in large databases, Proceedings of the 20th International Conference on Very Large Data Bases (VLDB ’94)
[2] Park, J. S.; Chen, M.-S.; Yu, P. S., Using a hash-based method with transaction trimming for mining association rules, IEEE Transactions on Knowledge and Data Engineering, 9, 5, 813-825, (1997) · doi:10.1109/69.634757
[3] Savasere, A.; Omiecinskia, E.; Navathe, S., An efficient algorithm for mining association rules in large databases, Proceedings of the 21th International Conference on Very Large Data Bases (VLDB ’95)
[4] Knuth, D., The Art of Computer Programming, 1-4A, (2011), Addison-Wesley Professional · Zbl 1354.68001
[5] Cormen, T. H.; Leiserson, C. E.; Rivest, R.; Stein, C., Introduction to Algorithms, (2009), Cambridge, Mass, USA: MIT Press, Cambridge, Mass, USA · Zbl 1187.68679
[6] Amdahl, G. M., Validity of the single processor approach to achieving large scale computing capabilities, Proceedings of the Spring Joint Computer Conference (AFIPS ’67), Thompson Books · doi:10.1145/1465482.1465560
[7] Peterson, J. L., Petri net theory and the modeling of systems, (1981), Englewood Cliffs, NJ, USA: Prentice-Hall, Englewood Cliffs, NJ, USA · Zbl 0461.68059
[8] Murata, T., Petri nets: properties, analysis and applications, Proceedings of the IEEE, 77, 4, 541-580, (1989) · doi:10.1109/5.24143
[9] Voevodin, Vl. V.; Voevodin, V. V., Analytical methods and software tools for enhancing scalability of parallel applications, Proceedings of the Intelligent Conference on HiPer
[11] Common Lisp Language
[12] Kerdprasop, N.; Kerdprasop, K., Mining frequent patterns with functional programming, International Journal of Computer, Information, Systems and Control Engineering, 1, 1, 120-125, (2007)
[13] Allison, L., Models for machine learning and data mining in functional programming, Journal of Functional Programming, 15, 1, 15-32, (2005) · Zbl 1063.68082 · doi:10.1017/S0956796804005301
[14] Maimon, O.; Rokach, L., Decomposition Methodology for Knowledge Discovery and Data Mining: Theory and Applications. Decomposition Methodology for Knowledge Discovery and Data Mining: Theory and Applications, Machine Perception and Artificial Intelligence, (2005), World Scientific · Zbl 1122.68042
[15] Dietterich, T. G., Ensemble methods in machine learning, Proceedings of the 1st InternationalWorkshop on Multiple Classifier Systems, Springer
[16] Todorovski, L.; Džeroski, S., Combining multiple models with meta decision trees, Principles of Data Mining and Knowledge Discovery. Principles of Data Mining and Knowledge Discovery, Lecture Notes in Computer Science, 1910, 54-64, (2000), Berlin, Germany: Springer, Berlin, Germany · doi:10.1007/3-540-45372-5_6
[18] Gonçalves, P. M.; Barros, R. S. M.; Vieira, D. C. L., On the use of data mining tools for data preparation in classification problems, Proceedings of the IEEE/ACIS 11th International Conference on Computer and Information Science (ICIS ’12), IEEE · doi:10.1109/icis.2012.79
[19] Waikato Environment for Knowledge Analysis (Weka)
[20] Witten, I. H.; Frank, E.; Trigg, L.; Hall, M.; Holmes, G.; Cunningham, S. J., Weka: practical machine learning tools and techniques with java implementations, Proceedings of the Workshop on Emerging Knowledge Engineering and Connectionist-Based Information Systems (ICONIP/ANZIIS/ANNES ’99)
[22] Tippmann, S., Programming tools: adventures with R, Nature, 517, 7532, 109-110, (2014) · doi:10.1038/517109a
[23] Morandat, F.; Hill, B.; Osvald, L.; Vitek, J.; Noble, J., Evaluating the design of the R language: objects and functions for data analysis, Proceedings of the 26th European Conference on Object-Oriented Programming (ECOOP ’12), Springer
[24] Ghoting, A.; Kambadur, P.; Pednault, E.; Kannan, R., NIMBLE: a toolkit for the implementation of parallel data mining and machine learning algorithms on mapReduce, Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’11) · doi:10.1145/2020408.2020464
[25] Barendregt, H. P., The Lambda Calculus: Its Syntax and Semantics. The Lambda Calculus: Its Syntax and Semantics, Studies in Logic and the Foundations of Mathematics, 103, (1985), North-Holland · Zbl 0597.03009
[26] Church, A.; Rosser, J. B., Some properties of conversion, Transactions of the American Mathematical Society, 39, 3, 472-482, (1936) · Zbl 0014.38504 · doi:10.2307/1989762
[27] Joachimski, F., Confluence of the coinductive λ-calculus, Theoretical Computer Science, 311, 1–3, 105-119, (2004) · Zbl 1088.03014 · doi:10.1016/s0304-3975(03)00324-4
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.