×

Time and space complexity of deterministic and nondeterministic decision trees. (English) Zbl 07644142

Summary: In this paper, we study arbitrary infinite binary information systems each of which consists of an infinite set called universe and an infinite set of two-valued functions (attributes) defined on the universe. We consider the notion of a problem over information system, which is described by a finite number of attributes and a mapping associating a decision to each tuple of attribute values. As algorithms for problem solving, we use deterministic and nondeterministic decision trees. As time and space complexity, we study the depth and the number of nodes in the decision trees. In the worst case, with the growth of the number of attributes in the problem description, (i) the minimum depth of deterministic decision trees grows either almost as logarithm or linearly, (ii) the minimum depth of nondeterministic decision trees either is bounded from above by a constant or grows linearly, (iii) the minimum number of nodes in deterministic decision trees has either polynomial or exponential growth, and (iv) the minimum number of nodes in nondeterministic decision trees has either polynomial or exponential growth. Based on these results, we divide the set of all infinite binary information systems into five complexity classes, and study for each class issues related to time-space trade-off for decision trees.

MSC:

68Txx Artificial intelligence

References:

[1] AbouEisha, H., Amin, T., Chikalov, I., Hussain, S., Moshkov, M.: Extensions of Dynamic Programming for Combinatorial Optimization and Data Mining. Intelligent Systems Reference Library, vol. 146. Springer (2019)
[2] Alsolami, F., Azad, M., Chikalov, I., Moshkov, M.: Decision and Inhibitory Trees and Rules for Decision Tables with Many-valued Decisions. Intelligent Systems Reference Library, vol. 156. Springer (2020) · Zbl 1415.68002
[3] Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984) · Zbl 0541.62042
[4] Moshkov, M.: Time complexity of decision trees. In: Trans. Rough Sets III. Lecture Notes in Computer Science, vol. 3400, pp. 244-459. Springer (2005) · Zbl 1117.68071
[5] Moshkov, M., Zielosko, B.: Combinatorial Machine Learning - A Rough Set Approach. Studies in Computational Intelligence, vol. 360. Springer (2011) · Zbl 1246.68010
[6] Rokach, L., Maimon, O.: Data Mining with Decision Trees - Theory and Applications. Series in Machine Perception and Artificial Intelligence, vol. 69. WorldScientific (2007) · Zbl 1154.68098
[7] Boros, E.; Hammer, PL; Ibaraki, T.; Kogan, A., Logical analysis of numerical data, Math. Program., 79, 163-190 (1997) · Zbl 0887.90179 · doi:10.1007/BF02614316
[8] Boros, E.; Hammer, PL; Ibaraki, T.; Kogan, A.; Mayoraz, E.; Muchnik, IB, An implementation of logical analysis of data, IEEE Trans. Knowl. Data Eng., 12, 2, 292-306 (2000) · doi:10.1109/69.842268
[9] Chikalov, I., Lozin, V.V., Lozina, I., Moshkov, M., Nguyen, H.S., Skowron, A., Zielosko, B.: Three Approaches to Data Analysis - Test Theory, Rough Sets and Logical Analysis of Data. Intelligent Systems Reference Library, vol. 41. Springer (2013) · Zbl 1254.68005
[10] Fürnkranz, J., Gamberger, D., Lavrac, N.: Foundations of Rule Learning. Cognitive Technologies. Springer (2012) · Zbl 1263.68002
[11] Lejeune, MA; Lozin, VV; Lozina, I.; Ragab, A.; Yacout, S., Recent advances in the theory and practice of logical analysis of data, Eur. J. Oper. Res., 275, 1, 1-15 (2019) · Zbl 1430.90495 · doi:10.1016/j.ejor.2018.06.011
[12] Moshkov, M., Piliszczuk, M., Zielosko, B.: Partial Covers, Reducts and Decision Rules in Rough Sets - Theory and Applications. Studies in Computational Intelligence, vol. 145. Springer (2008) · Zbl 1158.68045
[13] Pawlak, Z.: Rough Sets - Theoretical Aspects of Reasoning About Data. Theory and Decision Library: Series D, vol. 9. Kluwer (1991) · Zbl 0758.68054
[14] Pawlak, Z.; Skowron, A., Rudiments of rough sets, Inf. Sci., 177, 1, 3-27 (2007) · Zbl 1142.68549 · doi:10.1016/j.ins.2006.06.003
[15] Abdelhalim, A., Traoré, I., Sayed, B.: RBDT-1: A new rule-based decision tree generation technique. In: Governatori, G, Hall, J., Paschke, A. (eds.) Rule Interchange and Applications, International Symposium, RuleML 2009, Las Vegas, Nevada, USA, November 5-7, 2009. Lecture Notes in Computer Science, vol. 5858, pp. 108-121. Springer (2009)
[16] Abdelhalim, A.; Traoré, I.; Nakkabi, Y., Creating decision trees from rules using RBDT-1, Comput. Intell., 32, 2, 216-239 (2016) · doi:10.1111/coin.12049
[17] Aglin, G., Nijssen, S., Schaus, P.: Learning optimal decision trees using caching branch-and-bound search. In: 34th AAAI Conference on Artificial Intelligence, AAAI 2020, pp. 3146-3153 (2020)
[18] Avellaneda, F.: Efficient inference of optimal decision trees. In: 34th AAAI Conference on Artificial Intelligence, AAAI 2020, pp. 3195-3202 (2020)
[19] Bertsimas, D.; Dunn, J., Optimal classification trees, Mach. Learn., 106, 7, 1039-1082 (2017) · Zbl 1455.68159 · doi:10.1007/s10994-017-5633-9
[20] Demirovic, E.; Lukina, A.; Hebrard, E.; Chan, J.; Bailey, J.; Leckie, C.; Ramamohanarao, K.; Stuckey, PJ, Murtree: Optimal decision trees via dynamic programming and search, J. Mach. Learn. Res., 23, 26-12647 (2022) · Zbl 07625179
[21] Przybyla-Kasperek, M., Aning, S.: Bagging and single decision tree approaches to dispersed data. In: Paszynski, M., Kranzlmüller, D., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds.) Computational Science - ICCS 2021 - 21st International Conference, Krakow, Poland, June 16-18, 2021, Proceedings, Part III. Lecture Notes in Computer Science, vol. 12744, pp. 420-427. Springer (2021)
[22] Kaufman, K.A., Michalski, R.S., Pietrzykowski, J., Wojtusiak, J.: An integrated multi-task inductive database VINLEN: initial implementation and early results. In: Dzeroski, S., Struyf, J. (eds.) Knowledge Discovery in Inductive Databases, 5th International Workshop, KDID 2006, Berlin, Germany, September 18, 2006, Revised Selected and Invited Papers. Lecture Notes in Computer Science, vol. 4747, pp. 116-133. Springer (2006)
[23] Narodytska, N., Ignatiev, A., Pereira, F., Marques-Silva, J.: Learning optimal decision trees with SAT. In: 27th International Joint Conference on Artificial Intelligence, IJCAI 2018, pp. 1362-1368 (2018) · Zbl 1511.68249
[24] Verwer, S., Zhang, Y.: Learning optimal classification trees using a binary linear program formulation. In: 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, pp. 1625-1632 (2019)
[25] Zabinski, K.; Zielosko, B., Decision rules construction: Algorithm based on EAV model, Entropy, 23, 1, 14 (2021) · doi:10.3390/e23010014
[26] Zielosko, B., Zabinski, K.: Selected approaches for decision rules construction-comparative study. In: Knowledge-Based and Intelligent Information & Engineering Systems: 25th International Conference KES-2021, Virtual Event / Szczecin, Poland, 8-10 September 2021. Procedia Computer Science, vol. 192, pp. 3667-3676. Elsevier (2021)
[27] Molnar, C.: Interpretable Machine Learning. A Guide for Making Black Box Models Explainable, 2nd edn. christophm.github.io/interpretable-ml-book/ (2022)
[28] Wegener, I., Time-space trade-offs for branching programs, J. Comput. Syst. Sci., 32, 1, 91-96 (1986) · Zbl 0593.68038 · doi:10.1016/0022-0000(86)90004-8
[29] Beame, P.; Jayram, TS; Saks, ME, Time-space tradeoffs for branching programs, J. Comput. Syst. Sci., 63, 4, 542-572 (2001) · Zbl 1052.68049 · doi:10.1006/jcss.2001.1778
[30] Chikalov, I.; Hussain, S.; Moshkov, M., Totally optimal decision trees for Boolean functions, Discret. Appl. Math., 215, 1-13 (2016) · Zbl 1403.94133 · doi:10.1016/j.dam.2016.07.009
[31] Dobkin, DP; Lipton, RJ, A lower bound of the (1/2)n2 on linear search programs for the knapsack problem, J. Comput. Syst. Sci., 16, 3, 413-417 (1978) · Zbl 0397.68045 · doi:10.1016/0022-0000(78)90026-0
[32] Dobkin, DP; Lipton, RJ, On the complexity of computations under varying sets of primitives, J. Comput. Syst. Sci., 18, 1, 86-91 (1979) · Zbl 0409.68023 · doi:10.1016/0022-0000(79)90054-0
[33] Morávek, J., A localization problem in geometry and complexity of discrete programming, Kybernetika, 8, 6, 498-516 (1972) · Zbl 0248.90044
[34] Ben-Or, M.: Lower bounds for algebraic computation trees (preliminary report). In: 15th Annual ACM Symposium on Theory of Computing, STOC 1983, pp. 80-86 (1983)
[35] Gabrielov, A.; Vorobjov, N., On topological lower bounds for algebraic computation trees, Found. Comput. Math., 17, 1, 61-72 (2017) · Zbl 1365.14080 · doi:10.1007/s10208-015-9283-7
[36] Grigoriev, D., Karpinski, M., Vorobjov, N.: Improved lower bound on testing membership to a polyhedron by algebraic decision trees. In: 36th Annual Symposium on Foundations of Computer Science, FOCS 1995, pp. 258-265 (1995) · Zbl 0938.68868
[37] Grigoriev, D.; Karpinski, M.; Yao, AC, An exponential lower bound on the size of algebraic decision trees for Max, Comput. Complex., 7, 3, 193-203 (1998) · Zbl 0918.68032 · doi:10.1007/s000370050010
[38] Steele, JM; Yao, AC, Lower bounds for algebraic decision trees, J. Algorithms, 3, 1, 1-8 (1982) · Zbl 0477.68065 · doi:10.1016/0196-6774(82)90002-5
[39] Yao, A.C.: Algebraic decision trees and Euler characteristics. In: 33rd Annual Symposium on Foundations of Computer Science, FOCS 1992, pp. 268-277 (1992) · Zbl 0977.68556
[40] Yao, A.C.: Decision tree complexity and Betti numbers. In: 26th Annual ACM Symposium on Theory of Computing, STOC 1994, pp. 615-624 (1994) · Zbl 1345.68297
[41] Moshkov, M., Optimization problems for decision trees, Fundam. Inform., 21, 4, 391-401 (1994) · Zbl 0824.68090 · doi:10.3233/FI-1994-2147
[42] Moshkov, M.: Decision Trees. Theory and Applications (in Russian). Nizhny Novgorod University Publishers (1994) · Zbl 0939.68601
[43] Moshkov, M.: Two approaches to investigation of deterministic and nondeterministic decision trees complexity. In: 2nd World Conference on the fundamentals of artificial intelligence, WOCFAI 1995, pp. 275-280 (1995)
[44] Moshkov, M., Comparative analysis of deterministic and nondeterministic decision tree complexity, Global approach Fundam. Inform., 25, 2, 201-214 (1996) · Zbl 0840.68090
[45] Pawlak, Z., Information systems theoretical foundations, Inf. Syst., 6, 3, 205-218 (1981) · Zbl 0462.68078 · doi:10.1016/0306-4379(81)90023-5
[46] Blum, M., Impagliazzo, R.: Generic oracles and oracle classes (extended abstract). In: 28th Annual Symposium on Foundations of Computer Science, Los Angeles, California, USA, 27-29 October 1987, pp. 118-126 (1987)
[47] Hartmanis, J., Hemachandra, L.A.: One-way functions, robustness, and the non-isomorphism of NP-complete sets. In: 2nd Annual Conference on Structure in Complexity Theory, Cornell University, Ithaca, New York, USA, June 16-19, 1987 (1987) · Zbl 0718.03031
[48] Moshkov, M., About the depth of decision trees computing Boolean functions, Fundam. Inform., 22, 3, 203-215 (1995) · Zbl 0813.06012 · doi:10.3233/FI-1995-2231
[49] Tardos, G., Query complexity, or why is it difficult to separate NPA ∩ coNPA from PA by random oracles A?, Comb., 9, 4, 385-392 (1989) · Zbl 0698.68051
[50] Buhrman, H.; de Wolf, R., Complexity measures and decision tree complexity: A survey, Theor. Comput. Sci., 288, 1, 21-43 (2002) · Zbl 1061.68058 · doi:10.1016/S0304-3975(01)00144-X
[51] Dobkin, DP; Lipton, RJ, Multidimensional searching problems, SIAM J. Comput., 5, 2, 181-186 (1976) · Zbl 0333.68031 · doi:10.1137/0205015
[52] Moshkov, M., On conditional tests, Sov. Phys. Dokl., 27, 528-530 (1982) · Zbl 0505.90055
[53] Meyer auf der Heide, F.: A polynomial linear search algorithm for the n-dimensional knapsack problem. In: 15th Annual ACM Symposium on Theory of Computing, STOC 1983, pp. 70-79 (1983)
[54] Moshkov, M., Classification of infinite information systems depending on complexity of decision trees and decision rule systems, Fundam. Inform., 54, 4, 345-368 (2003) · Zbl 1111.68705
[55] Moshkov, M.: Comparative analysis of deterministic and nondeterministic decision tree complexity. Local approach. In: Trans. Rough Sets IV. Lecture Notes in Computer Science, vol. 3700, pp. 125-143. Springer (2005) · Zbl 1136.68534
[56] Moshkov, M.: Comparative Analysis of Deterministic and Nondeterministic Decision Trees. Intelligent Systems Reference Library, vol. 179. Springer (2020)
[57] Moshkov, M.: On time and space complexity of deterministic and nondeterministic decision trees. In: 8th International Conference Information Processing and Management of Uncertainty in Knowledge-based Systems, IPMU 2000, vol. 3, pp. 1932-1936 (2000)
[58] Naiman, DQ; Wynn, HP, Independence number and the complexity of families of sets, Discr. Math., 154, 203-216 (1996) · Zbl 0852.05076 · doi:10.1016/0012-365X(94)00318-D
[59] Sauer, N., On the density of families of sets, J. Comb. Theory (A), 13, 145-147 (1972) · Zbl 0248.05005 · doi:10.1016/0097-3165(72)90019-2
[60] Shelah, S., A combinatorial problem; stability and order for models and theories in infinitary languages, Pacific J. Math., 41, 241-261 (1972) · Zbl 0239.02024 · doi:10.2140/pjm.1972.41.247
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.