×

Computational complexity of queries based on itemsets. (English) Zbl 1187.68255

Summary: We investigate determining the exact bounds of the frequencies of conjunctions based on frequent sets. Our scenario is an important special case of some general probabilistic logic problems that are known to be intractable. We show that despite the limitations our problems are also intractable, namely, we show that checking whether the maximal consistent frequency of a query is larger than a given threshold is NP-complete and that evaluating the maximum entropy estimate of a query is PP-hard. We also prove that checking consistency is NP-complete.

MSC:

68Q17 Computational difficulty of problems (lower bounds, completeness, difficulty of approximation, etc.)
68P15 Database theory

References:

[1] R. Agrawal, T. Imielinski, A.N. Swami, Mining association rules between sets of items in large databases, in: P. Buneman, S. Jajodia (Eds.), Proc. 1993 ACM SIGMOD Internat. Conf. on Management of Data, Washington, DC, 26-28 May, 1993, pp. 207-216; R. Agrawal, T. Imielinski, A.N. Swami, Mining association rules between sets of items in large databases, in: P. Buneman, S. Jajodia (Eds.), Proc. 1993 ACM SIGMOD Internat. Conf. on Management of Data, Washington, DC, 26-28 May, 1993, pp. 207-216
[2] Agrawal, R.; Mannila, H.; Srikant, R.; Toivonen, H.; Verkamo, A. I., Fast discovery of association rules, (Fayyad, U. M.; Piatetsky-Shapiro, G.; Smyth, P.; Uthurusamy, R., Advances in Knowledge Discovery and Data Mining (1996), AAAI Press/The MIT Press: AAAI Press/The MIT Press Cambridge, MA), 307-328
[3] D.D. Bailey, V. Dalmau, P.G. Kolaitis, Phase transitions of PP-complete satisfiability problems, in: IJCAI, 2001, pp. 183-192; D.D. Bailey, V. Dalmau, P.G. Kolaitis, Phase transitions of PP-complete satisfiability problems, in: IJCAI, 2001, pp. 183-192 · Zbl 0990.90547
[4] Bykowski, A.; Seppänen, J. K.; Hollmén, J., Model-independent bounding of the supports of Boolean formulae in binary data, (Lanzi, P. L.; Meo, R., Database Technologies for Data Mining (2003), Springer-Verlag: Springer-Verlag Berlin) · Zbl 1099.68608
[5] T. Calders, Axiomatization and deduction rules for the frequency of itemsets, PhD thesis, University of Antwerp, Belgium, 2003; T. Calders, Axiomatization and deduction rules for the frequency of itemsets, PhD thesis, University of Antwerp, Belgium, 2003 · Zbl 1019.68034
[6] T. Calders, Computational complexity of itemset frequency satisfiability, in: Proc. 23nd ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database System, 2004; T. Calders, Computational complexity of itemset frequency satisfiability, in: Proc. 23nd ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database System, 2004 · Zbl 1136.68021
[7] Cooper, G., The computational complexity of probabilistic inference using Bayesian belief networks, Artificial Intelligence, 42, 2-3, 393-405 (1990) · Zbl 0717.68080
[8] Csiszár, I., I-divergence geometry of probability distributions and minimization problems, Ann. Probab., 3, 1, 146-158 (1975) · Zbl 0318.60013
[9] Georgakopoulos, G.; Kavvadias, D.; Papadimitriou, C. H., Probabilistic satisfiability, J. Complexity, 4, 1, 1-11 (1988) · Zbl 0647.68049
[10] Hailperin, T., Best possible inequalities for the probability of a logical function of events, Amer. Math. Monthly, 72, 4, 343-359 (1965) · Zbl 0132.13706
[11] Kullback, S., Information Theory and Statistics (1968), Dover Publications, Inc.: Dover Publications, Inc. New York · Zbl 0149.37901
[12] Lukasiewicz, T., Probabilistic logic programming with conditional constraints, ACM Trans. Comput. Logic (TOCL), 2, 3, 289-339 (2001) · Zbl 1171.68762
[13] Papadimitriou, C., Computational Complexity (1995), Addison-Wesley: Addison-Wesley Reading, MA · Zbl 0557.68033
[14] Papadimitriou, C.; Steiglitz, K., Combinatorial Optimization Algorithms and Complexity (1998), Dover: Dover New York · Zbl 0944.90066
[15] Pavlov, D.; Mannila, H.; Smyth, P., Beyond independence: Probabilistic models for query approximation on binary transaction data, IEEE Trans. Knowledge Data Engrg., 15, 6, 1409-1421 (2003)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.