×

Local and global symmetry breaking in itemset mining. (English) Zbl 1409.68267

Summary: The concept of symmetry has been extensively studied in the field of constraint programming and in the propositional satisfiability. Several methods for detection and removal of these symmetries have been developed, and their use in known solvers of these domains improved dramatically their effectiveness on a big variety of problems considered difficult to solve. The concept of symmetry may be exported to other areas where some structures can be exploited effectively. Particularly, in the area of data mining where some tasks can be expressed as constraints or logical formulas. We are interested here, by the detection and elimination of local and global symmetries in the item-set mining problem. Recent works have provided effective encodings as Boolean constraints for these data mining tasks and some idea on symmetry elimination in this area begin to appear, but still few and the techniques presented are often on global symmetry that is detected and eliminated statically in a preprocessing phase. In this work we study the notion of local symmetry and compare it to global symmetry for the itemset mining problem. We show how local symmetries of the boolean encoding can be detected dynamically and give some properties that allow to eliminate theses symmetries in SAT-based itemset mining solvers in order to enhance their efficiency.

MSC:

68T20 Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.)
68T05 Learning and adaptive systems in artificial intelligence

Software:

Shatter; MiningZinc; LCM
Full Text: DOI

References:

[1] Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference n Management of Data, SIGMOD ’93, pp. 207-216. ACM, New York (1993)
[2] Agrawal, R., Srikant, R.: Fast algorithms for mining association rules in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, VLDB ’94, pp. 487-499. Morgan Kaufmann Publishers Inc., San Francisco (1994)
[3] Aloul, F.A., Ramani, A., Markov, I.L., Sakallah, K.A.: Solving difficult SAT instances in the presence of symmetry. In: Proceedings of the 39th Design Automation Conference (DAC 2002), pp. 731-736. ACM Press (2002)
[4] Aloul, F.A., Markov, I.L., Sakallah, K.A.: Shatter: efficient symmetry-breaking for boolean satisfiability. In: DAC, pp. 836-839. ACM (2003)
[5] Aloul, F.A., Ramani, A., Markov, I.L., Sakallak, K.A.: Solving difficult sat instances in the presence of symmetry. In: IEEE Transaction on CAD, vol. 22(9), pp. 1117-1137 (2003)
[6] Aloul, F.A., Ramani, A., Markov, I.L., Sakallah, K.A.: Solving difficult instances of boolean satisfiability in the presence of symmetry. IEEE Trans. CAD Integr. Circ. Syst. 22(9), 1117-1137 (2003) · doi:10.1109/TCAD.2003.816218
[7] Aloul, F.A., Ramani, A., Markov, I.L., Sakallak, K.A.: Symmetry breaking for pseudo-boolean satisfiabilty. In: ASPDAC’04, pp. 884-887 (2004)
[8] Audemard, G., Benhamou, B., Siegel, P.: Aval: an enumerative method for sat. In: Proceedings of the International Conference on Compitational Logic, CL’2000, pp. 373-383. Springer, London (2000) · Zbl 0983.68530
[9] Benhamou, B.: Study of symmetry in constraint satisfaction problems. In: PPCP’94, pp. 246-254 (1994)
[10] Benhamou, B., Saïdi, M.R.: Local symmetry breaking during search in csps. In: Springer (ed.) The 13th International Conference on Principles and Practice of Constraint Programming (CP 2007), LNCS, vol. 4741, pp. 195-209. Providence (2007) · Zbl 1145.68505
[11] Benhamou, B., Sais, L.: Theoretical study of symmetries in propositional calculus and application. In: CADE’11, pp. 281-294 (1992) · Zbl 0925.03077
[12] Benhamou, B., Sais, L.: Tractability through symmetries in propositional calculus. J. Autom. Reasoning 12(1), 89-102 (1994) · Zbl 0804.68134 · doi:10.1007/BF00881844
[13] Benhamou, B., Jabbour, S., Sais, L., Salhi, Y.: Symmetry breaking in itemset mining. In: Proceedings of the International Conference on Knowledge Discovery and Information Retrieval, pp. 86-96 (2014), doi:10.5220/0005078200860096
[14] Benhamou, B., Nabhani, T., Ostrowski, R., Saïdi, M.R.: Dynamic symmetry breaking in the satisfiability problem. In: Proceedings of the 16th International Conference on Logic for Programming, Artificial Intelligence, and Reasoning, LPAR-16. Dakar (2010) · Zbl 0804.68134
[15] Besson, J., Boulicaut, J.F., Guns, T., Nijssen, S.: Generalizing itemset mining in a constraint programming setting. In: Inductive Databases and Constraint-Based Data Mining, pp. 107-126. Springer (2010) · Zbl 1211.68142
[16] Bonchi, F., Lucchese, C.: Extending the state-of-the-art of constraint-based pattern discovery. Data Knowl. Eng. 60(2), 377-399 (2007) · doi:10.1016/j.datak.2006.02.006
[17] Bucila, C., Gehrke, J., Kifer, D., White, W.: Dualminer: a dual-pruning algorithm for itemsets with constraints. Data Min. Knowl. Disc. 7(3), 241-272 (2003) · doi:10.1023/A:1024076020895
[18] Burdick, D., Calimlim, M., Gehrke, J.: Mafia: a maximal frequent itemset algorithm for transactional databases. In: ICDE, pp. 443-452 (2001)
[19] Crawford, J., Ginsberg, M., Luks, E., Roy, A.: Symmetry-breaking predicates for search problems. In: Knowledge Representation (KR), pp. 148-159. Morgan Kaufmann (1996)
[20] Desrosiers, C., Galinier, P., Hansen, P., Hertz, A.: Improving frequent subgraph mining in the presence of symmetry. In: MLG (2007)
[21] Freuder, E.: Eliminating interchangeable values in constraints satisfaction problems. In: AAAI-91, pp. 227-233 (1991) · Zbl 0804.68134
[22] Gély, A., Medina, R., Nourine, L., Renaud, Y.: Uncovering and reducing hidden combinatorics in Guigues-Duquenne bases. In: Ganter, B., Godin, R. (eds.) ICFCA, Lecture Notes in Computer Science, pp. 235-248. Springer (2005) · Zbl 1078.68144
[23] Grahne, G., Zhu, J.: Fast algorithms for frequent itemset mining using fp-trees. IEEE Trans. Knowl. Data Eng. 17(10), 1347-1362 (2005) · doi:10.1109/TKDE.2005.166
[24] Guns, T., Nijssen, S., De Raedt, L.: Itemset mining: a constraint programming perspective. Artif. Intell. 175(12-13), 1951-1983 (2011) · Zbl 1353.68233 · doi:10.1016/j.artint.2011.05.002
[25] Guns, T., Dries, A., Tack, G., Nijssen, S., Raedt, L.D.: Miningzinc: a modeling language for constraint-based mining. In: International Joint Conference on Artificial Intelligence, pp. 1365-1372, Beijing (2013) · Zbl 1404.68106
[26] Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD ’00, pp. 1-12. ACM, New York (2000)
[27] Henriques, R., Lynce, I., Manquinho, V.M.: On when and how to use sat to mine frequent itemsets. CoRR arXiv:1207.6253 (2012)
[28] Jabbour, S., Sais, L., Salhi, Y., Tabia, K.: Symmetries in itemset mining. In: 20th European Conference on Artificial Intelligence (ECAI ’12), pp. 432-437. IOS Press (2012) · Zbl 1327.68194
[29] Jabbour, S., Khiari, M., Sais, L., Salhi, Y., Tabia, K.: Symmetry-based pruning in itemset mining. In: 25th International Conference on Tools with Artificial Intelligence (ICTAI’13). IEEE Computer Society, Washington DC (2013) · Zbl 1327.68194
[30] Jabbour, S., Sais, L., Salhi, Y.: Boolean satisfiability for sequence mining. In: CIKM, pp. 649-658 (2013)
[31] Jabbour, S., Sais, L., Salhi, Y.: Top-k frequent closed itemset mining using top-k sat problem. In: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD’13), vol. 146, pp. 131-140. Springer (2013) · Zbl 0552.03009
[32] Khiari, M., Boizumault, P., Crémilleux, B.: Constraint programming for mining n-ary patterns. In: Cohen, D. (ed.) CP, Lecture Notes in Computer Science, vol. 6308, pp. 552-567. Springer (2010)
[33] Krishnamurthy, B.: Short proofs for tricky formulas. Acta Inf. 22(3), 253-275 (1985) · Zbl 0552.03009 · doi:10.1007/BF00265682
[34] Métivier, J. P., Boizumault, P., Crémilleux, B., Khiari, M., Loudni, S.: A constraint language for declarative pattern discovery. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, SAC ’12, pp. 119-125. ACM, New York (2012)
[35] Minato, S.I.: Symmetric item set mining based on zero-suppressed Bdds. In: Todorovski, L., Lavrac, N., Jantke, K.P. (eds.) Discovery Science, Lecture Notes in Computer Science, vol. 4265, pp. 321-326. Springer (2006) · Zbl 1140.68310
[36] Minato, S.I., Uno, T., Arimura, H.: Fast generation of very large-scale frequent itemsets using a compact graph-based representation (2007) · Zbl 0552.03009
[37] Murtagh, F., Contreras, P.: Hierarchical clustering for finding symmetries and other patterns in massive, high dimensional datasets (2010). CoRR arXiv:1005.2638 · Zbl 1231.68214
[38] Pei, J., Han, J., Lakshmanan, L.V.S.: Pushing convertible constraints in frequent itemset mining. Data Min. Knowl. Disc. 8(3), 227-252 (2004) · doi:10.1023/B:DAMI.0000023674.74932.4c
[39] Puget, J.F.: On the satisfiability of symmetrical constrained satisfaction problems. In: Kamorowski, J., Ras, Z.W. (eds.) Proceedings of ISMIS’93, LNAI 689, pp. 350-361 (1993)
[40] Raedt, L.D., Guns, T., Nijssen, S.: Constraintprogrammingfor itemsetmining. In: KDD, pp. 204-212 (2008)
[41] Raedt, L.D., Guns, T., Nijssen, S.: Constraint programming for data mining and machine learning. In: AAAI (2010) · Zbl 1353.68233
[42] Tiwari, A., Gupta, R., Agrawal, D.: A survey on frequent pattern mining: current status and challenging issues. Inform. Technol. J. 9, 1278-1293 (2010) · doi:10.3923/itj.2010.1278.1293
[43] Tseitin, G.S.: On the complexity of derivation in propositional calculus. In: Structures in the Constructive Mathematics and Mathematical Logic, pp. 115-125. H.A.O Shsenko (1968) · Zbl 0197.00102
[44] Uno, T., Asai, T., Uchida, Y., Arimura, H.: Lcm: an efficient algorithm for enumerating frequent closed item sets. In: Proceedings of Workshop on Frequent Itemset Mining Implementations (FIMI 03) (2003)
[45] Uno, T., Kiyomi, M., Arimura, H.: Lcm Ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: FIMI (2004)
[46] Vanetik, N.: Mining graphs with constraints on symmetry and diameter. In: Shen, H.T., Pei, J., Zsu, M.T., Zou, L., Lu, J., Ling, T.W., Yu, G., Zhuang, Y., Shao, J. (eds.) WAIM Workshops, Lecture Notes in Computer Science, vol. 6185, pp. 1-12. Springer (2010)
[47] Zaki, M.J., Hsiao, C.J.: Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans. Knowl. Data Eng. 17(4), 462-478 (2005) · doi:10.1109/TKDE.2005.60
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.