Abstract
The variety of concept’s specialization can be an index of how the concept is significant. From this viewpoint, given two incident relations as datasets, we consider formal concepts with many frequent subconcepts in one dataset, while those have few frequent subconcepts in another dataset. Instead of calculating the number of frequent subconcepts directly, we introduce a structural index that approximates the depth complexity of join semilattice of frequent concepts, and consider an anti-monotonic constraint for one dataset and a monotonic constraint for another one. Based on these two constraints, we develop a procedure to search for “emerging concepts” with respect to the structural index. Although it is generally a hard task to compute the structural index, the index we choose is known as efficient for large sparse data. The experimental results show the effectiveness of proposed method, involving some interesting output concepts contrasting two datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We have collected them from a CD-ROM edition of the newspapers.
- 2.
“Hokkaido” is the northernmost prefecture in Japan.
- 3.
The lattices have been drawn by Graphviz (http://www.graphviz.org) via FcaStone (http://fcastone.sourceforge.net).
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of the 20th International Conference on Very Large Databases (VLDB 1994), pp. 487–499 (1994)
Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: efficient mining algorithm for frequent/closed/maximal itemsets. In: Proceedings of IEEE ICDM 2004 Workshop (FIMI 2004) (2004). http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS//Vol-126
Workshop on Frequent Itemset Mining Implementations (FIMI 2004) (2004). http://fimi.ua.ac.be/fimi04/
Zaki, M.J., Hsiao, C.: CHARM: an efficient algorithm for closed itemset mining. In: Proceedings of the 2002 SIAM International Conference on Data Mining (SDM 2002), pp. 457–453 (2002)
Wang, J., Han, J., Pei, J.: CLOSET+: searching for the best strategies for mining frequent closed itemsets. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2003), pp. 236–245 (2003)
Leroy, V., Kirchgessner, M., Termier, A., Amer-Yahia, S.: TopPI: an efficient algorithm for item-centric mining. Inf. Syst. 64, 104–118 (2017). Elsevier
Zida, S., Furnier-Viger, P., Lin, J.C., Wu, C., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51, 595–625 (2016). Online First Articles, Springer
Geng, L., Hamilton, H.J.: Interestingness measures for data mining: a survey. ACM Comput. Surv. 38(3), Article 9 (2006)
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent pattern mining - current status and future directions. Data Mining Knowl. Disc. 15(1), 55–86 (2007). Springer
Zhu, F., Yan, X., Han, J., Yu, P.S., Cheng, H.: Mining colossal frequent patterns by core pattern fusion. In: Proceedings of the 23rd IEEE International Conference on Data Engineering (ICDE 2007), pp. 706–715 (2007)
Bay, S.D., Pazzani, M.J.: Detecting group differences: mining contrast sets. Data Mining Knowl. Disc. 5(3), 213–246 (2001). Kluwer Academic Publishers
Omiecinski, E.R.: Alternative interest measures for mining associations in databases. IEEE Trans. Knowl. Data Eng. 15(1), 57–69 (2003)
Szathmary, L., Napoli, A., Valtchev, P.: Towards rare itemset mining. In: Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), pp. 305–312 (2007)
Dong, G., Li, J.: Mining border descriptions of emerging patterns from dataset pairs. Knowl. Inf. Syst. 8(2), 178–202 (2005). Springer
Li, J., Dong, G., Ramamohanarao, K.: Making use of the most expressive jumping emerging patterns for classification. Knowl. Inf. Syst. 3(2), 131–145 (2001). Springer
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1999)
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999). Elsevier
Vychodil, V.: A new algorithm for computing formal concepts. In: Proceedings of The 19th European Meeting on Cybernetics and Systems Research, pp. 15–21 (2008)
Bron, C., Kerbosch, J.: Algorithm 457 - finding all cliques of an undirected graph. Commun. ACM 16(9), 575–577 (1973)
Tomita, E., Tanaka, A., Takahashi, H.: The worst-case time complexity for generating all maximal cliques and computational experiments. Theor. Comput. Sci. 363(1), 28–42 (2006). Elsevier
Tomita, E., Nakanishi, H.: Polynomial-time solvability of the maximum clique problem. In: Computing and Computational Intelligence, pp. 203–208. World Scientific and Engineering Academy and Society (2009)
Eppstein, D., Strash, D.: Listing all maximal cliques in large sparse real-world graphs. In: Pardalos, P.M., Rebennack, S. (eds.) SEA 2011. LNCS, vol. 6630, pp. 364–375. Springer, Heidelberg (2011). doi:10.1007/978-3-642-20662-7_31
Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NPCompleteness. W.H. Freeman and Company, New York (1979)
Okubo, Y., Haraguchi, M.: An algorithm for extracting rare concepts with concise intents. In: Kwuida, L., Sertkaya, B. (eds.) ICFCA 2010. LNCS (LNAI), vol. 5986, pp. 145–160. Springer, Heidelberg (2010). doi:10.1007/978-3-642-11928-6_11
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Okubo, Y., Haraguchi, M. (2017). Mining Frequent Closed Set Distinguishing One Dataset from Another from a Viewpoint of Structural Index. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2017. Lecture Notes in Computer Science(), vol 10358. Springer, Cham. https://doi.org/10.1007/978-3-319-62416-7_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-62416-7_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-62415-0
Online ISBN: 978-3-319-62416-7
eBook Packages: Computer ScienceComputer Science (R0)