An Algorithm for Extracting Rare Concepts with Concise Intents

Yoshiaki Okubo²¹ &
Makoto Haraguchi²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5986))

Included in the following conference series:

International Conference on Formal Concept Analysis

1052 Accesses
6 Citations

Abstract

This paper presents an algorithm for finding concepts (closures) with smaller supports. As suggested by the study of emerging patterns, contrast sets or crossover concepts, we regard less frequent and rare concepts.

However, we have several difficulties when we try to find concepts in those rare concepts. Firstly, there exist a large number of concepts closer to individual ones. Secondly, the lengths of intents become longer, involving many attributes at various levels of generality. Consequently, it becomes harder to understand what the concepts mean or represent.

In order to solve the above problems, we make a restriction on formation processes of concepts, where the formation is a flow of adding attributes to the present concepts already formed. The present concepts work as conditions for several candidate attributes to be added to them. Given such a present concept, we prohibit adding attributes strongly correlated with the present concept. In other words, we add attributes only when they contribute toward decreasing the supports of concepts to some extent. As a result, the detected concepts has lower supports and consist of only attributes directing at more specific concepts through the formation processes.

The algorithm is designed as a top-N closure enumerator using branch-and-bound pruning rules so that it can reach concepts with lower supports by avoiding useless combination of correlated attributes in a huge space of concepts. We experimentally show effectiveness of the algorithm and the conceptual clarity of detected concepts because of their shorter length in spite of their lower supports.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Mining Frequent Closed Set Distinguishing One Dataset from Another from a Viewpoint of Structural Index

Closed Patterns and Abstraction Beyond Lattices

Utilizing Coverage Lists as a Pruning Mechanism for Concept Discovery

References

Ganter, B., Wille, R.: Formal Concept Analysis - Mathematical Foundations, 284 p. Springer, Heidelberg (1999)
MATH Google Scholar
Ganter, B., Stumme, G., Wille, R. (eds.): Formal Concept Analysis – Foundations and Applications. LNCS (LNAI), vol. 3626, 348 p. Springer, Heidelberg (2005)
Google Scholar
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the ACM SIGMOD Int’l Conf. on Management of Data, pp. 207–216 (1993)
Google Scholar
Lakhal, L., Stumme, G.: Efficient Mining of Association Rules Based on Formal Concept Analysis. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis. LNCS (LNAI), vol. 3626, pp. 180–195. Springer, Heidelberg (2005)
Google Scholar
Wang, J., Han, J., Pei, J.: CLOSET+: Searching for the Best Strategies for Mining Frequent Closed Itemsets. In: Proc. of the 9th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining - KDD 2003, pp. 236–245 (2003)
Google Scholar
Han, J., Cheng, H., Xin, D., Yan, X.: Frequent Pattern Mining - Current Status and Future Directions. Data Mining and Knowledge Discovery 15(1), 55–86 (2007)
Article MathSciNet Google Scholar
Uno, T., Kiyomi, M., Arimura, H.: LCM ver. 2: Efficient Mining Algorithm for Frequent/Closed/Maximal Itemsets. In: Proc. of IEEE ICDM 2004 Workshop - FIMI 2004 (2004), http://sunsite.informatik.rwth-aachen.de/verb+Publications/CEUR-WS//Vol-126/
Dong, G., Li, J.: Efficient Mining of Emerging Patterns: Discovering Trends and Differences. In: Proc. of the 5th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining - KDD 1999, pp. 43–52 (1999)
Google Scholar
Alhammady, H., Ramamohanarao, K.: Using Emerging Patterns and Decision Trees in Rare-Class Classification. In: Proc. of the 4th IEEE Int’l Conf. on Data Mining - ICDM 2004, pp. 315–318 (2004)
Google Scholar
Bay, S.D., Pazzani, M.J.: Detecting Group Differences: Mining Contrast Sets. Data Mining and Knowledge Discovery 5(3), 213–246 (2001)
Article MATH Google Scholar
Novak, P.K., Lavrac, N.: Supervised Descriptive Rule Discovery: A Unifying Survey of Contrast Set, Emerging Pattern and Subgroup Mining. The Journal of Machine Learning Research Archive 10, 377–403 (2009)
Google Scholar
Li, A., Haraguchi, M., Okubo, Y.: Implicit Groups of Web Pages as Constrained Top-N Concepts. In: Proc. of the 2008 IEEE/WIC/ACM Int’l Conf. on Web Intelligence and Intelligent Agent Technology Workshops, pp. 190–194 (2008)
Google Scholar
Nebel, B.: Reasoning and Revision in Hybrid Representation. Springer, Heidelberg (1989)
Google Scholar
Sinka, M.P., Corne, D.W.: A Large Benchmark Dataset for Web Document Clustering. In: Soft Computing Systems: Design, Management and Applications. Series of Frontiers in Artificial Intelligence and Applications, vol. 87, pp. 881–890 (2002)
Google Scholar
Besson, J., Robardet, C., Boulicaut, J.: Constraint-Based Concept Mining and Its Application to Microarray Data Analysis. Intelligent Data Analysis 9(1), 59–82 (2005)
Google Scholar
Szathmary, L., Napoli, A., Valtchev, P.: Towards Rare Itemset Mining. In: Proc. of the 19th IEEE Int’l Conf. on Tools with Artificial Intelligence - ICTAI 2007, pp. 305–312 (2007)
Google Scholar
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient Mining of Association Rules Using Closed Itemset Lattices. Information Systems 24(1), 25–46 (1999)
Article Google Scholar
Tomita, E., Kameda, T.: An Efficient Branch-and-Bound Algorithm for Finding a Maximum Clique with Computational Experiments. Journal of Global Optimization 37, 95–111 (2007)
Article MATH MathSciNet Google Scholar
Tomita, E., Seki, T.: An Efficient Branch and Bound Algorithm for Finding a Maximum Clique. In: Calude, C.S., Dinneen, M.J., Vajnovszki, V. (eds.) DMTCS 2003. LNCS, vol. 2731, pp. 278–289. Springer, Heidelberg (2003)
Chapter Google Scholar
Fahle, T.: Simple and Fast: Improving a Branch-and-Bound Algorithm for Maximum Clique. In: Möhring, R.H., Raman, R. (eds.) ESA 2002. LNCS, vol. 2461, pp. 485–498. Springer, Heidelberg (2002)
Chapter Google Scholar
Porter, M.F.: An Algorithm for Suffix Stripping. Program 14(3), 130–137 (1980)
Google Scholar
Fellbaum, C. (ed.): WordNet - An Electronic Lexical Database. MIT Press, Cambridge (1998)
MATH Google Scholar
Haraguchi, M., Okubo, Y.: An Extended Branch-and-Bound Search Algorithm for Finding Top-N Formal Concepts of Documents. In: Washio, T., Satoh, K., Takeda, H., Inokuchi, A. (eds.) JSAI 2006. LNCS (LNAI), vol. 4384, pp. 276–288. Springer, Heidelberg (2007)
Chapter Google Scholar
Haraguchi, M., Okubo, Y.: A Method for Pinpoint Clustering of Web Pages with Pseudo-Clique Search. In: Jantke, K.P., Lunzer, A., Spyratos, N., Tanaka, Y. (eds.) Federation over the Web. LNCS (LNAI), vol. 3847, pp. 59–78. Springer, Heidelberg (2006)
Chapter Google Scholar
Okubo, Y., Haraguchi, M.: Finding Conceptual Document Clusters with Improved Top-N Formal Concept Search. In: Proc. of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence - WI 2006, pp. 347–351 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Division of Computer Science, Graduate School of Information Science and Technology, Hokkaido University, N-14 W-9, Sapporo, 060-0814, Japan
Yoshiaki Okubo & Makoto Haraguchi

Authors

Yoshiaki Okubo
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Haraguchi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering, Zurich University of Applied Sciences, Technikumstraße 9, 8401, Winterthur, Switzerland
Léonard Kwuida
SAP Research Center, Chemnitzer Straße 48, 01187, Dresden, Germany
Barış Sertkaya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Okubo, Y., Haraguchi, M. (2010). An Algorithm for Extracting Rare Concepts with Concise Intents. In: Kwuida, L., Sertkaya, B. (eds) Formal Concept Analysis. ICFCA 2010. Lecture Notes in Computer Science(), vol 5986. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11928-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-11928-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11927-9
Online ISBN: 978-3-642-11928-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Algorithm for Extracting Rare Concepts with Concise Intents

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Mining Frequent Closed Set Distinguishing One Dataset from Another from a Viewpoint of Structural Index

Closed Patterns and Abstraction Beyond Lattices

Utilizing Coverage Lists as a Pruning Mechanism for Concept Discovery

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

An Algorithm for Extracting Rare Concepts with Concise Intents

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Mining Frequent Closed Set Distinguishing One Dataset from Another from a Viewpoint of Structural Index

Closed Patterns and Abstraction Beyond Lattices

Utilizing Coverage Lists as a Pruning Mechanism for Concept Discovery

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation