
The minimum description length principle for pattern mining: a survey. (English) Zbl 1509.68240

Summary: Mining patterns is a core task in data analysis and, beyond issues of efficient enumeration, the selection of patterns constitutes a major challenge. The Minimum Description Length (MDL) principle, a model selection method grounded in information theory, has been applied to pattern mining with the aim to obtain compact high-quality sets of patterns. After giving an outline of relevant concepts from information theory and coding, we review MDL-based methods for mining different kinds of patterns from various types of data. Finally, we open a discussion on some issues regarding these methods.


68T09 Computational aspects of data analysis and big data
68P30 Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science)
68T10 Pattern recognition, speech recognition
68-02 Research exposition (monographs, survey articles) pertaining to computer science


