
Exploratory data mining and data cleaning. (English) Zbl 1027.62002

Wiley Series in Probability and Statistics. Hoboken, NJ: Wiley. xii, 203 p. (2003).
This book provides a uniquely integrated approach to data exploration and data cleaning. The authors (1) focus on developing and evolving a modeling strategy through an iterative data exploration loop and incorporation of domain knowledge, (2) address methods of detecting, quantifying (metrics) and correcting data quality issues, and (3) highlight new approaches and methodologies such as DataSphere space partitioning and summary-based analysis techniques. Case studies are also given to illustrate applications in real-life scenarios.
This book is intended for serious data analyses everywhere that need to analyze large amounts of unfamiliar, potentially noisy data, and for managers of operations databases. It can also serve as a text on data quality to supplement an advanced undergraduate or graduate level course in large-scale data analysis and data mining.


62-07 Data analysis (statistics) (MSC2010)
62-02 Research exposition (monographs, survey articles) pertaining to statistics
68P10 Searching and sorting
