×

Core concepts in data analysis. Summarization, correlation and visualization. (English) Zbl 1219.68007

Undergraduate Topics in Computer Science. London: Springer (ISBN 978-0-85729-286-5/pbk; 978-0-85729-287-2/ebook). xx, 390 p. (2011).
As it is well known, data analysis is the process of exploring data with the goal of highlighting useful information in order to eventually support decision making. There are many approaches under this term, ranging from data mining, statistical data analysis, machine learning, etc., with various applications.
This textbook follows an unconventional way to present the main aspects regarding data analysis. Thus, it starts with introducing the notion of the data analysis “core”, also presenting some simple illustrating examples. Basically, two ways of studying data analysis are considered:
(a)
summarization – for developing and augmenting concepts, and
(b)
correlation – for enhancing and establishing relations.
Thus, the author mixes elements of statistical data analysis, data mining, and computational intelligence to accomplish his task to demonstrate that data analysis should help in enhancing and augmenting knowledge about the corresponding data.
Then, the reader is led in a friendly way through different data analysis areas, such as: summarization and visualization, correlation, multivariate correlation, linear regression, linear discrimination, decision trees, naïve Bayes model, principal component analysis, clustering techniques, etc. The final appendix summarizes knowledge regarding basic linear algebra, basic optimization, basic Matlab, etc. Many concrete examples illustrate the theory exposed in the book.
As an overall conclusion, this book represents an exciting text, covering the main topics of the data analysis area. It can be successfully used as a textbook for BS and MS students in computer science, on the one hand, and for researchers in data mining and related fields, on the other hand.

MSC:

68-01 Introductory exposition (textbooks, tutorial papers, etc.) pertaining to computer science
62-07 Data analysis (statistics) (MSC2010)
68T05 Learning and adaptive systems in artificial intelligence

Software:

Matlab
Full Text: DOI