×

Selecting relevant projections onto subsets of coordinates: A minimax dependence-based approach. (English) Zbl 0713.62006

Given a set of entities and a random vector whose components are defined on that set, the sum of entropies of the components minus the joint entropy (or entropy of the random vector) is used to reduce the number of components. This reduction is accomplished by selecting a pair (or triple) of components that are significant for classifying the set of entities in distinct and meaningful clusters. This purpose is achieved by performing the selection on the basis of both, the low degree of interdependence among selected components, and the high degree of dependence on the omitted ones (that means a kind of minimax criterion).
When components are Gaussian, the measure of interdependence admits a simple expression that can be easily computed through MINITAB or SAS procedures. Results are illustrated by means of a practical example.
Reviewer: M.A.Gil

MSC:

62B10 Statistical aspects of information-theoretic topics
62H30 Classification and discrimination; cluster analysis (statistical aspects)
94A17 Measures of information, entropy

Software:

MINITAB; SAS
Full Text: DOI

References:

[1] Andrews, D. F.; Herzberg, A. M., Data: A Collection of Problems from Many Fields for the Student end Research Worker (1985), Springer-Verlag: Springer-Verlag New York · Zbl 0567.62002
[2] Diaconis, P.; Freedman, D., Asymptotics of graphical projection pursuit, Ann. Statist., 12, 793-815 (1984) · Zbl 0559.62002
[3] Diaconis, P.; Friedman, J. H., \(M\) and \(N\) plots, (Rizvi, M. H.; Rustagi, J.; Siegmund, D., Recent Advances in Statistics (1983), Academic Press: Academic Press New York), 425-447 · Zbl 0598.62001
[4] Friedman, J. H.; Tukey, J. W., A projection pursuit algorithm for exploratory data analysis, IEEE Trans. Comput., C-23, 881-889 (1974) · Zbl 0284.68079
[5] Guiasu, S., Information Theory with Applications (1977), McGraw-Hill: McGraw-Hill New York · Zbl 0379.94027
[6] Guiasu, S.; Leblanc, R., Sur la dépendence entre les variables aléatoires normales, Estadistica, 36, 101-107 (1984)
[7] Haberman, S. J., Association, measures of, (Kotz, S.; Johnson, N. L., Encyclopedia of Statistical Sciences (1982), Wiley: Wiley New York), 130-137 · Zbl 0552.62001
[8] Huber, P. J., Projection pursuit, Ann. Statist., 13, 435-475 (1985) · Zbl 0595.62059
[9] Jogdeo, K., Dependence, Concepts of, (Kotz, S.; Johnson, N. L., Encyclopedia of Statistical Sciences (1982), Wiley: Wiley New York), 324-334 · Zbl 0552.62001
[10] Rao, C. R., Linear Statistical Inference end Its Applications (1965), Wiley: Wiley New York · Zbl 0137.36203
[11] Reaven, G. M.; Miller, R. G., An attempt to define the nature of chemical diabetes using a multidimensional analysis, Diabetologia, 16, 17-24 (1979)
[12] Shannon, C. E., A mathematical theory of communication, Bell Systems Technol J., 27, 623-656 (1948) · Zbl 1154.94303
[13] Symons, M. J., Clustering criteria and multivariate normal mixtures, Biometrics, 37, 35-43 (1981) · Zbl 0473.62048
[14] Watanabe, S., Knowing and Guessing (1969), Wiley: Wiley New York · Zbl 0206.20901
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.