The MVL (Missing Values Linkage) Approach for Hierarchical Classification when Data are Incomplete

M. Schader⁵ &
W. Gaul⁶

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

216 Accesses
4 Citations

Abstract

We describe possibilities how the well-known linkage techniques for hierarchical clustering can be modified to consider the problem of missing values in dissimilarity data. The resulting MVL (Missing Values Linkage) approach is presented and compared with a least squares-based penalty algorithm. In an example, a distance table of selected European cities is used to demonstrate features of the MVL approach. Randomly chosen distances are assumed missing, the non-missing distances are superimposed with random error, and different subsets of cities are taken into consideration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Improving Clustering and Cluster Validation with Missing Data Using Distance Estimation Methods

A sequential distance-based approach for imputing missing data: Forward Imputation

Article 25 March 2016

Hierarchical Linkage Clustering with Distributions of Distances for Large-Scale Record Linkage

References

ARABIE, P., and CAROLL, J.D. (1980), Mapclus: A Mathematical Approach to Fitting the ADCLUS Model, Psychometrika, 45, 211–235.
Article Google Scholar
BROSSIER, G. (1990), Piecewise Hierarchical Clustering, Journal of Classification, 7, 197–216.
Article Google Scholar
CAROLL, J.D., and ARABIE, P. (1983), INDCLUS: An Individual Differences Generalization of the ADCLUS Model and the MAPCLUS Algorithm, Psychometrika, 48, 157–169.
Article Google Scholar
DEGENS, P.O. (1988), Reconstruction of Phylogenies by Weighted Genetic Distances, in: Classification and Related Methods of Data Analysis, ed. H.H. BOCK, North-Holland, 727–739.
Google Scholar
De SOETE, G. (1984a), Ultrametric Tree Representations of Incomplete Dissimilarity Data, Journal of Classification, 1, 235–242.
Article Google Scholar
De SOETE, G. (1984b), Additive Tree Representations of Incomplete Dissimilarity Data, Quality and Quantity, 18, 387–393.
Article Google Scholar
ESPEJO, E., and GAUL, W. (1986), Two-Mode Hierarchical Clustering as an Instrument for Marketing Research, in: Classification as a Tool of Research, eds. W. GAUL and M. SCHADER, North-Holland, 121–128.
Google Scholar
GAUL, W., and HARTUNG, J. (1979), A Barrier Method with Arbitrary Starting Point, Mathematische Operationsforschung und Statistik, Ser. Optimization, 10, 317–323.
Google Scholar
GAUL, W., and SCHADER, M. (1988), Clusterwise Aggregation of Relations, Aplplied Stochastic Models and Data Analysis, 4, 273–282.
Article Google Scholar
GAUL, W., and SCHADER, M. (1991), Pyramidal Classification Based on Incomplete Dissimilarity Data, Working Paper, submitted.
Google Scholar
GAUL, W., SCHADER, M., and BOTH, M. (1990), Knowledge-Oriented Support for Data Analysis Applications to Marketing, in: Knowledge, Data and Computer-Assisted Decisions, eds. M. Schader and W. Gaul Springer, 259–271.
Chapter Google Scholar
MACCALLUM, R.C. (1978), Recovery of Structure in Incomplete Data by ALSCAL, Psychometrika, 44, 69–74.
Article Google Scholar
SCHADER, M., and GAUL, W. (1990), Pyramidal Clustering with Missing Values, to appear in: Proceedings INRIA Conference Symbolic — Numeric Data Analysis and Learning, Nova Science.
Google Scholar
SCHNELL, R., and ESSER, H. (1985), Zur Effizienz einiger Missing-Data-Techniken — Ergebnisse einer Computer-Simulation, ZUMA-Nachrichten (Nov 1985), 17, 50–74.
Google Scholar
WISHART, D. (1978), Treatment of Missing Values in Cluster Analysis, COMPSTAT 1978 Proceedings, Physica, 281–287.
Google Scholar
WISHART, D. (1985), Estimation of Missing Values and Diagnosis Using Hierarchical Classifications, Computational Statistics Quarterly, 2, 125–134.
Google Scholar
WISHART, D. (1986), Hierarchical Cluster Analysis with Messy Data, in: Classification as a Tool of Research, eds. W. Gaul and M. Schader, North-Holland, 453–460.
Google Scholar

Download references

Author information

Authors and Affiliations

Institut für Informatik, Universität der Bundeswehr Hamburg, Holstenhofweg 85, 2000, Hamburg 70, Germany
M. Schader
Institut für Entscheidungstheorie und Unternehmensforschung, Universität Karlsruhe Kollegium am Schloß, 7500, Karlsruhe, Germany
W. Gaul

Authors

M. Schader
View author publications
You can also search for this author in PubMed Google Scholar
W. Gaul
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Lehrstuhl für Wirtschaftsinformatik III, Universität Mannheim, Schloß, D-6800, Mannheim, Germany
Martin Schader

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schader, M., Gaul, W. (1992). The MVL (Missing Values Linkage) Approach for Hierarchical Classification when Data are Incomplete. In: Schader, M. (eds) Analyzing and Modeling Data and Knowledge. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-46757-8_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-46757-8_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-54708-2
Online ISBN: 978-3-642-46757-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

The MVL (Missing Values Linkage) Approach for Hierarchical Classification when Data are Incomplete

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Improving Clustering and Cluster Validation with Missing Data Using Distance Estimation Methods

A sequential distance-based approach for imputing missing data: Forward Imputation

Hierarchical Linkage Clustering with Distributions of Distances for Large-Scale Record Linkage

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

The MVL (Missing Values Linkage) Approach for Hierarchical Classification when Data are Incomplete

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Improving Clustering and Cluster Validation with Missing Data Using Distance Estimation Methods

A sequential distance-based approach for imputing missing data: Forward Imputation

Hierarchical Linkage Clustering with Distributions of Distances for Large-Scale Record Linkage

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation