×

Indexed dendrograms on random dissimilarities. (English) Zbl 0902.62072

Summary: This paper studies the random indexed dendrograms produced by agglomerative hierarchical algorithms under the non-classifiability hypothesis of independent identically distributed (i.i.d.) dissimilarities. New tests for classifiability are deduced. The corresponding test statistics are random variables attached to the indexed dendrograms, such as the indices, the survival time of singletons, the value of the ultrametric between two given points, or the size of classes in the different levels of the dendrogram. For an indexed dendrogram produced by the Single Link method on i.i.d. dissimilarities, the distribution of these random variables is computed, thus leading to explicit tests. For the case of the Average and Complete Link methods, some asymptotic results are presented. The proofs rely essentially on the theory of random graphs.

MSC:

62H30 Classification and discrimination; cluster analysis (statistical aspects)
Full Text: DOI