×

A T-theoretical approach to phylogenetic analysis and cluster analysis. (English) Zbl 0943.92028

Bielefeld: Univ. Bielefeld, Fakultät Mathematik, 113 p. (1997).
From the introduction: It is generally accepted that the vast number and variety of contemporary species, has evolved from the same primitive cell (or Progenote). The following question naturally arises: Which phylogeny best describes their evolution? In the graph theoretical sense: What is the shape of the (dated) rooted tree, whose leaves represent the contemporary species and whose internal nodes represent the ancestors that appeared during the course of evolution? A major difficulty in this context arises from extinction and the incompleteness of the fossil records. Many of the organisms forming part of the real historical genealogy are unknown. Coping with this problem has resulted in numerous varied approaches to modelling evolution and an even greater number of methods for reconstructing phylogenetic trees. Although these models differ considerably, their main goal is to create a structure that explains and visualizes phylogenetic relationships of the organisms under the constraint that generally accepted evolutionary findings – usually based on morphological studies – of some well studied species are represented correctly.
We now formalize one of the main concepts on which the reconstruction of phylogenetic trees is based: Consider a finite set \(X\) and data that interconnects elements of \(X\). For example, a subset \(X\) of all the contemporary species together with a perhaps only partially known (dis)similarity matrix based on e.g. sequence data. One aims at finding tree structures with leaf set \(X\) that are supported by the data such that tree structures resulting from subsets of \(X\) for which certified information is available are represented correctly. In other words, the problem of how to reconstruct a phylogenetic tree boils down to the question of extracting globally relevant features from locally distributed information. – A classical problem in cluster analysis.
This thesis is divided into two parts. In Part I, we will look at problems which can be best dealt with in the affine model, while in Part II the projective clustering model provides us with the best theoretical framework.

MSC:

92D15 Problems related to evolution
62H30 Classification and discrimination; cluster analysis (statistical aspects)
91C20 Clustering in the social and behavioral sciences