×

Bayesian alignment using hierarchical models, with applications in protein bioinformatics. (English) Zbl 1153.62020

Summary: An important problem in shape analysis is to match configurations of points in space after filtering out some geometrical transformation. We introduce hierarchical models for such tasks, in which the points in the configurations are either unlabelled or have at most a partial labelling constraining the matching, and in which some points may only appear in one of the configurations. We derive procedures for simultaneous inference about the matching and the transformation, using a Bayesian approach. Our hierarchical model is based on a Poisson process for hidden true point locations; this leads to considerable mathematical simplification and efficiency of implementation of EM and Markov chain Monte Carlo algorithms. We find a novel use for classical distributions from directional statistics in a conditionally conjugate specification for the case where the geometrical transformation includes an unknown rotation. Throughout, we focus on the case of affine or rigid motion transformations.
Under a broad parametric family of loss functions, an optimal Bayesian point estimate of the matching matrix can be constructed that depends only on a single parameter of the family. Our methods are illustrated by two applications from bioinformatics. The first problem is of matching protein gels in two dimensions, and the second consists of aligning active sites of proteins in three dimensions. In the latter case, we also use information related to the grouping of the amino acids, as an example of a more general capability of our methodology to include partial labelling information. We discuss some open problems and suggest directions for future work.

MSC:

62F15 Bayesian inference
65C40 Numerical analysis or methods applied to Markov chains
62M99 Inference from stochastic processes
62P10 Applications of statistics to biology and medical sciences; meta analysis
92C40 Biochemistry, molecular biology
05C90 Applications of graph theory
60J22 Computational methods in Markov chains
62H12 Estimation in multivariate analysis