×

Flexibly regularized mixture models and application to image segmentation. (English) Zbl 07750026

Summary: Probabilistic finite mixture models are widely used for unsupervised clustering. These models can often be improved by adapting them to the topology of the data. For instance, in order to classify spatially adjacent data points similarly, it is common to introduce a Laplacian constraint on the posterior probability that each data point belongs to a class. Alternatively, the mixing probabilities can be treated as free parameters, while assuming Gauss-Markov or more complex priors to regularize those mixing probabilities. However, these approaches are constrained by the shape of the prior and often lead to complicated or intractable inference. Here, we propose a new parametrization of the Dirichlet distribution to flexibly regularize the mixing probabilities of over-parametrized mixture distributions. Using the Expectation-Maximization algorithm, we show that our approach allows us to define any linear update rule for the mixing probabilities, including spatial smoothing regularization as a special case. We then show that this flexible design can be extended to share class information between multiple mixture models. We apply our algorithm to artificial and natural image segmentation tasks, and we provide quantitative and qualitative comparison of the performance of Gaussian and Student-t mixtures on the Berkeley Segmentation Dataset. We also demonstrate how to propagate class information across the layers of deep convolutional neural networks in a probabilistically optimal way, suggesting a new interpretation for feedback signals in biological visual systems. Our flexible approach can be easily generalized to adapt probabilistic mixture models to arbitrary data topologies.

MSC:

68Txx Artificial intelligence
62Hxx Multivariate analysis

Software:

U-Net; PMTK; SegNet; DeepLab

References:

[1] Arbelaez, P.; Maire, M.; Fowlkes, C.; Malik, J., Contour detection and hierarchical image segmentation, IEEE transactions on pattern analysis and machine intelligence, 33, 5, 898-916 (2011)
[2] Badrinarayanan, V.; Kendall, A.; Cipolla, R., Segnet: a deep convolutional encoder-decoder architecture for image segmentation, IEEE transactions on pattern analysis and machine intelligence, 39, 12, 2481-2495 (2017)
[3] Blei, D. M.; Frazier, P. I., Distance dependent chinese restaurant processes, Journal of Machine Learning Research, 12, 8 (2011) · Zbl 1280.68157
[4] Bouveyron, C.; Brunet-Saumard, C., Model-based clustering of high-dimensional data: a review., Computational Statistics & Data Analysis, 71, 52-78 (2014) · Zbl 1471.62032
[5] Boyd, S.; Vandenberghe, L., Convex optimization (2004), Cambridge University Press · Zbl 1058.90049
[6] Boykov, Y.; Veksler, O.; Zabih, R., Fast approximate energy minimization via graph cuts, IEEE Transactions on pattern analysis and machine intelligence, 23, 11, 1222-1239 (2001)
[7] Boyles, R. A., On the convergence of the EM algorithm, Journal of the Royal Statistical Society: Series B (Methodological), 45, 1, 47-50 (1983) · Zbl 0508.62030
[8] Chen, L.-C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A. L., Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs, IEEE transactions on pattern analysis and machine intelligence, 40, 4, 834-848 (2018)
[9] Coen-cagli, R.; Dayan, P.; Schwartz, O., Statistical models of linear and nonlinear contextual interactions in early visual processing, (Bengio, Y.; Schuurmans, D.; Lafferty, J.; Williams, C.; Culotta, A., Advances in neural information processing systems, vol. 22 (2009), Curran Associates, Inc.)
[10] Coen-cagli, R.; Dayan, P.; Schwartz, O., Cortical surround interactions and perceptual salience via natural scene statistics, PLOS Computational Biology, 8, 3, 1-18 (2012)
[11] Comaniciu, D.; Meer, P., Mean shift: a robust approach toward feature space analysis, IEEE Transactions on pattern analysis and machine intelligence, 24, 5, 603-619 (2002)
[12] Dauwels, J.; Eckford, A.; Korl, S.; Loeliger, H.-A., Expectation maximization as message passing-part I: principles and gaussian messages (2009), arXiv preprint arXiv:0910.2832
[13] Dauwels, J., Korl, S., & Loeliger, H. (2005). Expectation maximization as message passing. In Proceedings. international symposium on information theory, 2005 (pp. 583-586).
[14] Dempster, A. P.; Laird, N. M.; Rubin, D. B., Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society: Series B (Methodological), 39, 1, 1-22 (1977) · Zbl 0364.62022
[15] Dhanachandra, N.; Manglem, K.; Chanu, Y. J., Image segmentation using k-means clustering algorithm and subtractive clustering algorithm, Procedia Computer Science, 54, 764-771 (2015)
[16] Eckford, A. W. (2004). Channel estimation in block fading channels using the factor graph EM algorithm. In 22nd biennial symposium on communications (pp. 1-3).
[17] Eckford, A.; Pasupathy, S., Iterative multiuser detection with graphical modeling, 2000 IEEE international conference on personal wireless communications. conference proceedings (Cat. No. 00TH8488), 454-458 (2000), IEEE
[18] Elder, J. H.; Goldberg, R. M., Ecological statistics of gestalt laws for the perceptual organization of contours, Journal of Vision, 2, 4, 5 (2002)
[19] Fowlkes, C. C.; Martin, D. R.; Malik, J., Local figure-ground cues are valid for natural images, Journal of Vision, 7, 8, 2 (2007)
[20] Frey, B. J., Extending factor graphs so as to unify directed and undirected graphical models, (Proceedings of the nineteenth conference on uncertainty in artificial intelligence. Proceedings of the nineteenth conference on uncertainty in artificial intelligence, UAI’03 (2003), Morgan Kaufmann Publishers Inc.: Morgan Kaufmann Publishers Inc. San Francisco, CA, USA), 257-264
[21] Gan, H.; Sang, N.; Huang, R., Manifold regularized semi-supervised gaussian mixture model, JOSA A, 32, 4, 566-575 (2015)
[22] Ghosh, S., Ungureanu, A. B., Sudderth, E. B., & Blei, D. M. (2011). Spatial distance dependent Chinese restaurant processes for image segmentation. In Advances in neural information processing systems (pp. 1476-1484).
[23] He, X.; Cai, D.; Shao, Y.; Bao, H.; Han, J., Laplacian regularized gaussian mixture model for data clustering, IEEE Transactions on Knowledge and Data Engineering, 23, 9, 1406-1418 (2010)
[24] Hubert, L.; Arabie, P., Comparing partitions, Journal of classification, 2, 1, 193-218 (1985)
[25] Hyvärinen, A.; Hurri, J.; Hoyer, P. O., Natural image statistics: a probabilistic approach to early computational vision (2009), Springer · Zbl 1178.68622
[26] Jing, Y.; Yang, Y.; Feng, Z.; Ye, J.; Yu, Y.; Song, M., Neural style transfer: a review, IEEE transactions on visualization and computer graphics, 26, 11, 3365-3385 (2019)
[27] Kass, M.; Witkin, A.; Terzopoulos, D., Snakes: active contour models, International journal of computer vision, 1, 4, 321-331 (1988)
[28] Kim, J.; Linsley, D.; Thakkar, K.; Serre, T., Disentangling neural mechanisms for perceptual grouping (2019), arXiv preprint arXiv:1906.01558
[29] Koller, D.; Friedman, N., Probabilistic graphical models: principles and techniques (2009), MIT press · Zbl 1183.68483
[30] Kreiman, G.; Serre, T., Beyond the feedforward sweep: feedback computations in the visual cortex, Annals of the New York Academy of Sciences, 1464, 1, 222-241 (2020)
[31] Kschischang, F.; Frey, B.; Loeliger, H.-A., Factor graphs and the sum-product algorithm, IEEE Transactions on Information Theory, 47, 2, 498-519 (2001) · Zbl 0998.68234
[32] Lee, T. S.; Mumford, D., Hierarchical bayesian inference in the visual cortex, JOSA A, 20, 7, 1434-1448 (2003)
[33] Linsley, D., Kim, J., Ashok, A., & Serre, T. (2019). Recurrent neural circuits for contour detection. In International conference on learning representations.
[34] Linsley, D., Kim, J., Veerabadran, V., Windolf, C., & Serre, T. (2018). Learning long-range spatial dependencies with horizontal gated recurrent units. In Advances in neural information processing systems (pp. 152-164).
[35] Liu, J., Cai, D., & He, X. (2010). Gaussian mixture model with local consistency. In Twenty-fourth AAAI conference on artificial intelligence.
[36] Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).
[37] Maninis, K.-K.; Pont-Tuset, J.; Arbeláez, P.; Van Gool, L., Convolutional oriented boundaries: from image segmentation to high-level tasks, IEEE transactions on pattern analysis and machine intelligence, 40, 4, 819-833 (2018)
[38] Martin, D. R.; Fowlkes, C. C.; Malik, J., Learning to detect natural image boundaries using local brightness, color, and texture cues, IEEE Transactions on Pattern Analysis & Machine Intelligence, 5, 530-549 (2004)
[39] McLachlan, G. J.; Lee, S. X.; Rathnayake, S. I., Finite mixture models, Annual review of statistics and its application, 6, 355-378 (2019)
[40] Minaee, S.; Boykov, Y. Y.; Porikli, F.; Plaza, A. J.; Kehtarnavaz, N.; Terzopoulos, D., Image segmentation using deep learning: a survey, IEEE Transactions on Pattern Analysis and Machine Intelligence (2021)
[41] Murphy, K. P., Machine learning: A probabilistic perspective (2012), MIT Press · Zbl 1295.68003
[42] Neri, P., Object segmentation controls image reconstruction from natural scenes, PLoS biology, 15, 8, e1002611 (2017)
[43] Nikou, C.; Galatsanos, N. P.; Likas, A. C., A class-adaptive spatially variant mixture model for image segmentation, IEEE Transactions on Image Processing, 16, 4, 1121-1130 (2007)
[44] Nikou, C.; Likas, A. C.; Galatsanos, N. P., A bayesian framework for image segmentation with spatially varying mixtures, IEEE Transactions on Image Processing, 19, 9, 2278-2289 (2010) · Zbl 1371.94278
[45] Olshausen, B. A.; Field, D. J., Emergence of simple-cell receptive field properties by learning a sparse code for natural images, Nature, 381, 6583, 607 (1996)
[46] Peterson, M. A.; Gibson, B. S., Object recognition contributions to figure-ground organization: operations on outlines and subjective contours, Perception & Psychophysics, 56, 5, 551-564 (1994)
[47] Pont-Tuset, J., & Marques, F. (2013). Measures and meta-measures for the supervised evaluation of image segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2131-2138).
[48] Ronneberger, O.; Fischer, P.; Brox, T., U-net: convolutional networks for biomedical image segmentation, (International conference on medical image computing and computer-assisted intervention (2015), Springer), 234-241
[49] Saarela, T. P.; Landy, M. S., Combination of texture and color cues in visual segmentation, Vision research, 58, 59-67 (2012)
[50] Sanchez-Giraldo, L. G.; Laskar, M. N.U.; Schwartz, O., Normalization and pooling in hierarchical models of natural images, Current opinion in neurobiology, 55, 65-72 (2019)
[51] Sfikas, G.; Nikou, C.; Galatsanos, N., Robust image segmentation with mixtures of student’s t-distributions, (2007 IEEE international conference on image processing, vol. 1 (2007), IEEE), I-273
[52] Sfikas, G.; Nikou, C.; Galatsanos, N., Edge preserving spatially varying mixtures for image segmentation, (2008 IEEE conference on computer vision and pattern recognition (2008), IEEE), 1-7
[53] Shi, T.; Horvath, S., Unsupervised learning with random forest predictors, Journal of Computational and Graphical Statistics, 15, 1, 118-138 (2006)
[54] Sigman, M.; Cecchi, G. A.; Gilbert, C. D.; Magnasco, M. O., On a common circle: natural scenes and gestalt rules, Proceedings of the National Academy of Sciences, 98, 4, 1935-1940 (2001)
[55] Simonyan, K.; Zisserman, A., Very deep convolutional networks for large-scale image recognition (2014), arXiv preprint arXiv:1409.1556
[56] Steinhaus, H., Sur la division des corps matériels en parties, Bulletin de l’Académie Polonaise des Sciences, Cl. III — Vol. IV, 12, 801-804 (1956) · Zbl 0079.16403
[57] Sun, S.; Paisley, J.; Liu, Q., Location dependent dirichlet processes, (International conference on intelligent science and big data engineering (2017), Springer), 64-76
[58] Thiagarajan, J. J.; Ramamurthy, K. N.; Spanias, A., Multiple kernel sparse representations for supervised and unsupervised learning, IEEE transactions on Image Processing, 23, 7, 2905-2915 (2014) · Zbl 1374.68426
[59] Vacher, J., Davila, A., Kohn, A., & Coen-Cagli, R. (2020). Texture Interpolation for Probing Visual Perception. In Advances in neural information processing systems, vol. 33.
[60] Vacher, J.; Mamassian, P.; Coen-Cagli, R., Probabilistic model of visual segmentation (2018)
[61] Wagemans, J.; Elder, J. H.; Kubovy, M.; Palmer, S. E.; Peterson, M. A.; Singh, M., A century of gestalt psychology in visual perception: I. perceptual grouping and figure-ground organization, Psychological bulletin, 138, 6, 1172 (2012)
[62] Wainwright, M. J., & Simoncelli, E. P. (2000). Scale mixtures of Gaussians and the statistics of natural images. In Advances in neural information processing systems (pp. 855-861).
[63] Wu, C. F.J., On the convergence properties of the EM algorithm, The Annals of Statistics, 11, 1, 95-103 (1983) · Zbl 0517.62035
[64] Yang, Y.; Zhang, F.; Zheng, C.; Lin, P., Unsupervised image segmentation using penalized fuzzy clustering algorithm, (International conference on intelligent data engineering and automated learning (2005), Springer), 71-77
[65] Ye, X.; Zhao, J.; Chen, Y., A nonparametric model for multi-manifold clustering with mixture of gaussians and graph consistency, Entropy, 20, 11, 830 (2018)
[66] Zhang, T.; Ramakrishnan, R.; Livny, M., BIRCH: an efficient data clustering method for very large databases, ACM sigmod record, 25, 2, 103-114 (1996)
[67] Zhao, S., Dong, Y., Chang, E. I., Xu, Y., et al. (2019). Recursive cascaded networks for unsupervised medical image registration. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 10600-10610).
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.