Abstract
Methods for out-of-distribution (OOD) detection that scale to 3D data are crucial components of any real-world clinical deep learning system. Classic denoising diffusion probabilistic models (DDPMs) have been recently proposed as a robust way to perform reconstruction-based OOD detection on 2D datasets, but do not trivially scale to 3D data. In this work, we propose to use Latent Diffusion Models (LDMs), which enable the scaling of DDPMs to high-resolution 3D medical data. We validate the proposed approach on near- and far-OOD datasets and compare it to a recently proposed, 3D-enabled approach using Latent Transformer Models (LTMs). Not only does the proposed LDM-based approach achieve statistically significant better performance, it also shows less sensitivity to the underlying latent representation, more favourable memory scaling, and produces better spatial anomaly maps. Code is available at https://github.com/marksgraham/ddpm-ood.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Antonelli, M., et al.: The medical segmentation decathlon. Nature Commu. 13(1), 4128 (2022)
Cardoso, M.J., et al.: Monai: An open-source framework for deep learning in healthcare. arXiv preprint arXiv:2211.02701 (2022)
Choi, H., Jang, E., Alemi, A.A.: Waic, but why? generative ensembles for robust anomaly detection. arXiv preprint arXiv:1810.01392 (2018)
Choromanski, K., et al.: Rethinking attention with performers. arXiv preprint arXiv:2009.14794 (2020)
DeLong, E.R., DeLong, D.M., Clarke-Pearson, D.L.: Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics, pp. 837–845 (1988)
Denouden, T., Salay, R., Czarnecki, K., Abdelzad, V., Phan, B., Vernekar, S.: Improving reconstruction autoencoder out-of-distribution detection with mahalanobis distance. arXiv preprint arXiv:1812.02765 (2018)
Dhariwal, P., Jun, H., Payne, C., Kim, J.W., Radford, A., Sutskever, I.: Jukebox: A generative model for music. arXiv preprint arXiv:2005.00341 (2020)
Dieleman, S.: Musings on typicality (2020). https://benanne.github.io/2020/09/01/typicality.html
Esser, P., Rombach, R., Ommer, B.: Taming transformers for high-resolution image synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12873–12883 (2021)
Graham, M.S., Pinaya, W.H., Tudosiu, P.D., Nachev, P., Ourselin, S., Cardoso, J.: Denoising diffusion models for out-of-distribution detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2947–2956 (2023)
Graham, M.S., et al.: Transformer-based out-of-distribution detection for clinically safe segmentation. In: International Conference on Medical Imaging with Deep Learning, pp. 457–476. PMLR (2022)
Havtorn, J.D., Frellsen, J., Hauberg, S., Maaløe, L.: Hierarchical vaes know what they don’t know. In: International Conference on Machine Learning, pp. 4117–4128. PMLR (2021)
Hendrycks, D., Mazeika, M., Dietterich, T.: Deep anomaly detection with outlier exposure. In: International Conference on Learning Representations (2018)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural. Inf. Process. Syst. 33, 6840–6851 (2020)
Hoogeboom, E., Heek, J., Salimans, T.: Simple diffusion: End-to-end diffusion for high resolution images. arXiv preprint arXiv:2301.11093 (2023)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Liu, L., Ren, Y., Lin, Z., Zhao, Z.: Pseudo numerical methods for diffusion models on manifolds. In: International Conference on Learning Representations (2021)
Lyudchik, O.: Outlier detection using autoencoders. Tech. rep. (2016)
Nalisnick, E., Matsukawa, A., Teh, Y.W., Gorur, D., Lakshminarayanan, B.: Do deep generative models know what they don’t know? In: International Conference on Learning Representations (2018)
Oord, A.v.d., Vinyals, O., Kavukcuoglu, K.: Neural discrete representation learning. arXiv preprint arXiv:1711.00937 (2017)
Patel, A., et al.: Cross attention transformers for multi-modal unsupervised whole-body pet anomaly detection. In: MICCAI Workshop on Deep Generative Models, pp. 14–23. Springer (2022). https://doi.org/10.1007/978-3-031-18576-2_2
Pimentel, M.A., Clifton, D.A., Clifton, L., Tarassenko, L.: A review of novelty detection. Signal Process. 99, 215–249 (2014)
Pinaya, W.H., et al.: Unsupervised brain imaging 3D anomaly detection and segmentation with transformers. Med. Image Anal. 79, 102475 (2022)
Rabe, M.N., Staats, C.: Self-attention does not need o(n\(^2\)) memory. arXiv preprint arXiv:2112.05682 (2021)
Rombach, R., Blattmann, A., Lorenz, D., Esser, P., Ommer, B.: High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10684–10695 (2022)
Serrà, J., Álvarez, D., Gómez, V., Slizovskaia, O., Núñez, J.F., Luque, J.: Input complexity and out-of-distribution detection with likelihood-based generative models. In: International Conference on Learning Representations (2019)
Tudosiu, P.D., et al.: Morphology-preserving autoregressive 3D generative modelling of the brain. In: International Workshop on Simulation and Synthesis in Medical Imaging, pp. 66–78. Springer (2022). https://doi.org/10.1007/978-3-031-16980-9_7
Tudosiu, P.D., et al.: Neuromorphologicaly-preserving volumetric data encoding using VQ-VAE. arXiv preprint arXiv:2002.05692 (2020)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inform. Process. Syst. 30, 5998–6008 (2017)
Werring, D.: Clinical trial: Clinical relevance of microbleeds in stroke (cromis-2). Tech. Rep. NCT02513316, University College London (Nov 2017)
Wilson, D., et al.: Cerebral microbleeds and intracranial haemorrhage risk in patients anticoagulated for atrial fibrillation after acute ischaemic stroke or transient ischaemic attack (cromis-2): a multicentre observational cohort study. Lancet Neurol. 17(6), 539–547 (2018)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Zong, B., et al.: Deep autoencoding gaussian mixture model for unsupervised anomaly detection. In: International Conference on Learning Representations (2018)
Acknowledgements
MSG, WHLP, RG, PW, PN, SO, and MJC are supported by the Wellcome Trust (WT213038/Z/18/Z). MJC and SO are also supported by the Wellcome/EPSRC Centre for Medical Engineering (WT203148/Z/16/Z), and the InnovateUK-funded London AI centre for Value-based Healthcare. PTD is supported by the EPSRC (EP/R513064/1). YM is supported by an MRC Clinical Academic Research Partnership grant (MR/T005351/1). PN is also supported by the UCLH NIHR Biomedical Research Centre. Datasets CROMIS and KCH were used with ethics 20/ES/0005.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Graham, M.S. et al. (2023). Unsupervised 3D Out-of-Distribution Detection with Latent Diffusion Models. In: Greenspan, H., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2023. MICCAI 2023. Lecture Notes in Computer Science, vol 14220. Springer, Cham. https://doi.org/10.1007/978-3-031-43907-0_43
Download citation
DOI: https://doi.org/10.1007/978-3-031-43907-0_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43906-3
Online ISBN: 978-3-031-43907-0
eBook Packages: Computer ScienceComputer Science (R0)