subscribe to arXiv mailings

Equivariance-based self-supervised learning for audio signal recovery from clipped measurements

Authors: Victor Sechaud, Laurent Jacques, Patrice Abry, Julián Tachella

Abstract: In numerous inverse problems, state-of-the-art solving strategies involve training neural networks from ground truth and associated measurement datasets that, however, may be expensive or impossible to collect. Recently, self-supervised learning techniques have emerged, with the major advantage of no longer requiring ground truth data. Most theoretical and experimental results on self-supervised l… ▽ More In numerous inverse problems, state-of-the-art solving strategies involve training neural networks from ground truth and associated measurement datasets that, however, may be expensive or impossible to collect. Recently, self-supervised learning techniques have emerged, with the major advantage of no longer requiring ground truth data. Most theoretical and experimental results on self-supervised learning focus on linear inverse problems. The present work aims to study self-supervised learning for the non-linear inverse problem of recovering audio signals from clipped measurements. An equivariance-based selfsupervised loss is proposed and studied. Performance is assessed on simulated clipped measurements with controlled and varied levels of clipping, and further reported on standard real music signals. We show that the performance of the proposed equivariance-based self-supervised declipping strategy compares favorably to fully supervised learning while only requiring clipped measurements alone for training. △ Less

Submitted 3 September, 2024; originally announced September 2024.

Journal ref: EUSIPCO, Aug 2024, Lyon, France

arXiv:2409.05734 [pdf, other]

Structured Random Model for Fast and Robust Phase Retrieval

Authors: Zhiyuan Hu, Julián Tachella, Michael Unser, Jonathan Dong

Abstract: Phase retrieval, a nonlinear problem prevalent in imaging applications, has been extensively studied using random models, some of which with i.i.d. sensing matrix components. While these models offer robust reconstruction guarantees, they are computationally expensive and impractical for real-world scenarios. In contrast, Fourier-based models, common in applications such as ptychography and coded… ▽ More Phase retrieval, a nonlinear problem prevalent in imaging applications, has been extensively studied using random models, some of which with i.i.d. sensing matrix components. While these models offer robust reconstruction guarantees, they are computationally expensive and impractical for real-world scenarios. In contrast, Fourier-based models, common in applications such as ptychography and coded diffraction imaging, are computationally more efficient but lack the theoretical guarantees of random models. Here, we introduce structured random models for phase retrieval that combine the efficiency of fast Fourier transforms with the versatility of random diagonal matrices. These models emulate i.i.d. random matrices at a fraction of the computational cost. Our approach demonstrates robust reconstructions comparable to fully random models using gradient descent and spectral methods. Furthermore, we establish that a minimum of two structured layers is necessary to achieve these structured-random properties. The proposed method is suitable for optical implementation and offers an efficient and robust alternative for phase retrieval in practical imaging applications. △ Less

Submitted 9 September, 2024; originally announced September 2024.

arXiv:2409.01985 [pdf, other]

UNSURE: Unknown Noise level Stein's Unbiased Risk Estimator

Authors: Julián Tachella, Mike Davies, Laurent Jacques

Abstract: Recently, many self-supervised learning methods for image reconstruction have been proposed that can learn from noisy data alone, bypassing the need for ground-truth references. Most existing methods cluster around two classes: i) Noise2Self and similar cross-validation methods that require very mild knowledge about the noise distribution, and ii) Stein's Unbiased Risk Estimator (SURE) and similar… ▽ More Recently, many self-supervised learning methods for image reconstruction have been proposed that can learn from noisy data alone, bypassing the need for ground-truth references. Most existing methods cluster around two classes: i) Noise2Self and similar cross-validation methods that require very mild knowledge about the noise distribution, and ii) Stein's Unbiased Risk Estimator (SURE) and similar approaches that assume full knowledge of the distribution. The first class of methods is often suboptimal compared to supervised learning, and the second class tends to be impractical, as the noise level is often unknown in real-world applications. In this paper, we provide a theoretical framework that characterizes this expressivity-robustness trade-off and propose a new approach based on SURE, but unlike the standard SURE, does not require knowledge about the noise level. Throughout a series of experiments, we show that the proposed estimator outperforms other existing self-supervised methods on various imaging inverse problems △ Less

Submitted 30 September, 2024; v1 submitted 3 September, 2024; originally announced September 2024.

MSC Class: 68U10 ACM Class: I.4.5; I.2.10; G.3

arXiv:2312.11232 [pdf, other]

Self-Supervised Learning for Image Super-Resolution and Deblurring

Authors: Jérémy Scanvic, Mike Davies, Patrice Abry, Julián Tachella

Abstract: Self-supervised methods have recently proved to be nearly as effective as supervised methods in various imaging inverse problems, paving the way for learning-based methods in scientific and medical imaging applications where ground truth data is hard or expensive to obtain. This is the case in magnetic resonance imaging and computed tomography. These methods critically rely on invariance to transl… ▽ More Self-supervised methods have recently proved to be nearly as effective as supervised methods in various imaging inverse problems, paving the way for learning-based methods in scientific and medical imaging applications where ground truth data is hard or expensive to obtain. This is the case in magnetic resonance imaging and computed tomography. These methods critically rely on invariance to translations and/or rotations of the image distribution to learn from incomplete measurement data alone. However, existing approaches fail to obtain competitive performances in the problems of image super-resolution and deblurring, which play a key role in most imaging systems. In this work, we show that invariance to translations and rotations is insufficient to learn from measurements that only contain low-frequency information. Instead, we propose a new self-supervised approach that leverages the fact that many image distributions are approximately scale-invariant, and that enables recovering high-frequency information lost in the measurement process. We demonstrate throughout a series of experiments on real datasets that the proposed method outperforms other self-supervised approaches, and obtains performances on par with fully supervised learning. △ Less

Submitted 19 March, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.01831 [pdf, other]

Equivariant plug-and-play image reconstruction

Authors: Matthieu Terris, Thomas Moreau, Nelly Pustelnik, Julian Tachella

Abstract: Plug-and-play algorithms constitute a popular framework for solving inverse imaging problems that rely on the implicit definition of an image prior via a denoiser. These algorithms can leverage powerful pre-trained denoisers to solve a wide range of imaging tasks, circumventing the necessity to train models on a per-task basis. Unfortunately, plug-and-play methods often show unstable behaviors, ha… ▽ More Plug-and-play algorithms constitute a popular framework for solving inverse imaging problems that rely on the implicit definition of an image prior via a denoiser. These algorithms can leverage powerful pre-trained denoisers to solve a wide range of imaging tasks, circumventing the necessity to train models on a per-task basis. Unfortunately, plug-and-play methods often show unstable behaviors, hampering their promise of versatility and leading to suboptimal quality of reconstructed images. In this work, we show that enforcing equivariance to certain groups of transformations (rotations, reflections, and/or translations) on the denoiser strongly improves the stability of the algorithm as well as its reconstruction quality. We provide a theoretical analysis that illustrates the role of equivariance on better performance and stability. We present a simple algorithm that enforces equivariance on any existing denoiser by simply applying a random transformation to the input of the denoiser and the inverse transformation to the output at each iteration of the algorithm. Experiments on multiple imaging modalities and denoising networks show that the equivariant plug-and-play algorithm improves both the reconstruction performance and the stability compared to their non-equivariant counterparts. △ Less

Submitted 23 May, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

arXiv:2310.11838 [pdf, other]

Equivariant Bootstrapping for Uncertainty Quantification in Imaging Inverse Problems

Authors: Julian Tachella, Marcelo Pereyra

Abstract: Scientific imaging problems are often severely ill-posed, and hence have significant intrinsic uncertainty. Accurately quantifying the uncertainty in the solutions to such problems is therefore critical for the rigorous interpretation of experimental results as well as for reliably using the reconstructed images as scientific evidence. Unfortunately, existing imaging methods are unable to quantify… ▽ More Scientific imaging problems are often severely ill-posed, and hence have significant intrinsic uncertainty. Accurately quantifying the uncertainty in the solutions to such problems is therefore critical for the rigorous interpretation of experimental results as well as for reliably using the reconstructed images as scientific evidence. Unfortunately, existing imaging methods are unable to quantify the uncertainty in the reconstructed images in a manner that is robust to experiment replications. This paper presents a new uncertainty quantification methodology based on an equivariant formulation of the parametric bootstrap algorithm that leverages symmetries and invariance properties commonly encountered in imaging problems. Additionally, the proposed methodology is general and can be easily applied with any image reconstruction technique, including unsupervised training strategies that can be trained from observed data alone, thus enabling uncertainty quantification in situations where there is no ground truth data available. We demonstrate the proposed approach with a series of numerical experiments and through comparisons with alternative uncertainty quantification strategies from the state-of-the-art, such as Bayesian strategies involving score-based diffusion models and Langevin samplers. In all our experiments, the proposed method delivers remarkably accurate high-dimensional confidence regions and outperforms the competing approaches in terms of estimation accuracy, uncertainty quantification accuracy, and computing time. △ Less

Submitted 20 October, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

MSC Class: 68T07 ACM Class: G.3

Journal ref: AISTATS 2024 (Oral presentation)

arXiv:2303.08691 [pdf, other]

Learning to Reconstruct Signals From Binary Measurements

Authors: Julián Tachella, Laurent Jacques

Abstract: Recent advances in unsupervised learning have highlighted the possibility of learning to reconstruct signals from noisy and incomplete linear measurements alone. These methods play a key role in medical and scientific imaging and sensing, where ground truth data is often scarce or difficult to obtain. However, in practice, measurements are not only noisy and incomplete but also quantized. Here we… ▽ More Recent advances in unsupervised learning have highlighted the possibility of learning to reconstruct signals from noisy and incomplete linear measurements alone. These methods play a key role in medical and scientific imaging and sensing, where ground truth data is often scarce or difficult to obtain. However, in practice, measurements are not only noisy and incomplete but also quantized. Here we explore the extreme case of learning from binary observations and provide necessary and sufficient conditions on the number of measurements required for identifying a set of signals from incomplete binary data. Our results are complementary to existing bounds on signal recovery from binary measurements. Furthermore, we introduce a novel self-supervised learning approach, which we name SSBM, that only requires binary data for training. We demonstrate in a series of experiments with real datasets that SSBM performs on par with supervised learning and outperforms sparse reconstruction methods with a fixed wavelet basis by a large margin. △ Less

Submitted 16 November, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: https://openreview.net/forum?id=ioFIAQOBOS

MSC Class: 68U10 ACM Class: I.4.5; I.2.10; G.3

Journal ref: TMLR 2023 (Featured paper)

arXiv:2210.07314 [pdf, ps, other]

Spline Sketches: An Efficient Approach for Photon Counting Lidar

Authors: Michael Patrick Sheehan, Julian Tachella, Mike E. Davies

Abstract: Photon counting lidar has become an invaluable tool for 3D depth imaging due to the fine-precision it can achieve over long ranges. However, high frame rate, high resolution lidar devices produce an enormous amount of time-of-flight (ToF) data which can cause a severe data processing bottleneck hindering the deployment of real-time systems. In this paper, an efficient photon counting approach is p… ▽ More Photon counting lidar has become an invaluable tool for 3D depth imaging due to the fine-precision it can achieve over long ranges. However, high frame rate, high resolution lidar devices produce an enormous amount of time-of-flight (ToF) data which can cause a severe data processing bottleneck hindering the deployment of real-time systems. In this paper, an efficient photon counting approach is proposed that exploits the simplicity of piecewise polynomial splines to form a hardware-friendly compressed statistic, or a so-called spline sketch, of the ToF data without sacrificing the quality of the recovered image. As each piecewise polynomial spline is a simple function with limited support over the timing depth window, the spline sketch can be computed efficiently on-chip with minimal computational overhead. We show that a piecewise linear or quadratic spline sketch, requiring minimal on-chip arithmetic computation per photon detection, can reconstruct real-world depth images with negligible loss of resolution whilst achieving $95\%$ compression compared to the full ToF data, as well as offering multi-peak detection performance. These contrast with previously proposed coarse binning histograms that suffer from a highly nonuniform accuracy across depth and can fail catastrophically when associated with bright reflectors. Further, by building range-walk correction into the proposed estimation algorithms, it is demonstrated that the spline sketches can be made robust to photon pile-up effects. The computational complexity of both the reconstruction and range walk correction algorithms scale only with the size of the spline sketch which is independent to both the photon count and temporal resolution of the lidar device. △ Less

Submitted 29 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

Comments: 13 pages, 13 figures

arXiv:2209.01725 [pdf, other]

Imaging with Equivariant Deep Learning

Authors: Dongdong Chen, Mike Davies, Matthias J. Ehrhardt, Carola-Bibiane Schönlieb, Ferdia Sherry, Julián Tachella

Abstract: From early image processing to modern computational imaging, successful models and algorithms have relied on a fundamental property of natural signals: symmetry. Here symmetry refers to the invariance property of signal sets to transformations such as translation, rotation or scaling. Symmetry can also be incorporated into deep neural networks in the form of equivariance, allowing for more data-ef… ▽ More From early image processing to modern computational imaging, successful models and algorithms have relied on a fundamental property of natural signals: symmetry. Here symmetry refers to the invariance property of signal sets to transformations such as translation, rotation or scaling. Symmetry can also be incorporated into deep neural networks in the form of equivariance, allowing for more data-efficient learning. While there has been important advances in the design of end-to-end equivariant networks for image classification in recent years, computational imaging introduces unique challenges for equivariant network solutions since we typically only observe the image through some noisy ill-conditioned forward operator that itself may not be equivariant. We review the emerging field of equivariant imaging and show how it can provide improved generalization and new imaging opportunities. Along the way we show the interplay between the acquisition physics and group actions and links to iterative reconstruction, blind compressed sensing and self-supervised learning. △ Less

Submitted 4 September, 2022; originally announced September 2022.

Comments: To appear in IEEE Signal Processing Magazine

arXiv:2203.12513 [pdf, other]

Sensing Theorems for Unsupervised Learning in Linear Inverse Problems

Authors: Julián Tachella, Dongdong Chen, Mike Davies

Abstract: Solving an ill-posed linear inverse problem requires knowledge about the underlying signal model. In many applications, this model is a priori unknown and has to be learned from data. However, it is impossible to learn the model using observations obtained via a single incomplete measurement operator, as there is no information about the signal model in the nullspace of the operator, resulting in… ▽ More Solving an ill-posed linear inverse problem requires knowledge about the underlying signal model. In many applications, this model is a priori unknown and has to be learned from data. However, it is impossible to learn the model using observations obtained via a single incomplete measurement operator, as there is no information about the signal model in the nullspace of the operator, resulting in a chicken-and-egg problem: to learn the model we need reconstructed signals, but to reconstruct the signals we need to know the model. Two ways to overcome this limitation are using multiple measurement operators or assuming that the signal model is invariant to a certain group action. In this paper, we present necessary and sufficient sensing conditions for learning the signal model from measurement data alone which only depend on the dimension of the model and the number of operators or properties of the group action that the model is invariant to. As our results are agnostic of the learning algorithm, they shed light into the fundamental limitations of learning from incomplete data and have implications in a wide range set of practical algorithms, such as dictionary learning, matrix completion and deep neural networks. △ Less

Submitted 11 October, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2201.12151

MSC Class: 68U10 ACM Class: I.4.5; I.2.10; G.3

Journal ref: JMLR 2023

arXiv:2203.00952 [pdf, other]

Sketched RT3D: How to reconstruct billions of photons per second

Authors: Julián Tachella, Michael P. Sheehan, Mike E. Davies

Abstract: Single-photon light detection and ranging (lidar) captures depth and intensity information of a 3D scene. Reconstructing a scene from observed photons is a challenging task due to spurious detections associated with background illumination sources. To tackle this problem, there is a plethora of 3D reconstruction algorithms which exploit spatial regularity of natural scenes to provide stable recons… ▽ More Single-photon light detection and ranging (lidar) captures depth and intensity information of a 3D scene. Reconstructing a scene from observed photons is a challenging task due to spurious detections associated with background illumination sources. To tackle this problem, there is a plethora of 3D reconstruction algorithms which exploit spatial regularity of natural scenes to provide stable reconstructions. However, most existing algorithms have computational and memory complexity proportional to the number of recorded photons. This complexity hinders their real-time deployment on modern lidar arrays which acquire billions of photons per second. Leveraging a recent lidar sketching framework, we show that it is possible to modify existing reconstruction algorithms such that they only require a small sketch of the photon information. In particular, we propose a sketched version of a recent state-of-the-art algorithm which uses point cloud denoisers to provide spatially regularized reconstructions. A series of experiments performed on real lidar datasets demonstrates a significant reduction of execution time and memory requirements, while achieving the same reconstruction performance than in the full data case. △ Less

Submitted 2 March, 2022; originally announced March 2022.

Comments: Accepted at ICASSP 2022

arXiv:2201.12151 [pdf, other]

Unsupervised Learning From Incomplete Measurements for Inverse Problems

Authors: Julián Tachella, Dongdong Chen, Mike Davies

Abstract: In many real-world inverse problems, only incomplete measurement data are available for training which can pose a problem for learning a reconstruction function. Indeed, unsupervised learning using a fixed incomplete measurement process is impossible in general, as there is no information in the nullspace of the measurement operator. This limitation can be overcome by using measurements from multi… ▽ More In many real-world inverse problems, only incomplete measurement data are available for training which can pose a problem for learning a reconstruction function. Indeed, unsupervised learning using a fixed incomplete measurement process is impossible in general, as there is no information in the nullspace of the measurement operator. This limitation can be overcome by using measurements from multiple operators. While this idea has been successfully applied in various applications, a precise characterization of the conditions for learning is still lacking. In this paper, we fill this gap by presenting necessary and sufficient conditions for learning the underlying signal model needed for reconstruction which indicate the interplay between the number of distinct measurement operators, the number of measurements per operator, the dimension of the model and the dimension of the signals. Furthermore, we propose a novel and conceptually simple unsupervised learning loss which only requires access to incomplete measurement data and achieves a performance on par with supervised learning when the sufficient condition is verified. We validate our theoretical bounds and demonstrate the advantages of the proposed unsupervised loss compared to previous methods via a series of experiments on various imaging inverse problems, such as accelerated magnetic resonance imaging, compressed sensing and image inpainting. △ Less

Submitted 28 September, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

MSC Class: 68U10 ACM Class: I.4.5; I.2.10; G.3

Journal ref: NeurIPS 2022

arXiv:2111.12855 [pdf, other]

Robust Equivariant Imaging: a fully unsupervised framework for learning to image from noisy and partial measurements

Authors: Dongdong Chen, Julián Tachella, Mike E. Davies

Abstract: Deep networks provide state-of-the-art performance in multiple imaging inverse problems ranging from medical imaging to computational photography. However, most existing networks are trained with clean signals which are often hard or impossible to obtain. Equivariant imaging (EI) is a recent self-supervised learning framework that exploits the group invariance present in signal distributions to le… ▽ More Deep networks provide state-of-the-art performance in multiple imaging inverse problems ranging from medical imaging to computational photography. However, most existing networks are trained with clean signals which are often hard or impossible to obtain. Equivariant imaging (EI) is a recent self-supervised learning framework that exploits the group invariance present in signal distributions to learn a reconstruction function from partial measurement data alone. While EI results are impressive, its performance degrades with increasing noise. In this paper, we propose a Robust Equivariant Imaging (REI) framework which can learn to image from noisy partial measurements alone. The proposed method uses Stein's Unbiased Risk Estimator (SURE) to obtain a fully unsupervised training loss that is robust to noise. We show that REI leads to considerable performance gains on linear and nonlinear inverse problems, thereby paving the way for robust unsupervised imaging with deep networks. Code is available at: https://github.com/edongdongchen/REI. △ Less

Submitted 15 March, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

Comments: CVPR 2022. Code: https://github.com/edongdongchen/REI

arXiv:2105.06920 [pdf, ps, other]

Surface Detection for Sketched Single Photon Lidar

Authors: Michael P. Sheehan, Julián Tachella, Mike E. Davies

Abstract: Single-photon lidar devices are able to collect an ever-increasing amount of time-stamped photons in small time periods due to increasingly larger arrays, generating a memory and computational bottleneck on the data processing side. Recently, a sketching technique was introduced to overcome this bottleneck which compresses the amount of information to be stored and processed. The size of the sketc… ▽ More Single-photon lidar devices are able to collect an ever-increasing amount of time-stamped photons in small time periods due to increasingly larger arrays, generating a memory and computational bottleneck on the data processing side. Recently, a sketching technique was introduced to overcome this bottleneck which compresses the amount of information to be stored and processed. The size of the sketch scales with the number of underlying parameters of the time delay distribution and not, fundamentally, with either the number of detected photons or the time-stamp resolution. In this paper, we propose a detection algorithm based solely on a small sketch that determines if there are surfaces or objects in the scene or not. If a surface is detected, the depth and intensity of a single object can be computed in closed-form directly from the sketch. The computational load of the proposed detection algorithm depends solely on the size of the sketch, in contrast to previous algorithms that depend at least linearly in the number of collected photons or histogram bins, paving the way for fast, accurate and memory efficient lidar estimation. Our experiments demonstrate the memory and statistical efficiency of the proposed algorithm both on synthetic and real lidar datasets. △ Less

Submitted 14 May, 2021; originally announced May 2021.

Comments: 5 pages, Accepted at EUSIPCO 2021

arXiv:2103.14756 [pdf, other]

Equivariant Imaging: Learning Beyond the Range Space

Authors: Dongdong Chen, Julián Tachella, Mike E. Davies

Abstract: In various imaging problems, we only have access to compressed measurements of the underlying signals, hindering most learning-based strategies which usually require pairs of signals and associated measurements for training. Learning only from compressed measurements is impossible in general, as the compressed observations do not contain information outside the range of the forward sensing operato… ▽ More In various imaging problems, we only have access to compressed measurements of the underlying signals, hindering most learning-based strategies which usually require pairs of signals and associated measurements for training. Learning only from compressed measurements is impossible in general, as the compressed observations do not contain information outside the range of the forward sensing operator. We propose a new end-to-end self-supervised framework that overcomes this limitation by exploiting the equivariances present in natural signals. Our proposed learning strategy performs as well as fully supervised methods. Experiments demonstrate the potential of this framework on inverse problems including sparse-view X-ray computed tomography on real clinical data and image inpainting on natural images. Code has been made available at: https://github.com/edongdongchen/EI. △ Less

Submitted 23 August, 2021; v1 submitted 26 March, 2021; originally announced March 2021.

Comments: ICCV 2021. Code: https://github.com/edongdongchen/EI

arXiv:2102.08732 [pdf, ps, other]

doi 10.1109/TCI.2021.3113495

A Sketching Framework for Reduced Data Transfer in Photon Counting Lidar

Authors: Michael P. Sheehan, Julián Tachella, Mike E. Davies

Abstract: Single-photon lidar has become a prominent tool for depth imaging in recent years. At the core of the technique, the depth of a target is measured by constructing a histogram of time delays between emitted light pulses and detected photon arrivals. A major data processing bottleneck arises on the device when either the number of photons per pixel is large or the resolution of the time stamp is fin… ▽ More Single-photon lidar has become a prominent tool for depth imaging in recent years. At the core of the technique, the depth of a target is measured by constructing a histogram of time delays between emitted light pulses and detected photon arrivals. A major data processing bottleneck arises on the device when either the number of photons per pixel is large or the resolution of the time stamp is fine, as both the space requirement and the complexity of the image reconstruction algorithms scale with these parameters. We solve this limiting bottleneck of existing lidar techniques by sampling the characteristic function of the time of flight (ToF) model to build a compressive statistic, a so-called sketch of the time delay distribution, which is sufficient to infer the spatial distance and intensity of the object. The size of the sketch scales with the degrees of freedom of the ToF model (number of objects) and not, fundamentally, with the number of photons or the time stamp resolution. Moreover, the sketch is highly amenable for on-chip online processing. We show theoretically that the loss of information for compression is controlled and the mean squared error of the inference quickly converges towards the optimal Cramér-Rao bound (i.e. no loss of information) for modest sketch sizes. The proposed compressed single-photon lidar framework is tested and evaluated on real life datasets of complex scenes where it is shown that a compression rate of up-to 150 is achievable in practice without sacrificing the overall resolution of the reconstructed image. △ Less

Submitted 5 January, 2022; v1 submitted 17 February, 2021; originally announced February 2021.

Comments: 16 pages, 20 figures. Figure 8 Corrected. Accepted at IEEE TCI

Journal ref: IEEE Transactions on Computational Imaging, Volume 7, 2021, Pages 989 - 1004

arXiv:2006.02379 [pdf, other]

The Neural Tangent Link Between CNN Denoisers and Non-Local Filters

Authors: Julián Tachella, Junqi Tang, Mike Davies

Abstract: Convolutional Neural Networks (CNNs) are now a well-established tool for solving computational imaging problems. Modern CNN-based algorithms obtain state-of-the-art performance in diverse image restoration problems. Furthermore, it has been recently shown that, despite being highly overparameterized, networks trained with a single corrupted image can still perform as well as fully trained networks… ▽ More Convolutional Neural Networks (CNNs) are now a well-established tool for solving computational imaging problems. Modern CNN-based algorithms obtain state-of-the-art performance in diverse image restoration problems. Furthermore, it has been recently shown that, despite being highly overparameterized, networks trained with a single corrupted image can still perform as well as fully trained networks. We introduce a formal link between such networks through their neural tangent kernel (NTK), and well-known non-local filtering techniques, such as non-local means or BM3D. The filtering function associated with a given network architecture can be obtained in closed form without need to train the network, being fully characterized by the random initialization of the network weights. While the NTK theory accurately predicts the filter associated with networks trained using standard gradient descent, our analysis shows that it falls short to explain the behaviour of networks trained using the popular Adam optimizer. The latter achieves a larger change of weights in hidden layers, adapting the non-local filtering function during training. We evaluate our findings via extensive image denoising experiments. △ Less

Submitted 16 November, 2020; v1 submitted 3 June, 2020; originally announced June 2020.

MSC Class: 68T07

Journal ref: CVPR 2021 (Oral presentation)

arXiv:2004.09211 [pdf, other]

doi 10.1109/TIP.2020.3046882

Robust 3D reconstruction of dynamic scenes from single-photon lidar using Beta-divergences

Authors: Quentin Legros, Julian Tachella, Rachael Tobin, Aongus McCarthy, Sylvain Meignen, Gerald S. Buller, Yoann Altmann, Stephen McLaughlin, Michael E. Davies

Abstract: In this paper, we present a new algorithm for fast, online 3D reconstruction of dynamic scenes using times of arrival of photons recorded by single-photon detector arrays. One of the main challenges in 3D imaging using single-photon lidar in practical applications is the presence of strong ambient illumination which corrupts the data and can jeopardize the detection of peaks/surface in the signals… ▽ More In this paper, we present a new algorithm for fast, online 3D reconstruction of dynamic scenes using times of arrival of photons recorded by single-photon detector arrays. One of the main challenges in 3D imaging using single-photon lidar in practical applications is the presence of strong ambient illumination which corrupts the data and can jeopardize the detection of peaks/surface in the signals. This background noise not only complicates the observation model classically used for 3D reconstruction but also the estimation procedure which requires iterative methods. In this work, we consider a new similarity measure for robust depth estimation, which allows us to use a simple observation model and a non-iterative estimation procedure while being robust to mis-specification of the background illumination model. This choice leads to a computationally attractive depth estimation procedure without significant degradation of the reconstruction performance. This new depth estimation procedure is coupled with a spatio-temporal model to capture the natural correlation between neighboring pixels and successive frames for dynamic scene analysis. The resulting online inference process is scalable and well suited for parallel implementation. The benefits of the proposed method are demonstrated through a series of experiments conducted with simulated and real single-photon lidar videos, allowing the analysis of dynamic scenes at 325 m observed under extreme ambient illumination conditions. △ Less

Submitted 18 December, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

Comments: 12 pages

arXiv:2002.07118 [pdf, other]

doi 10.1038/s41467-020-19727-4

Seeing Around Corners with Edge-Resolved Transient Imaging

Authors: Joshua Rapp, Charles Saunders, Julián Tachella, John Murray-Bruce, Yoann Altmann, Jean-Yves Tourneret, Stephen McLaughlin, Robin M. A. Dawson, Franco N. C. Wong, Vivek K Goyal

Abstract: Non-line-of-sight (NLOS) imaging is a rapidly growing field seeking to form images of objects outside the field of view, with potential applications in search and rescue, reconnaissance, and even medical imaging. The critical challenge of NLOS imaging is that diffuse reflections scatter light in all directions, resulting in weak signals and a loss of directional information. To address this proble… ▽ More Non-line-of-sight (NLOS) imaging is a rapidly growing field seeking to form images of objects outside the field of view, with potential applications in search and rescue, reconnaissance, and even medical imaging. The critical challenge of NLOS imaging is that diffuse reflections scatter light in all directions, resulting in weak signals and a loss of directional information. To address this problem, we propose a method for seeing around corners that derives angular resolution from vertical edges and longitudinal resolution from the temporal response to a pulsed light source. We introduce an acquisition strategy, scene response model, and reconstruction algorithm that enable the formation of 2.5-dimensional representations -- a plan view plus heights -- and a 180$^{\circ}$ field of view (FOV) for large-scale scenes. Our experiments demonstrate accurate reconstructions of hidden rooms up to 3 meters in each dimension. △ Less

Submitted 17 February, 2020; originally announced February 2020.

Comments: Includes manuscript (14 pages) and supplement (24 pages)

arXiv:1905.06700 [pdf, other]

doi 10.1038/s41467-019-12943-7

Real-time 3D reconstruction from single-photon lidar data using plug-and-play point cloud denoisers

Authors: Julián Tachella, Yoann Altmann, Nicolas Mellado, Aongus McCarthy, Rachael Tobin, Gerald S. Buller, Jean-Yves Tourneret, Stephen McLaughlin

Abstract: Single-photon lidar has emerged as a prime candidate technology for depth imaging through challenging environments. Until now, a major limitation has been the significant amount of time required for the analysis of the recorded data. Here we show a new computational framework for real-time three-dimensional (3D) scene reconstruction from single-photon data. By combining statistical models with hig… ▽ More Single-photon lidar has emerged as a prime candidate technology for depth imaging through challenging environments. Until now, a major limitation has been the significant amount of time required for the analysis of the recorded data. Here we show a new computational framework for real-time three-dimensional (3D) scene reconstruction from single-photon data. By combining statistical models with highly scalable computational tools from the computer graphics community, we demonstrate 3D reconstruction of complex outdoor scenes with processing times of the order of 20 ms, where the lidar data was acquired in broad daylight from distances up to 320 metres. The proposed method can handle an unknown number of surfaces in each pixel, allowing for target detection and imaging through cluttered scenes. This enables robust, real-time target reconstruction of complex moving scenes, paving the way for single-photon lidar at video rates for practical 3D imaging applications. △ Less

Submitted 4 October, 2019; v1 submitted 16 May, 2019; originally announced May 2019.

arXiv:1904.02583 [pdf, other]

Bayesian 3D Reconstruction of Subsampled Multispectral Single-photon Lidar Signals

Authors: Julián Tachella, Yoann Altmann, Miguel Márquez, Henry Arguello-Fuentes, Jean-Yves Tourneret, Stephen McLaughlin

Abstract: Light detection and ranging (Lidar) single-photon devices capture range and intensity information from a 3D scene. This modality enables long range 3D reconstruction with high range precision and low laser power. A multispectral single-photon Lidar system provides additional spectral diversity, allowing the discrimination of different materials. However, the main drawback of such systems can be th… ▽ More Light detection and ranging (Lidar) single-photon devices capture range and intensity information from a 3D scene. This modality enables long range 3D reconstruction with high range precision and low laser power. A multispectral single-photon Lidar system provides additional spectral diversity, allowing the discrimination of different materials. However, the main drawback of such systems can be the long acquisition time needed to collect enough photons in each spectral band. In this work, we tackle this problem in two ways: first, we propose a Bayesian 3D reconstruction algorithm that is able to find multiple surfaces per pixel, using few photons, i.e., shorter acquisitions. In contrast to previous algorithms, the novel method processes the jointly all the spectral bands, obtaining better reconstructions using less photon detections. The proposed model promotes spatial correlation between neighbouring points within a given surface using spatial point processes. Secondly, we account for different spatial and spectral subsampling schemes, which reduce the total number of measurements, without significant degradation of the reconstruction performance. In this way, the total acquisition time, memory requirements and computational time can be significantly reduced. The experiments performed using both synthetic and real single-photon Lidar data demonstrate the advantages of tailored sampling schemes over random alternatives. Furthermore, the proposed algorithm yields better estimates than other existing methods for multi-surface reconstruction using multispectral Lidar data. △ Less

Submitted 19 September, 2019; v1 submitted 4 April, 2019; originally announced April 2019.

Comments: code: https://gitlab.com/Tachella/musapop

arXiv:1810.11633 [pdf, other]

doi 10.1137/18M1183972

Bayesian 3D Reconstruction of Complex Scenes from Single-Photon Lidar Data

Authors: Julián Tachella, Yoann Altmann, Ximing Ren, Aongus McCarthy, Gerald S. Buller, Jean-Yves Tourneret, Steve McLaughlin

Abstract: Light detection and ranging (Lidar) data can be used to capture the depth and intensity profile of a 3D scene. This modality relies on constructing, for each pixel, a histogram of time delays between emitted light pulses and detected photon arrivals. In a general setting, more than one surface can be observed in a single pixel. The problem of estimating the number of surfaces, their reflectivity a… ▽ More Light detection and ranging (Lidar) data can be used to capture the depth and intensity profile of a 3D scene. This modality relies on constructing, for each pixel, a histogram of time delays between emitted light pulses and detected photon arrivals. In a general setting, more than one surface can be observed in a single pixel. The problem of estimating the number of surfaces, their reflectivity and position becomes very challenging in the low-photon regime (which equates to short acquisition times) or relatively high background levels (i.e., strong ambient illumination). This paper presents a new approach to 3D reconstruction using single-photon, single-wavelength Lidar data, which is capable of identifying multiple surfaces in each pixel. Adopting a Bayesian approach, the 3D structure to be recovered is modelled as a marked point process and reversible jump Markov chain Monte Carlo (RJ-MCMC) moves are proposed to sample the posterior distribution of interest. In order to promote spatial correlation between points belonging to the same surface, we propose a prior that combines an area interaction process and a Strauss process. New RJ-MCMC dilation and erosion updates are presented to achieve an efficient exploration of the configuration space. To further reduce the computational load, we adopt a multiresolution approach, processing the data from a coarse to the finest scale. The experiments performed with synthetic and real data show that the algorithm obtains better reconstructions than other recently published optimization algorithms for lower execution times. △ Less

Submitted 27 October, 2018; originally announced October 2018.

Journal ref: SIAM Journal on Imaging Sciences 2019 12:1, 521-550

Showing 1–22 of 22 results for author: Tachella, J