subscribe to arXiv mailings

GStex: Per-Primitive Texturing of 2D Gaussian Splatting for Decoupled Appearance and Geometry Modeling

Authors: Victor Rong, Jingxiang Chen, Sherwin Bahmani, Kiriakos N. Kutulakos, David B. Lindell

Abstract: Gaussian splatting has demonstrated excellent performance for view synthesis and scene reconstruction. The representation achieves photorealistic quality by optimizing the position, scale, color, and opacity of thousands to millions of 2D or 3D Gaussian primitives within a scene. However, since each Gaussian primitive encodes both appearance and geometry, these attributes are strongly coupled--thu… ▽ More Gaussian splatting has demonstrated excellent performance for view synthesis and scene reconstruction. The representation achieves photorealistic quality by optimizing the position, scale, color, and opacity of thousands to millions of 2D or 3D Gaussian primitives within a scene. However, since each Gaussian primitive encodes both appearance and geometry, these attributes are strongly coupled--thus, high-fidelity appearance modeling requires a large number of Gaussian primitives, even when the scene geometry is simple (e.g., for a textured planar surface). We propose to texture each 2D Gaussian primitive so that even a single Gaussian can be used to capture appearance details. By employing per-primitive texturing, our appearance representation is agnostic to the topology and complexity of the scene's geometry. We show that our approach, GStex, yields improved visual quality over prior work in texturing Gaussian splats. Furthermore, we demonstrate that our decoupling enables improved novel view synthesis performance compared to 2D Gaussian splatting when reducing the number of Gaussian primitives, and that GStex can be used for scene appearance editing and re-texturing. △ Less

Submitted 19 September, 2024; originally announced September 2024.

Comments: Project page: https://lessvrong.com/cs/gstex

ACM Class: I.3; I.4

arXiv:2407.12781 [pdf, other]

VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control

Authors: Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin, Willi Menapace, Guocheng Qian, Michael Vasilkovsky, Hsin-Ying Lee, Chaoyang Wang, Jiaxu Zou, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov

Abstract: Modern text-to-video synthesis models demonstrate coherent, photorealistic generation of complex videos from a text description. However, most existing models lack fine-grained control over camera movement, which is critical for downstream applications related to content creation, visual effects, and 3D vision. Recently, new methods demonstrate the ability to generate videos with controllable came… ▽ More Modern text-to-video synthesis models demonstrate coherent, photorealistic generation of complex videos from a text description. However, most existing models lack fine-grained control over camera movement, which is critical for downstream applications related to content creation, visual effects, and 3D vision. Recently, new methods demonstrate the ability to generate videos with controllable camera poses these techniques leverage pre-trained U-Net-based diffusion models that explicitly disentangle spatial and temporal generation. Still, no existing approach enables camera control for new, transformer-based video diffusion models that process spatial and temporal information jointly. Here, we propose to tame video transformers for 3D camera control using a ControlNet-like conditioning mechanism that incorporates spatiotemporal camera embeddings based on Plucker coordinates. The approach demonstrates state-of-the-art performance for controllable video generation after fine-tuning on the RealEstate10K dataset. To the best of our knowledge, our work is the first to enable camera control for transformer-based video diffusion models. △ Less

Submitted 20 July, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

Comments: Project Page: https://snap-research.github.io/vd3d/

arXiv:2403.17920 [pdf, other]

TC4D: Trajectory-Conditioned Text-to-4D Generation

Authors: Sherwin Bahmani, Xian Liu, Wang Yifan, Ivan Skorokhodov, Victor Rong, Ziwei Liu, Xihui Liu, Jeong Joon Park, Sergey Tulyakov, Gordon Wetzstein, Andrea Tagliasacchi, David B. Lindell

Abstract: Recent techniques for text-to-4D generation synthesize dynamic 3D scenes using supervision from pre-trained text-to-video models. However, existing representations for motion, such as deformation models or time-dependent neural representations, are limited in the amount of motion they can generate-they cannot synthesize motion extending far beyond the bounding box used for volume rendering. The la… ▽ More Recent techniques for text-to-4D generation synthesize dynamic 3D scenes using supervision from pre-trained text-to-video models. However, existing representations for motion, such as deformation models or time-dependent neural representations, are limited in the amount of motion they can generate-they cannot synthesize motion extending far beyond the bounding box used for volume rendering. The lack of a more flexible motion model contributes to the gap in realism between 4D generation methods and recent, near-photorealistic video generation models. Here, we propose TC4D: trajectory-conditioned text-to-4D generation, which factors motion into global and local components. We represent the global motion of a scene's bounding box using rigid transformation along a trajectory parameterized by a spline. We learn local deformations that conform to the global trajectory using supervision from a text-to-video model. Our approach enables the synthesis of scenes animated along arbitrary trajectories, compositional scene generation, and significant improvements to the realism and amount of generated motion, which we evaluate qualitatively and through a user study. Video results can be viewed on our website: https://sherwinbahmani.github.io/tc4d. △ Less

Submitted 14 October, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

Comments: ECCV 2024; Project Page: https://sherwinbahmani.github.io/tc4d

arXiv:2311.17984 [pdf, other]

4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling

Authors: Sherwin Bahmani, Ivan Skorokhodov, Victor Rong, Gordon Wetzstein, Leonidas Guibas, Peter Wonka, Sergey Tulyakov, Jeong Joon Park, Andrea Tagliasacchi, David B. Lindell

Abstract: Recent breakthroughs in text-to-4D generation rely on pre-trained text-to-image and text-to-video models to generate dynamic 3D scenes. However, current text-to-4D methods face a three-way tradeoff between the quality of scene appearance, 3D structure, and motion. For example, text-to-image models and their 3D-aware variants are trained on internet-scale image datasets and can be used to produce s… ▽ More Recent breakthroughs in text-to-4D generation rely on pre-trained text-to-image and text-to-video models to generate dynamic 3D scenes. However, current text-to-4D methods face a three-way tradeoff between the quality of scene appearance, 3D structure, and motion. For example, text-to-image models and their 3D-aware variants are trained on internet-scale image datasets and can be used to produce scenes with realistic appearance and 3D structure -- but no motion. Text-to-video models are trained on relatively smaller video datasets and can produce scenes with motion, but poorer appearance and 3D structure. While these models have complementary strengths, they also have opposing weaknesses, making it difficult to combine them in a way that alleviates this three-way tradeoff. Here, we introduce hybrid score distillation sampling, an alternating optimization procedure that blends supervision signals from multiple pre-trained diffusion models and incorporates benefits of each for high-fidelity text-to-4D generation. Using hybrid SDS, we demonstrate synthesis of 4D scenes with compelling appearance, 3D structure, and motion. △ Less

Submitted 26 May, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

Comments: CVPR 2024; Project page: https://sherwinbahmani.github.io/4dfy

arXiv:2303.12074 [pdf, other]

CC3D: Layout-Conditioned Generation of Compositional 3D Scenes

Authors: Sherwin Bahmani, Jeong Joon Park, Despoina Paschalidou, Xingguang Yan, Gordon Wetzstein, Leonidas Guibas, Andrea Tagliasacchi

Abstract: In this work, we introduce CC3D, a conditional generative model that synthesizes complex 3D scenes conditioned on 2D semantic scene layouts, trained using single-view images. Different from most existing 3D GANs that limit their applicability to aligned single objects, we focus on generating complex scenes with multiple objects, by modeling the compositional nature of 3D scenes. By devising a 2D l… ▽ More In this work, we introduce CC3D, a conditional generative model that synthesizes complex 3D scenes conditioned on 2D semantic scene layouts, trained using single-view images. Different from most existing 3D GANs that limit their applicability to aligned single objects, we focus on generating complex scenes with multiple objects, by modeling the compositional nature of 3D scenes. By devising a 2D layout-based approach for 3D synthesis and implementing a new 3D field representation with a stronger geometric inductive bias, we have created a 3D GAN that is both efficient and of high quality, while allowing for a more controllable generation process. Our evaluations on synthetic 3D-FRONT and real-world KITTI-360 datasets demonstrate that our model generates scenes of improved visual and geometric quality in comparison to previous works. △ Less

Submitted 8 September, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

Comments: ICCV 2023; Webpage: https://sherwinbahmani.github.io/cc3d/

arXiv:2208.05788 [pdf, other]

Semantic Self-adaptation: Enhancing Generalization with a Single Sample

Authors: Sherwin Bahmani, Oliver Hahn, Eduard Zamfir, Nikita Araslanov, Daniel Cremers, Stefan Roth

Abstract: The lack of out-of-domain generalization is a critical weakness of deep networks for semantic segmentation. Previous studies relied on the assumption of a static model, i. e., once the training process is complete, model parameters remain fixed at test time. In this work, we challenge this premise with a self-adaptive approach for semantic segmentation that adjusts the inference process to each in… ▽ More The lack of out-of-domain generalization is a critical weakness of deep networks for semantic segmentation. Previous studies relied on the assumption of a static model, i. e., once the training process is complete, model parameters remain fixed at test time. In this work, we challenge this premise with a self-adaptive approach for semantic segmentation that adjusts the inference process to each input sample. Self-adaptation operates on two levels. First, it fine-tunes the parameters of convolutional layers to the input image using consistency regularization. Second, in Batch Normalization layers, self-adaptation interpolates between the training and the reference distribution derived from a single test sample. Despite both techniques being well known in the literature, their combination sets new state-of-the-art accuracy on synthetic-to-real generalization benchmarks. Our empirical study suggests that self-adaptation may complement the established practice of model regularization at training time for improving deep network generalization to out-of-domain data. Our code and pre-trained models are available at https://github.com/visinf/self-adaptive. △ Less

Submitted 13 December, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

Comments: Published in TMLR (July 2023) | OpenReview: https://openreview.net/forum?id=ILNqQhGbLx | Code: https://github.com/visinf/self-adaptive | Video: https://youtu.be/s4DG65ic0EA

Report number: 2835-8856

Journal ref: Transactions on Machine Learning Research (TMLR) 2023

arXiv:2206.14797 [pdf, other]

3D-Aware Video Generation

Authors: Sherwin Bahmani, Jeong Joon Park, Despoina Paschalidou, Hao Tang, Gordon Wetzstein, Leonidas Guibas, Luc Van Gool, Radu Timofte

Abstract: Generative models have emerged as an essential building block for many image synthesis and editing tasks. Recent advances in this field have also enabled high-quality 3D or video content to be generated that exhibits either multi-view or temporal consistency. With our work, we explore 4D generative adversarial networks (GANs) that learn unconditional generation of 3D-aware videos. By combining neu… ▽ More Generative models have emerged as an essential building block for many image synthesis and editing tasks. Recent advances in this field have also enabled high-quality 3D or video content to be generated that exhibits either multi-view or temporal consistency. With our work, we explore 4D generative adversarial networks (GANs) that learn unconditional generation of 3D-aware videos. By combining neural implicit representations with time-aware discriminator, we develop a GAN framework that synthesizes 3D video supervised only with monocular videos. We show that our method learns a rich embedding of decomposable 3D structures and motions that enables new visual effects of spatio-temporal renderings while producing imagery with quality comparable to that of existing 3D or video GANs. △ Less

Submitted 9 August, 2023; v1 submitted 29 June, 2022; originally announced June 2022.

Comments: TMLR 2023; Project page: https://sherwinbahmani.github.io/3dvidgen

arXiv:2111.14820 [pdf, other]

Towards Robust and Adaptive Motion Forecasting: A Causal Representation Perspective

Authors: Yuejiang Liu, Riccardo Cadei, Jonas Schweizer, Sherwin Bahmani, Alexandre Alahi

Abstract: Learning behavioral patterns from observational data has been a de-facto approach to motion forecasting. Yet, the current paradigm suffers from two shortcomings: brittle under distribution shifts and inefficient for knowledge transfer. In this work, we propose to address these challenges from a causal representation perspective. We first introduce a causal formalism of motion forecasting, which ca… ▽ More Learning behavioral patterns from observational data has been a de-facto approach to motion forecasting. Yet, the current paradigm suffers from two shortcomings: brittle under distribution shifts and inefficient for knowledge transfer. In this work, we propose to address these challenges from a causal representation perspective. We first introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables, namely invariant variables, style confounders, and spurious features. We then introduce a learning framework that treats each group separately: (i) unlike the common practice mixing datasets collected from different locations, we exploit their subtle distinctions by means of an invariance loss encouraging the model to suppress spurious correlations; (ii) we devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a sparse causal graph; (iii) we introduce a style contrastive loss that not only enforces the structure of style representations but also serves as a self-supervisory signal for test-time refinement on the fly. Experiments on synthetic and real datasets show that our proposed method improves the robustness and reusability of learned motion representations, significantly outperforming prior state-of-the-art motion forecasting models for out-of-distribution generalization and low-shot transfer. △ Less

Submitted 5 April, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

Comments: CVPR 2022. Code is available at https://github.com/vita-epfl/causalmotion. v4: fixed typo

arXiv:2110.15283 [pdf, other]

Decentralized Feature-Distributed Optimization for Generalized Linear Models

Authors: Brighton Ancelin, Sohail Bahmani, Justin Romberg

Abstract: We consider the "all-for-one" decentralized learning problem for generalized linear models. The features of each sample are partitioned among several collaborating agents in a connected network, but only one agent observes the response variables. To solve the regularized empirical risk minimization in this distributed setting, we apply the Chambolle--Pock primal--dual algorithm to an equivalent sa… ▽ More We consider the "all-for-one" decentralized learning problem for generalized linear models. The features of each sample are partitioned among several collaborating agents in a connected network, but only one agent observes the response variables. To solve the regularized empirical risk minimization in this distributed setting, we apply the Chambolle--Pock primal--dual algorithm to an equivalent saddle-point formulation of the problem. The primal and dual iterations are either in closed-form or reduce to coordinate-wise minimization of scalar convex functions. We establish convergence rates for the empirical risk minimization under two different assumptions on the loss function (Lipschitz and square root Lipschitz), and show how they depend on the characteristics of the design matrix and the Laplacian of the network. △ Less

Submitted 28 October, 2021; originally announced October 2021.

arXiv:2103.07020 [pdf, ps, other]

Max-Linear Regression by Convex Programming

Authors: Seonho Kim, Sohail Bahmani, Kiryung Lee

Abstract: We consider the multivariate max-linear regression problem where the model parameters $\boldsymbolβ_{1},\dotsc,\boldsymbolβ_{k}\in\mathbb{R}^{p}$ need to be estimated from $n$ independent samples of the (noisy) observations $y = \max_{1\leq j \leq k} \boldsymbolβ_{j}^{\mathsf{T}} \boldsymbol{x} + \mathrm{noise}$. The max-linear model vastly generalizes the conventional linear model, and it can app… ▽ More We consider the multivariate max-linear regression problem where the model parameters $\boldsymbolβ_{1},\dotsc,\boldsymbolβ_{k}\in\mathbb{R}^{p}$ need to be estimated from $n$ independent samples of the (noisy) observations $y = \max_{1\leq j \leq k} \boldsymbolβ_{j}^{\mathsf{T}} \boldsymbol{x} + \mathrm{noise}$. The max-linear model vastly generalizes the conventional linear model, and it can approximate any convex function to an arbitrary accuracy when the number of linear models $k$ is large enough. However, the inherent nonlinearity of the max-linear model renders the estimation of the regression parameters computationally challenging. Particularly, no estimator based on convex programming is known in the literature. We formulate and analyze a scalable convex program given by anchored regression (AR) as the estimator for the max-linear regression problem. Under the standard Gaussian observation setting, we present a non-asymptotic performance guarantee showing that the convex program recovers the parameters with high probability. When the $k$ linear components are equally likely to achieve the maximum, our result shows a sufficient number of noise-free observations for exact recovery scales as {$k^{4}p$} up to a logarithmic factor. { This sample complexity coincides with that by alternating minimization (Ghosh et al., {2021}). Moreover, the same sample complexity applies when the observations are corrupted with arbitrary deterministic noise. We provide empirical results that show that our method performs as our theoretical result predicts, and is competitive with the alternating minimization algorithm particularly in presence of multiplicative Bernoulli noise. Furthermore, we also show empirically that a recursive application of AR can significantly improve the estimation accuracy.} △ Less

Submitted 23 February, 2024; v1 submitted 11 March, 2021; originally announced March 2021.

arXiv:2004.02718 [pdf, other]

Low-Rank Matrix Estimation From Rank-One Projections by Unlifted Convex Optimization

Authors: Sohail Bahmani, Kiryung Lee

Abstract: We study an estimator with a convex formulation for recovery of low-rank matrices from rank-one projections. Using initial estimates of the factors of the target $d_1\times d_2$ matrix of rank-$r$, the estimator admits a practical subgradient method operating in a space of dimension $r(d_1+d_2)$. This property makes the estimator significantly more scalable than the convex estimators based on lift… ▽ More We study an estimator with a convex formulation for recovery of low-rank matrices from rank-one projections. Using initial estimates of the factors of the target $d_1\times d_2$ matrix of rank-$r$, the estimator admits a practical subgradient method operating in a space of dimension $r(d_1+d_2)$. This property makes the estimator significantly more scalable than the convex estimators based on lifting and semidefinite programming. Furthermore, we present a streamlined analysis for exact recovery under the real Gaussian measurement model, as well as the partially derandomized measurement model by using the spherical $t$-design. We show that under both models the estimator succeeds, with high probability, if the number of measurements exceeds $r^2 (d_1+d_2)$ up to some logarithmic factors. This sample complexity improves on the existing results for nonconvex iterative algorithms. △ Less

Submitted 10 January, 2021; v1 submitted 6 April, 2020; originally announced April 2020.

arXiv:1910.11477 [pdf, other]

Phase Retrieval of Low-Rank Matrices by Anchored Regression

Authors: Kiryung Lee, Sohail Bahmani, Yonina Eldar, Justin Romberg

Abstract: We study the low-rank phase retrieval problem, where we try to recover a $d_1\times d_2$ low-rank matrix from a series of phaseless linear measurements. This is a fourth-order inverse problem, as we are trying to recover factors of matrix that have been put through a quadratic nonlinearity after being multiplied together. We propose a solution to this problem using the recently introduced techni… ▽ More We study the low-rank phase retrieval problem, where we try to recover a $d_1\times d_2$ low-rank matrix from a series of phaseless linear measurements. This is a fourth-order inverse problem, as we are trying to recover factors of matrix that have been put through a quadratic nonlinearity after being multiplied together. We propose a solution to this problem using the recently introduced technique of anchored regression. This approach uses two different types of convex relaxations: we replace the quadratic equality constraints for the phaseless measurements by a search over a polytope, and enforce the rank constraint through nuclear norm regularization. The result is a convex program that works in the space of $d_1 \times d_2$ matrices. We analyze two specific scenarios. In the first, the target matrix is rank-$1$, and the observations are structured to correspond to a phaseless blind deconvolution. In the second, the target matrix has general rank, and we observe the magnitudes of the inner products against a series of independent Gaussian random matrices. In each of these problems, we show that the anchored regression returns an accurate estimate from a near-optimal number of measurements given that we have access to an anchor matrix of sufficient quality. We also show how to create such an anchor in the phaseless blind deconvolution problem, again from an optimal number of measurements, and present a partial result in this direction for the general rank problem. △ Less

Submitted 3 July, 2020; v1 submitted 24 October, 2019; originally announced October 2019.

arXiv:1908.09915 [pdf, other]

Convex Programming for Estimation in Nonlinear Recurrent Models

Authors: Sohail Bahmani, Justin Romberg

Abstract: We propose a formulation for nonlinear recurrent models that includes simple parametric models of recurrent neural networks as a special case. The proposed formulation leads to a natural estimator in the form of a convex program. We provide a sample complexity for this estimator in the case of stable dynamics, where the nonlinear recursion has a certain contraction property, and under certain regu… ▽ More We propose a formulation for nonlinear recurrent models that includes simple parametric models of recurrent neural networks as a special case. The proposed formulation leads to a natural estimator in the form of a convex program. We provide a sample complexity for this estimator in the case of stable dynamics, where the nonlinear recursion has a certain contraction property, and under certain regularity conditions on the input distribution. We evaluate the performance of the estimator by simulation on synthetic data. These numerical experiments also suggest the extent at which the imposed theoretical assumptions may be relaxed. △ Less

Submitted 26 August, 2019; originally announced August 2019.

Comments: 18 pages

MSC Class: 90C25; 93E35

arXiv:1806.07307 [pdf, other]

Estimation from Non-Linear Observations via Convex Programming with Application to Bilinear Regression

Authors: Sohail Bahmani

Abstract: We propose a computationally efficient estimator, formulated as a convex program, for a broad class of non-linear regression problems that involve difference of convex (DC) non-linearities. The proposed method can be viewed as a significant extension of the "anchored regression" method formulated and analyzed in [10] for regression with convex non-linearities. Our main assumption, in addition to o… ▽ More We propose a computationally efficient estimator, formulated as a convex program, for a broad class of non-linear regression problems that involve difference of convex (DC) non-linearities. The proposed method can be viewed as a significant extension of the "anchored regression" method formulated and analyzed in [10] for regression with convex non-linearities. Our main assumption, in addition to other mild statistical and computational assumptions, is availability of a certain approximation oracle for the average of the gradients of the observation functions at a ground truth. Under this assumption and using a PAC-Bayesian analysis we show that the proposed estimator produces an accurate estimate with high probability. As a concrete example, we study the proposed framework in the bilinear regression problem with Gaussian factors and quantify a sufficient sample complexity for exact recovery. Furthermore, we describe a computationally tractable scheme that provably produces the required approximation oracle in the considered bilinear regression problem. △ Less

Submitted 28 March, 2019; v1 submitted 19 June, 2018; originally announced June 2018.

Comments: Some elaboration on the algorithm and theoretical results are added. Minor errors and typos corrected

MSC Class: 62F10; 90C25; 62P30

arXiv:1702.05327 [pdf, other]

Solving Equations of Random Convex Functions via Anchored Regression

Authors: Sohail Bahmani, Justin Romberg

Abstract: We consider the question of estimating a solution to a system of equations that involve convex nonlinearities, a problem that is common in machine learning and signal processing. Because of these nonlinearities, conventional estimators based on empirical risk minimization generally involve solving a non-convex optimization program. We propose anchored regression, a new approach based on convex pro… ▽ More We consider the question of estimating a solution to a system of equations that involve convex nonlinearities, a problem that is common in machine learning and signal processing. Because of these nonlinearities, conventional estimators based on empirical risk minimization generally involve solving a non-convex optimization program. We propose anchored regression, a new approach based on convex programming that amounts to maximizing a linear functional (perhaps augmented by a regularizer) over a convex set. The proposed convex program is formulated in the natural space of the problem, and avoids the introduction of auxiliary variables, making it computationally favorable. Working in the native space also provides great flexibility as structural priors (e.g., sparsity) can be seamlessly incorporated. For our analysis, we model the equations as being drawn from a fixed set according to a probability law. Our main results provide guarantees on the accuracy of the estimator in terms of the number of equations we are solving, the amount of noise present, a measure of statistical complexity of the random equations, and the geometry of the regularizer at the true solution. We also provide recipes for constructing the anchor vector (that determines the linear functional to maximize) directly from the observed data. △ Less

Submitted 13 August, 2018; v1 submitted 17 February, 2017; originally announced February 2017.

arXiv:1612.05986 [pdf, other]

Algebraic Connectivity Under Site Percolation in Finite Weighted Graphs

Authors: Sohail Bahmani, Justin Romberg, Prasad Tetali

Abstract: We study the behavior of algebraic connectivity in a weighted graph that is subject to site percolation, random deletion of the vertices. Using a refined concentration inequality for random matrices we show in our main theorem that the (augmented) Laplacian of the percolated graph concentrates around its expectation. This concentration bound then provides a lower bound on the algebraic connectivit… ▽ More We study the behavior of algebraic connectivity in a weighted graph that is subject to site percolation, random deletion of the vertices. Using a refined concentration inequality for random matrices we show in our main theorem that the (augmented) Laplacian of the percolated graph concentrates around its expectation. This concentration bound then provides a lower bound on the algebraic connectivity of the percolated graph. As a special case for $(n,d,λ)$-graphs (i.e., $d$-regular graphs on $n$ vertices with non-trivial eigenvalues less than $λ$ in magnitude) our result shows that, with high probability, the graph remains connected under a homogeneous site percolation with survival probability $p\ge 1-C_{1}n^{-C_{2}/d}$ with $C_{1}$ and $C_{2}$ depending only on $λ/d$. △ Less

Submitted 2 January, 2017; v1 submitted 18 December, 2016; originally announced December 2016.

arXiv:1610.04210 [pdf, other]

Phase Retrieval Meets Statistical Learning Theory: A Flexible Convex Relaxation

Authors: Sohail Bahmani, Justin Romberg

Abstract: We propose a flexible convex relaxation for the phase retrieval problem that operates in the natural domain of the signal. Therefore, we avoid the prohibitive computational cost associated with "lifting" and semidefinite programming (SDP) in methods such as PhaseLift and compete with recently developed non-convex techniques for phase retrieval. We relax the quadratic equations for phaseless measur… ▽ More We propose a flexible convex relaxation for the phase retrieval problem that operates in the natural domain of the signal. Therefore, we avoid the prohibitive computational cost associated with "lifting" and semidefinite programming (SDP) in methods such as PhaseLift and compete with recently developed non-convex techniques for phase retrieval. We relax the quadratic equations for phaseless measurements to inequality constraints each of which representing a symmetric "slab". Through a simple convex program, our proposed estimator finds an extreme point of the intersection of these slabs that is best aligned with a given anchor vector. We characterize geometric conditions that certify success of the proposed estimator. Furthermore, using classic results in statistical learning theory, we show that for random measurements the geometric certificates hold with high probability at an optimal sample complexity. Phase transition of our estimator is evaluated through simulations. Our numerical experiments also suggest that the proposed method can solve phase retrieval problems with coded diffraction measurements as well. △ Less

Submitted 16 March, 2017; v1 submitted 13 October, 2016; originally announced October 2016.

Comments: Accepted in AISTATS 2017. Extended the discussion of related work and added a few more references. Clarified some of the statements and notations

arXiv:1510.01670 [pdf, other]

Sketching for Simultaneously Sparse and Low-Rank Covariance Matrices

Authors: Sohail Bahmani, Justin Romberg

Abstract: We introduce a technique for estimating a structured covariance matrix from observations of a random vector which have been sketched. Each observed random vector $\boldsymbol{x}_t$ is reduced to a single number by taking its inner product against one of a number of pre-selected vector $\boldsymbol{a}_\ell$. These observations are used to form estimates of linear observations of the covariance matr… ▽ More We introduce a technique for estimating a structured covariance matrix from observations of a random vector which have been sketched. Each observed random vector $\boldsymbol{x}_t$ is reduced to a single number by taking its inner product against one of a number of pre-selected vector $\boldsymbol{a}_\ell$. These observations are used to form estimates of linear observations of the covariance matrix $\boldsymbol{\varSigma}$, which is assumed to be simultaneously sparse and low-rank. We show that if the sketching vectors $\boldsymbol{a}_\ell$ have a special structure, then we can use straightforward two-stage algorithm that exploits this structure. We show that the estimate is accurate when the number of sketches is proportional to the maximum of the rank times the number of significant rows/columns of $\boldsymbol{\varSigma}$. Moreover, our algorithm takes direct advantage of the low-rank structure of $\boldsymbol{\varSigma}$ by only manipulating matrices that are far smaller than the original covariance matrix. △ Less

Submitted 8 October, 2015; v1 submitted 6 October, 2015; originally announced October 2015.

Comments: Accepted in 2015 IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP 2015)

arXiv:1507.08254 [pdf, other]

Efficient Compressive Phase Retrieval with Constrained Sensing Vectors

Authors: Sohail Bahmani, Justin Romberg

Abstract: We propose a robust and efficient approach to the problem of compressive phase retrieval in which the goal is to reconstruct a sparse vector from the magnitude of a number of its linear measurements. The proposed framework relies on constrained sensing vectors and a two-stage reconstruction method that consists of two standard convex programs that are solved sequentially. In recent years, variou… ▽ More We propose a robust and efficient approach to the problem of compressive phase retrieval in which the goal is to reconstruct a sparse vector from the magnitude of a number of its linear measurements. The proposed framework relies on constrained sensing vectors and a two-stage reconstruction method that consists of two standard convex programs that are solved sequentially. In recent years, various methods are proposed for compressive phase retrieval, but they have suboptimal sample complexity or lack robustness guarantees. The main obstacle has been that there is no straightforward convex relaxations for the type of structure in the target. Given a set of underdetermined measurements, there is a standard framework for recovering a sparse matrix, and a standard framework for recovering a low-rank matrix. However, a general, efficient method for recovering a jointly sparse and low-rank matrix has remained elusive. Deviating from the models with generic measurements, in this paper we show that if the sensing vectors are chosen at random from an incoherent subspace, then the low-rank and sparse structures of the target signal can be effectively decoupled. We show that a recovery algorithm that consists of a low-rank recovery stage followed by a sparse recovery stage will produce an accurate estimate of the target when the number of measurements is $\mathsf{O}(k\,\log\frac{d}{k})$, where $k$ and $d$ denote the sparsity level and the dimension of the input signal. We also evaluate the algorithm through numerical simulation. △ Less

Submitted 26 October, 2015; v1 submitted 29 July, 2015; originally announced July 2015.

Comments: Accepted for the 29th Annual Conference on Neural Information Processing Systems (NIPS), 2015

arXiv:1506.08159 [pdf, other]

Near-Optimal Estimation of Simultaneously Sparse and Low-Rank Matrices from Nested Linear Measurements

Authors: Sohail Bahmani, Justin Romberg

Abstract: In this paper we consider the problem of estimating simultaneously low-rank and row-wise sparse matrices from nested linear measurements where the linear operator consists of the product of a linear operator $\mathcal{W}$ and a matrix $\mathbf{\varPsi}$. Leveraging the nested structure of the measurement operator, we propose a computationally efficient two-stage algorithm for estimating the simult… ▽ More In this paper we consider the problem of estimating simultaneously low-rank and row-wise sparse matrices from nested linear measurements where the linear operator consists of the product of a linear operator $\mathcal{W}$ and a matrix $\mathbf{\varPsi}$. Leveraging the nested structure of the measurement operator, we propose a computationally efficient two-stage algorithm for estimating the simultaneously structured target matrix. Assuming that $\mathcal{W}$ is a restricted isometry for low-rank matrices and $\mathbf{\varPsi}$ is a restricted isometry for row-wise sparse matrices, we establish an accuracy guarantee that holds uniformly for all sufficiently low-rank and row-wise sparse matrices with high probability. Furthermore, using standard tools from information theory, we establish a minimax lower bound for estimation of simultaneously low-rank and row-wise sparse matrices from linear measurements that need not be nested. The accuracy bounds established for the algorithm, that also serve as a minimax upper bound, differ from the derived minimax lower bound merely by a polylogarithmic factor of the dimensions. Therefore, the proposed algorithm is nearly minimax optimal. We also discuss some applications of the proposed observation model and evaluate our algorithm through numerical simulation. △ Less

Submitted 21 March, 2016; v1 submitted 26 June, 2015; originally announced June 2015.

arXiv:1501.00046 [pdf, other]

doi 10.1137/141002165

Lifting for Blind Deconvolution in Random Mask Imaging: Identifiability and Convex Relaxation

Authors: Sohail Bahmani, Justin Romberg

Abstract: In this paper we analyze the blind deconvolution of an image and an unknown blur in a coded imaging system. The measurements consist of subsampled convolution of an unknown blurring kernel with multiple random binary modulations (coded masks) of the image. To perform the deconvolution, we consider a standard lifting of the image and the blurring kernel that transforms the measurements into a set o… ▽ More In this paper we analyze the blind deconvolution of an image and an unknown blur in a coded imaging system. The measurements consist of subsampled convolution of an unknown blurring kernel with multiple random binary modulations (coded masks) of the image. To perform the deconvolution, we consider a standard lifting of the image and the blurring kernel that transforms the measurements into a set of linear equations of the matrix formed by their outer product. Any rank-one solution to this system of equation provides a valid pair of an image and a blur. We first express the necessary and sufficient conditions for the uniqueness of a rank-one solution under some additional assumptions (uniform subsampling and no limit on the number of coded masks). These conditions are special case of a previously established result regarding identifiability in the matrix completion problem. We also characterize a low-dimensional subspace model for the blur kernel that is sufficient to guarantee identifiability, including the interesting instance of "bandpass"` blur kernels. Next, assuming the bandpass model for the blur kernel, we show that the image and the blur kernel can be found using nuclear norm minimization. Our main results show that recovery is achieved (with high probability) when the number of masks is on the order of $μ\log^{2}L\,\log\frac{Le}μ\,\log\log\left(N+1\right)$ where $μ$ is the \emph{coherence} of the blur, $L$ is the dimension of the image, and $N$ is the number of measured samples per mask. △ Less

Submitted 6 April, 2015; v1 submitted 30 December, 2014; originally announced January 2015.

Journal ref: SIAM J. Imaging Sci. 8(4):2203--2238, 2015

arXiv:1412.7890 [pdf, other]

doi 10.1109/TCI.2015.2485941

Compressive Deconvolution in Random Mask Imaging

Authors: Sohail Bahmani, Justin Romberg

Abstract: We investigate the problem of reconstructing signals from a subsampled convolution of their modulated versions and a known filter. The problem is studied as applies to specific imaging systems relying on spatial phase modulation by randomly coded "masks." The diversity induced by the random masks is deemed to improve the conditioning of the deconvolution problem while maintaining sampling efficien… ▽ More We investigate the problem of reconstructing signals from a subsampled convolution of their modulated versions and a known filter. The problem is studied as applies to specific imaging systems relying on spatial phase modulation by randomly coded "masks." The diversity induced by the random masks is deemed to improve the conditioning of the deconvolution problem while maintaining sampling efficiency. We analyze a linear model of the system, where the joint effect of the spatial modulation, blurring, and spatial subsampling is represented by a measurement matrix. We provide a bound on the conditioning of this measurement matrix in terms of the number of masks, the dimension of the image, and certain characteristics of the blurring kernel and subsampling operator. The derived bound shows that stable deconvolution is possible with high probability even if the total number of (scalar) measurements is within a logarithmic factor of the image size. Furthermore, beyond a critical number of masks determined by the extent of blurring and subsampling, every additional mask improves the conditioning of the measurement matrix. We also consider a more interesting scenario where the target image is sparse. We show that under mild conditions on the blurring kernel, with high probability the measurement matrix is a restricted isometry when the number of masks is within a logarithmic factor of the sparsity of the image. Therefore, the image can be reconstructed using many sparse recovery algorithms such as the basis pursuit. The bound on the required number of masks is linear in sparsity of the image but it is logarithmic in its dimension. The bound provides a quantitative view of the effect of the blurring and subsampling on the required number of masks, which is critical for designing efficient imaging systems. △ Less

Submitted 3 September, 2015; v1 submitted 25 December, 2014; originally announced December 2014.

Journal ref: IEEE Transactions on Computational Imaging 1(4):236--246, 2015

arXiv:1304.6627 [pdf, other]

Robust 1-bit Compressive Sensing via Gradient Support Pursuit

Authors: Sohail Bahmani, Petros T. Boufounos, Bhiksha Raj

Abstract: This paper studies a formulation of 1-bit Compressed Sensing (CS) problem based on the maximum likelihood estimation framework. In order to solve the problem we apply the recently proposed Gradient Support Pursuit algorithm, with a minor modification. Assuming the proposed objective function has a Stable Restricted Hessian, the algorithm is shown to accurately solve the 1-bit CS problem. Furthermo… ▽ More This paper studies a formulation of 1-bit Compressed Sensing (CS) problem based on the maximum likelihood estimation framework. In order to solve the problem we apply the recently proposed Gradient Support Pursuit algorithm, with a minor modification. Assuming the proposed objective function has a Stable Restricted Hessian, the algorithm is shown to accurately solve the 1-bit CS problem. Furthermore, the algorithm is compared to the state-of-the-art 1-bit CS algorithms through numerical simulations. The results suggest that the proposed method is robust to noise and at mid to low input SNR regime it achieves the best reconstruction SNR vs. execution time trade-off. △ Less

Submitted 24 April, 2013; originally announced April 2013.

Comments: 18 pages, 5 figures

arXiv:1209.1557 [pdf, ps, other]

doi 10.1109/TIT.2016.2515078

Learning Model-Based Sparsity via Projected Gradient Descent

Authors: Sohail Bahmani, Petros T. Boufounos, Bhiksha Raj

Abstract: Several convex formulation methods have been proposed previously for statistical estimation with structured sparsity as the prior. These methods often require a carefully tuned regularization parameter, often a cumbersome or heuristic exercise. Furthermore, the estimate that these methods produce might not belong to the desired sparsity model, albeit accurately approximating the true parameter. Th… ▽ More Several convex formulation methods have been proposed previously for statistical estimation with structured sparsity as the prior. These methods often require a carefully tuned regularization parameter, often a cumbersome or heuristic exercise. Furthermore, the estimate that these methods produce might not belong to the desired sparsity model, albeit accurately approximating the true parameter. Therefore, greedy-type algorithms could often be more desirable in estimating structured-sparse parameters. So far, these greedy methods have mostly focused on linear statistical models. In this paper we study the projected gradient descent with non-convex structured-sparse parameter model as the constraint set. Should the cost function have a Stable Model-Restricted Hessian the algorithm produces an approximation for the desired minimizer. As an example we elaborate on application of the main results to estimation in Generalized Linear Model. △ Less

Submitted 27 January, 2016; v1 submitted 7 September, 2012; originally announced September 2012.

MSC Class: 62FXX; 65KXX

Journal ref: IEEE Transactions on Information Theory 62(4):2092--2099, 2016

arXiv:1107.4623 [pdf, ps, other]

doi 10.1016/j.acha.2012.07.004

A Unifying Analysis of Projected Gradient Descent for $\ell_p$-constrained Least Squares

Authors: Sohail Bahmani, Bhiksha Raj

Abstract: In this paper we study the performance of the Projected Gradient Descent(PGD) algorithm for $\ell_{p}$-constrained least squares problems that arise in the framework of Compressed Sensing. Relying on the Restricted Isometry Property, we provide convergence guarantees for this algorithm for the entire range of $0\leq p\leq1$, that include and generalize the existing results for the Iterative Hard T… ▽ More In this paper we study the performance of the Projected Gradient Descent(PGD) algorithm for $\ell_{p}$-constrained least squares problems that arise in the framework of Compressed Sensing. Relying on the Restricted Isometry Property, we provide convergence guarantees for this algorithm for the entire range of $0\leq p\leq1$, that include and generalize the existing results for the Iterative Hard Thresholding algorithm and provide a new accuracy guarantee for the Iterative Soft Thresholding algorithm as special cases. Our results suggest that in this group of algorithms, as $p$ increases from zero to one, conditions required to guarantee accuracy become stricter and robustness to noise deteriorates. △ Less

Submitted 26 June, 2012; v1 submitted 22 July, 2011; originally announced July 2011.

Comments: 16 pages, 3 Figures

Journal ref: Applied and Computational Harmonic Analysis, 34(3):366-378, 2013

Showing 1–25 of 25 results for author: Bahmani, S