subscribe to arXiv mailings

PriViT: Vision Transformers for Fast Private Inference

Authors: Naren Dhyani, Jianqiao Mo, Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde

Abstract: The Vision Transformer (ViT) architecture has emerged as the backbone of choice for state-of-the-art deep models for computer vision applications. However, ViTs are ill-suited for private inference using secure multi-party computation (MPC) protocols, due to the large number of non-polynomial operations (self-attention, feed-forward rectifiers, layer normalization). We propose PriViT, a gradient b… ▽ More The Vision Transformer (ViT) architecture has emerged as the backbone of choice for state-of-the-art deep models for computer vision applications. However, ViTs are ill-suited for private inference using secure multi-party computation (MPC) protocols, due to the large number of non-polynomial operations (self-attention, feed-forward rectifiers, layer normalization). We propose PriViT, a gradient based algorithm to selectively "Taylorize" nonlinearities in ViTs while maintaining their prediction accuracy. Our algorithm is conceptually simple, easy to implement, and achieves improved performance over existing approaches for designing MPC-friendly transformer architectures in terms of achieving the Pareto frontier in latency-accuracy. We confirm these improvements via experiments on several standard image classification tasks. Public code is available at https://github.com/NYU-DICE-Lab/privit. △ Less

Submitted 6 October, 2023; originally announced October 2023.

Comments: 18 pages, 14 figures

arXiv:2307.04077 [pdf, other]

doi 10.1145/3587135.3592169

Towards Fast and Scalable Private Inference

Authors: Jianqiao Mo, Karthik Garimella, Negar Neda, Austin Ebel, Brandon Reagen

Abstract: Privacy and security have rapidly emerged as first order design constraints. Users now demand more protection over who can see their data (confidentiality) as well as how it is used (control). Here, existing cryptographic techniques for security fall short: they secure data when stored or communicated but must decrypt it for computation. Fortunately, a new paradigm of computing exists, which we re… ▽ More Privacy and security have rapidly emerged as first order design constraints. Users now demand more protection over who can see their data (confidentiality) as well as how it is used (control). Here, existing cryptographic techniques for security fall short: they secure data when stored or communicated but must decrypt it for computation. Fortunately, a new paradigm of computing exists, which we refer to as privacy-preserving computation (PPC). Emerging PPC technologies can be leveraged for secure outsourced computation or to enable two parties to compute without revealing either users' secret data. Despite their phenomenal potential to revolutionize user protection in the digital age, the realization has been limited due to exorbitant computational, communication, and storage overheads. This paper reviews recent efforts on addressing various PPC overheads using private inference (PI) in neural network as a motivating application. First, the problem and various technologies, including homomorphic encryption (HE), secret sharing (SS), garbled circuits (GCs), and oblivious transfer (OT), are introduced. Next, a characterization of their overheads when used to implement PI is covered. The characterization motivates the need for both GCs and HE accelerators. Then two solutions are presented: HAAC for accelerating GCs and RPU for accelerating HE. To conclude, results and effects are shown with a discussion on what future work is needed to overcome the remaining overheads of PI. △ Less

Submitted 8 July, 2023; originally announced July 2023.

Comments: Appear in the 20th ACM International Conference on Computing Frontiers

arXiv:2303.05518 [pdf, ps, other]

doi 10.1609/aaai.v37i9.26274

Computably Continuous Reinforcement-Learning Objectives are PAC-learnable

Authors: Cambridge Yang, Michael Littman, Michael Carbin

Abstract: In reinforcement learning, the classic objectives of maximizing discounted and finite-horizon cumulative rewards are PAC-learnable: There are algorithms that learn a near-optimal policy with high probability using a finite amount of samples and computation. In recent years, researchers have introduced objectives and corresponding reinforcement-learning algorithms beyond the classic cumulative rewa… ▽ More In reinforcement learning, the classic objectives of maximizing discounted and finite-horizon cumulative rewards are PAC-learnable: There are algorithms that learn a near-optimal policy with high probability using a finite amount of samples and computation. In recent years, researchers have introduced objectives and corresponding reinforcement-learning algorithms beyond the classic cumulative rewards, such as objectives specified as linear temporal logic formulas. However, questions about the PAC-learnability of these new objectives have remained open. This work demonstrates the PAC-learnability of general reinforcement-learning objectives through sufficient conditions for PAC-learnability in two analysis settings. In particular, for the analysis that considers only sample complexity, we prove that if an objective given as an oracle is uniformly continuous, then it is PAC-learnable. Further, for the analysis that considers computational complexity, we prove that if an objective is computable, then it is PAC-learnable. In other words, if a procedure computes successive approximations of the objective's value, then the objective is PAC-learnable. We give three applications of our condition on objectives from the literature with previously unknown PAC-learnability and prove that these objectives are PAC-learnable. Overall, our result helps verify existing objectives' PAC-learnability. Also, as some studied objectives that are not uniformly continuous have been shown to be not PAC-learnable, our results could guide the design of new PAC-learnable objectives. △ Less

Submitted 19 March, 2023; v1 submitted 9 March, 2023; originally announced March 2023.

arXiv:2211.13324 [pdf, other]

doi 10.1145/3579371.3589045

HAAC: A Hardware-Software Co-Design to Accelerate Garbled Circuits

Authors: Jianqiao Mo, Jayanth Gopinath, Brandon Reagen

Abstract: Privacy and security have rapidly emerged as priorities in system design. One powerful solution for providing both is privacy-preserving computation, where functions are computed directly on encrypted data and control can be provided over how data is used. Garbled circuits (GCs) are a PPC technology that provide both confidential computing and control over how data is used. The challenge is that t… ▽ More Privacy and security have rapidly emerged as priorities in system design. One powerful solution for providing both is privacy-preserving computation, where functions are computed directly on encrypted data and control can be provided over how data is used. Garbled circuits (GCs) are a PPC technology that provide both confidential computing and control over how data is used. The challenge is that they incur significant performance overheads compared to plaintext. This paper proposes a novel garbled circuits accelerator and compiler, named HAAC, to mitigate performance overheads and make privacy-preserving computation more practical. HAAC is a hardware-software co-design. GCs are exemplars of co-design as programs are completely known at compile time, i.e., all dependence, memory accesses, and control flow are fixed. The design philosophy of HAAC is to keep hardware simple and efficient, maximizing area devoted to our proposed custom execution units and other circuits essential for high performance (e.g., on-chip storage). The compiler can leverage its program understanding to realize hardware's performance potential by generating effective instruction schedules, data layouts, and orchestrating off-chip events. In taking this approach we can achieve ASIC performance/efficiency without sacrificing generality. Insights of our approach include how co-design enables expressing arbitrary GCs programs as streams, which simplifies hardware and enables complete memory-compute decoupling, and the development of a scratchpad that captures data reuse by tracking program execution, eliminating the need for costly hardware managed caches and tagging logic. We evaluate HAAC with VIP-Bench and achieve an average speedup of 589$\times$ with DDR4 (2,627$\times$ with HBM2) in 4.3mm$^2$ of area. △ Less

Submitted 25 April, 2023; v1 submitted 23 November, 2022; originally announced November 2022.

Comments: Accepted to the 50th Annual International Symposium on Computer Architecture (ISCA)

arXiv:2111.12679 [pdf, other]

On the (In)Tractability of Reinforcement Learning for LTL Objectives

Authors: Cambridge Yang, Michael Littman, Michael Carbin

Abstract: In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives. Despite these advancements, there are fundamental limitations to how well this problem can be solved. Previous studies have alluded to this fact but have not examined it in depth. In this paper, we address the tract… ▽ More In recent years, researchers have made significant progress in devising reinforcement-learning algorithms for optimizing linear temporal logic (LTL) objectives and LTL-like objectives. Despite these advancements, there are fundamental limitations to how well this problem can be solved. Previous studies have alluded to this fact but have not examined it in depth. In this paper, we address the tractability of reinforcement learning for general LTL objectives from a theoretical perspective. We formalize the problem under the probably approximately correct learning in Markov decision processes (PAC-MDP) framework, a standard framework for measuring sample complexity in reinforcement learning. In this formalization, we prove that the optimal policy for any LTL formula is PAC-MDP-learnable if and only if the formula is in the most limited class in the LTL hierarchy, consisting of formulas that are decidable within a finite horizon. Practically, our result implies that it is impossible for a reinforcement-learning algorithm to obtain a PAC-MDP guarantee on the performance of its learned policy after finitely many interactions with an unconstrained environment for LTL objectives that are not decidable within a finite horizon. △ Less

Submitted 24 June, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

arXiv:2102.13455 [pdf, other]

Inverse deformation analysis: an experimental and numerical assessment using the FEniCS Project

Authors: Arnaud Mazier, Alexandre Bilger, Antonio E. Forte, Igor Peterlik, Jack S. Hale, Stéphane P. A. Bordas, ., Institute of Computational Engineering, Department of Engineering, University of Luxembourg, Esch-sur-Alzette, Luxembourg., Harvard University, Cambridge, USA., Department of Electronics, Information, Bioengineering, Politecnico di Milano, Milan, Italy., Institute of Computer Science, Masaryk University, Czech Republic., Institute of Research , et al. (3 additional authors not shown)

Abstract: In this paper, we develop a framework for solving inverse deformation problems using the FEniCS Project finite element software. We validate our approach with experimental imaging data acquired from a soft silicone beam under gravity. In contrast with inverse iterative algorithms that require multiple solutions of a standard elasticity problem, the proposed method can compute the undeformed config… ▽ More In this paper, we develop a framework for solving inverse deformation problems using the FEniCS Project finite element software. We validate our approach with experimental imaging data acquired from a soft silicone beam under gravity. In contrast with inverse iterative algorithms that require multiple solutions of a standard elasticity problem, the proposed method can compute the undeformed configuration by solving only one modified elasticity problem. This modified problem has a complexity comparable to the standard one. The framework is implemented within an open-source pipeline enabling the direct and inverse deformation simulation directly from imaging data. We use the high-level Unified Form Language (UFL) of the FEniCS Project to express the finite element model in variational form and to automatically derive the consistent Jacobian. Consequently, the design of the pipeline is flexible: for example, it allows the modification of the constitutive models by changing a single line of code. We include a complete working example showing the inverse deformation of a beam deformed by gravity as supplementary material. △ Less

Submitted 26 February, 2021; originally announced February 2021.

Comments: 33 pages, 12 figures, submitted to Elsevier

arXiv:2007.11203 [pdf, other]

doi 10.1145/3434301

Simplifying Dependent Reductions in the Polyhedral Model

Authors: Cambridge Yang, Eric Atkinson, Michael Carbin

Abstract: A Reduction -- an accumulation over a set of values, using an associative and commutative operator -- is a common computation in many numerical computations, including scientific computations, machine learning, computer vision, and financial analytics. Contemporary polyhedral-based compilation techniques make it possible to optimize reductions, such as prefix sums, in which each component of the… ▽ More A Reduction -- an accumulation over a set of values, using an associative and commutative operator -- is a common computation in many numerical computations, including scientific computations, machine learning, computer vision, and financial analytics. Contemporary polyhedral-based compilation techniques make it possible to optimize reductions, such as prefix sums, in which each component of the reduction's output potentially shares computation with another component in the reduction. Therefore an optimizing compiler can identify the computation shared between multiple components and generate code that computes the shared computation only once. These techniques, however, do not support reductions that -- when phrased in the language of the polyhedral model -- span multiple dependent statements. In such cases, existing approaches can generate incorrect code that violates the data dependences of the original, unoptimized program. In this work, we identify and formalize the optimization of dependent reductions as an integer bilinear program. We present a heuristic optimization algorithm that uses an affine sequential schedule of the program to determine how to simplfy reductions yet still preserve the program's dependences. We demonstrate that the algorithm provides optimal complexity for a set of benchmark programs from the literature on probabilistic inference algorithms, whose performance critically relies on simplifying these reductions. The complexities for 10 of the 11 programs improve siginifcantly by factors at least of the sizes of the input data, which are in the range of $10^4$ to $10^6$ for typical real application inputs. We also confirm the significance of the improvement by showing speedups in wall-clock time that range from $1.1\text{x}$ to over $10^6\text{x}$. △ Less

Submitted 9 February, 2021; v1 submitted 22 July, 2020; originally announced July 2020.

Journal ref: Proc. ACM Program. Lang. 5, POPL, Article 20 (2021)

arXiv:2002.07692 [pdf, other]

doi 10.3847/1538-4365/ab77b1

Imaging and Modeling Data from the Hydrogen Epoch of Reionization Array

Authors: C. L. Carilli{1, 2}, N. Thyagarajan{1}, J. Kent{2}, B. Nikolic{2}, K. Gale-Sides{2}, N. S. Kern{3}, G. Bernardi{4, 5, 6}, A. Mesinger{7}, S. Matika{5}, the HERA TEAM {1}{National Radio Astronomy Observatory, P. O. Box 0, Socorro, NM 87801, USA, ccarilli@nrao. edu, ORCID, :, 0000-0001-6647-3861} {2}{Astrophysics Group, Cavendish Laboratory, JJ Thomson Avenue, Cambridge CB3 0HE, UK} {3}{Department of Astronomy , et al. (21 additional authors not shown)

Abstract: We analyze data from the Hydrogen Epoch of Reionization Array. This is the third in a series of papers on the closure phase delay-spectrum technique designed to detect the HI 21cm emission from cosmic reionization. We present the details of the data and models employed in the power spectral analysis, and discuss limitations to the process. We compare images and visibility spectra made with HERA da… ▽ More We analyze data from the Hydrogen Epoch of Reionization Array. This is the third in a series of papers on the closure phase delay-spectrum technique designed to detect the HI 21cm emission from cosmic reionization. We present the details of the data and models employed in the power spectral analysis, and discuss limitations to the process. We compare images and visibility spectra made with HERA data, to parallel quantities generated from sky models based on the GLEAM survey, incorporating the HERA telescope model. We find reasonable agreement between images made from HERA data, with those generated from the models, down to the confusion level. For the visibility spectra, there is broad agreement between model and data across the full band of $\sim 80$MHz. However, models with only GLEAM sources do not reproduce a roughly sinusoidal spectral structure at the tens of percent level seen in the observed visibility spectra on scales $\sim 10$ MHz on 29 m baselines. We find that this structure is likely due to diffuse Galactic emission, predominantly the Galactic plane, filling the far sidelobes of the antenna primary beam. We show that our current knowledge of the frequency dependence of the diffuse sky radio emission, and the primary beam at large zenith angles, is inadequate to provide an accurate reproduction of the diffuse structure in the models. We discuss implications due to this missing structure in the models, including calibration, and in the search for the HI 21cm signal, as well as possible mitigation techniques. △ Less

Submitted 18 February, 2020; originally announced February 2020.

Comments: 17 pages, 10 figures, accepted for the ApJ Supplement

arXiv:1805.01863 [pdf, other]

Verifying Handcoded Probabilistic Inference Procedures

Authors: Eric Atkinson, Cambridge Yang, Michael Carbin

Abstract: Researchers have recently proposed several systems that ease the process of performing Bayesian probabilistic inference. These include systems for automatic inference algorithm synthesis as well as stronger abstractions for manual algorithm development. However, existing systems whose performance relies on the developer manually constructing a part of the inference algorithm have limited support f… ▽ More Researchers have recently proposed several systems that ease the process of performing Bayesian probabilistic inference. These include systems for automatic inference algorithm synthesis as well as stronger abstractions for manual algorithm development. However, existing systems whose performance relies on the developer manually constructing a part of the inference algorithm have limited support for reasoning about the correctness of the resulting algorithm. In this paper, we present Shuffle, a programming language for manually developing inference procedures that 1) enforces the basic rules of probability theory, 2) enforces the statistical dependencies of the algorithm's corresponding probabilistic model, and 3) generates an optimized implementation. We have used Shuffle to develop inference algorithms for several standard probabilistic models. Our results demonstrate that Shuffle enables a developer to deliver correct and performant implementations of these algorithms. △ Less

Submitted 4 May, 2018; originally announced May 2018.

ACM Class: F.3.1

arXiv:astro-ph/0405026 [pdf, ps, other]

doi 10.1086/422248

The Chandra Deep Field-North Survey: XVII. Evolution of magnetic activity in old late-type stars

Authors: E. D. Feigelson, A. E. Hornschemeier, G. Micela, F. E. Bauer, D. M. Alexander, W. N. Brandt, F. Favata, S. Sciortino, G. P. Garmire, Penn State, Johns Hopkins, Palermo, Cambridge, ESTEC

Abstract: The extremely sensitive Chandra Deep Field-North (CDF-N) pencil-beam X-ray survey is used to identify and characterize the X-ray emission from old high-latitude main sequence Galactic stars. Our principal goal is to investigate the expected long-term decay of magnetic activity of late-type stars due to the gradual spindown of stellar rotation from a magnetized stellar wind. Eleven X-ray sources… ▽ More The extremely sensitive Chandra Deep Field-North (CDF-N) pencil-beam X-ray survey is used to identify and characterize the X-ray emission from old high-latitude main sequence Galactic stars. Our principal goal is to investigate the expected long-term decay of magnetic activity of late-type stars due to the gradual spindown of stellar rotation from a magnetized stellar wind. Eleven X-ray sources constitute a well-defined sample of 2 G, 2 K0-K4, and 7 M2-M5 stars with median distance around 300 pc. X-ray luminosities are typically log Lx ~ 27 erg/s and is dominated by flares rather than quiescent coronal emission. Models of the population indicates that the CDF-N stars are the most magnetically active old disk stars. A substantial decline in X-ray luminosities over the 1<t<11 Gyr age interval is required. This is the first demonstration that the coronal and flaring components of stellar magnetic activity -- and presumably the interior magnetic dynamos responsible for the reconnecting fields at the stellar surface -- exhibit long-term decay over the age of the Galactic disk. The model that best fits the magnitudes, spectral types and X-ray luminosities of the sample has Lx ~ 1/t^2 erg/s which is faster than the 1/t decay rate predicted from widely accepted rotational spindown rates and X-ray-activity relations. △ Less

Submitted 3 May, 2004; originally announced May 2004.

Comments: 22 pages, 5 figures, accepted for publication in the Astrophysical Journal

Journal ref: Astrophys.J.611:1107-1120,2004

arXiv:astro-ph/0301555 [pdf, ps, other]

doi 10.1086/374335

The resolved fraction of the Cosmic X-ray Background

Authors: A. Moretti, S. Campana, D. Lazzati, G. Tagliaferri, INAF-O. A. Brera ITALY, IoA Cambridge UK

Abstract: We present the X-ray source number counts in two energy bands (0.5-2 and 2-10 keV) from a very large source sample: we combine data of six different surveys, both shallow wide field and deep pencil beam, performed with three different satellites (ROSAT, Chandra and XMM-Newton). The sample covers with good statistics the largest possible flux range so far: [2.4*10^-17 - 10^-11] cgs in the soft ba… ▽ More We present the X-ray source number counts in two energy bands (0.5-2 and 2-10 keV) from a very large source sample: we combine data of six different surveys, both shallow wide field and deep pencil beam, performed with three different satellites (ROSAT, Chandra and XMM-Newton). The sample covers with good statistics the largest possible flux range so far: [2.4*10^-17 - 10^-11] cgs in the soft band and [2.1*10^-16 - 8*10^{-12}]cgs in the hard band. Integrating the flux distributions over this range and taking into account the (small) contribution of the brightest sources we derive the flux density generated by discrete sources in both bands. After a critical review of the literature values of the total Cosmic X--Ray Background (CXB) we conclude that, with the present data, the 94.3%, and 88.8% of the soft and hard CXB can be ascribed to discrete source emission. If we extrapolate the analytical form of the Log N--Log S distribution beyond the flux limit of our catalog in the soft band we find that the flux from discrete sources at ~3*10^-18 cgs is consistent with the entire CXB, whereas in the hard band it accounts for only 93% of the total CXB at most, hinting for a faint and obscured population to arise at even fainter fluxes. △ Less

Submitted 28 January, 2003; originally announced January 2003.

Comments: Accepted for publication in ApJ

arXiv:astro-ph/9812277 [pdf, ps, other]

Baryonic Dark Matter: Limits from HST and ISO

Authors: Gerard Gilmore, IoA Cambridge, UK

Abstract: Recent HST and ISO observations provide very severe limits on any compact baryonic contributions to galactic (dark) halos. When combined with Milky Way Galaxy microlensing results, almost the entire plausible range of massive compact baryonic objects is excluded by direct observation. Deep direct imaging at 7mu and 15mu with ISOCAM on the ISO spacecraft directly excludes hydrogen-burning stars o… ▽ More Recent HST and ISO observations provide very severe limits on any compact baryonic contributions to galactic (dark) halos. When combined with Milky Way Galaxy microlensing results, almost the entire plausible range of massive compact baryonic objects is excluded by direct observation. Deep direct imaging at 7mu and 15mu with ISOCAM on the ISO spacecraft directly excludes hydrogen-burning stars of any mass above the hydrogen-burning limit, and of any chemical abundance, from being the predominant explanation of the dark halos of external spiral galaxies. In the Milky Way Galaxy, HST has provided luminosity functions to the hydrogen-burning limit in several globular clusters. The resulting mass functions do not provide any support for dominance by very low-mass stars. This is consistent with field surveys for sub-stellar mass brown dwarfs, which show such objects to be relatively rare. These results are complemented by very deep HST luminosity functions in the Large Magellanic Cloud, providing strong support for the (near)-universality of the stellar mass function. Very recent HST results are available for the nearby dSph galaxy UMi. This galaxy, the most dark-matter dominated object known on kpc scales, has a normal stellar mass function at low masses. The prospects are bright for dark elementary particles. △ Less

Submitted 15 December, 1998; originally announced December 1998.

Comments: 12 pp, 6 figures, uses ws-p8-50x6-00.cls (included), Review at Dark Matter 98, ed N Spooner (World Scientific, in press)

Showing 1–12 of 12 results for author: Cambridge