Given many copies of an unknown quantum state $\rho$, we consider the task of learning a classical description of its principal eigenstate. Namely, assuming that $\rho$ has an eigenstate $|\phi\rangle$ with (unknown) eigenvalue $\lambda > 1/2$, the goal is to learn a (classical shadows style) classical description of $|\phi\rangle$ which can later be used to estimate expectation values $\langle \phi |O| \phi \rangle$ for any $O$ in some class of observables. We consider the sample-complexity setting in which generating a copy of $\rho$ is expensive, but joint measurements on many copies of the state are possible. We present a protocol for this task scaling with the principal eigenvalue $\lambda$ and show that it is optimal within a space of natural approaches, e.g., applying quantum state purification followed by a single-copy classical shadows scheme. Furthermore, when $\lambda$ is sufficiently close to $1$, the performance of our algorithm is optimal--matching the sample complexity for pure state classical shadows.
We study the sample complexity of the classical shadows task: what is the fewest number of copies of an unknown state you need to measure to predict expected values with respect to some class of observables? Large joint measurements are likely required in order to minimize sample complexity, but previous joint measurement protocols only work when the unknown state is pure. We present the first joint measurement protocol for classical shadows whose sample complexity scales with the rank of the unknown state. In particular we prove $\mathcal O(\sqrt{rB}/\epsilon^2)$ samples suffice, where $r$ is the rank of the state, $B$ is a bound on the squared Frobenius norm of the observables, and $\epsilon$ is the target accuracy. In the low-rank regime, this is a nearly quadratic advantage over traditional approaches that use single-copy measurements. We present several intermediate results that may be of independent interest: a solution to a new formulation of classical shadows that captures functions of non-identical input states; a generalization of a ``nice'' Schur basis used for optimal qubit purification and quantum majority vote; and a measurement strategy that allows us to use local symmetries in the Schur basis to avoid intractable Weingarten calculations in the analysis.
BosonSampling is the leading candidate for demonstrating quantum computational advantage in photonic systems. While we have recently seen many impressive experimental demonstrations, there is still a formidable distance between the complexity-theoretic hardness arguments and current experiments. One of the largest gaps involves the ratio of photons to modes: all current hardness evidence assumes a "high-mode" regime in which the number of linear optical modes scales at least quadratically in the number of photons. By contrast, current experiments operate in a "low-mode" regime with a linear number of modes. In this paper we bridge this gap, bringing the hardness evidence for the low-mode experiments to the same level as had been previously established for the high-mode regime. This involves proving a new worst-to-average-case reduction for computing the Permanent that is robust to large numbers of row repetitions and also to distributions over matrices with correlated entries.
We consider the classical shadows task for pure states in the setting of both joint and independent measurements. The task is to measure few copies of an unknown pure state $\rho$ in order to learn a classical description which suffices to later estimate expectation values of observables. Specifically, the goal is to approximate $\mathrm{Tr}(O \rho)$ for any Hermitian observable $O$ to within additive error $\epsilon$ provided $\mathrm{Tr}(O^2)\leq B$ and $\lVert O \rVert = 1$. Our main result applies to the joint measurement setting, where we show $\tilde{\Theta}(\sqrt{B}\epsilon^{-1} + \epsilon^{-2})$ samples of $\rho$ are necessary and sufficient to succeed with high probability. The upper bound is a quadratic improvement on the previous best sample complexity known for this problem. For the lower bound, we see that the bottleneck is not how fast we can learn the state but rather how much any classical description of $\rho$ can be compressed for observable estimation. In the independent measurement setting, we show that $\mathcal O(\sqrt{Bd} \epsilon^{-1} + \epsilon^{-2})$ samples suffice. Notably, this implies that the random Clifford measurements algorithm of Huang, Kueng, and Preskill, which is sample-optimal for mixed states, is not optimal for pure states. Interestingly, our result also uses the same random Clifford measurements but employs a different estimator.
Gaussian boson sampling is a model of photonic quantum computing that has attracted attention as a platform for building quantum devices capable of performing tasks that are out of reach for classical devices. There is therefore significant interest, from the perspective of computational complexity theory, in solidifying the mathematical foundation for the hardness of simulating these devices. We show that, under the standard Anti-Concentration and Permanent-of-Gaussians conjectures, there is no efficient classical algorithm to sample from ideal Gaussian boson sampling distributions (even approximately) unless the polynomial hierarchy collapses. The hardness proof holds in the regime where the number of modes scales quadratically with the number of photons, a setting in which hardness was widely believed to hold but that nevertheless had no definitive proof. Crucial to the proof is a new method for programming a Gaussian boson sampling device so that the output probabilities are proportional to the permanents of submatrices of an arbitrary matrix. This technique is a generalization of Scattershot BosonSampling that we call BipartiteGBS. We also make progress towards the goal of proving hardness in the regime where there are fewer than quadratically more modes than photons (i.e., the high-collision regime) by showing that the ability to approximate permanents of matrices with repeated rows/columns confers the ability to approximate permanents of matrices with no repetitions. The reduction suffices to prove that GBS is hard in the constant-collision regime.
We study the forrelation problem: given a pair of $n$-bit Boolean functions $f$ and $g$, estimate the correlation between $f$ and the Fourier transform of $g$. This problem is known to provide the largest possible quantum speedup in terms of its query complexity and achieves the landmark oracle separation between the complexity class BQP and the Polynomial Hierarchy. Our first result is a classical algorithm for the forrelation problem which has runtime $O(n2^{n/2})$. This is a nearly quadratic improvement over the best previously known algorithm. Secondly, we show that quantum query algorithm that makes $t$ queries to an $n$-bit oracle can be simulated by classical query algorithm making only $O(2^{n(1-1/2t)})$ queries. This fixes a gap in the literature arising from a recently discovered critical error in a previous proof; it matches recently established lower bounds (up to $poly(n,t))$ factors) and thus characterizes the maximal separation in query complexity between quantum and classical algorithms. Finally, we introduce a graph-based forrelation problem where $n$ binary variables live at vertices of some fixed graph and the functions $f,g$ are products of terms describing interactions between nearest-neighbor variables. We show that the graph-based forrelation problem can be solved on a classical computer in time $O(n)$ for any bipartite graph, any planar graph, or, more generally, any graph which can be partitioned into two subgraphs of constant treewidth. The graph-based forrelation is simply related to the variational energy achieved by the Quantum Approximate Optimization Algorithm (QAOA) with two entangling layers and Ising-type cost functions. By exploiting the connection between QAOA and the graph-based forrelation we were able to simulate the recently proposed Recursive QAOA with two entangling layers and $225$ qubits on a laptop computer.
Recent work by Bravyi et al. constructs a relation problem that a noisy constant-depth quantum circuit (QNC$^0$) can solve with near certainty (probability $1 - o(1)$), but that any bounded fan-in constant-depth classical circuit (NC$^0$) fails with some constant probability. We show that this robustness to noise can be achieved in the other low-depth quantum/classical circuit separations in this area. In particular, we show a general strategy for adding noise tolerance to the interactive protocols of Grier and Schaeffer. As a consequence, we obtain an unconditional separation between noisy QNC$^0$ circuits and AC$^0[p]$ circuits for all primes $p \geq 2$, and a conditional separation between noisy QNC$^0$ circuits and log-space classical machines under a plausible complexity-theoretic conjecture. A key component of this reduction is showing average-case hardness for the classical simulation tasks -- that is, showing that a classical simulation of the quantum interactive task is still powerful even if it is allowed to err with constant probability over a uniformly random input. We show that is true even for quantum tasks which are $\oplus$L-hard to simulate. To do this, we borrow techniques from randomized encodings used in cryptography.
A general quantum circuit can be simulated classically in exponential time. If it has a planar layout, then a tensor-network contraction algorithm due to Markov and Shi has a runtime exponential in the square root of its size, or more generally exponential in the treewidth of the underlying graph. Separately, Gottesman and Knill showed that if all gates are restricted to be Clifford, then there is a polynomial time simulation. We combine these two ideas and show that treewidth and planarity can be exploited to improve Clifford circuit simulation. Our main result is a classical algorithm with runtime scaling asymptotically as $n^{\omega/2}<n^{1.19}$ which samples from the output distribution obtained by measuring all $n$ qubits of a planar graph state in given Pauli bases. Here $\omega$ is the matrix multiplication exponent. We also provide a classical algorithm with the same asymptotic runtime which samples from the output distribution of any constant-depth Clifford circuit in a planar geometry. Our work improves known classical algorithms with cubic runtime. A key ingredient is a mapping which, given a tree decomposition of some graph $G$, produces a Clifford circuit with a structure that mirrors the tree decomposition and which emulates measurement of the corresponding graph state. We provide a classical simulation of this circuit with the runtime stated above for planar graphs and otherwise $nt^{\omega-1}$ where $t$ is the width of the tree decomposition. Our algorithm incorporates two subroutines which may be of independent interest. The first is a matrix-multiplication-time version of the Gottesman-Knill simulation of multi-qubit measurement on stabilizer states. The second is a new classical algorithm for solving symmetric linear systems over $\mathbb{F}_2$ in a planar geometry, extending previous works which only applied to non-singular linear systems in the analogous setting.
We show that the quantum parity gate on $n > 3$ qubits cannot be cleanly simulated by a quantum circuit with two layers of arbitrary C-SIGN gates of any arity and arbitrary 1-qubit unitary gates, regardless of the number of allowed ancilla qubits. This is the best known and first nontrivial separation between the parity gate and circuits of this form. The same bounds also apply to the quantum fanout gate. Our results are incomparable with those of Fang et al. [3], which apply to any constant depth but require a sublinear number of ancilla qubits on the simulating circuit.
Recent work of Bravyi et al. and follow-up work by Bene Watts et al. demonstrates a quantum advantage for shallow circuits: constant-depth quantum circuits can perform a task which constant-depth classical (i.e., AC$^0$) circuits cannot. Their results have the advantage that the quantum circuit is fairly practical, and their proofs are free of hardness assumptions (e.g., factoring is classically hard, etc.). Unfortunately, constant-depth classical circuits are too weak to yield a convincing real-world demonstration of quantum advantage. We attempt to hold on to the advantages of the above results, while increasing the power of the classical model. Our main result is a two-round interactive task which is solved by a constant-depth quantum circuit (using only Clifford gates, between neighboring qubits of a 2D grid, with Pauli measurements), but such that any classical solution would necessarily solve $\oplus$L-hard problems. This implies a more powerful class of constant-depth classical circuits (e.g., AC$^0[p]$ for any prime $p$) unconditionally cannot perform the task. Furthermore, under standard complexity-theoretic conjectures, log-depth circuits and log-space Turing machines cannot perform the task either. Using the same techniques, we prove hardness results for weaker complexity classes under more restrictive circuit topologies. Specifically, we give QNC$^0$ interactive tasks on $2 \times n$ and $1 \times n$ grids which require classical simulations of power NC$^1$ and AC$^{0}[6]$, respectively. Moreover, these hardness results are robust to a small constant fraction of error in the classical simulation. We use ideas and techniques from the theory of branching programs, quantum contextuality, measurement-based quantum computation, and Kilian randomization.
We present a trichotomy theorem for the quantum query complexity of regular languages. Every regular language has quantum query complexity Theta(1), ~Theta(sqrt n), or Theta(n). The extreme uniformity of regular languages prevents them from taking any other asymptotic complexity. This is in contrast to even the context-free languages, which we show can have query complexity Theta(n^c) for all computable c in [1/2,1]. Our result implies an equivalent trichotomy for the approximate degree of regular languages, and a dichotomy---either Theta(1) or Theta(n)---for sensitivity, block sensitivity, certificate complexity, deterministic query complexity, and randomized query complexity. The heart of the classification theorem is an explicit quantum algorithm which decides membership in any star-free language in ~O(sqrt n) time. This well-studied family of the regular languages admits many interesting characterizations, for instance, as those languages expressible as sentences in first-order logic over the natural numbers with the less-than relation. Therefore, not only do the star-free languages capture functions such as OR, they can also express functions such as "there exist a pair of 2's such that everything between them is a 0." Thus, we view the algorithm for star-free languages as a nontrivial generalization of Grover's algorithm which extends the quantum quadratic speedup to a much wider range of string-processing algorithms than was previously known. We show a variety of applications---new quantum algorithms for dynamic constant-depth Boolean formulas, balanced parentheses nested constantly many levels deep, binary addition, a restricted word break problem, and path-discovery in narrow grids---all obtained as immediate consequences of our classification theorem.
In 2011, Aaronson gave a striking proof, based on quantum linear optics, showing that the problem of computing the permanent of a matrix is #P-hard. Aaronson's proof led naturally to hardness of approximation results for the permanent, and it was arguably simpler than Valiant's seminal proof of the same fact in 1979. Nevertheless, it did not prove that computing the permanent was #P-hard for any class of matrices which was not previously known. In this paper, we present a collection of new results about matrix permanents that are derived primarily via these linear optical techniques. First, we show that the problem of computing the permanent of a real orthogonal matrix is #P-hard. Much like Aaronson's original proof, this will show that even a multiplicative approximation remains #P-hard to compute. The hardness result even translates to permanents over finite fields, where the problem of computing the permanent of an orthogonal matrix is ModpP-hard in the finite field F_p^4 for all primes p not equal to 2 or 3. Interestingly, this characterization is tight: in fields of characteristic 2, the permanent coincides with the determinant; in fields of characteristic 3, one can efficiently compute the permanent of an orthogonal matrix by a nontrivial result of Kogan. Finally, we use more elementary arguments to prove #P-hardness for the permanent of a positive semidefinite matrix, which shows that certain probabilities of boson sampling experiments with thermal states are hard to compute exactly despite the fact that they can be efficiently sampled by a classical computer.
What is the minimum amount of information and time needed to solve 2SAT? When the instance is known, it can be solved in polynomial time, but is this also possible without knowing the instance? Bei, Chen and Zhang (STOC '13) considered a model where the input is accessed by proposing possible assignments to a special oracle. This oracle, on encountering some constraint unsatisfied by the proposal, returns only the constraint index. It turns out that, in this model, even 1SAT cannot be solved in polynomial time unless P=NP. Hence, we consider a model in which the input is accessed by proposing probability distributions over assignments to the variables. The oracle then returns the index of the constraint that is most likely to be violated by this distribution. We show that the information obtained this way is sufficient to solve 1SAT in polynomial time, even when the clauses can be repeated. For 2SAT, as long as there are no repeated clauses, in polynomial time we can even learn an equivalent formula for the hidden instance and hence also solve it. Furthermore, we extend these results to the quantum regime. We show that in this setting 1QSAT can be solved in polynomial time up to constant precision, and 2QSAT can be learnt in polynomial time up to inverse polynomial precision.
We examine the following problem: given a collection of Clifford gates, describe the set of unitaries generated by circuits composed of those gates. Specifically, we allow the standard circuit operations of composition and tensor product, as well as ancillary workspace qubits as long as they start and end in states uncorrelated with the input, which rule out common "magic state injection" techniques that make Clifford circuits universal. We show that there are exactly 57 classes of Clifford unitaries and present a full classification characterizing the gate sets which generate them. This is the first attempt at a quantum extension of the classification of reversible classical gates introduced by Aaronson et al., another part of an ambitious program to classify all quantum gate sets. The classification uses, at its center, a reinterpretation of the tableau representation of Clifford gates to give circuit decompositions, from which elementary generators can easily be extracted. The 57 different classes are generated in this way, 30 of which arise from the single-qubit subgroups of the Clifford group. At a high level, the remaining classes are arranged according to the bases they preserve. For instance, the CNOT gate preserves the X and Z bases because it maps X-basis elements to X-basis elements and Z-basis elements to Z-basis elements. The remaining classes are characterized by more subtle tableau invariants; for instance, the T_4 and phase gate generate a proper subclass of Z-preserving gates.
We present a complete classification of all possible sets of classical reversible gates acting on bits, in terms of which reversible transformations they generate, assuming swaps and ancilla bits are available for free. Our classification can be seen as the reversible-computing analogue of Post's lattice, a central result in mathematical logic from the 1940s. It is a step toward the ambitious goal of classifying all possible quantum gate sets acting on qubits. Our theorem implies a linear-time algorithm (which we have implemented), that takes as input the truth tables of reversible gates G and H, and that decides whether G generates H. Previously, this problem was not even known to be decidable. The theorem also implies that any n-bit reversible circuit can be "compressed" to an equivalent circuit, over the same gates, that uses at most 2^n*poly(n) gates and O(1) ancilla bits; these are the first upper bounds on these quantities known, and are close to optimal. Finally, the theorem implies that every non-degenerate reversible gate can implement either every reversible transformation, or every affine transformation, when restricted to an "encoded subspace." Briefly, the theorem says that every set of reversible gates generates either all reversible transformations on n-bit strings (as the Toffoli gate does); no transformations; all transformations that preserve Hamming weight (as the Fredkin gate does); all transformations that preserve Hamming weight mod k for some k; all affine transformations (as the Controlled-NOT gate does); all affine transformations that preserve Hamming weight mod 2 or mod 4, inner products mod 2, or a combination thereof; or a previous class augmented by a NOT or NOTNOT gate. Ruling out the possibility of additional classes, not in the list, requires some arguments about polynomials, lattices, and Diophantine equations.