subscribe to arXiv mailings

Assessing Reusability of Deep Learning-Based Monotherapy Drug Response Prediction Models Trained with Omics Data

Authors: Jamie C. Overbeek, Alexander Partin, Thomas S. Brettin, Nicholas Chia, Oleksandr Narykov, Priyanka Vasanthakumari, Andreas Wilke, Yitan Zhu, Austin Clyde, Sara Jones, Rohan Gnanaolivu, Yuanhang Liu, Jun Jiang, Chen Wang, Carter Knutson, Andrew McNaughton, Neeraj Kumar, Gayara Demini Fernando, Souparno Ghosh, Cesar Sanchez-Villalobos, Ruibo Zhang, Ranadip Pal, M. Ryan Weil, Rick L. Stevens

Abstract: Cancer drug response prediction (DRP) models present a promising approach towards precision oncology, tailoring treatments to individual patient profiles. While deep learning (DL) methods have shown great potential in this area, models that can be successfully translated into clinical practice and shed light on the molecular mechanisms underlying treatment response will likely emerge from collabor… ▽ More Cancer drug response prediction (DRP) models present a promising approach towards precision oncology, tailoring treatments to individual patient profiles. While deep learning (DL) methods have shown great potential in this area, models that can be successfully translated into clinical practice and shed light on the molecular mechanisms underlying treatment response will likely emerge from collaborative research efforts. This highlights the need for reusable and adaptable models that can be improved and tested by the wider scientific community. In this study, we present a scoring system for assessing the reusability of prediction DRP models, and apply it to 17 peer-reviewed DL-based DRP models. As part of the IMPROVE (Innovative Methodologies and New Data for Predictive Oncology Model Evaluation) project, which aims to develop methods for systematic evaluation and comparison DL models across scientific domains, we analyzed these 17 DRP models focusing on three key categories: software environment, code modularity, and data availability and preprocessing. While not the primary focus, we also attempted to reproduce key performance metrics to verify model behavior and adaptability. Our assessment of 17 DRP models reveals both strengths and shortcomings in model reusability. To promote rigorous practices and open-source sharing, we offer recommendations for developing and sharing prediction models. Following these recommendations can address many of the issues identified in this study, improving model reusability without adding significant burdens on researchers. This work offers the first comprehensive assessment of reusability and reproducibility across diverse DRP models, providing insights into current model sharing practices and promoting standards within the DRP and broader AI-enabled scientific research community. △ Less

Submitted 18 September, 2024; originally announced September 2024.

Comments: 12 pages, 2 figures

arXiv:2409.06317 [pdf, other]

The Black-Box Simulation Barrier Persists in a Fully Quantum World

Authors: Nai-Hui Chia, Kai-Min Chung, Xiao Liang, Jiahui Liu

Abstract: Zero-Knowledge (ZK) protocols have been intensely studied due to their fundamental importance and versatility. However, quantum information's inherent differences significantly alter the landscape, necessitating a re-examination of ZK designs. A crucial aspect is round complexity, linked to $\textit{simulation}$, which forms the foundation of ZK definition and security proofs. In the… ▽ More Zero-Knowledge (ZK) protocols have been intensely studied due to their fundamental importance and versatility. However, quantum information's inherent differences significantly alter the landscape, necessitating a re-examination of ZK designs. A crucial aspect is round complexity, linked to $\textit{simulation}$, which forms the foundation of ZK definition and security proofs. In the $\textit{post-quantum}$ setting, where honest parties and channels are classical but adversaries quantum, Chia et al. [FOCS'21] showed constant-round $\textit{black-box-simulatable}$ ZK arguments (BBZK) for $\mathbf{NP}$ are impossible unless $\mathbf{NP} \subseteq \mathbf{BQP}$. But this problem remains open when all parties and communication are quantum. Indeed, this problem interests the broader theory of quantum computing. Investigating how quantum power alters tasks like the $\textit{unconditional}$ security of QKD and incorporating OT in MiniQCrypt has been crucial. Moreover, quantum communication has enabled round compression for commitments and interactive arguments. Along this line, understanding if quantum computing could fundamentally change ZK protocols is vital. We resolved this problem by proving that only languages in $\mathbf{BQP}$ admit constant-round $\textit{fully-quantum}$ BBZK. This result holds significant implications. Firstly, it illuminates the nature of quantum zero-knowledge and provides valuable insights for designing future protocols in the quantum realm. Secondly, it relates ZK round complexity with the intriguing problem of $\mathbf{BQP}$ vs $\mathbf{QMA}$, which is out of the reach of previous analogue impossibility results in the classical or post-quantum setting. Lastly, it justifies the need for the $\textit{non-black-box}$ simulation techniques or the relaxed security notions employed in existing constant-round fully-quantum BBZK protocols. △ Less

Submitted 10 September, 2024; originally announced September 2024.

arXiv:2407.04486 [pdf, other]

Variational and Explanatory Neural Networks for Encoding Cancer Profiles and Predicting Drug Responses

Authors: Tianshu Feng, Rohan Gnanaolivu, Abolfazl Safikhani, Yuanhang Liu, Jun Jiang, Nicholas Chia, Alexander Partin, Priyanka Vasanthakumari, Yitan Zhu, Chen Wang

Abstract: Human cancers present a significant public health challenge and require the discovery of novel drugs through translational research. Transcriptomics profiling data that describes molecular activities in tumors and cancer cell lines are widely utilized for predicting anti-cancer drug responses. However, existing AI models face challenges due to noise in transcriptomics data and lack of biological i… ▽ More Human cancers present a significant public health challenge and require the discovery of novel drugs through translational research. Transcriptomics profiling data that describes molecular activities in tumors and cancer cell lines are widely utilized for predicting anti-cancer drug responses. However, existing AI models face challenges due to noise in transcriptomics data and lack of biological interpretability. To overcome these limitations, we introduce VETE (Variational and Explanatory Transcriptomics Encoder), a novel neural network framework that incorporates a variational component to mitigate noise effects and integrates traceable gene ontology into the neural network architecture for encoding cancer transcriptomics data. Key innovations include a local interpretability-guided method for identifying ontology paths, a visualization tool to elucidate biological mechanisms of drug responses, and the application of centralized large scale hyperparameter optimization. VETE demonstrated robust accuracy in cancer cell line classification and drug response prediction. Additionally, it provided traceable biological explanations for both tasks and offers insights into the mechanisms underlying its predictions. VETE bridges the gap between AI-driven predictions and biologically meaningful insights in cancer research, which represents a promising advancement in the field. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2406.13879 [pdf, other]

A Catalyst Framework for the Quantum Linear System Problem via the Proximal Point Algorithm

Authors: Junhyung Lyle Kim, Nai-Hui Chia, Anastasios Kyrillidis

Abstract: Solving systems of linear equations is a fundamental problem, but it can be computationally intensive for classical algorithms in high dimensions. Existing quantum algorithms can achieve exponential speedups for the quantum linear system problem (QLSP) in terms of the problem dimension, but even such a theoretical advantage is bottlenecked by the condition number of the coefficient matrix. In this… ▽ More Solving systems of linear equations is a fundamental problem, but it can be computationally intensive for classical algorithms in high dimensions. Existing quantum algorithms can achieve exponential speedups for the quantum linear system problem (QLSP) in terms of the problem dimension, but even such a theoretical advantage is bottlenecked by the condition number of the coefficient matrix. In this work, we propose a new quantum algorithm for QLSP inspired by the classical proximal point algorithm (PPA). Our proposed method can be viewed as a meta-algorithm that allows inverting a modified matrix via an existing \texttt{QLSP\_solver}, thereby directly approximating the solution vector instead of approximating the inverse of the coefficient matrix. By carefully choosing the step size $η$, the proposed algorithm can effectively precondition the linear system to mitigate the dependence on condition numbers that hindered the applicability of previous approaches. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2405.10242 [pdf, ps, other]

Quantum State Learning Implies Circuit Lower Bounds

Authors: Nai-Hui Chia, Daniel Liang, Fang Song

Abstract: We establish connections between state tomography, pseudorandomness, quantum state synthesis, and circuit lower bounds. In particular, let $\mathfrak{C}$ be a family of non-uniform quantum circuits of polynomial size and suppose that there exists an algorithm that, given copies of $|ψ\rangle$, distinguishes whether $|ψ\rangle$ is produced by $\mathfrak{C}$ or is Haar random, promised one of these… ▽ More We establish connections between state tomography, pseudorandomness, quantum state synthesis, and circuit lower bounds. In particular, let $\mathfrak{C}$ be a family of non-uniform quantum circuits of polynomial size and suppose that there exists an algorithm that, given copies of $|ψ\rangle$, distinguishes whether $|ψ\rangle$ is produced by $\mathfrak{C}$ or is Haar random, promised one of these is the case. For arbitrary fixed constant $c$, we show that if the algorithm uses at most $O(2^{n^c})$ time and $2^{n^{0.99}}$ samples then $\mathsf{stateBQE} \not\subset \mathsf{state}\mathfrak{C}$. Here $\mathsf{stateBQE} := \mathsf{stateBQTIME}[2^{O(n)}]$ and $\mathsf{state}\mathfrak{C}$ are state synthesis complexity classes as introduced by Rosenthal and Yuen (ITCS 2022), which capture problems with classical inputs but quantum output. Note that efficient tomography implies a similarly efficient distinguishing algorithm against Haar random states, even for nearly exponential-time algorithms. Because every state produced by a polynomial-size circuit can be learned with $2^{O(n)}$ samples and time, or $O(n^{ω(1)})$ samples and $2^{O(n^{ω(1)})}$ time, we show that even slightly non-trivial quantum state tomography algorithms would lead to new statements about quantum state synthesis. Finally, a slight modification of our proof shows that distinguishing algorithms for quantum states can imply circuit lower bounds for decision problems as well. This help sheds light on why time-efficient tomography algorithms for non-uniform quantum circuit classes has only had limited and partial progress. Our work parallels results by Arunachalam et al. (FOCS 2021) that revealed a similar connection between quantum learning of Boolean functions and circuit lower bounds for classical circuit classes, but modified for the purposes of state tomography and state synthesis. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: 53 pages

arXiv:2405.07137 [pdf, other]

Oracle Separation between Noisy Quantum Polynomial Time and the Polynomial Hierarchy

Authors: Nai-Hui Chia, Min-Hsiu Hsieh, Shih-Han Hung, En-Jui Kuo

Abstract: This work investigates the oracle separation between the physically motivated complexity class of noisy quantum circuits, inspired by definitions such as those presented by Chen, Cotler, Huang, and Li (2022). We establish that with a constant error rate, separation can be achieved in terms of NP. When the error rate is $Ω(\log n/n)$, we can extend this result to the separation of PH. Notably, our… ▽ More This work investigates the oracle separation between the physically motivated complexity class of noisy quantum circuits, inspired by definitions such as those presented by Chen, Cotler, Huang, and Li (2022). We establish that with a constant error rate, separation can be achieved in terms of NP. When the error rate is $Ω(\log n/n)$, we can extend this result to the separation of PH. Notably, our oracles, in all separations, do not necessitate error correction schemes or fault tolerance, as all quantum circuits are of constant depth. This indicates that even quantum computers with minor errors, without error correction, may surpass classical complexity classes under various scenarios and assumptions. We also explore various common noise settings and present new classical hardness results, generalizing those found in studies by Raz and Tal (2022) and Bassirian, Bouland, Fefferman, Gunn, and Tal (2021), which are of independent interest. △ Less

Submitted 14 May, 2024; v1 submitted 11 May, 2024; originally announced May 2024.

arXiv:2310.14464 [pdf, ps, other]

A Cryptographic Perspective on the Verifiability of Quantum Advantage

Authors: Nai-Hui Chia, Honghao Fu, Fang Song, Penghui Yao

Abstract: In recent years, achieving verifiable quantum advantage on a NISQ device has emerged as an important open problem in quantum information. The sampling-based quantum advantages are not known to have efficient verification methods. This paper investigates the verification of quantum advantage from a cryptographic perspective. We establish a strong connection between the verifiability of quantum adva… ▽ More In recent years, achieving verifiable quantum advantage on a NISQ device has emerged as an important open problem in quantum information. The sampling-based quantum advantages are not known to have efficient verification methods. This paper investigates the verification of quantum advantage from a cryptographic perspective. We establish a strong connection between the verifiability of quantum advantage and cryptographic and complexity primitives, including efficiently samplable, statistically far but computationally indistinguishable pairs of (mixed) quantum states ($\mathsf{EFI}$), pseudorandom states ($\mathsf{PRS}$), and variants of minimum circuit size problems ($\mathsf{MCSP}$). Specifically, we prove that a) a sampling-based quantum advantage is either verifiable or can be used to build $\mathsf{EFI}$ and even $\mathsf{PRS}$ and b) polynomial-time algorithms for a variant of $\mathsf{MCSP}$ would imply efficient verification of quantum advantages. Our work shows that the quest for verifiable quantum advantages may lead to applications of quantum cryptography, and the construction of quantum primitives can provide new insights into the verifiability of quantum advantages. △ Less

Submitted 22 October, 2023; originally announced October 2023.

Comments: 21 pages, 2 figures

arXiv:2308.07014 [pdf, other]

doi 10.22331/q-2024-02-12-1250

Efficient learning of $t$-doped stabilizer states with single-copy measurements

Authors: Nai-Hui Chia, Ching-Yi Lai, Han-Hsuan Lin

Abstract: One of the primary objectives in the field of quantum state learning is to develop algorithms that are time-efficient for learning states generated from quantum circuits. Earlier investigations have demonstrated time-efficient algorithms for states generated from Clifford circuits with at most $\log(n)$ non-Clifford gates. However, these algorithms necessitate multi-copy measurements, posing imple… ▽ More One of the primary objectives in the field of quantum state learning is to develop algorithms that are time-efficient for learning states generated from quantum circuits. Earlier investigations have demonstrated time-efficient algorithms for states generated from Clifford circuits with at most $\log(n)$ non-Clifford gates. However, these algorithms necessitate multi-copy measurements, posing implementation challenges in the near term due to the requisite quantum memory. On the contrary, using solely single-qubit measurements in the computational basis is insufficient in learning even the output distribution of a Clifford circuit with one additional $T$ gate under reasonable post-quantum cryptographic assumptions. In this work, we introduce an efficient quantum algorithm that employs only nonadaptive single-copy measurement to learn states produced by Clifford circuits with a maximum of $O(\log n)$ non-Clifford gates, filling a gap between the previous positive and negative results. △ Less

Submitted 5 February, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

Comments: 8 pages

Journal ref: Quantum 8, 1250 (2024)

arXiv:2305.12444 [pdf, other]

On the Impossibility of General Parallel Fast-forwarding of Hamiltonian Simulation

Authors: Nai-Hui Chia, Kai-Min Chung, Yao-Ching Hsieh, Han-Hsuan Lin, Yao-Ting Lin, Yu-Ching Shen

Abstract: Hamiltonian simulation is one of the most important problems in the field of quantum computing. There have been extended efforts on designing algorithms for faster simulation, and the evolution time $T$ for the simulation turns out to largely affect algorithm runtime. While there are some specific types of Hamiltonians that can be fast-forwarded, i.e., simulated within time $o(T)$, for large enoug… ▽ More Hamiltonian simulation is one of the most important problems in the field of quantum computing. There have been extended efforts on designing algorithms for faster simulation, and the evolution time $T$ for the simulation turns out to largely affect algorithm runtime. While there are some specific types of Hamiltonians that can be fast-forwarded, i.e., simulated within time $o(T)$, for large enough classes of Hamiltonians (e.g., all local/sparse Hamiltonians), existing simulation algorithms require running time at least linear in the evolution time $T$. On the other hand, while there exist lower bounds of $Ω(T)$ circuit size for some large classes of Hamiltonian, these lower bounds do not rule out the possibilities of Hamiltonian simulation with large but "low-depth" circuits by running things in parallel. Therefore, it is intriguing whether we can achieve fast Hamiltonian simulation with the power of parallelism. In this work, we give a negative result for the above open problem, showing that sparse Hamiltonians and (geometrically) local Hamiltonians cannot be parallelly fast-forwarded. In the oracle model, we prove that there are time-independent sparse Hamiltonians that cannot be simulated via an oracle circuit of depth $o(T)$. In the plain model, relying on the random oracle heuristic, we show that there exist time-independent local Hamiltonians and time-dependent geometrically local Hamiltonians that cannot be simulated via an oracle circuit of depth $o(T/n^c)$, where the Hamiltonians act on $n$-qubits, and $c$ is a constant. △ Less

Submitted 21 May, 2023; originally announced May 2023.

Comments: 44 pages, 7 figures

arXiv:2210.03198 [pdf, other]

Metabolic Model-based Ecological Modeling for Probiotic Design

Authors: James D. Brunner, Nicholas Chia

Abstract: The microbial community composition in the human gut has a profound effect on human health. This observation has lead to extensive use of microbiome therapies, including over-the-counter ``probiotic" treatments intended to alter the composition of the microbiome. Despite so much promise and commercial interest, the factors that contribute to the success or failure of microbiome-targeted treatments… ▽ More The microbial community composition in the human gut has a profound effect on human health. This observation has lead to extensive use of microbiome therapies, including over-the-counter ``probiotic" treatments intended to alter the composition of the microbiome. Despite so much promise and commercial interest, the factors that contribute to the success or failure of microbiome-targeted treatments remain unclear. We investigate the biotic interactions that lead to successful engraftment of a novel bacterial strain introduced to the microbiome as in probiotic treatments. We use pairwise genome-scale metabolic modeling with a generalized resource allocation constraint to build a network of interactions between 818 species with well developed models available in the AGORA database. We create induced sub-graphs using the taxa present in samples from three experimental engraftment studies and assess the likelihood of invader engraftment based on network structure. To do so, we use a set of dynamical models designed to reflect connect network topology to growth dynamics. We show that a generalized Lotka-Volterra model has strong ability to predict if a particular invader or probiotic will successfully engraft into an individual's microbiome. Furthermore, we show that the mechanistic nature of the model is useful for revealing which microbe-microbe interactions potentially drive engraftment. △ Less

Submitted 6 October, 2022; originally announced October 2022.

Comments: 18 pages, 6 figures

arXiv:2206.01345 [pdf, other]

QMLP: An Error-Tolerant Nonlinear Quantum MLP Architecture using Parameterized Two-Qubit Gates

Authors: Cheng Chu, Nai-Hui Chia, Lei Jiang, Fan Chen

Abstract: Despite potential quantum supremacy, state-of-the-art quantum neural networks (QNNs) suffer from low inference accuracy. First, the current Noisy Intermediate-Scale Quantum (NISQ) devices with high error rates of 0.001 to 0.01 significantly degrade the accuracy of a QNN. Second, although recently proposed Re-Uploading Units (RUUs) introduce some non-linearity into the QNN circuits, the theory behi… ▽ More Despite potential quantum supremacy, state-of-the-art quantum neural networks (QNNs) suffer from low inference accuracy. First, the current Noisy Intermediate-Scale Quantum (NISQ) devices with high error rates of 0.001 to 0.01 significantly degrade the accuracy of a QNN. Second, although recently proposed Re-Uploading Units (RUUs) introduce some non-linearity into the QNN circuits, the theory behind it is not fully understood. Furthermore, previous RUUs that repeatedly upload original data can only provide marginal accuracy improvements. Third, current QNN circuit ansatz uses fixed two-qubit gates to enforce maximum entanglement capability, making task-specific entanglement tuning impossible, resulting in poor overall performance. In this paper, we propose a Quantum Multilayer Perceptron (QMLP) architecture featured by error-tolerant input embedding, rich nonlinearity, and enhanced variational circuit ansatz with parameterized two-qubit entangling gates. Compared to prior arts, QMLP increases the inference accuracy on the 10-class MNIST dataset by 10% with 2 times fewer quantum gates and 3 times reduced parameters. Our source code is available and can be found in [1] △ Less

Submitted 2 June, 2022; originally announced June 2022.

arXiv:2205.04656 [pdf, ps, other]

Classical verification of quantum depth

Authors: Nai-Hui Chia, Shih-Han Hung

Abstract: We present two protocols for classical verification of quantum depth. Our protocols allow a purely classical verifier to distinguish devices with different quantum circuit depths even in the presence of classical computation. We show that a device with quantum circuit depth at most d will be rejected by the verifier even if the prover applies additional polynomial-time classical computation to che… ▽ More We present two protocols for classical verification of quantum depth. Our protocols allow a purely classical verifier to distinguish devices with different quantum circuit depths even in the presence of classical computation. We show that a device with quantum circuit depth at most d will be rejected by the verifier even if the prover applies additional polynomial-time classical computation to cheat. On the other hand, the verifier accepts a device which has quantum circuit depth d' for some d'>d. In our first protocol, we introduce an additional untrusted quantum machine which shares entanglements with the target machine. Applying a robust self-test, our first protocol certifies the depth of the target machine with information theoretic security and nearly optimal separation. The protocol relies on the oracle separation problem for quantum depth by Chia, Chung and Lai [STOC 2020] and a transformation from an oracle separation problem to a two-player non-local game. Our second protocol certifies the quantum depth of a single device based on quantum hardness of learning with errors. The protocol relies on the noisy trapdoor claw-free function family and the idea of pointer chasing to force the prover to keep quantum coherence until all preceding message exchanges are completed. To our knowledge, we give the first constructions for distinguishing hybrid quantum-classical computers with different circuit depths in unrelativized models. △ Less

Submitted 9 May, 2022; originally announced May 2022.

Comments: 39 pages

arXiv:2111.08665 [pdf, ps, other]

Post-Quantum Simulatable Extraction with Minimal Assumptions: Black-Box and Constant-Round

Authors: Nai-Hui Chia, Kai-Min Chung, Xiao Liang, Takashi Yamakawa

Abstract: From the minimal assumption of post-quantum semi-honest oblivious transfers, we build the first $ε$-simulatable two-party computation (2PC) against quantum polynomial-time (QPT) adversaries that is both constant-round and black-box (for both the construction and security reduction). A recent work by Chia, Chung, Liu, and Yamakawa (FOCS'21) shows that post-quantum 2PC with standard simulation-based… ▽ More From the minimal assumption of post-quantum semi-honest oblivious transfers, we build the first $ε$-simulatable two-party computation (2PC) against quantum polynomial-time (QPT) adversaries that is both constant-round and black-box (for both the construction and security reduction). A recent work by Chia, Chung, Liu, and Yamakawa (FOCS'21) shows that post-quantum 2PC with standard simulation-based security is impossible in constant rounds, unless either $\mathbf{NP} \subseteq \mathbf{BQP}$ or relying on non-black-box simulation. The $ε$-simulatability we target is a relaxation of the standard simulation-based security that allows for an arbitrarily small noticeable simulation error $ε$. Moreover, when quantum communication is allowed, we can further weaken the assumption to post-quantum secure one-way functions (PQ-OWFs), while maintaining the constant-round and black-box property. Our techniques also yield the following set of constant-round and black-box two-party protocols secure against QPT adversaries, only assuming black-box access to PQ-OWFs: - extractable commitments for which the extractor is also an $ε$-simulator; - $ε$-zero-knowledge commit-and-prove whose commit stage is extractable with $ε$-simulation; - $ε$-simulatable coin-flipping; - $ε$-zero-knowledge arguments of knowledge for $\mathbf{NP}$ for which the knowledge extractor is also an $ε$-simulator; - $ε$-zero-knowledge arguments for $\mathbf{QMA}$. At the heart of the above results is a black-box extraction lemma showing how to efficiently extract secrets from QPT adversaries while disturbing their quantum state in a controllable manner, i.e., achieving $ε$-simulatability of the post-extraction state of the adversary. △ Less

Submitted 4 November, 2023; v1 submitted 16 November, 2021; originally announced November 2021.

arXiv:2111.07348 [pdf, other]

Invariant Risk Minimisation for Cross-Organism Inference: Substituting Mouse Data for Human Data in Human Risk Factor Discovery

Authors: Odhran O'Donoghue, Paul Duckworth, Giuseppe Ughi, Linus Scheibenreif, Kia Khezeli, Adrienne Hoarfrost, Samuel Budd, Patrick Foley, Nicholas Chia, John Kalantari, Graham Mackintosh, Frank Soboczenski, Lauren Sanders

Abstract: Human medical data can be challenging to obtain due to data privacy concerns, difficulties conducting certain types of experiments, or prohibitive associated costs. In many settings, data from animal models or in-vitro cell lines are available to help augment our understanding of human data. However, this data is known for having low etiological validity in comparison to human data. In this work,… ▽ More Human medical data can be challenging to obtain due to data privacy concerns, difficulties conducting certain types of experiments, or prohibitive associated costs. In many settings, data from animal models or in-vitro cell lines are available to help augment our understanding of human data. However, this data is known for having low etiological validity in comparison to human data. In this work, we augment small human medical datasets with in-vitro data and animal models. We use Invariant Risk Minimisation (IRM) to elucidate invariant features by considering cross-organism data as belonging to different data-generating environments. Our models identify genes of relevance to human cancer development. We observe a degree of consistency between varying the amounts of human and mouse data used, however, further work is required to obtain conclusive insights. As a secondary contribution, we enhance existing open source datasets and provide two uniformly processed, cross-organism, homologue gene-matched datasets to the community. △ Less

Submitted 13 February, 2022; v1 submitted 14 November, 2021; originally announced November 2021.

Comments: Machine Learning for Health (ML4H) - Extended Abstract

arXiv:2108.03171 [pdf, ps, other]

Quantum Meets the Minimum Circuit Size Problem

Authors: Nai-Hui Chia, Chi-Ning Chou, Jiayu Zhang, Ruizhe Zhang

Abstract: In this work, we initiate the study of the Minimum Circuit Size Problem (MCSP) in the quantum setting. MCSP is a problem to compute the circuit complexity of Boolean functions. It is a fascinating problem in complexity theory -- its hardness is mysterious, and a better understanding of its hardness can have surprising implications to many fields in computer science. We first define and investiga… ▽ More In this work, we initiate the study of the Minimum Circuit Size Problem (MCSP) in the quantum setting. MCSP is a problem to compute the circuit complexity of Boolean functions. It is a fascinating problem in complexity theory -- its hardness is mysterious, and a better understanding of its hardness can have surprising implications to many fields in computer science. We first define and investigate the basic complexity-theoretic properties of minimum quantum circuit size problems for three natural objects: Boolean functions, unitaries, and quantum states. We show that these problems are not trivially in NP but in QCMA (or have QCMA protocols). Next, we explore the relations between the three quantum MCSPs and their variants. We discover that some reductions that are not known for classical MCSP exist for quantum MCSPs for unitaries and states, e.g., search-to-decision reduction and self-reduction. Finally, we systematically generalize results known for classical MCSP to the quantum setting (including quantum cryptography, quantum learning theory, quantum circuit lower bounds, and quantum fine-grained complexity) and also find new connections to tomography and quantum gravity. Due to the fundamental differences between classical and quantum circuits, most of our results require extra care and reveal properties and phenomena unique to the quantum setting. Our findings could be of interest for future studies, and we post several open problems for further exploration along this direction. △ Less

Submitted 14 September, 2021; v1 submitted 6 August, 2021; originally announced August 2021.

arXiv:2106.09777 [pdf, other]

On Invariance Penalties for Risk Minimization

Authors: Kia Khezeli, Arno Blaas, Frank Soboczenski, Nicholas Chia, John Kalantari

Abstract: The Invariant Risk Minimization (IRM) principle was first proposed by Arjovsky et al. [2019] to address the domain generalization problem by leveraging data heterogeneity from differing experimental conditions. Specifically, IRM seeks to find a data representation under which an optimal classifier remains invariant across all domains. Despite the conceptual appeal of IRM, the effectiveness of the… ▽ More The Invariant Risk Minimization (IRM) principle was first proposed by Arjovsky et al. [2019] to address the domain generalization problem by leveraging data heterogeneity from differing experimental conditions. Specifically, IRM seeks to find a data representation under which an optimal classifier remains invariant across all domains. Despite the conceptual appeal of IRM, the effectiveness of the originally proposed invariance penalty has recently been brought into question. In particular, there exists counterexamples for which that invariance penalty can be arbitrarily small for non-invariant data representations. We propose an alternative invariance penalty by revisiting the Gramian matrix of the data representation. We discuss the role of its eigenvalues in the relationship between the risk and the invariance penalty, and demonstrate that it is ill-conditioned for said counterexamples. The proposed approach is guaranteed to recover an invariant representation for linear settings under mild non-degeneracy conditions. Its effectiveness is substantiated by experiments on DomainBed and InvarianceUnitTest, two extensive test beds for domain generalization. △ Less

Submitted 17 June, 2021; originally announced June 2021.

arXiv:2103.11244 [pdf, ps, other]

On the Impossibility of Post-Quantum Black-Box Zero-Knowledge in Constant Rounds

Authors: Nai-Hui Chia, Kai-Min Chung, Qipeng Liu, Takashi Yamakawa

Abstract: We investigate the existence of constant-round post-quantum black-box zero-knowledge protocols for $\mathbf{NP}$. As a main result, we show that there is no constant-round post-quantum black-box zero-knowledge argument for $\mathbf{NP}$ unless $\mathbf{NP}\subseteq \mathbf{BQP}$. As constant-round black-box zero-knowledge arguments for $\mathbf{NP}$ exist in the classical setting, our main result… ▽ More We investigate the existence of constant-round post-quantum black-box zero-knowledge protocols for $\mathbf{NP}$. As a main result, we show that there is no constant-round post-quantum black-box zero-knowledge argument for $\mathbf{NP}$ unless $\mathbf{NP}\subseteq \mathbf{BQP}$. As constant-round black-box zero-knowledge arguments for $\mathbf{NP}$ exist in the classical setting, our main result points out a fundamental difference between post-quantum and classical zero-knowledge protocols. Combining previous results, we conclude that unless $\mathbf{NP}\subseteq \mathbf{BQP}$, constant-round post-quantum zero-knowledge protocols for $\mathbf{NP}$ exist if and only if we use non-black-box techniques or relax certain security requirements such as relaxing standard zero-knowledge to $ε$-zero-knowledge. Additionally, we also prove that three-round and public-coin constant-round post-quantum black-box $ε$-zero-knowledge arguments for $\mathbf{NP}$ do not exist unless $\mathbf{NP}\subseteq \mathbf{BQP}$. △ Less

Submitted 14 June, 2021; v1 submitted 20 March, 2021; originally announced March 2021.

Comments: 46 pages

arXiv:2011.02670 [pdf, ps, other]

A Black-Box Approach to Post-Quantum Zero-Knowledge in Constant Rounds

Authors: Nai-Hui Chia, Kai-Min Chung, Takashi Yamakawa

Abstract: In a recent seminal work, Bitansky and Shmueli (STOC '20) gave the first construction of a constant round zero-knowledge argument for NP secure against quantum attacks. However, their construction has several drawbacks compared to the classical counterparts. Specifically, their construction only achieves computational soundness, requires strong assumptions of quantum hardness of learning with erro… ▽ More In a recent seminal work, Bitansky and Shmueli (STOC '20) gave the first construction of a constant round zero-knowledge argument for NP secure against quantum attacks. However, their construction has several drawbacks compared to the classical counterparts. Specifically, their construction only achieves computational soundness, requires strong assumptions of quantum hardness of learning with errors (QLWE assumption) and the existence of quantum fully homomorphic encryption (QFHE), and relies on non-black-box simulation. In this paper, we resolve these issues at the cost of weakening the notion of zero-knowledge to what is called $ε$-zero-knowledge. Concretely, we construct the following protocols: - We construct a constant round interactive proof for NP that satisfies statistical soundness and black-box $ε$-zero-knowledge against quantum attacks assuming the existence of collapsing hash functions, which is a quantum counterpart of collision-resistant hash functions. Interestingly, this construction is just an adapted version of the classical protocol by Goldreich and Kahan (JoC '96) though the proof of $ε$-zero-knowledge property against quantum adversaries requires novel ideas. - We construct a constant round interactive argument for NP that satisfies computational soundness and black-box $ε$-zero-knowledge against quantum attacks only assuming the existence of post-quantum one-way functions. At the heart of our results is a new quantum rewinding technique that enables a simulator to extract a committed message of a malicious verifier while simulating verifier's internal state in an appropriate sense. △ Less

Submitted 30 October, 2023; v1 submitted 5 November, 2020; originally announced November 2020.

Comments: Fixed a minor technical issue (see Footnote 17 in page 21) and improved the proof of Claim 4.5. (10/30/2023)

Journal ref: CRYPTO 2021

arXiv:2006.02961 [pdf, other]

Confidence in the dynamic spread of epidemics under biased sampling conditions

Authors: James D. Brunner, Nicholas Chia

Abstract: The interpretation of sampling data plays a crucial role in policy response to the spread of a disease during an epidemic, such as the COVID-19 epidemic of 2020. However, this is a non-trivial endeavor due to the complexity of real world conditions and limits to the availability of diagnostic tests, which necessitate a bias in testing favoring symptomatic individuals. A thorough understanding of s… ▽ More The interpretation of sampling data plays a crucial role in policy response to the spread of a disease during an epidemic, such as the COVID-19 epidemic of 2020. However, this is a non-trivial endeavor due to the complexity of real world conditions and limits to the availability of diagnostic tests, which necessitate a bias in testing favoring symptomatic individuals. A thorough understanding of sampling confidence and bias is necessary in order make accurate conclusions. In this manuscript, we provide a stochastic model of sampling for assessing confidence in disease metrics such as trend detection, peak detection, and disease spread estimation. Our model simulates testing for a disease in an epidemic with known dynamics, allowing us to use Monte-Carlo sampling to assess metric confidence. This model can provide realistic simulated data which can be used in the design and calibration of data analysis and prediction methods. As an example, we use this method to show that trends in the disease may be identified using under $10000$ biased samples each day, and an estimate of disease spread can be made with additional $1000-2000$ unbiased samples each day. We also demonstrate that the model can be used to assess more advanced metrics by finding the precision and recall of a strategy for finding peaks in the dynamics. △ Less

Submitted 28 July, 2020; v1 submitted 4 June, 2020; originally announced June 2020.

Comments: 11 figures, 2 tables, 15 pages

MSC Class: 92-10; 62D05

arXiv:2003.03638 [pdf, other]

doi 10.1371/journal.pcbi.1007786

Minimizing the number of optimizations for efficient community dynamic flux balance analysis

Authors: James D. Brunner, Nicholas Chia

Abstract: Dynamic flux balance analysis uses a quasi-steady state assumption to calculate an organism's metabolic activity at each time-step of a dynamic simulation, using the well-known technique of flux balance analysis. For microbial communities, this calculation is especially costly and involves solving a linear constrained optimization problem for each member of the community at each time step. However… ▽ More Dynamic flux balance analysis uses a quasi-steady state assumption to calculate an organism's metabolic activity at each time-step of a dynamic simulation, using the well-known technique of flux balance analysis. For microbial communities, this calculation is especially costly and involves solving a linear constrained optimization problem for each member of the community at each time step. However, this is unnecessary and inefficient, as prior solutions can be used to inform future time steps. Here, we show that a basis for the space of internal fluxes can be chosen for each microbe in a community and this basis can be used to simulate forward by solving a relatively inexpensive system of linear equations at most time steps. We can use this solution as long as the resulting metabolic activity remains within the optimization problem's constraints (i.e. the solution to the linear system of equations remains a feasible to the linear program). As the solution becomes infeasible, it first becomes a feasible but degenerate solution to the optimization problem, and we can solve a different but related optimization problem to choose an appropriate basis to continue forward simulation. We demonstrate the efficiency and robustness of our method by comparing with currently used methods on a four species community, and show that our method requires at least $91\%$ fewer optimizations to be solved. For reproducibility, we prototyped the method using Python. Source code is available at \verb|https://github.com/jdbrunner/surfin_fba|. △ Less

Submitted 28 July, 2020; v1 submitted 7 March, 2020; originally announced March 2020.

Comments: 9 figures

MSC Class: 92-08; 92D25

arXiv:1912.00990 [pdf, ps, other]

Classical Verification of Quantum Computations with Efficient Verifier

Authors: Nai-Hui Chia, Kai-Min Chung, Takashi Yamakawa

Abstract: In this paper, we extend the protocol of classical verification of quantum computations (CVQC) recently proposed by Mahadev to make the verification efficient. Our result is obtained in the following three steps: $\bullet$ We show that parallel repetition of Mahadev's protocol has negligible soundness error. This gives the first constant round CVQC protocol with negligible soundness error. In th… ▽ More In this paper, we extend the protocol of classical verification of quantum computations (CVQC) recently proposed by Mahadev to make the verification efficient. Our result is obtained in the following three steps: $\bullet$ We show that parallel repetition of Mahadev's protocol has negligible soundness error. This gives the first constant round CVQC protocol with negligible soundness error. In this part, we only assume the quantum hardness of the learning with error (LWE) problem similar to the Mahadev's work. $\bullet$ We construct a two-round CVQC protocol in the quantum random oracle model (QROM) where a cryptographic hash function is idealized to be a random function. This is obtained by applying the Fiat-Shamir transform to the parallel repetition version of the Mahadev's protocol. $\bullet$ We construct a two-round CVQC protocol with the efficient verifier in the CRS+QRO model where both prover and verifier can access to a (classical) common reference string generated by a trusted third party in addition to quantum access to QRO. Specifically, the verifier can verify a $QTIME(T)$ computation in time $poly(n,log T)$ where $n$ is the security parameter. For proving soundness, we assume that a standard model instantiation of our two-round protocol with a concrete hash function (say, SHA-3) is sound and the existence of post-quantum indistinguishability obfuscation and post-quantum fully homomorphic encryption in addition to the quantum hardness of the LWE problem. △ Less

Submitted 12 March, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

arXiv:1911.01973 [pdf, other]

On the Quantum Complexity of Closest Pair and Related Problems

Authors: Scott Aaronson, Nai-Hui Chia, Han-Hsuan Lin, Chunhao Wang, Ruizhe Zhang

Abstract: The closest pair problem is a fundamental problem of computational geometry: given a set of $n$ points in a $d$-dimensional space, find a pair with the smallest distance. A classical algorithm taught in introductory courses solves this problem in $O(n\log n)$ time in constant dimensions (i.e., when $d=O(1)$). This paper asks and answers the question of the problem's quantum time complexity. Specif… ▽ More The closest pair problem is a fundamental problem of computational geometry: given a set of $n$ points in a $d$-dimensional space, find a pair with the smallest distance. A classical algorithm taught in introductory courses solves this problem in $O(n\log n)$ time in constant dimensions (i.e., when $d=O(1)$). This paper asks and answers the question of the problem's quantum time complexity. Specifically, we give an $\tilde{O}(n^{2/3})$ algorithm in constant dimensions, which is optimal up to a polylogarithmic factor by the lower bound on the quantum query complexity of element distinctness. The key to our algorithm is an efficient history-independent data structure that supports quantum interference. In $\mathrm{polylog}(n)$ dimensions, no known quantum algorithms perform better than brute force search, with a quadratic speedup provided by Grover's algorithm. To give evidence that the quadratic speedup is nearly optimal, we initiate the study of quantum fine-grained complexity and introduce the Quantum Strong Exponential Time Hypothesis (QSETH), which is based on the assumption that Grover's algorithm is optimal for CNF-SAT when the clause width is large. We show that the naïve Grover approach to closest pair in higher dimensions is optimal up to an $n^{o(1)}$ factor unless QSETH is false. We also study the bichromatic closest pair problem and the orthogonal vectors problem, with broadly similar results. △ Less

Submitted 6 August, 2020; v1 submitted 5 November, 2019; originally announced November 2019.

Comments: 46 pages, 3 figures, presentation improved

Journal ref: Proceedings of the 35th Computational Complexity Conference (CCC), pages 16:1--16:43, 2020

arXiv:1910.06151 [pdf, other]

doi 10.1145/3357713.3384314

Sampling-based sublinear low-rank matrix arithmetic framework for dequantizing quantum machine learning

Authors: Nai-Hui Chia, András Gilyén, Tongyang Li, Han-Hsuan Lin, Ewin Tang, Chunhao Wang

Abstract: We present an algorithmic framework for quantum-inspired classical algorithms on close-to-low-rank matrices, generalizing the series of results started by Tang's breakthrough quantum-inspired algorithm for recommendation systems [STOC'19]. Motivated by quantum linear algebra algorithms and the quantum singular value transformation (SVT) framework of Gilyén, Su, Low, and Wiebe [STOC'19], we develop… ▽ More We present an algorithmic framework for quantum-inspired classical algorithms on close-to-low-rank matrices, generalizing the series of results started by Tang's breakthrough quantum-inspired algorithm for recommendation systems [STOC'19]. Motivated by quantum linear algebra algorithms and the quantum singular value transformation (SVT) framework of Gilyén, Su, Low, and Wiebe [STOC'19], we develop classical algorithms for SVT that run in time independent of input dimension, under suitable quantum-inspired sampling assumptions. Our results give compelling evidence that in the corresponding QRAM data structure input model, quantum SVT does not yield exponential quantum speedups. Since the quantum SVT framework generalizes essentially all known techniques for quantum linear algebra, our results, combined with sampling lemmas from previous work, suffice to generalize all recent results about dequantizing quantum machine learning algorithms. In particular, our classical SVT framework recovers and often improves the dequantization results on recommendation systems, principal component analysis, supervised clustering, support vector machines, low-rank regression, and semidefinite program solving. We also give additional dequantization results on low-rank Hamiltonian simulation and discriminant analysis. Our improvements come from identifying the key feature of the quantum-inspired input model that is at the core of all prior quantum-inspired results: $\ell^2$-norm sampling can approximate matrix products in time independent of their dimension. We reduce all our main results to this fact, making our exposition concise, self-contained, and intuitive. △ Less

Submitted 10 July, 2023; v1 submitted 14 October, 2019; originally announced October 2019.

Comments: 77 pages, 2 figures. v2: revised to add more connection to QSVT, improve existing results. v3: revised structure, introduction rewritten for clarity. v4: minor correction to regression result

arXiv:1909.10303 [pdf, other]

On the Need for Large Quantum Depth

Authors: Nai-Hui Chia, Kai-Min Chung, Ching-Yi Lai

Abstract: Near-term quantum computers are likely to have small depths due to short coherence time and noisy gates, and thus a potential way to use these quantum devices is using a hybrid scheme that interleaves them with classical computers. For example, the quantum Fourier transform can be implemented by a hybrid of logarithmic-depth quantum circuits and a classical polynomial-time algorithm. Along the lin… ▽ More Near-term quantum computers are likely to have small depths due to short coherence time and noisy gates, and thus a potential way to use these quantum devices is using a hybrid scheme that interleaves them with classical computers. For example, the quantum Fourier transform can be implemented by a hybrid of logarithmic-depth quantum circuits and a classical polynomial-time algorithm. Along the line, it seems possible that a general quantum computer may only be polynomially faster than a hybrid quantum-classical computer. Jozsa raised the question of whether $BQP = BPP^{BQNC}$ and conjectured that they are equal, where $BQNC$ means $polylog$-depth quantum circuits. Nevertheless, Aaronson conjectured an oracle separation for these two classes and gave a candidate. In this work, we prove Aaronson's conjecture for a different but related oracle problem. Our result also proves that Jozsa's conjecture fails relative to an oracle. △ Less

Submitted 12 September, 2020; v1 submitted 23 September, 2019; originally announced September 2019.

arXiv:1907.04436 [pdf, other]

doi 10.1098/rsif.2019.0423

Metabolite mediated modeling of microbial community dynamics captures emergent behavior more effectively than species-species modeling

Authors: James D. Brunner, Nicholas Chia

Abstract: Personalized models of the gut microbiome are valuable for disease prevention and treatment. For this, one requires a mathematical model that predicts microbial community composition and the emergent behavior of microbial communities. We seek a modeling strategy that can capture emergent behavior when built from sets of universal individual interactions. Our investigation reveals that species-meta… ▽ More Personalized models of the gut microbiome are valuable for disease prevention and treatment. For this, one requires a mathematical model that predicts microbial community composition and the emergent behavior of microbial communities. We seek a modeling strategy that can capture emergent behavior when built from sets of universal individual interactions. Our investigation reveals that species-metabolite interaction modeling is better able to capture emergent behavior in community composition dynamics than direct species-species modeling. Using publicly available data, we examine the ability of species-species models and species-metabolite models to predict trio growth experiments from the outcomes of pair growth experiments. We compare quadratic species-species interaction models and quadratic species-metabolite interaction models, and conclude that only species-metabolite models have the necessary complexity to to explain a wide variety of interdependent growth outcomes. We also show that general species-species interaction models cannot match patterns observed in community growth dynamics, whereas species-metabolite models can. We conclude that species-metabolite modeling will be important in the development of accurate, clinically useful models of microbial communities. △ Less

Submitted 19 August, 2019; v1 submitted 9 July, 2019; originally announced July 2019.

Comments: 23 pages, 8 Figures

MSC Class: 92D25

arXiv:1901.03254 [pdf, other]

Quantum-inspired sublinear algorithm for solving low-rank semidefinite programming

Authors: Nai-Hui Chia, Tongyang Li, Han-Hsuan Lin, Chunhao Wang

Abstract: Semidefinite programming (SDP) is a central topic in mathematical optimization with extensive studies on its efficient solvers. In this paper, we present a proof-of-principle sublinear-time algorithm for solving SDPs with low-rank constraints; specifically, given an SDP with $m$ constraint matrices, each of dimension $n$ and rank $r$, our algorithm can compute any entry and efficient descriptions… ▽ More Semidefinite programming (SDP) is a central topic in mathematical optimization with extensive studies on its efficient solvers. In this paper, we present a proof-of-principle sublinear-time algorithm for solving SDPs with low-rank constraints; specifically, given an SDP with $m$ constraint matrices, each of dimension $n$ and rank $r$, our algorithm can compute any entry and efficient descriptions of the spectral decomposition of the solution matrix. The algorithm runs in time $O(m\cdot\mathrm{poly}(\log n,r,1/\varepsilon))$ given access to a sampling-based low-overhead data structure for the constraint matrices, where $\varepsilon$ is the precision of the solution. In addition, we apply our algorithm to a quantum state learning task as an application. Technically, our approach aligns with 1) SDP solvers based on the matrix multiplicative weight (MMW) framework by Arora and Kale [TOC '12]; 2) sampling-based dequantizing framework pioneered by Tang [STOC '19]. In order to compute the matrix exponential required in the MMW framework, we introduce two new techniques that may be of independent interest: $\bullet$ Weighted sampling: assuming sampling access to each individual constraint matrix $A_{1},\ldots,A_τ$, we propose a procedure that gives a good approximation of $A=A_{1}+\cdots+A_τ$. $\bullet$ Symmetric approximation: we propose a sampling procedure that gives the \emph{spectral decomposition} of a low-rank Hermitian matrix $A$. To the best of our knowledge, this is the first sampling-based algorithm for spectral decomposition, as previous works only give singular values and vectors. △ Less

Submitted 5 August, 2020; v1 submitted 10 January, 2019; originally announced January 2019.

Comments: 37 pages, 1 figure. To appear in the Proceedings of the 45th International Symposium on Mathematical Foundations of Computer Science (MFCS 2020)

arXiv:1811.04852 [pdf, other]

Quantum-inspired sublinear classical algorithms for solving low-rank linear systems

Authors: Nai-Hui Chia, Han-Hsuan Lin, Chunhao Wang

Abstract: We present classical sublinear-time algorithms for solving low-rank linear systems of equations. Our algorithms are inspired by the HHL quantum algorithm for solving linear systems and the recent breakthrough by Tang of dequantizing the quantum algorithm for recommendation systems. Let $A \in \mathbb{C}^{m \times n}$ be a rank-$k$ matrix, and $b \in \mathbb{C}^m$ be a vector. We present two algori… ▽ More We present classical sublinear-time algorithms for solving low-rank linear systems of equations. Our algorithms are inspired by the HHL quantum algorithm for solving linear systems and the recent breakthrough by Tang of dequantizing the quantum algorithm for recommendation systems. Let $A \in \mathbb{C}^{m \times n}$ be a rank-$k$ matrix, and $b \in \mathbb{C}^m$ be a vector. We present two algorithms: a "sampling" algorithm that provides a sample from $A^{-1}b$ and a "query" algorithm that outputs an estimate of an entry of $A^{-1}b$, where $A^{-1}$ denotes the Moore-Penrose pseudo-inverse. Both of our algorithms have query and time complexity $O(\mathrm{poly}(k, κ, \|A\|_F, 1/ε)\,\mathrm{polylog}(m, n))$, where $κ$ is the condition number of $A$ and $ε$ is the precision parameter. Note that the algorithms we consider are sublinear time, so they cannot write and read the whole matrix or vectors. In this paper, we assume that $A$ and $b$ come with well-known low-overhead data structures such that entries of $A$ and $b$ can be sampled according to some natural probability distributions. Alternatively, when $A$ is positive semidefinite, our algorithms can be adapted so that the sampling assumption on $b$ is not required. △ Less

Submitted 12 November, 2018; originally announced November 2018.

arXiv:1807.09400 [pdf, other]

doi 10.1103/PhysRevE.99.032413

Extreme value analysis of gut microbial alterations in colorectal cancer

Authors: Stephanie Danni Song, Patricio Jeraldo, Jun Chen, Nicholas Chia

Abstract: Gut microbes play a key role in colorectal carcinogenesis, yet reaching a consensus on microbial signatures remains a challenge. This is in part due to a reliance on mean value estimates. We present an extreme value analysis for overcoming these limitations. By characterizing a power law fit to the relative abundances of microbes, we capture the same microbial signatures as more complex meta-analy… ▽ More Gut microbes play a key role in colorectal carcinogenesis, yet reaching a consensus on microbial signatures remains a challenge. This is in part due to a reliance on mean value estimates. We present an extreme value analysis for overcoming these limitations. By characterizing a power law fit to the relative abundances of microbes, we capture the same microbial signatures as more complex meta-analyses. Importantly, we show that our method is robust to the variations inherent in microbial community profiling and point to future directions for developing sensitive, reliable analytical methods. △ Less

Submitted 13 February, 2019; v1 submitted 24 July, 2018; originally announced July 2018.

Journal ref: Phys. Rev. E 99, 032413 (2019)

arXiv:1804.10309 [pdf, other]

doi 10.22331/q-2020-08-27-312

On Basing One-way Permutations on NP-hard Problems under Quantum Reductions

Authors: Nai-Hui Chia, Sean Hallgren, Fang Song

Abstract: A fundamental pursuit in complexity theory concerns reducing worst-case problems to average-case problems. There exist complexity classes such as PSPACE that admit worst-case to average-case reductions. However, for many other classes such as NP, the evidence so far is typically negative, in the sense that the existence of such reductions would cause collapses of the polynomial hierarchy(PH). Basi… ▽ More A fundamental pursuit in complexity theory concerns reducing worst-case problems to average-case problems. There exist complexity classes such as PSPACE that admit worst-case to average-case reductions. However, for many other classes such as NP, the evidence so far is typically negative, in the sense that the existence of such reductions would cause collapses of the polynomial hierarchy(PH). Basing cryptographic primitives, e.g., the average-case hardness of inverting one-way permutations, on NP-completeness is a particularly intriguing instance. As there is evidence showing that classical reductions from NP-hard problems to breaking these primitives result in PH collapses, it seems unlikely to base cryptographic primitives on NP-hard problems. Nevertheless, these results do not rule out the possibilities of the existence of quantum reductions. In this work, we initiate a study of the quantum analogues of these questions. Aside from formalizing basic notions of quantum reductions and demonstrating powers of quantum reductions by examples of separations, our main result shows that if NP-complete problems reduce to inverting one-way permutations using certain types of quantum reductions, then coNP $\subseteq$ QIP(2). △ Less

Submitted 9 August, 2020; v1 submitted 26 April, 2018; originally announced April 2018.

Comments: 32 pages

Journal ref: Quantum 4, 312 (2020)

arXiv:1706.01787 [pdf]

doi 10.1038/ncomms15393

Global metabolic interaction network of the human gut microbiota for context-specific community-scale analysis

Authors: Jaeyun Sung, Seunghyeon Kim, Josephine Jill T. Cabatbat, Sungho Jang, Yong-Su Jin, Gyoo Yeol Jung, Nicholas Chia, Pan-Jun Kim

Abstract: A system-level framework of complex microbe-microbe and host-microbe chemical cross-talk would help elucidate the role of our gut microbiota in health and disease. Here we report a literature-curated interspecies network of the human gut microbiota, called NJS16. This is an extensive data resource composed of ~570 microbial species and 3 human cell types metabolically interacting through >4,400 sm… ▽ More A system-level framework of complex microbe-microbe and host-microbe chemical cross-talk would help elucidate the role of our gut microbiota in health and disease. Here we report a literature-curated interspecies network of the human gut microbiota, called NJS16. This is an extensive data resource composed of ~570 microbial species and 3 human cell types metabolically interacting through >4,400 small-molecule transport and macromolecule degradation events. Based on the contents of our network, we develop a mathematical approach to elucidate representative microbial and metabolic features of the gut microbial community in a given population, such as a disease cohort. Applying this strategy to microbiome data from type 2 diabetes patients reveals a context-specific infrastructure of the gut microbial ecosystem, core microbial entities with large metabolic influence, and frequently-produced metabolic compounds that might indicate relevant community metabolic processes. Our network presents a foundation towards integrative investigations of community-scale microbial activities within the human gut. △ Less

Submitted 6 June, 2017; originally announced June 2017.

Comments: Supplementary material is available at the journal website

Journal ref: Nat. Commun. 8, 15393 (2017)

arXiv:1704.07904 [pdf, other]

Prediction and Inference with Missing Data in Patient Alert Systems

Authors: Curtis B. Storlie, Terry M. Therneau, Rickey E. Carter, Nicholas Chia, John R. Bergquist, Jeanne M. Huddleston, Santiago Romero-Brufau

Abstract: We describe the Bedside Patient Rescue (BPR) project, the goal of which is risk prediction of adverse events for non-ICU patients using ~200 variables (vitals, lab results, assessments, ...). There are several missing predictor values for most patients, which in the health sciences is the norm, rather than the exception. A Bayesian approach is presented that addresses many of the shortcomings to s… ▽ More We describe the Bedside Patient Rescue (BPR) project, the goal of which is risk prediction of adverse events for non-ICU patients using ~200 variables (vitals, lab results, assessments, ...). There are several missing predictor values for most patients, which in the health sciences is the norm, rather than the exception. A Bayesian approach is presented that addresses many of the shortcomings to standard approaches to missing predictors: (i) treatment of the uncertainty due to imputation is straight-forward in the Bayesian paradigm, (ii) the predictor distribution is flexibly modeled as an infinite normal mixture with latent variables to explicitly account for discrete predictors (i.e., as in multivariate probit regression models), and (iii) certain missing not at random situations can be handled effectively by allowing the indicator of missingness into the predictor distribution only to inform the distribution of the missing variables. The proposed approach also has the benefit of providing a distribution for the prediction, including the uncertainty inherent in the imputation. Therefore, we can ask questions such as: is it possible this individual is at high risk but we are missing too much information to know for sure? How much would we reduce the uncertainty in our risk prediction by obtaining a particular missing value? This approach is applied to the BPR problem resulting in excellent predictive capability to identify deteriorating patients. △ Less

Submitted 25 April, 2017; originally announced April 2017.

arXiv:1608.02003 [pdf, ps, other]

doi 10.4230/LIPIcs.TQC.2016.89

How hard is deciding trivial versus nontrivial in the dihedral coset problem?

Authors: Nai-Hui Chia, Sean Hallgren

Abstract: We study the hardness of the dihedral hidden subgroup problem. It is known that lattice problems reduce to it, and that it reduces to random subset sum with density $> 1$ and also to quantum sampling subset sum solutions. We examine a decision version of the problem where the question asks whether the hidden subgroup is trivial or order two. The decision problem essentially asks if a given vector… ▽ More We study the hardness of the dihedral hidden subgroup problem. It is known that lattice problems reduce to it, and that it reduces to random subset sum with density $> 1$ and also to quantum sampling subset sum solutions. We examine a decision version of the problem where the question asks whether the hidden subgroup is trivial or order two. The decision problem essentially asks if a given vector is in the span of all coset states. We approach this by first computing an explicit basis for the coset space and the perpendicular space. We then look at the consequences of having efficient unitaries that use this basis. We show that if a unitary maps the basis to the standard basis in any way, then that unitary can be used to solve random subset sum with constant density $>1$. We also show that if a unitary can exactly decide membership in the coset subspace, then the collision problem for subset sum can be solved for density $>1$ but approaching $1$ as the problem size increases. This strengthens the previous hardness result that implementing the optimal POVM in a specific way is as hard as quantum sampling subset sum solutions. △ Less

Submitted 5 August, 2016; originally announced August 2016.

Comments: 16 pages

ACM Class: F.2

arXiv:1012.2166 [pdf, other]

doi 10.1007/s10955-010-0112-8

Statistical Mechanics of Horizontal Gene Transfer in Evolutionary Ecology

Authors: Nicholas Chia, Nigel Goldenfeld

Abstract: The biological world, especially its majority microbial component, is strongly interacting and may be dominated by collective effects. In this review, we provide a brief introduction for statistical physicists of the way in which living cells communicate genetically through transferred genes, as well as the ways in which they can reorganize their genomes in response to environmental pressure. We d… ▽ More The biological world, especially its majority microbial component, is strongly interacting and may be dominated by collective effects. In this review, we provide a brief introduction for statistical physicists of the way in which living cells communicate genetically through transferred genes, as well as the ways in which they can reorganize their genomes in response to environmental pressure. We discuss how genome evolution can be thought of as related to the physical phenomenon of annealing, and describe the sense in which genomes can be said to exhibit an analogue of information entropy. As a direct application of these ideas, we analyze the variation with ocean depth of transposons in marine microbial genomes, predicting trends that are consistent with recent observations using metagenomic surveys. △ Less

Submitted 9 December, 2010; originally announced December 2010.

Comments: Accepted by Journal of Statistical Physics

arXiv:1005.3349 [pdf, other]

doi 10.1103/PhysRevE.83.021906

The dynamics of gene duplication and transposons in microbial genomes following a sudden environmental change

Authors: Nicholas Chia, Nigel Goldenfeld

Abstract: A variety of genome transformations can occur as a microbial population adapts to a large environmental change. In particular, genomic surveys indicate that, following the transition to an obligate, host-dependent symbiont, the density of transposons first rises, then subsequently declines over evolutionary time. Here, we show that these observations can be accounted for by a class of generic stoc… ▽ More A variety of genome transformations can occur as a microbial population adapts to a large environmental change. In particular, genomic surveys indicate that, following the transition to an obligate, host-dependent symbiont, the density of transposons first rises, then subsequently declines over evolutionary time. Here, we show that these observations can be accounted for by a class of generic stochastic models for the evolution of genomes in the presence of continuous selection and gene duplication. The models use a fitness function that allows for partial contributions from multiple gene copies, is an increasing but bounded function of copy number, and is optimal for one fully adapted gene copy. We use Monte Carlo simulation to show that the dynamics result in an initial rise in gene copy number followed by a subsequent fall due to adaptation to the new environmental parameters. These results are robust for reasonable gene duplication and mutation parameters when adapting to a novel target sequence. Our model provides a generic explanation for the dynamics of microbial transposon density following a large environmental changes such as host restriction. △ Less

Submitted 19 January, 2011; v1 submitted 18 May, 2010; originally announced May 2010.

arXiv:0811.3407 [pdf, ps, other]

doi 10.1103/PhysRevE.80.030901

Lambda-prophage induction modeled as a cooperative failure mode of lytic repression

Authors: Nicholas Chia, Ido Golding, Nigel Goldenfeld

Abstract: We analyze a system-level model for lytic repression of lambda-phage in E. coli using reliability theory, showing that the repressor circuit comprises 4 redundant components whose failure mode is prophage induction. Our model reflects the specific biochemical mechanisms involved in regulation, including long-range cooperative binding, and its detailed predictions for prophage induction in E. col… ▽ More We analyze a system-level model for lytic repression of lambda-phage in E. coli using reliability theory, showing that the repressor circuit comprises 4 redundant components whose failure mode is prophage induction. Our model reflects the specific biochemical mechanisms involved in regulation, including long-range cooperative binding, and its detailed predictions for prophage induction in E. coli under ultra-violet radiation are in good agreement with experimental data. △ Less

Submitted 3 December, 2008; v1 submitted 20 November, 2008; originally announced November 2008.

Comments: added reference

arXiv:0802.1669 [pdf, ps, other]

M-decomposability, elliptical unimodal densities, and applications to clustering and kernel density estimation

Authors: Nicholas Chia, Junji Nakano

Abstract: Chia and Nakano (2009) introduced the concept of M-decomposability of probability densities in one-dimension. In this paper, we generalize M-decomposability to any dimension. We prove that all elliptical unimodal densities are M-undecomposable. We also derive an inequality to show that it is better to represent an M-decomposable density via a mixture of unimodal densities. Finally, we demonstrat… ▽ More Chia and Nakano (2009) introduced the concept of M-decomposability of probability densities in one-dimension. In this paper, we generalize M-decomposability to any dimension. We prove that all elliptical unimodal densities are M-undecomposable. We also derive an inequality to show that it is better to represent an M-decomposable density via a mixture of unimodal densities. Finally, we demonstrate the application of M-decomposability to clustering and kernel density estimation, using real and simulated data. Our results show that M-decomposability can be used as a non-parametric criterion to locate modes in probability densities. △ Less

Submitted 21 April, 2010; v1 submitted 12 February, 2008; originally announced February 2008.

Comments: 30 pages, 13 figures

arXiv:cond-mat/0509417 [pdf, ps, other]

doi 10.1103/PhysRevE.72.051102

Numerical Method for Accessing the Universal Scaling Function for a Multi-Particle Discrete Time Asymmetric Exclusion Process

Authors: Nicholas Chia, Ralf Bundschuh

Abstract: In the universality class of the one dimensional Kardar-Parisi-Zhang surface growth, Derrida and Lebowitz conjectured the universality of not only the scaling exponents, but of an entire scaling function. Since Derrida and Lebowitz's original publication [PRL 80 209 (1998)] this universality has been verified for a variety of continuous time, periodic boundary systems in the KPZ universality cla… ▽ More In the universality class of the one dimensional Kardar-Parisi-Zhang surface growth, Derrida and Lebowitz conjectured the universality of not only the scaling exponents, but of an entire scaling function. Since Derrida and Lebowitz's original publication [PRL 80 209 (1998)] this universality has been verified for a variety of continuous time, periodic boundary systems in the KPZ universality class. Here, we present a numerical method for directly examining the entire particle flux of the asymmetric exclusion process (ASEP), thus providing an alternative to more difficult cumulant ratios studies. Using this method, we find that the Derrida-Lebowitz scaling function (DLSF) properly characterizes the large system size limit (N-->infty) of a single particle discrete time system, even in the case of very small system sizes (N <= 22). This fact allows us to not only verify that the DLSF properly characterizes multiple particle discrete-time asymmetric exclusion processes, but also provides a way to numerically solve for quantities of interest, such as the particle hopping flux. This method can thus serve to further increase the ease and accessibility of studies involving even more challenging dynamics, such as the open boundary ASEP. △ Less

Submitted 15 September, 2005; originally announced September 2005.

arXiv:q-bio/0406009 [pdf, ps, other]

Finite Width Model Sequence Comparison

Authors: Ralf Bundschuh, Nicholas Chia

Abstract: Sequence comparison is a widely used computational technique in modern molecular biology. In spite of the frequent use of sequence comparisons the important problem of assigning statistical significance to a given degree of similarity is still outstanding. Analytical approaches to filling this gap usually make use of an approximation that neglects certain correlations in the disorder underlying… ▽ More Sequence comparison is a widely used computational technique in modern molecular biology. In spite of the frequent use of sequence comparisons the important problem of assigning statistical significance to a given degree of similarity is still outstanding. Analytical approaches to filling this gap usually make use of an approximation that neglects certain correlations in the disorder underlying the sequence comparison algorithm. Here, we use the longest common subsequence problem, a prototype sequence comparison problem, to analytically establish that this approximation does make a difference to certain sequence comparison statistics. In the course of establishing this difference we develop a method that can systematically deal with these disorder correlations. △ Less

Submitted 3 June, 2004; originally announced June 2004.

Showing 1–38 of 38 results for author: Chia, N