End-to-End Quantum Vision Transformer: Towards Practical Quantum Speedup in Large-Scale Models

PDFHTML

The field of quantum deep learning presents significant opportunities for advancing computational capabilities, yet it faces a major obstacle in the form of the "information loss problem" due to the inherent limitations of the necessary quantum tomography in scaling quantum deep neural networks. This paper introduces an end-to-end Quantum Vision Transformer (QViT), which incorporates an innovative quantum residual connection technique, to overcome these challenges and therefore optimize quantum computing processes in deep learning. Our thorough complexity analysis of the QViT reveals a theoretically exponential and empirically polynomial speedup, showcasing the model's efficiency and potential in quantum computing applications. We conducted extensive numerical tests on modern, large-scale transformers and datasets, establishing the QViT as a pioneering advancement in applying quantum deep neural networks in practical scenarios. Our work provides a comprehensive quantum deep learning paradigm, which not only demonstrates the versatility of current quantum linear algebra algorithms but also promises to enhance future research and development in quantum deep learning.
Submitted 29 Feb 2024 to Quantum Physics [quant-ph]
Published 01 Mar 2024
Updated 01 Mar 2024
Author comments: 24pages, 10 figures
https://arxiv.org/abs/2402.18940
https://arxiv.org/pdf/2402.18940.pdf
https://arxiv-vanity.com/papers/2402.18940

View this paper on arXiv.wiki:
https://arxiv.wiki/abs/2402.18940

0 comments