×

Quantum mathematics in artificial intelligence. (English) Zbl 1522.68443

Summary: In the decade since 2010, successes in artificial intelligence have been at the forefront of computer science and technology, and vector space models have solidified a position at the forefront of artificial intelligence. At the same time, quantum computers have become much more powerful, and announcements of major advances are frequently in the news.
The mathematical techniques underlying both these areas have more in common than is sometimes realized. Vector spaces took a position at the axiomatic heart of quantum mechanics in the 1930s, and this adoption was a key motivation for the derivation of logic and probability from the linear geometry of vector spaces. Quantum interactions between particles are modelled using the tensor product, which is also used to express objects and operations in artificial neural networks.
This paper describes some of these common mathematical areas, including examples of how they are used in artificial intelligence (AI), particularly in automated reasoning and natural language processing (NLP). Techniques discussed include vector spaces, scalar products, subspaces and implication, orthogonal projection and negation, dual vectors, density matrices, positive operators, and tensor products. Application areas include information retrieval, categorization and implication, modelling word-senses and disambiguation, inference in knowledge bases, and semantic composition.
Some of these approaches can potentially be implemented on quantum hardware. Many of the practical steps in this implementation are in early stages, and some are already realized. Explaining some of the common mathematical tools can help researchers in both AI and quantum computing further exploit these overlaps, recognizing and exploring new directions along the way.

MSC:

68T01 General topics in artificial intelligence
81P68 Quantum computation

References:

[1] Aaronson, S. (2013).Quantum computing since Democritus. Cambridge University Press. · Zbl 1353.68003
[2] Aaronson, S. (2015). Read the fine print.Nature Physics,11(4), 291-293.
[3] Abbas, A., Sutter, D., Zoufal, C., Lucchi, A., Figalli, A., & Woerner, S. (2021). The power of quantum neural networks.Nature Computational Science,1(6), 403-409.
[4] Abramsky, S., & Coecke, B. (2004). A categorical semantics of quantum protocols. InProceedings of the 19th Annual IEEE Symposium on Logic in Computer Science, 2004., pp. 415-425. IEEE.
[5] Abramsky, S., & Coecke, B. (2009). Categorical quantum mechanics.Handbook of quantum logic and quantum structures,2, 261-325. · Zbl 1273.81014
[6] Aerts, D., & Czachor, M. (2004). Quantum aspects of semantic analysis and symbolic artificial intelligence.J. Phys. A: Math. Gen.,37, L123-L132. · Zbl 1078.81014
[7] Aerts, D., Durt, T., Grib, A., Van Bogaert, B., & Zapatrin, R. (1993). Quantum structures in macroscopic reality.International Journal of Theoretical Physics,32(3), pp-489. · Zbl 0784.03036
[8] Aerts, D., & Sozzo, S. (2011). Quantum structure in cognition: Why and how concepts are entangled. InInternational Symposium on Quantum Interaction, pp. 116-127. Springer.
[9] Arunachalam, S., Gheorghiu, V., Jochym-O’Connor, T., Mosca, M., & Srinivasan, P. V. (2015). On the robustness of bucket brigade quantum ram.New Journal of Physics,17(12), 123010. · Zbl 1452.81061
[10] Arute, F., Arya, K., Babbush, R., Bacon, D., Bardin, J. C., Barends, R., Biswas, R., Boixo, S., Brandao, F. G., Buell, D. A., et al. (2019).Quantum supremacy using a programmable superconducting processor.Nature,574(7779), 505-510.
[11] Bankova, D., Coecke, B., Lewis, M., & Marsden, D. (2019). Graded hyponymy for compositional distributional semantics.Journal of Language Modelling,6(2), 225-260.
[12] Baroni, M., Bernardi, R., Zamparelli, R., et al. (2014). Frege in space: A program for compositional distributional semantics.Linguistic Issues in language technology,9(6), 5-110.
[13] Baroni, M., & Zamparelli, R. (2010).Nouns are vectors, adjectives are matrices: Representing adjective-noun constructions in semantic space. InProceedings of the 2010 Conference on Empirical Methods in Natural Language Processing (EMNLP).
[14] Basile, P., Caputo, A., & Semeraro, G. (2011).Negation for Document Re-ranking in Ad-hoc Retrieval, pp. 285-296. Springer, Berlin, Heidelberg.
[15] Bernhardt, C. (2019).Quantum Computing for Everyone. Mit Press.
[16] Biamonte, J., Wittek, P., Pancotti, N., Rebentrost, P., Wiebe, N., & Lloyd, S. (2017). Quantum machine learning.Nature,549(7671), 195-202.
[17] Birkhoff, G., & von Neumann, J. (1936). The logic of quantum mechanics.Annals of Mathematics, 37, 823-843. · Zbl 0015.14603
[18] Bishop, R. L., & Goldberg, S. I. (1968).Tensor analysis on manifolds. Macmillan, republished by Dover (2012). · Zbl 0218.53021
[19] Bollob´as, B. (1998).Modern Graph Theory. No. 184 in Graduate Texts in Mathematics. SpringerVerlag. · Zbl 0902.05016
[20] Boole, G. (1854).An Investigation of the Laws of Thought. Macmillan. Dover edition, 1958. · Zbl 1205.03003
[21] Bradley, T.-D. (2020).At the Interface of Algebra and Statistics. Ph.D. thesis, City University of New York.
[22] Bruza, P., Kitto, K., & McEvoy, D. (2008). Entangling words and meaning. InQuantum Interaction: Proceedings of the Second Quantum Interaction Symposium (QI-2008), pp. 118-124. College Publications.
[23] Bruza, P. D., Kitto, K., Ramm, B. J., & Sitbon, L. (2015). A probabilistic framework for analysing the compositionality of conceptual combinations.Journal of Mathematical Psychology,67, 26-38. · Zbl 1354.91136
[24] Bruza, P. D., Widdows, D., & Woods, J. (2009). A quantum logic of down below. In Engesser, K., Gabbay, D. M., & Lehmann, D. (Eds.),Handbook of quantum logic and quantum structures. Elsevier. · Zbl 1273.81016
[25] Busemeyer, J., & Bruza, P. (2012).Quantum Models of Cognition and Decision. Cambridge University Press.
[26] Camacho-Collados, J., & Pilehvar, M. T. (2018). From word to sense embeddings: A survey on vector representations of meaning.Journal of Artificial Intelligence Research,63, 743-788. · Zbl 1486.68194
[27] Clark, S., & Pulman, S. (2007). Combining symbolic and distributional models of meaning.. In AAAI Spring Symposium: Quantum Interaction, pp. 52-55.
[28] Clarke, D. (2012). A context-theoretic framework for compositionality in distributional semantics. Computational Linguistics,38(1), 41-71.
[29] Coecke, B., de Felice, G., Meichanetzidis, K., & Toumi, A. (2020).Foundations for near-term quantum natural language processing.arXiv preprint,arXiv:2012.03755.
[30] Coecke, B., & Kissinger, A. (2017).Picturing quantum processes. Cambridge University Press. · Zbl 1405.81001
[31] Coecke, B., Sadrzadeh, M., & Clark, S. (2010).Mathematical foundations for a compositional distributional model of meaning.arXiv preprint,arXiv:1003.4394.
[32] Cohen, T., & Widdows, D. (2015). Embedding probabilities in predication space with hermitian holographic reduced representations. InNinth International Symposium on Quantum Interaction, pp. 245-257. Springer. · Zbl 1435.81026
[33] Cohen, T., Widdows, D., Vine, L. D., Schvaneveldt, R., & Rindflesch, T. C. (2012). Many paths lead to discovery: Analogical retrieval of cancer therapies. InSixth International Symposium on Quantum Interaction.
[34] Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding.arXiv preprint,arXiv:1810.04805.
[35] Dirac, P. (1930).The principles of quantum mechanics(4th edition, 1958, reprinted 1982 edition). Clarendon Press, Oxford. · JFM 56.0745.05
[36] Dorst, L., Fontijne, D., & Mann, S. (2010).Geometric algebra for computer science: an objectoriented approach to geometry. Elsevier. · Zbl 1213.68649
[37] Fearnley-Sander, D. (1979). Hermann Grassmann and the creation of linear algebra.The American Mathematical Monthly,86(10), 809-817. · Zbl 0428.01008
[38] Ferrie, C. (2018).Quantum Information for Babies. Sourcebooks Explore.
[39] Fischbacher, T., & Sbaiz, L. (2020).Single-photon image classification.arXiv preprint, arXiv:2008.05859.
[40] Ganter, B., & Wille, R. (1999).Formal Concept Analysis: Mathematical Foundations. Springer. · Zbl 0909.06001
[41] Garg, D., Ikbal, S., Srivastava, S. K., Vishwakarma, H., Karanam, H., & Subramaniam, L. V. (2019). Quantum embedding of knowledge for reasoning. InAdvances in Neural Information Processing Systems, pp. 5594-5604.
[42] G´eron, A. (2019).Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. O’Reilly Media.
[43] Giovannetti, V., Lloyd, S., & Maccone, L. (2008). Quantum random access memory.Physical review letters,100(16), 160501. · Zbl 1228.81125
[44] Grassmann, H. (1862).Extension Theory. History of Mathematics Sources. American Mathematical Society, London Mathematical Society. Translated by Lloyd C. Kannenberg (2000).
[45] Greenstein, G., & Zajonc, A. G. (1997).The Quantum Challenge: modern research on the Foundations of Quantum Mechanics. Jones and Bartlett Publishers, Sudbury, Massachusetts.
[46] Grefenstette, E., & Sadrzadeh, M. (2011). Experimental support for a categorical compositional distributional model of meaning. InProceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (EMNLP).
[47] Hastie, T., Tibshirani, R., & Friedman, J. H. (2001).The Elements of Statistical Learning. Springer Series in Statistics. · Zbl 0973.62007
[48] Heunen, C., & Vicary, J. (2019).Categories for Quantum Theory: an introduction. Oxford University Press. · Zbl 1436.81004
[49] Hinton, G. E. (1990). Preface to the special issue on connectionist symbol processing.Artificial Intelligence,46(1-2), 1-4.
[50] Huang, Q., Deng, L., Wu, D., Liu, C., & He, X. (2019). Attentive tensor product learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, pp. 1344-1351.
[51] Huang, Q., Smolensky, P., He, X., Deng, L., & Wu, D. (2017). Tensor product generation networks for deep NLP modeling.arXiv preprint,arXiv:1709.09118.
[52] Hupkes, D., Dankers, V., Mul, M., & Bruni, E. (2020). Compositionality decomposed: how do neural networks generalise?.Journal of Artificial Intelligence Research,67, 757-795.
[53] Isham, C. J. (1995).Lectures on Quantum Theory. Imperial College Press, London. · Zbl 0875.81001
[54] Iyyer, M., Manjunatha, V., Boyd-Graber, J., & Daum´e III, H. (2015). Deep unordered composition rivals syntactic methods for text classification. InProceedings of the 53rd annual meeting of the Association for Computational Linguistics (ACL), pp. 1681-1691.
[55] J¨anich, K. (1994).Linear algebra. Undergraduate Texts in Mathematics. Springer-Verlag. · Zbl 0808.15001
[56] Kanerva, P. (2009). Hyperdimensional computing: An introduction to computing in distributed representation with high-dimensional random vectors.Cognitive Computation,1(2), 139-159.
[57] Kartsaklis, D., & Sadrzadeh, M. (2014). A study of entanglement in a categorical framework of natural language.arXiv preprint,arXiv:1405.2874. · Zbl 1464.03033
[58] Khrennikov, A. (2010).Ubiquitous Quantum Structure: From Psychology to Finance. Springer. · Zbl 1188.91002
[59] Kitto, K., & Boschetti, F. (2013). Attitudes, ideologies and self-organization: information load minimization in multi-agent decision making.Advances in Complex Systems,16(02n03), 1350029. · Zbl 07865736
[60] Lambek, J. (2001). Type grammars as pregroups.Grammars,4(1), 21-39. · Zbl 1007.03031
[61] Landauer, T., & Dumais, S. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition.Psychological Review,104(2), 211-240.
[62] Lang, S. (2002).Algebra. No. 211 in Graduate Texts in Mathematics. Springer-Verlag. · Zbl 0984.00001
[63] Lawvere, F. W., & Schanuel, S. H. (2009).Conceptual mathematics: a first introduction to categories. Cambridge University Press. · Zbl 1179.18001
[64] LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning.Nature,521(7553), 436-444.
[65] Levy, S. D., & Gayler, R. (2008). Vector symbolic architectures: A new building material for artificial general intelligence. InProceedings of the 2008 conference on Artificial General Intelligence 2008, pp. 414-418. IOS Press. · Zbl 1147.68316
[66] Lewis, M. (2019). Compositional hyponymy with positive operators. InProceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pp. 638-647, Varna, Bulgaria.
[67] Lewis, M. (2020).Towards logical negation for compositional distributional semantics.arXiv preprint,arXiv:2005.04929.
[68] Li, Z., Dattani, N. S., Chen, X., Liu, X., Wang, H., Tanburn, R., Chen, H., Peng, X., & Du, J. (2017). High-fidelity adiabatic quantum computation using the intrinsic hamiltonian of a spin system: Application to the experimental factorization of 291311.arXiv preprint,arXiv:1706.08061.
[69] Manning, C. D., Raghavan, P., & Sch¨utze, H. (2008).Introduction to information retrieval. Cambridge university press. · Zbl 1160.68008
[70] McCoy, R. T., Linzen, T., Dunbar, E., & Smolensky, P. (2018). RNNs implicitly implement tensor product representations.arXiv preprint,arXiv:1812.08718.
[71] McCoy, R. T., Linzen, T., Dunbar, E., & Smolensky, P. (2020). Tensor product decomposition networks: Uncovering representations of structure learned by neural networks.Proceedings of the Society for Computation in Linguistics,3(1), 474-475.
[72] Meichanetzidis, K., De Felice, G., Toumi, A., Coecke, B., Gogioso, S., & Chiappori, N. (2020). Quantum natural language processing on near-term quantum computers. InSemantic Spaces at the Intersection of NLP, Physics, and Cognitive Science (SemSpace2020). arXiv preprint arXiv:2005.04147.
[73] Melucci, M. (2015).Introduction to Information Retrieval and Quantum Mechanics. Springer. · Zbl 1334.68003
[74] Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. InProceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 472-479. ACM.
[75] Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space.arXiv preprint,arXiv:1301.3781.
[76] Mitchell, J., & Lapata, M. (2008). Vector-based models of semantic composition.. InACL, pp. 236-244.
[77] Mitchell, J., & Lapata, M. (2010). Composition in distributional models of semantics.Cognitive science,34(8), 1388-1429.
[78] Mizraji, E., et al. (1994). Modalities in vector logic..Notre Dame Journal of Formal Logic,35(2), 272-283. · Zbl 0819.03014
[79] Moreira, C., Tiwari, P., Pandey, H. M., Bruza, P., & Wichert, A. (2020). Quantum-like influence diagrams for decision-making.Neural Networks,132, 190-210. · Zbl 1479.91093
[80] Neelakantan, A., Shankar, J., Passos, A., & McCallum, A. (2014). Efficient non-parametric estimation of multiple embeddings per word in vector space. InProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar. Association for Computational Linguistics.
[81] Nielsen, M. A., & Chuang, I. (2002). Quantum computation and quantum information.. · Zbl 1049.81015
[82] Penrose, R. (1999).The emperor’s new mind: Concerning computers, minds, and the laws of physics. Oxford University Press. · Zbl 0749.00009
[83] Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L. (2018). Deep contextualized word representations.arXiv preprint,arXiv:1802.05365.
[84] Plate, T. A. (2003).Holographic Reduced Representations: Distributed Representation for Cognitive Structures. CSLI Publications.
[85] Pothos, E. M., & Busemeyer, J. R. (2013). Can quantum probability provide a new direction for cognitive modeling?.Behavioral and brain sciences,36(3), 255-274.
[86] Rieffel, E. G., & Polak, W. H. (2014).Quantum Computing: A Gentle Introduction (Scientific and Engineering Computation). The MIT Press. · Zbl 1221.81003
[87] Sadrzadeh, M., Clark, S., & Coecke, B. (2013). The Frobenius anatomy of word meanings i: subject and object relative pronouns.Journal of Logic and Computation,23(6), 1293-1317. · Zbl 1320.68207
[88] Sadrzadeh, M., Clark, S., & Coecke, B. (2014). The Frobenius anatomy of word meanings ii: possessive relative pronouns.Journal of Logic and Computation,26(2), 785-815. · Zbl 1344.68253
[89] Sadrzadeh, M., Kartsaklis, D., & Balkır, E. (2018). Sentence entailment in compositional distributional semantics.Annals of Mathematics and Artificial Intelligence,82(4), 189-218. · Zbl 1459.03030
[90] Salton, G., & McGill, M. (1983).Introduction to modern information retrieval. McGraw-Hill, New York, NY. · Zbl 0523.68084
[91] Sch¨utze, H. (1998). Automatic word sense discrimination.Computational Linguistics,24(1), 97-124.
[92] Selinger, P. (2007). Dagger compact closed categories and completely positive maps.Electronic Notes in Theoretical computer science,170, 139-163. · Zbl 1277.18008
[93] Selinger, P. (2010). A survey of graphical languages for monoidal categories. InNew structures for physics, pp. 289-355. Springer. · Zbl 1217.18002
[94] Sloan, P.-P., Kautz, J., & Snyder, J. (2002). Precomputed radiance transfer for real-time rendering in dynamic, low-frequency lighting environments. InProceedings of the 29th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’02, p. 527-536, New York, NY, USA. Association for Computing Machinery.
[95] Smolensky, P. (1986). Information processing in dynamical systems: Foundations of harmony theory. Tech. rep., Colorado Univ at Boulder Dept of Computer Science.
[96] Smolensky, P. (1990). Tensor product variable binding and the representation of symbolic structures in connectionist systems.Artificial intelligence,46(1), 159-216. · Zbl 0717.68095
[97] Socher, R., Huval, B., Manning, C. D., & Ng, A. Y. (2012). Semantic compositionality through recursive matrix-vector spaces. InEMNLP 2012, EMNLP-CoNLL ’12, pp. 1201-1211. Association for Computational Linguistics.
[98] Sordoni, A., Nie, J.-Y., & Bengio, Y. (2013). Modeling Term Dependencies with Quantum Language Models for IR. InProceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’13, pp. 653-662.
[99] Spencer, A. J. M. (2004).Continuum mechanics. Courier Corporation.
[100] Steinbrecher, G. R., Olson, J. P., Englund, D., & Carolan, J. (2019).Quantum optical neural networks.NPJ Quantum Information,5(1), 1-9.
[101] Strang, G. (1993).Introduction to linear algebra, Vol. 3. Wellesley-Cambridge Press Wellesley, MA. · Zbl 1067.15501
[102] Switzer, P. (1965). Vector images in document retrieval. InStatistical association methods for mechanized documentation, pp. 163-171.
[103] Tarrataca, L., & Wichert, A. (2011a). Problem-solving and quantum computation.Cognitive computation,3(4), 510-524. · Zbl 1225.81041
[104] Tarrataca, L., & Wichert, A. (2011b). Tree search and quantum computation.Quantum Information Processing,10(4), 475-500. · Zbl 1225.81041
[105] Trabelsi, C., Bilaniuk, O., Zhang, Y., Serdyuk, D., Subramanian, S., Santos, J. F., Mehri, S., Rostamzadeh, N., Bengio, Y., & Pal, C. J. (2018). Deep complex networks. InICLR 2018.
[106] Turney, P. D. (2012). Domain and function: A dual-space model of semantic relations and compositions..J. Artif. Intell. Res.(JAIR),44, 533-585. · Zbl 1280.68273
[107] Turney, P. D., Pantel, P., et al. (2010). From frequency to meaning: Vector space models of semantics. Journal of artificial intelligence research,37(1), 141-188. · Zbl 1185.68765
[108] Van Rijsbergen, C. J. (1986). A non-classical logic for information retrieval.The Computer Journal, 29(6), 481-485. · Zbl 0633.68101
[109] Van Rijsbergen, C. J. (2004).The Geometry of Information Retrieval. Cambridge University Press. · Zbl 1095.68030
[110] Varadarajan, V. S. (1985).Geometry of Quantum Theory. Springer-Verlag. · Zbl 0581.46061
[111] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. InAdvances in neural information processing systems, pp. 5998-6008.
[112] Vilnis, L., & McCallum, A. (2015). Word representations via gaussian embedding. InInternational Cofnerence on Learning Representations. arXiv preprint arXiv:1412.6623.
[113] Wang, K., Xiao, L., Yi, W., Ran, S.-J., & Xue, P. (2020). Quantum image classifier with single photons.arXiv preprint,arXiv:2003.08551.
[114] Westphal, J., & Hardy, J. (2005). Logic as a vector system..Journal of Logic & Computation,15(5). · Zbl 1099.03006
[115] Wichert, A. (2020).Principles of quantum artificial intelligence: Quantum Problem Solving and Machine Learning (Second Edition). World scientific. · Zbl 1440.81011
[116] Widdows, D. (2003). Orthogonal negation in vector spaces for modelling word-meanings and document retrieval. InProceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL), Sapporo, Japan.
[117] Widdows, D. (2004).Geometry and Meaning. CSLI Publications. · Zbl 1068.00003
[118] Widdows, D. (2008). Semantic vector products: Some initial investigations. InProceedings of the Second International Symposium on Quantum Interaction.
[119] Widdows, D., & Bruza, P. (2007). Quantum information dynamics and open world science. In Quantum Interaction: Papers from the 2007 AAAI Spring Symposium (Technical Report SS07-08), pp. 126-133. AAAI Press.
[120] Widdows, D., & Cohen, T. (2015). Reasoning with vectors: a continuous model for fast robust inference.Logic Journal of IGPL,23(2), 141-173.
[121] Widdows, D., & Peters, S. (2003). Word vectors and quantum logic. InProceedings of the Eighth Mathematics of Language Conference, Bloomington, Indiana.
[122] Wiebe, N., Bocharov, A., Smolensky, P., Troyer, M., & Svore, K. M. (2019). Quantum language processing.arXiv preprint,arXiv:1902.05162.
[123] Wiebe, N., Kapoor, A., & Svore, K. M. (2014).Quantum deep learning.arXiv preprint, arXiv:1412.3489.
[124] Willmore, T. J. (1959).An introduction to differential geometry. Oxford University Press, republished Dover (2012). · Zbl 0086.14401
[125] Wittek, P. (2014).Quantum machine learning: what quantum computing means to data mining. Elsevier. · Zbl 1304.68008
[126] Ying, M. (2010). Quantum computation, quantum theory and AI.Artificial Intelligence,174(2), 162-176.
[127] Zhong, H.-S., Wang, H., Deng, Y.-H., Chen, M.-C., Peng, L.-C., Luo, Y.-H., Qin, J., Wu, D., Ding, X., Hu, Y., et al. (2020). Quantum computational advantage using photons.Science,370(6523), 1460-1463
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.