Pregroup grammars, their syntax and semantics. (English) Zbl 1491.68252
Casadio, Claudia (ed.) et al., Joachim Lambek: the interplay of mathematics, logic, and linguistics. Cham: Springer. Outst. Contrib. Log. 20, 347-376 (2021).
This paper is dedicated to Lambek’s pregroup grammars [J. Lambek, J. Logic Lang. Inf. 17, No. 2, 141–160 (2008; Zbl 1162.68721)], the trouble with which has always been their semantics, or lack thereof. A cut-free sequent calculus has been developed for pregroups by W. Buszkowski [Math. Log. Q. 49, No. 5, 467–474 (2003; Zbl 1036.03046)], who has also shown that the expressive power of pregroup grammars, similar to that of the syntactic calculus [J. Lambek, Am. Math. Mon. 65, 154–170 (1958; Zbl 0080.00702)], is context-free [W. Buszkowski, Z. Math. Logik Grundlagen Math. 31, 369–384 (1985; Zbl 0559.68063)]. The set-theoretic semantics is, however, ambiguous, as a pregroup term \(abc^{l}\) has two interpretations, namely, \(A\times C^{B}\) and \(C^{A\times B}\).
This article studies the semantics of pregroup grammars, surveying recent advances in vector space modelling in natural language processing. Following a suggestion of Lambek, the author addresses finite-dimensional vector space semantics for pregroups, in which the adjoint types are to be interpreted as dual spaces. The author builds semantic vector representations for some exemplary words, phrases and sentences of language, showing how compositionality of vector semantics disambiguates meaning. Finally, the paper presents a vector semantics and analysis of questions, demonstrating how their representations relate to the sentences they are asked about.
For the entire collection see [Zbl 1470.03008].
This article studies the semantics of pregroup grammars, surveying recent advances in vector space modelling in natural language processing. Following a suggestion of Lambek, the author addresses finite-dimensional vector space semantics for pregroups, in which the adjoint types are to be interpreted as dual spaces. The author builds semantic vector representations for some exemplary words, phrases and sentences of language, showing how compositionality of vector semantics disambiguates meaning. Finally, the paper presents a vector semantics and analysis of questions, demonstrating how their representations relate to the sentences they are asked about.
For the entire collection see [Zbl 1470.03008].
Reviewer: Hirokazu Nishimura (Tsukuba)
MSC:
68T50 | Natural language processing |
03B65 | Logic of natural languages |
68Q42 | Grammars and rewriting systems |
91F20 | Linguistics |
References:
[1] | Bullinaria, J. A., and Levy, J. P.Extracting semantic representations from word co-occurrence statistics: A computational study.Behavior Research Methods(2007), 510-526. |
[2] | Buszkowski, W. Lambek grammars based on pregroups. InLogical Aspects of Computational Linguistics, vol. 2099 ofLecture Notes in Computer Science. Springer Berlin Heidelberg, 2001, pp. 95-109. · Zbl 0990.03021 |
[3] | Clark, S.Type-driven syntax and semantics for composing vectors.InQuantum Physics and Linguistics: A Compositional Diagrammatic Discourse. Oxford University Press, 2013, pp. 359-377. · Zbl 1346.68226 |
[4] | Clark, S., and Pulman, S. Combining symbolic and distributional models of meaning. InProceedings of the AAAI Spring Symposium on Quantum Interaction(2007), pp. 52-55. |
[5] | Coecke, B., Grefenstette, E., and Sadrzadeh, M. Lambek vs. lambek: Functorial vector space semantics and string diagrams for lambek calculus.Annals of Pure and Applied Logic 164, 11 (2013), 1079 - 1100. Special issue on Seventh Workshop on Games for Logic and Programming Languages (GaLoP VII). · Zbl 1280.03026 |
[6] | Coecke, B., Sadrzadeh, M., and Clark, S. Mathematical Foundations for Distributed Compositional Model of Meaning. Lambek Festschrift.Linguistic Analysis 36(2010), 345-384. |
[7] | Coecke, B., Toumi, A., de Felice, G., and Marsden, D. Towards compositional distributional discourse analysis. InEPTCS Proceedings of CAPNS(2018), pp. 1-12. |
[8] | Denis Bechet, A. F., and Tellier, I. Learnability of pregroup grammars.Studia Logica 87(2007), 225-252. · Zbl 1128.68042 |
[9] | Firth, J. A synopsis of linguistic theory 1930-1955. InStudies in Linguistic Analysis. 1957. |
[10] | Grefenstette, E. Towards a formal distributional semantics: Simulating logical calculi with tensors. InSecond Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity(Atlanta, Georgia, USA, June 2013), Association for Computational Linguistics, pp. 1-10. |
[11] | Grefenstette, E., Dinu, G., Zhang, Y., Sadrzadeh, M., and Baroni, M.Multi-step regression learning for compositional distributional semantics. In10th International Conference on Computational Semantics (IWCS)(Postdam, 2013). |
[12] | Grefenstette, E., and Sadrzadeh, M. Experimental support for a categorical compositional distributional model of meaning. InProceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP)(2011), pp. 1394-1404. |
[13] | Grefenstette, E., and Sadrzadeh, M. Concrete models and empirical evaluations for the categorical compositional distributional model of meaning.Computational Linguistics 41(2015), 71-118. |
[14] | Harris, Z. Distributional structure.Word(1954). |
[15] | Hedges, J., and Sadrzadeh, M. A generalised quantifier theory of natural language in categorical compositional distributional semantics with bialgebras.Mathematical Structure in Computer Science 29(2019), 783-809. · Zbl 1422.68227 |
[16] | Kartsaklis, D.Compositional Distributional Semantics with Compact Closed Categories and Frobenius Algebras. PhD thesis, Department of Computer Science, University of Oxford, 2015. |
[17] | Kartsaklis, D., and Sadrzadeh, M. Prior disambiguation of word tensors for constructing sentence vectors. InProceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP)(2013). |
[18] | Kartsaklis, D., Sadrzadeh, M., and Pulman, S.A unified sentence space for categorical distributional-compositional semantics: Theory and experiments. InProceedings of 24th International Conference on Computational Linguistics (COLING 2012): Posters(Mumbai, India, 2012), pp. 549-558. |
[19] | Lambek, J. The mathematics of sentence structure.American Mathematics Monthly 65(1958). · Zbl 0080.00702 |
[20] | Lambek, J. Type grammars revisited. Inproceedings of LACL 97(1997), vol. 1582 of Lecture Notes in Artificial Intelligence, Springer Verlag. · Zbl 0897.03002 |
[21] | Lambek, J.From Word to Sentence. Polimetrica International Scientific Publisher, 2008. · Zbl 1166.03315 |
[22] | Lambek, J.From Rules of Grammar to Laws of Nature. Language and Linguistics. Nova Science Publishers, 2014. · Zbl 1497.03001 |
[23] | Landauer, T., and Dumais, S. A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquision, Induction, and Representation of Knowledge.Psychological Review(1997). |
[24] | Lapesa, G., and Evert, S. A large scale evaluation of distributional semantic models: Parameters, interactions and model selection.Transactions of the Association for Computational Linguistics 2(2014), 531-545. |
[25] | Lin, D. Automatic retrieval and clustering of similar words. InProceedings of the 17th international conference on Computational linguistics-Volume 2(1998), Association for Computational Linguistics, pp. 768-774. |
[26] | Maillard, J., Clark, S., and Grefenstette, E. A type-driven tensor-based semantics for CCG. InProceedings of the EACL 2014 Workshop on Type Theory and Natural Language Semantics (TTNLS)(Gothenburg, Sweden, Apr. 2014), Association for Computational Linguistics, pp. 46-54. |
[27] | Milajevs, D., Kartsaklis, D., Sadrzadeh, M., and Purver, M. Evaluating neural word representations in tensor-based compositional settings.InProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)(2014), Association for Computational Linguistics, pp. 708-719. |
[28] | Montague, R. English as a formal language. InLinguaggi nella Societ‘a e nella Tecnica. Edizioni di Comunit‘a, Milan, 1970, pp. 189-224. |
[29] | Preller, A. Toward discourse representation via pregroup grammars.Journal of Logic, Language and Information 16, 2 (2007), 173-194. · Zbl 1160.03307 |
[30] | Preller, A., and Sadrzadeh, M. Bell states and negative sentences in the distributed model of meaning. InElectronic Notes in Theoretical Computer Science, Proceedings of the 6th QPL Workshop on Quantum Physics and Logic(2010), P. S. B. Coecke, P. Panangaden, Ed., University of Oxford. · Zbl 1347.03056 |
[31] | Preller, A., and Sadrzadeh, M. Semantic vector models and functional models for pregroup grammars.Journal of Logic, Language and Information 20, 4 (2011), 419- 443. · Zbl 1305.03028 |
[32] | R.F.Blute, and P.J.Scott. Linear l¨auchli semantics.Annals of Pure and Applied Logic 77(1996), 101-142. · Zbl 0856.03006 |
[33] | Rockt¨aschel, T., Singh, S., and Riedel, S. Injecting logical background knowledge into embeddings for relation extraction. InProc. 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies(2015), Association for Computational Linguistics, pp. 1119-1129. |
[34] | Rubenstein, H., and Goodenough, J. Contextual Correlates of Synonymy.Communications of the ACM 8, 10 (1965), 627-633. |
[35] | Sadrzadeh, M. Unifying mathematics for grammar and data.London Mathematical Society News Letter(2018), 25-31. |
[36] | Sadrzadeh, M., Clark, S., and Coecke, B. Frobenius anatomy of word meanings i: subject and object relative pronouns.Journal of Logic and Computation 23(2013), 1293-1317. · Zbl 1320.68207 |
[37] | Sadrzadeh, M., Clark, S., and Coecke, B. Frobenius anatomy of word meanings 2: possessive relative pronouns.Journal of Logic and Computation 26(2014), 785-815. · Zbl 1344.68253 |
[38] | Sadrzadeh, M., Purver, M., Hough, J., and Kempson, R. Exploring semantic incrementality with dynamic syntax and vector space semantics. InProceedings of the 22nd Workshop on the Semantics and Pragmatics of Dialogue(2018), pp. 1-10. |
[39] | Salton, G., Wong, A., and Yang, C. S. A vector space model for automatic indexing. Commun. ACM 18(1975), 613-620. · Zbl 0313.68082 |
[40] | Sato, T. Embedding tarskian semantics in vector spaces. InThe Workshops of the The Thirty-First AAAI Conference on Artificial Intelligence, Saturday, February 4-9, 2017, San Francisco, California, USA(2017). |
[41] | Schuetze, H. Automatic word sense discrimination.Computational Linguistics 24, 1 (1998), 97-123. |
[42] | Stephen Clark, B. C., and Sadrzadeh, M. A compositional distributional model of meaning.InProceedings of the Second Symposium on Quantum Interaction (QI) (2008), pp. 133-140. |
[43] | Stephen Clark, Bob Coecke, M. S. The frobenius anatomy of relative pronouns. In 13th Meeting on Mathematics of Language (MoL)(2013). · Zbl 1376.03028 |
[44] | T. Polajnar, L. Fagarasan, S. C. Reducing dimensions of tensors in type-driven distributional semantics. InProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)(2014), pp. 1036-1046. |
[45] | Turney, P. D. Similarity of semantic relations.Computational Linguistics 32, 3 (2006), 379-416. · Zbl 1234.68434 |
[46] | Wijnholds, G., and Sadrzadeh, M. Evaluating composition models for verb phrase elliptical sentence embeddings.InProc. 2019 Conference of the North American Chapter of the Association for Computational Linguistics(2019), pp. 261-271. |
[47] | Wijnholds, G., and Sadrzadeh, M. A type-driven vector semantics for ellipsis with anaphora using lambek calculus with limited contraction.Journal of Logic, Language and Information 28(2019), 331-358 · Zbl 1477.03115 |
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.