×

Efficient evaluation of arbitrary relational calculus queries. (English) Zbl 07789010

Summary: The relational calculus (RC) is a concise, declarative query language. However, existing RC query evaluation approaches are inefficient and often deviate from established algorithms based on finite tables used in database management systems. We devise a new translation of an arbitrary RC query into two safe-range queries, for which the finiteness of the query’s evaluation result is guaranteed. Assuming an infinite domain, the two queries have the following meaning: The first is closed and characterizes the original query’s relative safety, i.e., whether given a fixed database, the original query evaluates to a finite relation. The second safe-range query is equivalent to the original query, if the latter is relatively safe. We compose our translation with other, more standard ones to ultimately obtain two SQL queries. This allows us to use standard database management systems to evaluate arbitrary RC queries. We show that our translation improves the time complexity over existing approaches, which we also empirically confirm in both realistic and synthetic experiments.

MSC:

68-XX Computer science

References:

[1] Alfred K. Ailamazyan, Mikhail M. Gilula, Alexei P. Stolboushkin, and Grigorii F. Schwartz. Reduction of a relational model with infinite domains to the case of finite domains. Doklady Akademii Nauk SSSR, 286(2):308-311, 1986. URL: http://mi.mathnet.ru/dan47310.
[2] Arnon Avron and Yoram Hirshfeld. On first order database query languages. In LICS 1991, pages 226-231. IEEE Computer Society, 1991. doi:10.1109/LICS.1991.151647. · doi:10.1109/LICS.1991.151647
[3] Serge Abiteboul, Richard Hull, and Victor Vianu. Foundations of Databases. Addison-Wesley, 1995. URL: http://webdam.inria.fr/Alice/. · Zbl 0848.68031
[4] Achim Blumensath and Erich Grädel. Finite presentations of infinite structures: Automata and interpretations. Theory Comput. Syst., 37(6):641-674, 2004. doi:10.1007/s00224-004-1133-y. · Zbl 1061.03038 · doi:10.1007/s00224-004-1133-y
[5] David Basin, Felix Klaedtke, Samuel Müller, and Eugen Zȃlinescu. Monitoring metric first-order temporal properties. J. ACM, 62(2):15:1-15:45, 2015. doi:10.1145/2699444. · Zbl 1333.68177 · doi:10.1145/2699444
[6] Michael Benedikt and Leonid Libkin. Relational queries over interpreted structures. J. ACM, 47(4):644-680, 2000. doi:10.1145/347476.347477. · Zbl 1327.68089 · doi:10.1145/347476.347477
[7] Sagar Chaki, Arie Gurfinkel, and Ofer Strichman. Decision diagrams for linear arithmetic. In FMCAD 2009, pages 53-60. IEEE, 2009. doi:10.1109/FMCAD.2009.5351143. · doi:10.1109/FMCAD.2009.5351143
[8] Jens Claußen, Alfons Kemper, Guido Moerkotte, and Klaus Peithner. Optimizing queries with universal quantification in object-oriented and object-relational databases. In Matthias Jarke, Michael J. Carey, Klaus R. Dittrich, Frederick H. Lochovsky, Pericles Loucopoulos, and Manfred A. Jeusfeld, editors, VLDB 1997, pages 286-295. Morgan Kaufmann, 1997. URL: http://www.vldb.org/conf/1997/P286.PDF.
[9] E. F. Codd. Relational completeness of data base sublanguages. Research Report / RJ / IBM / San Jose, California, RJ987, 1972.
[10] Jan Chomicki and David Toman. Implementing temporal integrity constraints using an active DBMS. IEEE Trans. Knowl. Data Eng., 7(4):566-582, 1995. doi:10.1109/69.404030. · doi:10.1109/69.404030
[11] Martha Escobar-Molano, Richard Hull, and Dean Jacobs. Safety and translation of calculus queries with scalar functions. In Catriel Beeri, editor, PODS 1993, pages 253-264. ACM Press, 1993. doi:10.1145/153850.153909. · doi:10.1145/153850.153909
[12] Erling Ellingsen. Regex golf, 2013. URL: https://alf.nu/RegexGolf.
[13] Allen Van Gelder and Rodney W. Topor. Safety and correct translation of relational calculus formulas. In Moshe Y. Vardi, editor, PODS 1987, pages 313-327. ACM, 1987. doi:10.1145/ 28659.28693. · doi:10.1145/28659.28693
[14] Allen Van Gelder and Rodney W. Topor. Safety and translation of relational calculus queries. ACM Trans. Database Syst., 16(2):235-278, 1991. doi:10.1145/114325.103712. · doi:10.1145/114325.103712
[15] Richard Hull and Jianwen Su. Domain independence and the relational calculus. Acta Informatica, 31(6):513-524, 1994. doi:10.1007/BF01213204. · Zbl 0818.68067 · doi:10.1007/BF01213204
[16] Michael Kifer. On safety, domain independence, and capturability of database queries (preliminary report). In Catriel Beeri, Joachim W. Schmidt, and Umeshwar Dayal, editors, JCDKB 1988, pages 405-415. Morgan Kaufmann, 1988. doi:10.1016/b978-1-4832-1313-2.50037-8. · doi:10.1016/b978-1-4832-1313-2.50037-8
[17] Nils Klarlund and Anders Møller. MONA v1.4 User Manual. BRICS, Department of Computer Science, University of Aarhus, 2001. URL: http://www.brics.dk/mona/.
[18] Mahmoud Abo Khamis, Hung Q. Ngo, and Atri Rudra. FAQ: questions asked frequently. In Tova Milo and Wang-Chiew Tan, editors, PODS 2016, pages 13-28. ACM, 2016. doi: 10.1145/2902251.2902280. · doi:10.1145/2902251.2902280
[19] Leonid Libkin. Elements of Finite Model Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2004. doi:10.1007/978-3-662-07003-1. · Zbl 1060.03002 · doi:10.1007/978-3-662-07003-1
[20] Hong-Cheu Liu, Jeffrey Xu Yu, and Weifa Liang. Safety, domain independence and translation of complex value database queries. Inf. Sci., 178(12):2507-2533, 2008. doi:10.1016/j.ins.2008. 02.005. · Zbl 1183.68249 · doi:10.1016/j.ins.2008.02.005
[21] Jesper B. Møller, Jakob Lichtenberg, Henrik Reif Andersen, and Henrik Hulgaard. Difference decision diagrams. In Jörg Flum and Mario Rodríguez-Artalejo, editors, CSL 1999, volume 1683 of LNCS, pages 111-125. Springer, 1999. doi:10.1007/3-540-48168-0_9. · Zbl 0944.68040 · doi:10.1007/3-540-48168-0_9
[22] Jesper B. Møller. DDDLIB: A library for solving quantified difference inequalities. In Andrei Voronkov, editor, CADE 2002, volume 2392 of LNCS, pages 129-133. Springer, 2002. doi: 10.1007/3-540-45620-1_9. · Zbl 1072.68583 · doi:10.1007/3-540-45620-1_9
[23] Jianmo Ni, Jiacheng Li, and Julian J. McAuley. Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors, EMNLP 2019, pages 188-197. Association for Computational Linguistics, 2019. doi:10.18653/v1/D19-1018. · doi:10.18653/v1/D19-1018
[24] Hung Q. Ngo, Christopher Ré, and Atri Rudra. Skew strikes back: new developments in the theory of join algorithms. SIGMOD Rec., 42(4):5-16, 2013. doi:10.1145/2590989.2590991. · doi:10.1145/2590989.2590991
[25] Dan Olteanu and Maximilian Schleich. Factorized databases. SIGMOD Rec., 45(2):5-16, 2016. doi:10.1145/3003665.3003667. · doi:10.1145/3003665.3003667
[26] Robert A. Di Paola. The recursive unsolvability of the decision problem for the class of definite formulas. J. ACM, 16(2):324-327, 1969. doi:10.1145/321510.321524. · Zbl 0182.33202 · doi:10.1145/321510.321524
[27] Martin Raszyk. First-order query evaluation. Archive of Formal Proofs, 2022. Formal proof development. URL: https://www.isa-afp.org/entries/Eval_FO.html.
[28] Martin Raszyk, David Basin, Srdan Krstić, and Dmitriy Traytel. Implementation and empirical evaluation associated with this article, 2022. URL: https://github.com/mraszyk/rc2sql/tree/ lmcs22.
[29] Martin Raszyk, David Basin, Srdan Krstić, and Dmitriy Traytel. Practical relational calculus query evaluation. In Dan Olteanu and Nils Vortmeier, editors, ICDT 2022, volume 220 of LIPIcs, pages 11:1-11:21. Schloss Dagstuhl -Leibniz-Zentrum für Informatik, 2022. doi:10. 4230/LIPIcs.ICDT.2022.11. · Zbl 07838433 · doi:10.4230/LIPIcs.ICDT.2022.11
[30] Peter Z. Revesz. Introduction to Constraint Databases. Texts in Computer Science. Springer, 2002. doi:10.1007/b97430. 38:36 · Zbl 0995.68035 · doi:10.1007/b97430
[31] M. Raszyk, D. Basin, S. Krstić, and D. Traytel Vol. 19:4
[32] Martin Raszyk and Dmitriy Traytel. Making arbitrary relational calculus queries safe-range. Archive of Formal Proofs, 2022. Formal proof development. URL: https://www.isa-afp.org/ entries/Safe_Range_RC.html.
[33] Boris A. Trakhtenbrot. Impossibility of an algorithm for the decision problem in finite classes. Doklady Akademii Nauk SSSR, 70(4):569-572, 1950. · Zbl 0038.15001
[34] Moshe Y. Vardi. The decision problem for database dependencies. Inf. Process. Lett., 12(5):251-254, 1981. doi:10.1016/0020-0190(81)90025-9. · Zbl 0482.68094 · doi:10.1016/0020-0190(81)90025-9
[35] Moshe Y. Vardi. The complexity of relational query languages (extended abstract). In Harry R. Lewis, Barbara B. Simons, Walter A. Burkhard, and Lawrence H. Landweber, editors, STOC 1982, pages 137-146. ACM, 1982. doi:10.1145/800070.802186. · doi:10.1145/800070.802186
[36] M. Raszyk, D. Basin, S. Krstić, and D. Traytel Vol. 19:4
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.