×

Deciding implication for functional dependencies in complex-value databases. (English) Zbl 1110.68034

Summary: Modern applications increasingly require the storage of data beyond relational structure. The challenge of providing well-founded data models that can handle complex objects such as lists, sets, multisets, unions and references has not been met yet in a completely satisfactory way. The success of such data models will greatly depend on the existence of automated database design techniques that generalise achievements from relational databases. In this paper, we study the implication problem of functional dependencies (FDs) in the presence of records, sets, multisets and lists. Database schemata are defined as nested attributes, database instances as nested relations and FDs are defined in terms of subattributes of the database schema. The expressiveness of FDs deviates fundamentally from previous approaches in different data models including the nested relational data model and XML.
The implication problem is to decide whether for an arbitrary database schema, and an arbitrary set \(\Sigma \cup \{\sigma\}\) of FDs defined on that schema, every database instance that satisfies all FDs in \(\Sigma\) also satisfies \(\sigma\). The difficulty in generalising the solution from the relational data model to the presence of sets and multisets is caused by the fact that the value on the join of subattributes is no longer determined by the values on the subattributes. Based on the notion of a unit, we propose to decompose the database schema in such a way that the closure of a set of nested attributes can be computed on the components of the schema. The implementation of the algorithm is based on a representation theorem for Brouwerian algebras. The main contribution is the proof that the algorithm works correctly and in polynomial-time in the size of the input. Defining the size of the input is not trivial since the measure should both generalise the one that is used for relational databases and do justice to the presence of sets and multisets. Our solution to the implication problem allows to solve other important problems that occur in database design. We present polynomial-time algorithms to determine non-redundant covers of sets of FDs, and to decide whether a given set of subattributes forms a superkey.

MSC:

68P15 Database theory
Full Text: DOI

References:

[1] Abiteboul, S.; Buneman, P.; Suciu, D., Data on the Web: From Relations to Semistructured Data and XML (2000), Morgan Kaufmann Publishers: Morgan Kaufmann Publishers Los Altos, CA
[2] Abiteboul, S.; Hull, R.; Vianu, V., Foundations of Databases (1995), Addison-Wesley: Addison-Wesley Reading, MA · Zbl 0848.68031
[3] Arenas, M.; Libkin, L., An information-theoretic approach to normal forms for relational and XML data, (Principles of Database Systems (PODS) (2003)), 15-26
[4] Arenas, M.; Libkin, L., A normal form for XML documents, Trans. Database Systems (TODS), 29, 1, 195-232 (2004)
[5] W.W. Armstrong, Dependency structures of database relationships, Inform. Process. (1974) 580-583.; W.W. Armstrong, Dependency structures of database relationships, Inform. Process. (1974) 580-583. · Zbl 0296.68038
[6] Armstrong, W. W.; Nakamura, Y.; Rudnicki, P., Armstrong’s axioms, J. Formalized Math., 14 (2002)
[7] Atkinson, M.; Bancilhon, F.; DeWitt, D.; Dittrich, K.; Maier, D.; Zdonik, S., The object-oriented database system manifesto, (Proc. Internat. Conf. on Deductive and Object-Oriented Databases (1989)), 40-57
[8] Banatre, J.; Le Metayer, D., Programming by multiset transformation, Comm. ACM, 36, 1, 98-111 (1993)
[9] Beeri, C., On the membership problem for functional and multivalued dependencies in relational databases, Trans. Database Systems (TODS), 5, 3, 241-259 (1980) · Zbl 0441.68118
[10] Beeri, C., A formal approach to object-oriented databases, Data Knowledge Eng., 5, 4, 353-382 (1990)
[11] C. Beeri, P.A. Bernstein, Computational problems related to the design of normal form relational schemata, Trans. Database Systems (TODS) (1979) 30-59.; C. Beeri, P.A. Bernstein, Computational problems related to the design of normal form relational schemata, Trans. Database Systems (TODS) (1979) 30-59.
[12] Beeri, C.; Bernstein, P. A.; Goodman, N., A sophisticate’s introduction to database normalization theory, (Proc. Internat. Conf. on Very Large Data Bases (VLDB) (1978)), 113-124
[13] Beeri, C.; Fagin, R.; Howard, J. H., A complete axiomatization for functional and multivalued dependencies in database relations, (International Conference on Management of Data (SIGMOD) (1977), ACM: ACM New York), 47-61
[14] Beeri, C.; Mendelzon, A. O.; Sagiv, Y.; Ullman, J. D., Equivalence of relational database schemes, SIAM J. Comput., 10, 2, 352-370 (1981) · Zbl 0472.68056
[15] P.A. Bernstein, Normalisation and functional dependencies in the relational data base model, Technical Report CSRG-60, University of Toronto, 1975.; P.A. Bernstein, Normalisation and functional dependencies in the relational data base model, Technical Report CSRG-60, University of Toronto, 1975.
[16] Bernstein, P. A., Synthesizing third normal form relations from functional dependencies, Trans. Database Systems (TODS), 1, 277-298 (1976)
[17] Bernstein, P. A.; Goodman, N., What does Boyce-Codd normal form do?, (Proc. Internat. Conf. on Very Large Data Bases (VLDB) (1980)), 245-259
[18] Berry, G.; Boudol, G., The chemical abstract machine, Theoret. Comput. Sci., 96, 217-248 (1992) · Zbl 0747.68013
[19] J. Biskup, Database schema design theory: achievements and challenges, in: Information Systems and Data Management, Lecture Notes in Computer Science, Vol. 1066, Springer, Berlin, 1995, pp. 14-44.; J. Biskup, Database schema design theory: achievements and challenges, in: Information Systems and Data Management, Lecture Notes in Computer Science, Vol. 1066, Springer, Berlin, 1995, pp. 14-44.
[20] Biskup, J., Achievements of relational database schema design theory revisited, (Semantics in Databases, Lecture Notes in Computer Science, Vol. 1358 (1998), Springer: Springer Berlin), 29-54 · Zbl 0885.68006
[21] Bommel, M. F.; Weddell, G. E., Reasoning about equations and functional dependencies on complex objects, Trans. Knowledge and Data Engineering, 6, 3, 455-469 (1994)
[22] T. Bray, J. Paoli, C.M. Sperberg-McQueen, E. Maler, F. Yergeau, Extensible Markup Language (XML) 1.0 (third edition) W3C recommendation 04 February 2004, \( \langle;\) http://www.w3.org/TR/2004/REC-xml-\(20040204/ \rangle;\); T. Bray, J. Paoli, C.M. Sperberg-McQueen, E. Maler, F. Yergeau, Extensible Markup Language (XML) 1.0 (third edition) W3C recommendation 04 February 2004, \( \langle;\) http://www.w3.org/TR/2004/REC-xml-\(20040204/ \rangle;\)
[23] Bry, F.; Kröger, P., A computational biology database digest: data, data analysis, and data management, Distributed Parallel Databases, 13, 1, 7-42 (2003) · Zbl 1009.68558
[24] Buneman, P.; Fan, W.; Siméon, J.; Weinstein, S., Constraints for semi-structured data and XML, SIGMOD Record, 30, 1, 47-54 (2001)
[25] C. Calude, G. Paun, G. Rozenberg, A. Salomaa (Eds.), Multiset Processing, Mathematical, Computer Science, and Molecular Computing Points of View, Workshop on Multiset Processing, WMP 2000, Curtea de Arges, Romania, August 21-25, 2000, Springer, Berlin, 2001.; C. Calude, G. Paun, G. Rozenberg, A. Salomaa (Eds.), Multiset Processing, Mathematical, Computer Science, and Molecular Computing Points of View, Workshop on Multiset Processing, WMP 2000, Curtea de Arges, Romania, August 21-25, 2000, Springer, Berlin, 2001.
[26] Chen, P. P., The entity-relationship model: towards a unified view of data, Trans. Database Systems (TODS), 1, 9-36 (1976)
[27] Codd, E. F., Further normalization of the database relational model, (Courant Computer Science Symposia 6: Data Base Systems (1972)), 33-64 · Zbl 0809.68056
[28] Codd, E. F., Recent investigations in relational database system, (Proc. IFIP Conference (1974)), 1017-1021 · Zbl 0296.68041
[29] Courcelle, B., Fundamental properties of infinite trees, Theoret. Comput. Sci., 25, 95-169 (1983) · Zbl 0521.68013
[30] Dovier, A.; Policriti, A.; Rossi, G., A uniform axiomatic view of lists, multisets, and sets, and the unification algorithm, Fund. Inform., 36, 201-234 (1998) · Zbl 0930.68042
[31] Dummett, M., Elements of Intuitionism (2000), Clarendon Press: Clarendon Press Oxford · Zbl 0949.03059
[32] Fagin, R., Multivalued dependencies and a new normal form for relational databases, Trans. Database Systems (TODS), 2, 3, 262-278 (1977)
[33] Fagin, R., A normal form for relational databases that is based on domains and keys, Trans. Database Systems (TODS), 6, 3, 387-415 (1981) · Zbl 0462.68088
[34] Fagin, R.; Vardi, M. Y., The theory of data dependencies: a survey, (Mathematics of Information Processing: Proceedings of Symposia in Applied Mathematics (1986)), 19-71 · Zbl 0617.68084
[35] Fan, W.; Libkin, L., On XML integrity constraints in the presence of DTDs, J. ACM, 49, 3, 368-406 (2002) · Zbl 1326.68120
[36] Fan, W.; Siméon, J., Integrity constraints for XML, J. Comput. System Sci., 66, 1, 254-291 (2003) · Zbl 1026.68038
[37] Fishman, D. H.; Beech, D.; Cate, H. P.; Chow, E. F.; Connors, T.; Davis, J. W.; Derret, N.; Hoch, C. G.; Kent, W.; Lyngbæk, P.; Mahbod, B.; Neimat, M.-A.; Ryan, T. A.; Shan, M.-C., IRIS: an object-oriented database management system, Trans. Office Inform. Systems, 5, 1, 48-69 (1987)
[38] Gardarin, G.; Cheiney, J.-P.; Kiernan, G.; Pastre, D.; Stora, H., Managing complex objects in an extensible relational DBMS, (Proc. Internat. Conf. on Very Large Data Bases (1989)), 55-65
[39] Hara, C. S.; Davidson, S. B., Reasoning about nested functional dependencies, (Principles of Database Systems (PODS) (1999)), 91-100
[40] Hartmann, S., Decomposing relationship types by pivoting and schema equivalence, Data Knowledge Eng., 39, 75-99 (2001) · Zbl 0984.68037
[41] Hartmann, S., On the implication problem for cardinality constraints and functional dependencies, Ann. Math. Artificial Intelligence, 33, 253-307 (2001) · Zbl 1314.68119
[42] S. Hartmann, S. Link, More functional dependencies for XML, in: Advances in Databases and Information Systems, Seventh East European Conference, ADBIS 2003, Dresden, Germany, September 3-6, Lecture Notes in Computer Science, Vol. 2798, Springer, Berlin, 2003, pp. 355-369.; S. Hartmann, S. Link, More functional dependencies for XML, in: Advances in Databases and Information Systems, Seventh East European Conference, ADBIS 2003, Dresden, Germany, September 3-6, Lecture Notes in Computer Science, Vol. 2798, Springer, Berlin, 2003, pp. 355-369. · Zbl 1024.00053
[43] Hartmann, S.; Link, S., On functional dependencies in advanced data models, Electronic Notes in Theoretical Computer Science (ENTCS), Vol. 84 (2003) · Zbl 1264.68073
[44] Hartmann, S.; Link, S., A membership algorithm for functional and multi-valued dependencies in the presence of lists, Electronic Notes in Theoretical Computer Science (ENTCS), Vol. 91, 171-194 (2004) · Zbl 1271.68103
[45] Hartmann, S.; Link, S., Multi-valued dependencies in the presence of lists, (Proc. 23rd Internat. Conf. on Principles of Database Systems (PODS) (2004), ACM: ACM New York), 330-341
[46] S. Hartmann, S. Link, Normalisation in the presence of lists, in: 15th Australasian Database Conference (ADC), Conferences in Research and Practice in Information Technology, Vol. 27, 2004, pp. 53-64.; S. Hartmann, S. Link, Normalisation in the presence of lists, in: 15th Australasian Database Conference (ADC), Conferences in Research and Practice in Information Technology, Vol. 27, 2004, pp. 53-64.
[47] S. Hartmann, S. Link, A topological view of data dependencies in complex-value databases, in: Proc. Third Chilean Workshop on Databases, 2004.; S. Hartmann, S. Link, A topological view of data dependencies in complex-value databases, in: Proc. Third Chilean Workshop on Databases, 2004.
[48] S. Hartmann, S. Link, The implication problem of functional dependencies in complex-value databases, Electronic Notes in Theoretical Computer Science (ENTCS) Vol. 123, 2005, pp. 125-137.; S. Hartmann, S. Link, The implication problem of functional dependencies in complex-value databases, Electronic Notes in Theoretical Computer Science (ENTCS) Vol. 123, 2005, pp. 125-137.
[49] Hartmann, S.; Link, S., Axiomatisations of functional dependencies in the presence of records, lists, sets and multisets, Theoret. Comput. Sci. (TCS), 355, 2, 167-196 (2006) · Zbl 1088.68046
[50] Hartmann, S.; Link, S., Functional and multivalued dependencies in nested databases generated by record and list constructor, Ann. Math. Artificial Intelligence (AMAI), 46, 114-164 (2006) · Zbl 1097.68553
[51] Hartmann, S.; Link, S., The nested list normal form for functional and multivalued dependencies, (Proc. Fourth Internat. Symposium on Foundations of Information and Knowledge Systems (FoIKS), Lecture Notes in Computer Science, Vol. 3861 (2006), Springer: Springer Berlin), 137-158 · Zbl 1177.68068
[52] S. Hartmann, S. Link, M. Kirchberg, A subgraph-based approach towards functional dependencies for XML, in: Proc. Seventh World-Multiconference on Systemics, Cybernetics and Informatics (SCI), Vol. IX, Computer Science and Engineering II, Orlando, Florida, USA, July 27-30, 2003, pp. 200-205.; S. Hartmann, S. Link, M. Kirchberg, A subgraph-based approach towards functional dependencies for XML, in: Proc. Seventh World-Multiconference on Systemics, Cybernetics and Informatics (SCI), Vol. IX, Computer Science and Engineering II, Orlando, Florida, USA, July 27-30, 2003, pp. 200-205.
[53] S. Hartmann, S. Link, K.-D. Schewe, Reasoning about functional and multi-valued dependencies in the presence of lists, in: Third International Symposium on Foundations of Information and Knowledge Systems (FoIKS), Lecture Notes in Computer Science, Vol. 2942, Springer, Berlin, 2004, pp. 134-154.; S. Hartmann, S. Link, K.-D. Schewe, Reasoning about functional and multi-valued dependencies in the presence of lists, in: Third International Symposium on Foundations of Information and Knowledge Systems (FoIKS), Lecture Notes in Computer Science, Vol. 2942, Springer, Berlin, 2004, pp. 134-154. · Zbl 1202.68142
[54] Hull, R.; King, R., Semantic database modeling: survey, applications and research issues, ACM Comput. Surveys, 19, 3 (1987)
[55] G. Lamperti, M. Melchiori, M. Zanella, On multisets in database systems, in: Workshop on Multiset Processing (WMP), Lecture Notes in Computer Science, Vol. 2235, Springer, Berlin, 2000, pp. 147-216.; G. Lamperti, M. Melchiori, M. Zanella, On multisets in database systems, in: Workshop on Multiset Processing (WMP), Lecture Notes in Computer Science, Vol. 2235, Springer, Berlin, 2000, pp. 147-216. · Zbl 1052.68027
[56] M. Lee, T. Ling, W. Low, Designing functional dependencies for XML, in: Advances in Database Technology—EDBT 2002, Eighth International Conference on Extending Database Technology, Prague, Czech Republic, March 25-27, Lecture Notes in Computer Science, Vol. 2287, Springer, Berlin, 2002, pp. 124-141.; M. Lee, T. Ling, W. Low, Designing functional dependencies for XML, in: Advances in Database Technology—EDBT 2002, Eighth International Conference on Extending Database Technology, Prague, Czech Republic, March 25-27, Lecture Notes in Computer Science, Vol. 2287, Springer, Berlin, 2002, pp. 124-141. · Zbl 1054.68794
[57] Levene, M., The Nested Universal Relation Database Model (1992), Springer: Springer Berlin · Zbl 0985.68510
[58] Levene, M.; Loizou, G., Semantics for null extended nested relations, Trans. Database Systems (TODS), 18, 3, 414-459 (1993)
[59] Li, J.; Ng, S.; Wong, L., Bioinformatics adventures in database research, (Proc. Internat. Conf. on Database Theory (ICDT), Lecture Notes in Computer Science, Vol. 2572 (2002), Springer: Springer Berlin), 31-46 · Zbl 1022.68519
[60] Maier, D., Minimum covers in relational database model, J. ACM, 27, 4, 664-674 (1980) · Zbl 0466.68085
[61] McKinsey, J. C.C.; Tarski, A., The algebra of topology, Ann. Math., 45, 141-191 (1944) · Zbl 0060.06206
[62] McKinsey, J. C.C.; Tarski, A., On closed elements in closure algebras, Ann. Math., 47, 122-146 (1946) · Zbl 0060.06207
[63] Mok, W. Y.; Ng, Y. K.; Embley, D. W., A normal form for precisely characterizing redundancy in nested relations, Trans. Database Systems (TODS), 21, 77-106 (1996)
[64] Naqvi, S.; Tsur, S., A Logical Language for Data and Knowledge Bases (1989), Computer Science Press: Computer Science Press Rockville, MD
[65] Özsoyoglu, Z. M.; Yuan, L. Y., A new normal form for nested relations, Trans. Database Systems (TODS), 12, 111-136 (1987)
[66] Paredaens, J.; De Bra, P.; Gyssens, M.; Van Gucht, D., The Structure of the Relational Database Model (1989), Springer: Springer Berlin · Zbl 0669.68060
[67] Richardson, J., Supporting lists in a datamodel, (Proc. Internat. Conf. on Very Large Data Bases (VLDB) (1992)), 127-192
[68] Schewe, K.-D.; Thalheim, B., Fundamental concepts of object oriented databases, Acta Cybernet., 11, 4, 49-85 (1993) · Zbl 0804.68034
[69] Scholl, M. H.; Schek, H.-J., A relational object model, (Proc. Internat. Conf. on Database Theory (ICDT) (1990)), 89-105 · Zbl 0589.68067
[70] Seshadri, P.; Livny, M.; Ramakrishnan, R., The design and implementation of sequence database system, (Proc. Internat. Conf. on Very Large Data Bases (VLDB) (1996)), 99-110
[71] Suciu, D., On database theory and XML, SIGMOD Record, 30, 3, 39-45 (2001)
[72] Tari, Z.; Stokes, J.; Spaccapietra, S., Object normal forms and dependency constraints for object-oriented schemata, Trans. Database Systems (TODS), 22, 513-569 (1997)
[73] Thalheim, B., Dependencies in Relational Databases (1991), Teubner: Teubner Stuttgart · Zbl 0743.68014
[74] Thalheim, B., Entity-Relationship Modeling: Foundations of Database Technology (2000), Springer: Springer Berlin · Zbl 0976.68046
[75] Vardi, M. Y., Fundamentals of dependency theory, (Börger, E., Trends in Theoretical Computer Science (1987)), 171-224
[76] Vianu, V., A web odyssey: from Codd to XML, (Principles of Database Systems (2001)), 1-15
[77] M.W. Vincent, J. Liu, Functional dependencies for XML, in: M.E. Orlowska, X. Zhou, Y. Zhang (Eds.), Web Technologies and Applications: Fifth Asia-Pacific Web Conference, APWeb 2003, Xian, China, April 23-25, 2003. Proceedings, Lecture Notes in Computer Science, Vol. 2642, Springer, Berlin, 2003, pp. 22-34.; M.W. Vincent, J. Liu, Functional dependencies for XML, in: M.E. Orlowska, X. Zhou, Y. Zhang (Eds.), Web Technologies and Applications: Fifth Asia-Pacific Web Conference, APWeb 2003, Xian, China, April 23-25, 2003. Proceedings, Lecture Notes in Computer Science, Vol. 2642, Springer, Berlin, 2003, pp. 22-34. · Zbl 1025.68668
[78] M.W. Vincent, J. Liu, Multivalued dependencies in XML, in: British National Conference on Databases, Lecture Notes in Computer Science, Vol. 2712, Springer, Berlin, 2003, pp. 4-18.; M.W. Vincent, J. Liu, Multivalued dependencies in XML, in: British National Conference on Databases, Lecture Notes in Computer Science, Vol. 2712, Springer, Berlin, 2003, pp. 4-18. · Zbl 1037.68958
[79] M.W. Vincent, J. Liu, C. Liu, A redundancy free 4NF for XML, in: Proc. of the XML Database Symposium, Lecture Notes in Computer Science, Vol. 2824, Springer, Berlin, 2003, pp. 254-266.; M.W. Vincent, J. Liu, C. Liu, A redundancy free 4NF for XML, in: Proc. of the XML Database Symposium, Lecture Notes in Computer Science, Vol. 2824, Springer, Berlin, 2003, pp. 254-266.
[80] Vincent, M. W.; Srinivasan, B., Redundancy and the justification of fourth normal form in relational databases, Internat. J. Foundations Comput. Sci., 4, 4, 355-365 (1993) · Zbl 0804.68038
[81] Vincent, M. W.; Srinivasan, B., Update anomalies and the justification of fourth normal form in relational databases, Inform. Sci., 81, 87-102 (1994) · Zbl 0837.68021
[82] Weddell, G. E., Reasoning about functional dependencies generalized for semantic data models, Trans. Database Systems (TODS), 17, 1, 32-64 (1992)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.