Abstract
This paper examines algorithmic aspects of searching for approximate functional dependencies in a database relation. The goal is to avoid exploration of large parts of the space of potential rules. This is accomplished by leveraging found rules to make finding other rules more efficient. The overall strategy is an attribute-at-a-time iteration which uses local breadth first searches on lattices that increase in width and height in each iteration. The resulting algorithm provides many opportunities to apply heuristics to tune the search for particular data-sets and/or search objectives. The search can be tuned at both the global iteration level and the local search level. A number of heuristics are developed and compared experimentally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bell, S., Brockhausen, P.: Discovery of constraints and data dependencies in databases (extended abstract). In: European Conference on Machine Learning, pp. 267–270 (1995)
Bohannon, P., Fan, W., Geerts, F., Jia, X., Kementsietsidis, A.: Conditional functional dependencies for data cleaning. In: ICDE, pp. 746–755. IEEE, Los Alamitos (2007)
Dalkilic, M.M., Robertson, E.L.: Information dependencies. In: PODS, pp. 245–253 (2000)
Doan, A.: Illinois semantic integration archive, http://pages.cs.wisc.edu/~anhai/wisc-si-archive/
Giannella, C., Dalkilic, M., Groth, D., Robertson, E.: Improving query evaluation with approximate functional dependency based decompositions. In: Eaglestone, B., North, S.C., Poulovassilis, A. (eds.) BNCOD 2002. LNCS, vol. 2405, pp. 26–41. Springer, Heidelberg (2002)
Giannella, C., Robertson, E.: On approximation measures for functional dependencies. Inf. Syst. 29(6), 483–507 (2004)
Hilderman, R., Hamilton, H.: Knowledge discovery and interestingness measures: A survey. Technical Report 99-04, University of Regina (1999)
Huhtala, Y., Kärkkäinen, J., Porkka, P., Toivonen, H.: TANE: An efficient algorithm for discovering functional and approximate dependencies. The Computer Journal 42(2), 100–111 (1999)
Ilyas, I.F., Markl, V., Haas, P., Brown, P., Aboulnaga, A.: Cords: automatic discovery of correlations and soft functional dependencies. In: SIGMOD Proceedings, pp. 647–658. ACM Press, New York (2004)
Kivinen, J., Mannila, H.: Approximate inference of functional dependencies from relations. In: ICDT, pp. 129–149. Elsevier Science Publishers, Amsterdam (1995)
Mannila, H., Räihä, K.-J.: On the complexity of inferring functional dependencies. Discrete Applied Mathematics 40(2), 237–243 (1992)
Matos, V., Grasser, B.: Sql-based discovery of exact and approximate functional dependencies. In: ITiCSE Working Group Reports, pp. 58–63. Association for Computing Machinery, New York (2004)
Ramakrishnan, R., Gehrke, J.: Database Management Systems. McGraw-Hill Higher Education, New York (2002)
Shannon, C.E.: A mathematical theory of communication. Bell System Tech. J. 27, 379–423, 623–656 (1948)
Wolf, G., Khatri, H., Chokshi, B., Fan, J., Chen, Y., Kambhampati, S.: Query processing over incomplete autonomous databases. In: VLDB Proceedings. VLDB Endowment, pp. 651–662 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Engle, J.T., Robertson, E.L. (2008). HLS: Tunable Mining of Approximate Functional Dependencies. In: Gray, A., Jeffery, K., Shao, J. (eds) Sharing Data, Information and Knowledge. BNCOD 2008. Lecture Notes in Computer Science, vol 5071. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70504-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-70504-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70503-1
Online ISBN: 978-3-540-70504-8
eBook Packages: Computer ScienceComputer Science (R0)