Abstract
One of the main contributions of rough set theory to data mining is data reduction. There are three reductions: attribute (column) reduction, row reduction, and value reduction. Row reduction is merging the duplicate rows. Attribute reduction is to find important attributes. Value reduction is to reduce the decision rules to a logically equivalent minimal length. Most recent attentions have been on finding attribute reducts. Traditionally, the value reduct has been searched through the attribute reduct. This paper observes that this method may miss the best value reducts. It also revisits an old rudiment idea [11], namely, a rough set theory on high frequency data: The notion of high frequency value reduct is extracted in a bottom-up fashion without finding attribute reducts. Our method can discover concise and important decision rules in large databases, and is described and illustrated by an example.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the ACM SIGMOD Conference, pp. 207–216 (1993)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Chen, R., Lin, T.Y.: Supporting Rough Set Theory in Very Large Database Using ORACLE RDBMS. In: Soft Computing in Intelligent Systems and Information Processing, Proceedings of 1996 Asian Fuzzy Systems Symposium, Kenting, Taiwan, December 11-14, 1996, pp. 332–337 (1996)
Chen, R., Lin, T.Y.: Finding Reducts in Very Large Databases. In: Proceedings of Joint Conference of Information Science, Research Triangle Park, North Carolina, March 1-5, 1997, pp. 350–352 (1997)
Fernandez-Baizan, M., Ruiz, E., Wasilewska, A.: A Model of RSDM Implementation. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 186–193. Springer, Heidelberg (1998)
Garcia-Molina, H., Ullman, J., Widom, J.: Database Systems: The Complete Book. Prentice-Hall, Englewood Cliffs (2001)
Han, J., Hu, X., Lin, T.: A new computation model for rough set theory based on database systems. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 381–390. Springer, Heidelberg (2003)
Houtsma, M., Swami, A.: Set-Oriented Mining for Association Rules in Relational Databases. In: Proc. of Int. Conf. on Data Engineering, pp. 25–33 (1995)
Hu, X., Lin, T., Han, J.: A new rough sets model based on database systems. J. of Fundamenta Informaticae 59(2-3), 135–152 (2004)
Lin, T.Y.: Neighborhood Systems and Approximation in Database and Knowledge Base Systems. In: Proceedings of the Fourth International Symposium on Methodologies of Intelligent Systems, Poster Session, October 12-15, 1989, pp. 75–86 (1989)
Lin, T.Y.: Rough Set Theory in Very Large Database Mining. In: Symposium on Modeling, Analysis and Simulation, CESA’96 IMACS Multi Conference (Computational Engineering in Systems Applications), vol. 2, Lille, France, July 9-12, 1996, pp. 936–994 (1996)
Pawlak, Z.: Rough Sets. International Journal of Information and Computer Science 11(5), 341–356 (1982)
Pawlak, Z.: Rough sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)
Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: Slowinski, R. (ed.) Decision Support by Experience - Application of the Rough Sets Theory, pp. 331–362. Kluwer Academic Publishers, Dordrecht (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lin, T.Y., Han, J. (2007). High Frequent Value Reduct in Very Large Databases. In: An, A., Stefanowski, J., Ramanna, S., Butz, C.J., Pedrycz, W., Wang, G. (eds) Rough Sets, Fuzzy Sets, Data Mining and Granular Computing. RSFDGrC 2007. Lecture Notes in Computer Science(), vol 4482. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72530-5_41
Download citation
DOI: https://doi.org/10.1007/978-3-540-72530-5_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72529-9
Online ISBN: 978-3-540-72530-5
eBook Packages: Computer ScienceComputer Science (R0)