Abstract
Several well-developed approaches to inductive learning low exist, but each has specific limitations that are hard to overcome. Multi-strategy learning attempts to tackle this problem combining multiple methods in one algorithm. This article describes a unification of two widely-used empirical approaches: rule induction and instance-based learning. In the new algorithm, instances are treated as maximally specific rules, and classification is oerformed using a best-match strategy. Rules are learned by gradually generalizing instances until no improvement in apparent accuracy is obtained. Theoretical analysis shows this approach to be efficient. It is implemented in the RISE 3.1 system. In an extensive empirical study, RISE consistently achieves higher accuracies than state-of-the-art representatives of both its parent approaches (PEBLS and CN2), as well as a decision tree learner (C4.5). Lesion studies show that eachoof RISE's components is essential to this performance. Most significantly, in 14 of the 30 domains studied, RISE is more accurate than the best of PEBLS and CN2, showing that a significant synergy can be obtained by combining multiple empirical methods.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
Refenences
Aha, D. W. (1990). A study of instance-based learning algorithms for supervised learning tasks: Mathematical, empirical, and psychological evaluations (Technical Report 90–42). Irvine, CA: University of California at Irvine, Department of Information and Computer Science.
Aha, D. W. (Ed.) (in press). Special issue on lazy learning. Artificial Intelligence Review.
Aha, D. W., & Bankert, R. L. (1994). Feature selection for case-based classification of cloud types: An empirical comparison. Proceedings of the 1994 AAAI Workshop on Case-Based Reasoning (pp. 106–112). Seattle, WA: AAAI.
Aha, D. W., & Goldstone, R. L. (1992). Concept learning and flexible weighting. Proceedings of the Fourteenth Annual Conference of the Cognitive Science Society (pp. 534–539). Bloomington, IN: Lawrence Erlbaum.
Aha, D. W., Kibler, D., & Albert, M. K. (1991). Instance-based learning algorithms. Machine Learning, 6, 37–66.
Atkeson, C. G., Moore, A. W., & Schaal, S. (in press). Locally weighted learning. Artificial Intelligence Review.
Belew, R. K., McInerney, J., & Schraudolph, N. N. (1992). Evolving networks: Using the genetic algorithm with connectionist learning. In C. G. Langton, J. Taylor, J. D. Farmer, & S. Rasmussen (Eds.), Artificial Life II. Redwood City, CA: Addison-Wesley.
Biberman, Y. (1994). A context similarity measure. Proceedings of the Ninth European Conference on Machine Learning (pp. 49–63). Catania, Italy: Springer-Verlag.
Booker, L. B., Goldberg, D. E., & Holland, J. H. (1989). Classifier systems and genetic algorithms. Artificial Intelligence, 40, 235–282.
Brodley, C. E. (1995). Recursive automatic bias selection for classifier construction. Machine Learning, 20, 63–94.
Buntine, W. (1989). Learning classification rules using Bayes. Proceedings of the Sixth International Workshop on Machine Learning (pp. 94–98). Ithaca, NY: Morgan Kaufmann.
Cameron-Jones, R. M. (1992). Minimum description length instance-based learning. Proceedings of the Fifth Australian Joint Conference on Artificial Intelligence (pp. 368–373). Hobart, Australia: World Scientific.
Catlett, J. (1991). Megainduction: A test flight. Proceedings of the Eighth International Conference on Machine Learning (pp. 589–604). Evanston, IL: Morgan Kaufmann.
Clark, P., & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. Proceedings of the Sixth European Working Session on Learning (pp. 151–163). Porto, Portugal: Springer-Verlag.
Clark, P., & Niblett, T. (1989). The CN2 induction algorithm. Machine Learning, 3, 261–283.
Cohen, W. W. (1995). Fast effective rule induction. Proceedings of the Twelfth International Conference on Machine Learning (pp. 115–123). Tahoe City, CA: Morgan Kaufmann.
Cost, S.,& Salzberg, S. (1993). A weighted nearest neighbor algorithm for learning with symbolic features. Machine Learning, 10, 57–78.
Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13, 21–27.
DeGroot, M. H. (1986). Probability and statistics (Second Edition). Reading, MA: Addison-Wesley.
Domingos, P. (1994). The RISE system: Conquering without separating Proceedings of the Sixth IEEE International Conference on Tools with Artificial Intelligence (pp. 704–707).. New Orleans, LA: IEEE Computer Society Press.
Domingos, P. (1995a). Rule induction and intance-based learning: A unified approach. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence (pp. 1226–1232). Montreal, Canada: Morgan Kaufmann.
Domingos, P. (1985b). The RISE 2.0 system: A case study in multistrategy learning (Technical Report 95–2). Irvine, CA: University of California at Irvine, Department of Information and Computer Science.
Domingos, P. (1995c). Two-way induction. Proceedings of the Seventh IEEE International Conference on Tools with Artificial Intelligence (pp. 182–189). Herndon, VA: IEEE Computer Society Press.
Domingos, P. (in press). Context-sensitive feature selection for lazy learners. Artificial Intelligence Review.
Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York, NY: Wiley.
Golding, A. R., & Rosenbloom, P. S. (1991). Improving rule-based systems through case-based reasoning. Proceedings of the Ninth National Conference on Artificial Intelligence (pp. 22–27). Anaheim, CA: AAAI Press.
Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11, 63–91.
Holte, R. C., Acker, L. E., & Porter, B. W. (1989). Concept learning and the problem of small disjuncts. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 813–818). Detroit, MI: Morgan Kaufmann.
Kelly, J. D., & Davis, L. (1991). A hybrid genetic algorithm for classification. Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (pp. 645–650). Sydney, Australia: Morgan Kaufmann.
Kohavi, R., & Li, C. (1995). Oblivious decision trees, graphs, and top-down pruning. Proceedings of the Eourteenth International Joint Conference on Artificial Intelligence (pp. 1071–1077). Montreal, Canada: Morgan Kaufmann.
Kolodner, J. (1993). Case-based reasoning. San Mateo, CA: Morgan Kaufmann.
Michalski, R. S. (1983). A theory and methodology of inductive learning. Artificial Intelligence, 20, 111–161.
Michalski, R. S., Mozetic, I., Hong, J., & Lavrac, N. (1986). The multi-purpose incremental learning system AQ 15 and its testing application to three medical domains. Proceedings of the Fifth National Conference on Artificial Intelligence (pp. 1041–1045). Philadelphia, PA: AAAI Press.
Michalski, R. S., & Tecuci, G. (Eds.), (1993). Proceedings of the Second International Workshop on Multistrategy Learning. Harpers Ferry, VA: Office of Naval Research/George Mason University.
Michalski, R. S., & Tecuci, G. (Eds.) (1994).Machine learning: A multistrategy approach. San Mateo, CA: Morgan Kaufmann.
Mitchell, T. M. (1980). The need for biases in learning generalizations (Technical Report). New Brunswick, NJ: Rutgers University, Computer Science Department.
Mohri, T., and Tanaka, H. (1994). An optimal weighting criterion of case indexing for both numeric and symbolic attributes. Proceedings of the 1994 AAAI Workshop on Case-Based Reasoning (pp. 123–127). Seattle, WA: AAAI.
Murphy, P. M., & Aha, D. W. (1995). UCI repository of machine learning databases (Machine-readable data repository). Irvine, CA. University of California, Department of Information and Computer Science.
Niblett, T. (1987). Constructing decision trees in noisy domains. Proceedings of the Second European Working Session on Learning (pp. 67–78). Bled, Yugoslavia: Sigma.
Oliveira, A. L., and Sangiovanni-Vincentelli, A. (1995). Inferring reduced ordered decision graphs of minimum description length. Proceedings of the Twelfth International Conference on Machine Learning (pp. 421–429). Tahoe City, CA: Morgan Kaufmann.
Ourston, D., & Mooney, R. J. (1994). Theory refinement combining analytical and empirical methods. Artificial Intelligence, 66, 273–309.
Pagallo, G., & Haussler, D. (1990). Boolean feature discovery in empirical learning. Machine Learning. 3, 71–99.
Pazzani, M., & Kibler, D. (1992). The utility of knowledge in inductive learning. Machine Learning, 9, 57–94.
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81–106.
Quinlan, J. R. (1987). Generating production rules from decision trees. Proceedings of the Tenth International Joint Conference on Artificial Intelligence (pp. 304–307). Milan, Italy: Morgan Kaufmann.
Quinlan, J. R. (1990). Learning logical definitions from relations. Machine Learning, 5, 239–266.
Quinlan, J. R. (1993a). C4.5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.
Quinlan, J. R. (1993b). Combining instance-based and model-based learning. Proceedings of the Tenth International Conference on Machine Learning (pp. 236–243). Amherst, MA: Morgan Kaufmann.
Rao, R. B., Gordon, D., & Spears, W. (1995). For every generalization action, is there really an equal and opposite reaction? Analysis of the conservation law for generalization performance. Proceedings of the Twelfth International Conference on Machine Learning (pp. 471–479). Tahoe City, CA: Morgan Kaufmann.
Rendell, L. (1986). A general framework for induction and a study of selective induction. Machine Learning. 1, 177–226.
Riesbeck, C. K., & Schank, R. C. (1989). Inside case-based reasoning. Hillsdale, NJ: Lawrence Erlbaum.
Rivest, R. L. (1987). Learning decision lists. Machine Learning. 2, 229–246.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning internal representations by error propagation In D. E. Rumelhart & J. L. McClelland (Eds.), Parallel distributed processing: Explorations in the microstructure of cognition (Vol. 2). Cambridge, MA: MIT Press.
Salzberg, S. (1991). A nearest hyperrectangle learning method. Machine Learning, 6, 251–276.
Schaffer, C. (1994a). Cross-validation, stacking, and bi-level stacking: Meta-methods for classification learning. In P. Cheeseman & R. W. Oldford (Eds.), Selecting models from data: Artificial intelligence and statistics IV. New York, NY: Springer-Verlag.
Schaffer, C. (1994b). A conservation law for generalization performance. Proceedings of the Eleventh International Conference on Machine Learning (pp. 259–265). New Brunswick, NJ: Morgan Kaufmann.
Scott, P. D., & Sage, K. H. (1992). Why generalize? Hybrid representations and instance-based learning. Proceedings of the Tenth European Conference on Artificial Intelligence (pp. 484–486). Vienna, Austria: Wiley.
Smyth, P. R., Goodman, M., & Higgins, C. (1990). A hybrid rule-based/Bayesian classifier. Proceedings of the Seventh European Conference on Artificial Intelligence (pp. 610–615). Stockholm, Sweden: Springer-Verlag.
Smyth, P., Gray, A., & Fayyad, U. (1995). Retrofitting decision tree classifiers using kernel density e estimation. Proceedings of the Twelfth International Conference on Machine Learning (pp. 506–514). Tahoe City, CA: Morgan Kaufmann.
Stanfill, C., & Waltz, D. (1986). Toward memory-based reasoning. Communications of the ACM, 29, 1213–1228.
Ting, K. M. (1994). Discretization of continous-valued attributes and instance-based learning (Technical Report. 491). Sydney, Australia: Basser Department of Computer Science, University of Sydney.
Towell, G. G., & Shavlik, J. W. (1994). Knowledge-based artificial neural networks. Artificial Intelligence, 70, 119–165.
Townsen-Weber, T., & Kibler, D. (1994). Instance-based prediction of continous values. Proceedings of the 1994 AAAI Workshop on Case-Based Reasoning (pp. 30–35). Seattle, WA: AAAI.
Utgoff, P. E. (1989a). Incremental induction of decision trees. Machine Learning, 4, 161–186.
Utgoff, P. E. (1989b). Perceptron trees: A case study in hybrid concept representations. Connection Science, 1, 377–391.
Wettschereck, D. (1994). A hybrid nearest-neighbor and nearest-hyperrectangle algorithm. Proceedings of the Ninth European Conference on Machine Learning (pp. 323–335). Catania, Italy: Springer-Verlag.
Wettschereck, D., & Dietterich, T. (1995). An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms. Machine Learning, 19, 5–27.
Wnek, J., & R. S. Michalski (1994). Hypothesis-driven constructive induction in AQ17-HCI: A method and experiments. Machine Learning, 14, 139–168.
Zhang, J. (1990). A method that combines inductive learning with exemplar-based learning. Proceedings of the Second IEEE International Conference on Tools for Artificial Intelligence (pp. 31–37). San Jose, CA: IEEE Computer Society Press.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Domingos, P. Unifying instance-based and rule-based induction. Mach Learn 24, 141–168 (1996). https://doi.org/10.1007/BF00058656
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1007/BF00058656