Abstract
We introduce the notion of “partial Occam algorithm”. A partial Occam algorithm produces a succinct hypothesis that is partially consistent with given examples, where the proportion of consistent examples is a bit more than half. By using this new notion, we propose one approach for obtaining a PAC learning algorithm: A partial Occam algorithm is equivalent to a weak PAC learning algorithm. Thus, by using boosting techniques, we can obtain an ordinary PAC learning algorithm from this weak PAC learning algorithm. We demonstrate with some examples that some improvement is possible by this approach, in particular in the hypothesis size. First, we obtain a non proper PAC learning algorithm for k-DNF, which has similar sample complexity as Littlestone's Winnow, but produces hypothesis of size polynomial in d and log k for a k-DNF target with n variables and d terms. (Cf. The hypothesis size of Winnow is O(n k).) Next we show that 1-decision lists of length d with n variables are non-proper PAC learn able by using O (1 (log + 16d log n(d + log log n)2)) examples within polynomial time w.r.t. n, 2d, 1/ε, and log 1/S. Again, while this sample complexity is similar to Winnow, we improve the hypothesis size. We also point out that our algorithms are robust against random classification noise.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
H. Almuallim and T.G. Dietterich. Learning Boolean concepts in the presence of many irrelevant features. Artificial Intelligence, 69(1-2), 279–306, 1994.
J.A. Aslam and S.E. Decatur. General bounds on statistical query learning and PAC learning with noise via hypothesis boosting. In Proceedings of the 34th IEEE Symposium on Foundation of Computer Science, 282–291 1993.
D. Angluin and P.D. Laird. Learning from noisy examples. Machine Learning, 2(4):343–370, 1988.
A. Blumer, A. Ehrenfeucht, D. Haussler, and M.K. Warmuth. Occam's razor. Information Processing Letters, 24:377–380,1987.
R. Board and L. Pitt. On the necessity of Occam algorithms. Theoretical Computer Science, 100:157–184, 1992.
A. Dhagat and L. Hellerstein. PAC learning with irrelevant attributes. In Proceedings of the 35th IEEE Symposium on Foundation of Computer Science, 64–74, 1994.
Y. Freund. Boosting a weak learning algorithm by majority. Information and Computation, 121(2):256–285, 1995.
Jeffrey Jackson. An efficient membership-query algorithm for learning DNF with respect to the uniform distribution. In Proceedings of the 35rd Annual Symposium on Foundations of Computer Science, pages 42–53. IEEE Computer Society Press, Los Alamitos, CA, 1994.
T. Hancock, T. Jiang, M. Li, and J. Tromp. Lower bounds on learning decision lists and trees. In Proceedings of the 12th Annual Symposium on Theoretical Aspects of Computer Science, Lecture Notes in Computer Science, 527–538, 1995.
D. Haussler. Quantifying inductive bias: AI learning algorithms and Valiant's learning framework. Artificial Intelligence, 36:177–221, 1988.
D. Helmbold and M. K. Warmuth. On weak learning. Journal of Computer and System Sciences, 50(3), 551–573, 1995.
M. Kearns. Efficient noise-tolerant learning from statistical queries. In Proceedings of the,25th Annual ACM Symposium on Theory of Computing, 392–401 1993.
M.J. Kearns, M. Li and L.G. Valiant. Learning Boolean formulas. Journal of the ACM, 41(6):1298–1328, 1994.
M.J. Kearns and U.V. Vazirani. An Introduction to Computational Learning Theory. Cambridge University Press, 1994.
P.D. Laird. Learning from Good and Bad Data. Kluwer international series in engineering and computer science. Kluwer Academic Publishers, Boston, 1988.
N. Littlestone. Learning when irrelevant attributes abound: A new linear-threshold algorithm. Machine Learning, 2:285–318, 1988.
N. Littlestone. From on line to batch learning. In Proceedings of the 2nd Workshop on Computational Learning Theory, 269–284, 1990.
R.L. Rivest. Learning decision lists. Machine Learning, 2(3):229–246, 1987.
R.E. Schapire. The strength of weak learnability. Machine Learning, 5(2):197–227, 1990.
L.G. Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134–1142, 1984.
L.G. Valiant. Learning disjunctions of conjunctions. In Proceedings of the 9th International Joint Conference on Artificial Intelligence, 560–566, 1985.
M.K. Warmuth. Posted in the COLT list. *** DIRECT SUPPORT *** A0008157 00003
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Domingo, C., Tsukiji, T., Watanabe, O. (1997). Partial occam's razor and its applications. In: Li, M., Maruoka, A. (eds) Algorithmic Learning Theory. ALT 1997. Lecture Notes in Computer Science, vol 1316. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63577-7_37
Download citation
DOI: https://doi.org/10.1007/3-540-63577-7_37
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63577-2
Online ISBN: 978-3-540-69602-5
eBook Packages: Springer Book Archive