×

Real-valued multiple-instance learning with queries. (English) Zbl 1101.68618

Summary: While there has been a significant amount of theoretical and empirical research on the multiple-instance learning model, most of this research is for concept learning. However, for the important application area of drug discovery, a real-valued classification is preferable. In this paper we initiate a theoretical study of real-valued multiple-instance learning. We prove that the problem of finding a target point consistent with a set of labeled multiple-instance examples (or bags) is NP-complete, and that the problem of learning from real-valued multiple-instance examples is as hard as learning DNF. Another contribution of our work is in defining and studying a Multiple-Instance Membership Query (MI-MQ). We give a positive result on exactly learning the target point for a multiple-instance problem in which the learner is provided with a MI-MQ oracle and a single adversarially selected bag.

MSC:

68Q32 Computational learning theory
Full Text: DOI

References:

[1] R.A. Amar, D.R. Dooly, S.A. Goldman, Q. Zhang, Multiple-instance learning of real-valued data, in: Proceedings of the Eighteenth International Conference on Machine Learning, vol. 18, San Francisco, CA, Morgan Kaufmann, Los Altos, CA, 2001, pp. 3-10.; R.A. Amar, D.R. Dooly, S.A. Goldman, Q. Zhang, Multiple-instance learning of real-valued data, in: Proceedings of the Eighteenth International Conference on Machine Learning, vol. 18, San Francisco, CA, Morgan Kaufmann, Los Altos, CA, 2001, pp. 3-10. · Zbl 1089.68589
[2] S. Andrews, I. Tsochantaridis, T. Hofmann, Support vector machines for multiple-instance learning, in: S. Becker, S. Thrun, K. Obermayer (Eds.), Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, NIPS 2002, December 9-14, 2002, Vancouver, British Columbia, Canada], MIT Press, 2003.; S. Andrews, I. Tsochantaridis, T. Hofmann, Support vector machines for multiple-instance learning, in: S. Becker, S. Thrun, K. Obermayer (Eds.), Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, NIPS 2002, December 9-14, 2002, Vancouver, British Columbia, Canada], MIT Press, 2003. · Zbl 1014.68523
[3] P. Auer, P.M. Long, A. Srinivasan, Approximating hyper-rectangles: learning and pseudo-random sets, in: Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing, vol. 29, ACM, New York, 1997, pp. 314-323.; P. Auer, P.M. Long, A. Srinivasan, Approximating hyper-rectangles: learning and pseudo-random sets, in: Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing, vol. 29, ACM, New York, 1997, pp. 314-323. · Zbl 0962.68077
[4] Berry, R. S.; Rice, S. A.; Ross, J., Physical Chemistry (Intermolecular Forces) (1980), Wiley: Wiley New York, (Chapter 10)
[5] Blum, A.; Kalai, A., A note on learning from multiple-instance examples, Mach. Learning, 30, 23-29 (1998) · Zbl 0894.68125
[6] Dietterich, T. G.; Lathrop, R. H.; Lozano-Pérez, T., Solving the multiple-instance problem with axis-parallel rectangles, Artif. Intell., 89, 1-2, 31-71 (1997) · Zbl 1042.68650
[7] S.A. Goldman, S.D. Scott, Multiple-instance learning of real-valued geometric patterns, Ann. Math. Artif. Intell., to appear.; S.A. Goldman, S.D. Scott, Multiple-instance learning of real-valued geometric patterns, Ann. Math. Artif. Intell., to appear. · Zbl 1038.68055
[8] Jain, A. N.; Dietterich, T. G.; Lathrop, R. L.; Chapman, D.; Critchlow, R. E.; Bauer, B. E.; Webster, T. A.; Lozano-Perez, T., Compass: a shape-based machine learning tool for drug design, J. Comput. Aided Molecular Design, 8, 6, 635-652 (1994)
[9] M. Kearns, Efficient noise-tolerant learning from statistical queries, in: Proceedings of the Twenty-fifth Annual ACM Symposium on Theory of Computing, vol. 25, ACM Press, New York, NY, 1993, pp. 392-401.; M. Kearns, Efficient noise-tolerant learning from statistical queries, in: Proceedings of the Twenty-fifth Annual ACM Symposium on Theory of Computing, vol. 25, ACM Press, New York, NY, 1993, pp. 392-401. · Zbl 1310.68179
[10] Long, P. M.; Tan, L., PAC learning axis-aligned rectangles with respect to product distributions from multiple-instance examples, Mach. Learning, 30, 7-21 (1998), (Earlier version in COLT96) · Zbl 0892.68082
[11] O. Maron, Learning from ambiguity, AI technical report 1639, MIT, 1998.; O. Maron, Learning from ambiguity, AI technical report 1639, MIT, 1998.
[12] Maron, O.; Lozano-Pérez, T., A framework for multiple-instance learning, Neural Inform. Process. Systems, 10 (1998)
[13] O. Maron, A.L. Ratan, Multiple-instance learning for natural scene classification, in: Proceedings of the Fifteenth International Conference on Machine Learning, vol. 15, San Francisco, CA, Morgan Kaufmann, Los Altos, CA, 1998, pp. 341-349.; O. Maron, A.L. Ratan, Multiple-instance learning for natural scene classification, in: Proceedings of the Fifteenth International Conference on Machine Learning, vol. 15, San Francisco, CA, Morgan Kaufmann, Los Altos, CA, 1998, pp. 341-349.
[14] S. Ray, D. Page, Multiple instance regression, in: Proceedings of the Eighteenth International Conference on Machine Learning, vol. 18, San Francisco, CA, 2001, Morgan Kaufmann, Los Altos, CA, 2001, pp. 425-432.; S. Ray, D. Page, Multiple instance regression, in: Proceedings of the Eighteenth International Conference on Machine Learning, vol. 18, San Francisco, CA, 2001, Morgan Kaufmann, Los Altos, CA, 2001, pp. 425-432.
[15] G. Ruffo, Learning Single and Multiple Instance Decision Trees for Computer Security Applications, Ph.D. Thesis, Department of Computer Science, University of Turin, Torino, Italy, 2000.; G. Ruffo, Learning Single and Multiple Instance Decision Trees for Computer Security Applications, Ph.D. Thesis, Department of Computer Science, University of Turin, Torino, Italy, 2000.
[16] J. Wang, J.D. Zucker, Solving the multiple-instance problem: a lazy learning approach, in: Proceedings of the Seventeenth International Conference on Machine Learning, vol. 18, San Francisco, CA, 2000, Morgan Kaufmann, Los Altos, CA, pp. 1119-1125.; J. Wang, J.D. Zucker, Solving the multiple-instance problem: a lazy learning approach, in: Proceedings of the Seventeenth International Conference on Machine Learning, vol. 18, San Francisco, CA, 2000, Morgan Kaufmann, Los Altos, CA, pp. 1119-1125.
[17] Warmuth, M. K.; Liao, J.; Ratsch, G.; Mathieson, M.; Putta, S.; Lemmen, C., Support vector machines for active learning in the drug discovery process, J. Chem. Inform. Sci., 43, 2, 667-673 (2003)
[18] Qi Zhang, S.A. Goldman, EM-DD: an improved multiple-instance learning technique, in: NIPS 2001, 2001, pp. 1073-1080.; Qi Zhang, S.A. Goldman, EM-DD: an improved multiple-instance learning technique, in: NIPS 2001, 2001, pp. 1073-1080.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.