Skip to main content

Accelerating Imitation Learning in Relational Domains via Transfer by Initialization

  • Conference paper
  • First Online:
Inductive Logic Programming (ILP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8812))

Included in the following conference series:

  • 704 Accesses

Abstract

The problem of learning to mimic a human expert/teacher from training trajectories is called imitation learning. To make the process of teaching easier in this setting, we propose to employ transfer learning (where one learns on a source problem and transfers the knowledge to potentially more complex target problems). We consider multi-relational environments such as real-time strategy games and use functional-gradient boosting to capture and transfer the models learned in these environments. Our experiments demonstrate that our learner learns a very good initial model from the simple scenario and effectively transfers the knowledge to the more complex scenario thus achieving a jump start, a steeper learning curve and a higher convergence in performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 34.99
Price excludes VAT (USA)
Softcover Book
USD 44.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://beaversource.oregonstate.edu/projects/stratagusai

References

  1. Segre, A., DeJong, G.: Explanation-based manipulator learning: acquisition of planning ability through observation. In: Conference on Robotics and Automation (1985)

    Google Scholar 

  2. Argall, B., Chernova, S., Veloso, M., Browning, B.: A survey of robot learning from demonstration. Robot. Auton. Syst. 57, 469–483 (2009)

    Article  Google Scholar 

  3. Calinon, S.: Robot Programming By Demonstration: A Probabilistic Approach. EPFL Press, Boca Raton (2009)

    Google Scholar 

  4. Lieberman, H.: Programming by example (introduction). Commun. ACM 43, 72–74 (2000)

    Article  Google Scholar 

  5. Ng, A., Russell, S.: Algorithms for inverse reinforcement learning. In: ICML (2000)

    Google Scholar 

  6. Sammut, C., Hurst, S., Kedzier, D., Michie, D.: Learning to fly. In: ICML (1992)

    Google Scholar 

  7. Ratliff, N., Bagnell, A., Zinkevich, M.: Maximum margin planning. In: ICML (2006)

    Google Scholar 

  8. Natarajan, S., Joshi, S., Tadepalli, P., Kersting, K., Shavlik, J.: Imitation learning in relational domains: a functional-gradient boosting approach. In: IJCAI (2011)

    Google Scholar 

  9. Khardon, R.: Learning action strategies for planning domains. Artif. Intell. 113, 125–148 (1999)

    Article  MATH  Google Scholar 

  10. Yoon, S., Fern, A., Givan, R.: Inductive policy selection for first-order mdps. In: UAI (2002)

    Google Scholar 

  11. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)

    Article  MATH  Google Scholar 

  12. Blockeel, H.: Top-down induction of first order logical decision trees. AI Commun. 12(1–2), 119–120 (1999)

    Google Scholar 

  13. Pan, S., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010)

    Article  Google Scholar 

  14. Al-Zubi, S., Sommer, G.: Imitation learning and transferring of human movement and hand grasping to adapt to environment changes. In: Human Motion. Computational Imaging and Vision, vol. 36, pp. 435–452 (2008)

    Google Scholar 

  15. Mehta, N., Natarajan, S., Tadepalli, P., Fern, A.: Transfer in variable-reward hierarchical reinforcement learning. Mach. Learn. 73(3), 289–312 (2008)

    Article  Google Scholar 

  16. Ratliff, N., Silver, D., Bagnell, A.: Learning to search: functional gradient techniques for imitation learning. Auton. Robots 27, 25–53 (2009)

    Article  Google Scholar 

  17. Dietterich, T.G., Ashenfelter, A., Bulatov, Y.: Training conditional random fields via gradient tree boosting. In: ICML (2004)

    Google Scholar 

  18. Gutmann, B., Kersting, K.: TildeCRF: conditional random fields for logical sequences. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 174–185. Springer, Heidelberg (2006)

    Google Scholar 

  19. Natarajan, S., Khot, T., Kersting, K., Guttmann, B., Shavlik, J.: Gradient-based boosting for statistical relational learning: the relational dependency network case. Mach. Learn. 86, 25–56 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  20. Kersting, K., Driessens, K.: Non-parametric policy gradients: a unified treatment of propositional and relational domains. In: ICML (2008)

    Google Scholar 

  21. Driessens, K.: Non-disjoint modularity in reinforcement learning through boosted policies. In: Multi-disciplinary Symposium on Reinforcement Learning (2009)

    Google Scholar 

  22. Driessens, K., Dzeroski, S.: Integrating guidance into relational reinforcement learning. Mach. Learn. 57(3), 271–304 (2004)

    Article  MATH  Google Scholar 

Download references

Acknowledgments

SN and PO thank Army Research Office grant number W911NF-13-1-0432 under the Young Investigator Program. SN and TK gratefully acknowledge the support of the DARPA DEFT Program under the Air Force Research Laboratory (AFRL) prime contract no. FA8750-13-2-0039. Any opinions, findings, and conclusion or recommendations expressed in this material are those of the authors and do not necessarily reflect the view of the DARPA, AFRL, or the US government. SJ was supported by a Computing Innovations Postdoctoral Fellowship. KK was supported by the Fraunhofer ATTRACT fellowship STREAM and by the European Commission under contract number FP7-248258-First-MM. PT acknowledges the support of ONR grant N000141110106.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Phillip Odom .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Natarajan, S., Odom, P., Joshi, S., Khot, T., Kersting, K., Tadepalli, P. (2014). Accelerating Imitation Learning in Relational Domains via Transfer by Initialization. In: Zaverucha, G., Santos Costa, V., Paes, A. (eds) Inductive Logic Programming. ILP 2013. Lecture Notes in Computer Science(), vol 8812. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44923-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-44923-3_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-44922-6

  • Online ISBN: 978-3-662-44923-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics