Skip to main content

Learning Multiple Timescales in Recurrent Neural Networks

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2016 (ICANN 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9886))

Included in the following conference series:

Abstract

Recurrent Neural Networks (RNNs) are powerful architectures for sequence learning. Recent advances on the vanishing gradient problem have led to improved results and an increased research interest. Among recent proposals are architectural innovations that allow the emergence of multiple timescales during training. This paper explores a number of architectures for sequence generation and prediction tasks with long-term relationships. We compare the Simple Recurrent Network (SRN) and Long Short-Term Memory (LSTM) with the recently proposed Clockwork RNN (CWRNN), Structurally Constrained Recurrent Network (SCRN), and Recurrent Plausibility Network (RPN) with regard to their capabilities of learning multiple timescales. Our results show that partitioning hidden layers under distinct temporal constraints enables the learning of multiple timescales, which contributes to the understanding of the fundamental conditions that allow RNNs to self-organize to accurate temporal abstractions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 39.99
Price excludes VAT (USA)
Softcover Book
USD 54.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)

    Article  Google Scholar 

  2. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  3. Koutník, J., Greff, K., Gomez, F., Schmidhuber, J.: A clockwork RNN. In: Proceedings of ICML-2014, pp. 1863–1871 (2014)

    Google Scholar 

  4. Jaeger, H., Lukoševičius, M., Popovici, D., Siewert, U.: Optimization and applications of echo state networks with leaky-integrator neurons. Neural Netw. 20(3), 335–352 (2007)

    Article  MATH  Google Scholar 

  5. Bengio, Y., Boulanger-Lewandowski, N., Pascanu, R.: Advances in optimizing recurrent networks. In: Proceedings of ICASSP-2013, pp. 8624–8628 (2013)

    Google Scholar 

  6. Wermter, S., Panchev, C., Arevian, G.: Hybrid neural plausibility networks for news agents. In: Proceedings of AAAI-1999, pp. 93–98 (1999)

    Google Scholar 

  7. Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks. ArXiv preprint arXiv:1312.6026v5 (2014)

  8. Mikolov, T., Joulin, A., Chopra, S., Mathieu, M., Ranzato, M.: Learning longer memory in recurrent neural networks. ArXiv preprint arXiv:1412.7753v2 (2015)

  9. Wermter, S.: Hybrid Connectionist Natural Language Processing. Chapman and Hall, Thompson International, London (1995)

    Google Scholar 

  10. Arevian, G., Panchev, C.: Robust text classification using a hysteresis-driven extended SRN. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D.P. (eds.) ICANN 2007. LNCS, vol. 4669, pp. 425–434. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Graves, A.: Generating Sequences with recurrent neural networks. Arxiv preprint arXiv:1308.0850 (2013)

  12. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of AISTATS-2010, pp. 249–256 (2010)

    Google Scholar 

  13. Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: Proceedings of ICML-2015, pp. 2342–2350 (2015)

    Google Scholar 

  14. Kullback, S.: Information Theory and Statistics. Wiley, New York (1959)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tayfun Alpay .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Alpay, T., Heinrich, S., Wermter, S. (2016). Learning Multiple Timescales in Recurrent Neural Networks. In: Villa, A., Masulli, P., Pons Rivero, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2016. ICANN 2016. Lecture Notes in Computer Science(), vol 9886. Springer, Cham. https://doi.org/10.1007/978-3-319-44778-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-44778-0_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-44777-3

  • Online ISBN: 978-3-319-44778-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics