Abstract
Recurrent Neural Networks (RNNs) are powerful architectures for sequence learning. Recent advances on the vanishing gradient problem have led to improved results and an increased research interest. Among recent proposals are architectural innovations that allow the emergence of multiple timescales during training. This paper explores a number of architectures for sequence generation and prediction tasks with long-term relationships. We compare the Simple Recurrent Network (SRN) and Long Short-Term Memory (LSTM) with the recently proposed Clockwork RNN (CWRNN), Structurally Constrained Recurrent Network (SCRN), and Recurrent Plausibility Network (RPN) with regard to their capabilities of learning multiple timescales. Our results show that partitioning hidden layers under distinct temporal constraints enables the learning of multiple timescales, which contributes to the understanding of the fundamental conditions that allow RNNs to self-organize to accurate temporal abstractions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bengio, Y., Simard, P., Frasconi, P.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Koutník, J., Greff, K., Gomez, F., Schmidhuber, J.: A clockwork RNN. In: Proceedings of ICML-2014, pp. 1863–1871 (2014)
Jaeger, H., Lukoševičius, M., Popovici, D., Siewert, U.: Optimization and applications of echo state networks with leaky-integrator neurons. Neural Netw. 20(3), 335–352 (2007)
Bengio, Y., Boulanger-Lewandowski, N., Pascanu, R.: Advances in optimizing recurrent networks. In: Proceedings of ICASSP-2013, pp. 8624–8628 (2013)
Wermter, S., Panchev, C., Arevian, G.: Hybrid neural plausibility networks for news agents. In: Proceedings of AAAI-1999, pp. 93–98 (1999)
Pascanu, R., Gulcehre, C., Cho, K., Bengio, Y.: How to construct deep recurrent neural networks. ArXiv preprint arXiv:1312.6026v5 (2014)
Mikolov, T., Joulin, A., Chopra, S., Mathieu, M., Ranzato, M.: Learning longer memory in recurrent neural networks. ArXiv preprint arXiv:1412.7753v2 (2015)
Wermter, S.: Hybrid Connectionist Natural Language Processing. Chapman and Hall, Thompson International, London (1995)
Arevian, G., Panchev, C.: Robust text classification using a hysteresis-driven extended SRN. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D.P. (eds.) ICANN 2007. LNCS, vol. 4669, pp. 425–434. Springer, Heidelberg (2007)
Graves, A.: Generating Sequences with recurrent neural networks. Arxiv preprint arXiv:1308.0850 (2013)
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of AISTATS-2010, pp. 249–256 (2010)
Jozefowicz, R., Zaremba, W., Sutskever, I.: An empirical exploration of recurrent network architectures. In: Proceedings of ICML-2015, pp. 2342–2350 (2015)
Kullback, S.: Information Theory and Statistics. Wiley, New York (1959)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Alpay, T., Heinrich, S., Wermter, S. (2016). Learning Multiple Timescales in Recurrent Neural Networks. In: Villa, A., Masulli, P., Pons Rivero, A. (eds) Artificial Neural Networks and Machine Learning – ICANN 2016. ICANN 2016. Lecture Notes in Computer Science(), vol 9886. Springer, Cham. https://doi.org/10.1007/978-3-319-44778-0_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-44778-0_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-44777-3
Online ISBN: 978-3-319-44778-0
eBook Packages: Computer ScienceComputer Science (R0)