×

Realization theory of recurrent neural ODEs using polynomial system embeddings. (English) Zbl 1519.93049

Summary: In this paper we show that neural ODE analogs of recurrent (ODE-RNN) and long short-term memory (ODE-LSTM) networks can be algorithmically embedded into a class of polynomial systems. This embedding preserves input-output behavior and can suitably be extended to other neural differential equation (neural DE) architectures. We then use realization theory of polynomial systems to provide necessary conditions for an input-output map to be realizable by an ODE-LSTM and sufficient conditions for minimality of such systems. These results represent the first steps towards realization theory of recurrent neural ODE architectures, which is expected to be useful for model reduction and learning algorithm analysis of recurrent neural ODEs.

MSC:

93B15 Realizations from input-output data
93C15 Control/observation systems governed by ordinary differential equations
93B30 System identification
93B03 Attainable sets, reachability
93B07 Observability

References:

[1] Hochreiter, S.; Schmidhuber, J., Long short term memory, Neural Comput., 9, 8, 1735-1780 (1997)
[2] Hochreiter, S., Untersuchungen zu Dynamischen Neuronalen Netzen (1991), Institut für Informatik Technische Universität München: Institut für Informatik Technische Universität München Germany, under the direction of W. Brauer
[3] Bengio, Y.; Simard, P.; Frasconi, P., Learning long-term dependencies with gradient descent is difficult, IEEE Trans. Neural Netw., 5, 2, 157-166 (1994)
[4] Chen, R. T.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D., Neural ordinary differential equations, Adv. Neural Inf. Process. Syst., 31 (2018)
[5] Bai, S.; Kolter, J. Z.; Koltun, V., Deep equilibrium models, Adv. Neural Inf. Process. Syst., 32 (2019)
[6] Pal, A.; Edelman, A.; Rackauckas, Ch., Mixing implicit and explicit deep learning with skip DEQs and infinite time neural ODEs (continuous DEQs) (2022), arXiv preprint arXiv:2201.12240
[7] Rubanova, Y.; Chen, R. T.Q.; Duvenaud, D. K., Latent ordinary differential equations for irregularly-sampled time series, Adv. Neural Inf. Process. Syst., 32 (2019)
[8] Kidger, P.; Morrill, J.; Foster, J.; Lyons, T., Neural controlled differential equations for irregular time series, Adv. Neural Inf. Process. Syst., 33 (2020)
[9] Fermanian, A.; Marion, P.; Vert, J.-P.; Biau, G., Framing RNN as a kernel method: A neural ODE approach, Adv. Neural Inf. Process. Syst., 34 (2021)
[10] Tzen, B.; Raginsky, M., Neural stochastic differential equations: Deep latent gaussian models in the diffusion limit (2019), arXiv preprint arXiv:1905.09883
[11] W. Xu, R.T. Chen, X. Li, D. Duvenaud, Infinitely Deep Bayesian Neural Networks with Stochastic Differential Equations, in: International Conference on Artificial Intelligence and Statistics, 2022.
[12] Z. Li, N.B. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, A. Anandkumar, Fourier Neural Operator for Parametric Partial Differential Equations, in: International Conference on Learning Representations, 2021.
[13] Kailath, T., Linear Systems (1979), Prentice-Hall: Prentice-Hall New Jersey · Zbl 0428.93024
[14] Lindquist, A.; Picci, G., (Linear Stochastic Systems. Linear Stochastic Systems, Series in Contemporary Mathematics (2015), Springer-Verlag: Springer-Verlag Berlin Heidelberg), 1 · Zbl 1319.93001
[15] Němcová, J.; van Schuppen, J. H., Realization theory for rational systems: The existence of rational realizations, SIAM J. Control Optim., 48, 2840-2856 (2009) · Zbl 1201.93025
[16] Němcová, J., Rational Systems in Control and System Theory (2009), Centrum Wiskunde & Informatica (CWI): Centrum Wiskunde & Informatica (CWI) Amsterdam, under the direction of Jan H. van Schuppen
[17] Bartoszewicz, Z., Minimal polynomial realizations, Math. Control Signals Systems, 1, 227-237 (1988) · Zbl 0671.93005
[18] Wang, Y.; Sontag, E. D., Algebraic differential equations and rational control systems, SIAM J. Control Optim., 30, 5, 1126-1149 (1992) · Zbl 0762.93015
[19] Jakubczyk, B., Realization theory for nonlinear systems: three approaches, (Fliess, M.; Hazenwinkel, M., Algebraic and Geometric Methods in Nonlinear Control Theory (1986), D. Reidel Publishing Company: D. Reidel Publishing Company Dordrecht), 3-31 · Zbl 0608.93018
[20] Isidori, A., Nonlinear Control Systems (2013), Springer · Zbl 0569.93034
[21] Hermann, R.; Krener, A. J., Nonlinear controllability and observability, IEEE Trans. Automat. Control, 22, 728-740 (1977) · Zbl 0396.93015
[22] J. Němcová, M. Petreczky, J.H. van Schuppen, Observability reduction algorithm for rational systems, in: IEEE Conference on Decision and Control, CDC, 2016, pp. 5738-5743.
[23] Albertini, F.; Sontag, E. D., State observability in recurrent neural networks, Syst. F3 Control Lett., 22, 235-244 (1994) · Zbl 0796.93013
[24] Qiao, Y.; Sontag, E. D., Further results on controllability of recurrent neural networks, Systems Control Lett., 36, 121-129 (1999) · Zbl 0914.93011
[25] Albertini, F.; Sontag, E. D., For neural networks, function determines form, Neural Netw., 6, 975-990 (1993)
[26] T. Defourneau, M. Petreczky, Realization theory of recurrent neural networks and rational systems, in: IEEE 58th Conference on Decision and Control, CDC, 2019.
[27] Bartoszewicz, Z., Rational systems and observation fields, Systems Control Lett., 9, 379-386 (1987) · Zbl 0636.93040
[28] Němcová, J.; van Schuppen, J. H., Realization theory for rational systems: Minimal rational realizations, Acta Appl. Math., 110, 605-626 (2010) · Zbl 1217.93040
[29] Němcová, J.; Petreczky, M.; van Schuppen, J. H., Realization theory of Nash systems, SIAM J. Control Optim., 51, 3386-3414 (2013) · Zbl 1279.93032
[30] Kalman, R. E., Mathematical description of linear dynamical systems, SIAM J. Control Optim., 1, 2, 152-159 (1963) · Zbl 0145.34301
[31] F.A. Gers, J. Schmidhuber, Recurrent nets that time and count, in: Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium, Vol. 3, 2000.
[32] T.K. Rusch, S. Mishra, N.B. Erichson, M.W. Mahoney, Long Expressive Memory for Sequence Modeling, in: International Conference on Learning Representations, 2022.
[33] Jakubczyk, B., Introduction to geometric nonlinear control ; controllability and Lie bracket, (Lectures Given at the Summer School on Mathematical Control Theory, Trieste 3-28 Septembre 2001, Vol. 38 (2002), International Atomic Energy Agency (IAEA)) · Zbl 1017.93034
[34] Hanson, J.; Raginsky, M.; Sontag, E., Learning recurrent neural net models of nonlinear systems, (Proceedings of the 3rd Conference on Learning for Dynamics and Control, Vol. 144 (2021), PMLR)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.