×

On the adaptation of recurrent neural networks for system identification. (English) Zbl 1520.93100

Summary: This paper presents a transfer learning approach which enables fast and efficient adaptation of recurrent neural network (RNN) models of dynamical systems. A nominal RNN model is first identified using available measurements. The system dynamics are then assumed to change, leading to an unacceptable degradation of the nominal model performance on the perturbed system. To cope with the mismatch, the model is augmented with an additive correction term trained on fresh data from the new dynamic regime. The correction term is learned through a Jacobian feature regression (JFR) method defined in terms of the features spanned by the model’s Jacobian with respect to its nominal parameters. A non-parametric view of the approach is also proposed, which extends recent work on Gaussian process (GP) with neural tangent kernel (NTK-GP) to the RNN case (RNTK-GP). This can be more efficient for very large networks or when only few data points are available. Implementation aspects for fast and efficient computation of the correction term, as well as the initial state estimation for the RNN model are described. Numerical examples show the effectiveness of the proposed methodology in presence of significant system variations.

MSC:

93B30 System identification
68T07 Artificial neural networks and deep learning

Software:

DiffSharp; Adam; PyTorch

References:

[1] Andersson, Carl; Ribeiro, Antônio H.; Tiels, Koen; Wahlström, Niklas; Schön, Thomas B., Deep convolutional networks in system identification, (2019 IEEE 58th conference on decision and control (2019), IEEE), 3670-3676
[2] Armenio, Luca Bugliari; Terzi, Enrico; Farina, Marcello; Scattolini, Riccardo, Echo state networks: Analysis, training and predictive control, (2019 18th European control conference (2019), IEEE)
[3] Balestriero, Randall; Baraniuk, Richard, Fast Jacobian-vector product for deep networks (2021), arXiv preprint arXiv:2104.00219
[4] Baydin, Atılım Günes; Pearlmutter, Barak A.; Radul, Alexey Andreyevich; Siskind, Jeffrey Mark, Automatic differentiation in machine learning: A survey, Journal of Machine Learning Research, 18, 1, 5595-5637 (2017) · Zbl 06982909
[5] Beintema, Gerben; Toth, Roland; Schoukens, Maarten, Nonlinear state-space identification using deep encoder networks, (Learning for dynamics and control (2021), PMLR), 241-250
[6] Forgione, M.; Piga, D., Continuous-time system identification with neural networks: Model structures and fitting criteria, European Journal of Control, 59, 69-81 (2021) · Zbl 1466.93032
[7] Forgione, Marco; Piga, Dario, dynoNet: A neural network architecture for learning dynamical systems, International Journal of Adaptive Control and Signal Processing, 35, 4 (2021) · Zbl 1466.93032
[8] Greff, Klaus; Srivastava, Rupesh Kumar; Koutník, Jan; Steunebrink, Bas R.; Schmidhuber, Jürgen, LSTM: A Search Space Odyssey, IEEE Transactions on Neural Networks and Learning Systems, 28, 10, 2222-2232 (2017), URL http://arxiv.org/abs/1503.04069, arXiv:1503.04069
[9] Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome, (The elements of statistical learning. The elements of statistical learning, Springer series in statistics (2001), Springer New York Inc.: Springer New York Inc. New York, NY, USA) · Zbl 0973.62007
[10] Hochreiter, S.; Schmidhuber, J., Long short-term memory, Neural Computation, 9, 8, 1735-1780 (1997)
[11] Iacob, Lucian Cristian; Beintema, Gerben Izaak; Schoukens, Maarten; Tóth, Roland, Deep identification of nonlinear systems in Koopman form (2021), arXiv preprint arXiv:2110.02583
[12] Kingma, Diederik P.; Ba, Jimmy, Adam: A method for stochastic optimization (2014), arXiv preprint arXiv:1412.6980
[13] Ljung, Lennart, System identification: theory for the user (1986), Prentice-Hall, Inc.: Prentice-Hall, Inc. USA · Zbl 0615.93004
[14] System identification: theory for the user (1999), Prentice Hall PTR: Prentice Hall PTR Upper Saddle River, NJ, USA
[15] Luyben, William L., Chemical reactor design and control (2007), John Wiley & Sons
[16] Maciejowski, Jan, Predictive control with constraints (2000), Prentice Hall
[17] Maddox, Wesley; Tang, Shuai; Moreno, Pablo; Wilson, Andrew Gordon; Damianou, Andreas, Fast adaptation with linearized neural networks, (International conference on artificial intelligence and statistics (2021), PMLR), 2737-2745
[18] Mauroy, Alexandre; Mezić, Igor; Susuki, Yoshihiko, The Koopman operator in systems and control: concepts, methodologies, and applications, vol. 484 (2020), Springer Nature · Zbl 1448.93003
[19] Mavkov, Bojan; Forgione, Marco; Piga, Dario, Integrated neural networks for nonlinear continuous-time system identification, IEEE Control Systems Letters, 4, 4, 851-856 (2020)
[20] Nocedal, Jorge; Wright, Stephen, Numerical optimization (2006), Springer Science & Business Media · Zbl 1104.65059
[21] Paszke, Adam; Gross, Sam; Chintala, Soumith; Chanan, Gregory; Yang, Edward; DeVito, Zachary, Automatic differentiation in pytorch (2017)
[22] Pozzoli, Simone; Gallieri, Marco; Scattolini, Riccardo, Tustin neural networks: A class of recurrent nets for adaptive MPC of mechanical systems, IFAC-PapersOnLine, 53, 2, 5171-5176 (2020)
[23] Rabitz, Herschel; Kramer, Mark; Dacol, D., Sensitivity analysis in chemical kinetics, Annual Review of Physical Chemistry, 34, 1, 419-461 (1983)
[24] Schmidhuber, Jürgen, Deep learning in neural networks: An overview, Neural Networks, 61, 85-117 (2015)
[25] Williams, Christopher K.; Rasmussen, Carl Edward, Gaussian processes for machine learning, vol. 2, no. 3 (2006), MIT press Cambridge, MA · Zbl 1177.68165
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.