Online identification and control of PDEs via reinforcement learning methods

102 Accesses
Explore all metrics

Abstract

We focus on the control of unknown partial differential equations (PDEs). The system dynamics is unknown, but we assume we are able to observe its evolution for a given control input, as typical in a reinforcement learning framework. We propose an algorithm based on the idea to control and identify on the fly the unknown system configuration. In this work, the control is based on the state-dependent Riccati approach, whereas the identification of the model on Bayesian linear regression. At each iteration, based on the observed data, we obtain an estimate of the a-priori unknown parameter configuration of the PDE and then we compute the control of the correspondent model. We show by numerical evidence the convergence of the method for infinite horizon control problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive optimal control of unknown discrete-time linear systems with guaranteed prescribed degree of stability using reinforcement learning

Article 24 August 2021

A New Algorithm for the LQR Problem with Partially Unknown Dynamics

Learning-Informed Parameter Identification in Nonlinear Time-Dependent PDEs

Article Open access 23 August 2023

Code availibility

The MATLAB source code for the implementations used to compute the presented results can be downloaded from https://github.com/alessandroalla/SDRE-RL upon request to the corresponding author.

Data availability

No data has been used in this paper.

References

Alla, A., Kalise, D., Simoncini, V.: State-dependent Riccati equation feedback stabilization for nonlinear PDEs. Adv. Comput. Math. 49 (2023). https://doi.org/10.1007/s10444-022-09998-4
Alla, A., Pacifico, A.: A pod approach to identify and control PDEs online through state dependent Riccati equations. Tech. Rep. arXiv:2402.08186 (2024)
Altmüller, N., Grüne, L., Worthmann, K.: Receding horizon optimal control for the wave equation. In: 49th IEEE Conference on Decision and Control (CDC), pp. 3427–3432 (2010). https://doi.org/10.1109/CDC.2010.5717272
Banks, H.T., Lewis, B.M., Tran, H.T.: Nonlinear feedback controllers and compensators: a state-dependent Riccati equation approach. Comput. Optim. Appl. 37(2), 177–218 (2007)
Article MathSciNet Google Scholar
Bardi, M., Capuzzo-Dolcetta, I.: Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations. Birkhauser (1997)
Bellman, R.: The theory of dynamic programming. Bullet. American Math. Soc. 60(6), 503–515 (1954)
Article MathSciNet Google Scholar
Bellman, R.: Adaptive control processes: a guided tour. Princeton University Press, Princeton, N.J. (1961)
Book Google Scholar
Benner, P., Heiland, J.: Exponential stability and stabilization of extended linearizations via continuous updates of Riccati-based feedback. Int. J. Robust Nonlinear Control 28(4), 1218–1232 (2018). https://doi.org/10.1002/rnc.3949
Article MathSciNet Google Scholar
Bertsekas, D.: Reinforcement and optimal control. Athena Scientific (2019)
Bertsekas, D.P.: Approximate dynamic programming (2008)
Box, G.E., Tiao, G.C.: Bayesian inference in statistical analysis, vol. 40. John Wiley & Sons (2011)
Brunton, S., Proctor, J., Kutz, J.: Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences of the United States of America 115, 3932–3937 (2016)
Article MathSciNet Google Scholar
Casper, S., Fuertinger, D.H., Kotanko, P., Mechelli, L., Rohleff, J., Volkwein, S.: Data-driven modeling and control of complex dynamical systems arising in renal anemia therapy. In: Ehrhardt, M., Günther, M. (eds.) Progress in Industrial Mathematics at ECMI 2021, pp. 155–161. Springer International Publishing, Cham (2022)
Chapter Google Scholar
Cloutier, J.R.: State-dependent Riccati equation techniques: an overview. In: Proceedings of the 1997 American Control Conference (Cat. No.97CH36041), vol. 2, pp. 932–936 vol.2 (1997)
Falcone, M., Ferretti, R.: Semi-Lagrangian approximation schemes for linear and Hamilton—Jacobi equations. SIAM (2013)
Freedman, D.A.: Statistical models: theory and practice. Cambridge University Press (2009)
Grüne, L., Pannek, J.: Nonlinear model predictive control. Communications and Control Engineering Series. Springer, London (2011). Theory and algorithms
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning, pp. 1861–1870. PMLR (2018)
Kaiser, E., Kutz, J.N., Brunton, S.L.: Sparse identification of nonlinear dynamics for model predictive control in the low-data limit. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 474(2219), 20180335 (2018). https://doi.org/10.1098/rspa.2018.0335
Article MathSciNet Google Scholar
Karniadakis, G., Kevrekidis, I., Lu, L., Perdikaris, P., Wang, S., Yang, L.: Physics-informed machine learning. Nat. Rev. Phys. 3, 686–707 (2021). https://doi.org/10.1038/s42254-021-00314-5
Article Google Scholar
Knoll, D.A., Keyes, D.E.: Jacobian-free Newton-Krylov methods: a survey of approaches and applications. J. Comput. Phys. 193(2), 357–397 (2004)
Article MathSciNet Google Scholar
Krstic, M., Smyshlyaev, A.: Adaptive control of PDEs. IFAC Proceedings Volumes, 9th IFAC Workshop on Adaptation and Learning in Control and Signal Processing 40(13), 20–31 (2007). https://doi.org/10.3182/20070829-3-RU-4911.00004
LeVeque, R.J.: Finite difference methods for ordinary and partial differential equations. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2007). Steady-state and time-dependent problems. https://doi.org/10.1137/1.9780898717839
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. In: 4th International Conference on Learning Representations (ICLR) (2016)
Martinsen, A.B., Lekkas, A.M., Gros, S.: Combining system identification with reinforcement learning-based MPC. IFAC-PapersOnLine 53(2), 8130–8135 (2020). https://doi.org/10.1016/j.ifacol.2020.12.2294. 21st IFAC World Congress
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Zolman, N., Fasel, U., Kutz, J.N. and Brunton, S.L.: SINDy-RL: interpretable and efficient model-based reinforcement learning. Tech. Rep. (2024). arXiv:2403.09110
Pacifico, A., Pesare, A., Falcone, M.: A new algorithm for the LQR problem with partially unknown dynamics. In: Lirkov, I., Margenov, S. (eds.) Large-Scale Scientific Computing, pp. 322–330. Springer International Publishing, Cham (2022)
Chapter Google Scholar
Powell, W.B.: Approximate dynamic programming: solving the curses of dimensionality, vol. 703. John Wiley & Sons (2007)
Powell, W.B.: From reinforcement learning to optimal control: a unified framework for sequential decisions. In: Handbook of Reinforcement Learning and Control, pp. 29–74. Springer (2021)
Raissi, M., Perdikaris, P., Karniadakis, G.: Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Computat. Phys. 378, 686–707 (2019). https://doi.org/10.1016/j.jcp.2018.10.045
Article MathSciNet Google Scholar
Rasmussen, C., Williams, C.: Gaussian processes for machine learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA, USA (2006)
Google Scholar
Rossi, P.E., Allenby, G.M., McCulloch, R.: Bayesian statistics and marketing. John Wiley & Sons (2012)
Rudy, S., Alla, A., Brunton, S.L., Kutz, J.N.: Data-driven identification of parametric partial differential equations. SIAM J. Appl. Dynamical Syst. 18(2), 643–660 (2019). https://doi.org/10.1137/18M1191944
Article MathSciNet Google Scholar
Rudy, S., Brunton, S., Proctor, J., Kutz, J.: Data-driven discovery of partial differential equations. Sci. Adv. 3 (2017)
Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems, vol. 37. Citeseer (1994)
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International conference on machine learning, pp. 1889–1897. PMLR (2015)
Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction, vol. 1, first edn. MIT Press, Cambridge, MA (1998)
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction, 2nd edn. MIT Press, Cambridge, MA (2018)
Google Scholar
Watkins, C., Hellaby, J.C.: Learning from delayed rewards (1989)

Download references

Acknowledgements

The authors want to express their deep gratitude to Maurizio Falcone. Thanks to him the authors met up and started to collaborate on this project.

Author information

Authors and Affiliations

Dipartimento di Scienze Molecolari e Nanosistemi, Universitá Ca’ Foscari, Venezia, Italy
Alessandro Alla
Department of Mathematics, Sapienza University of Rome, Rome, Italy
Agnese Pacifico
Department of Information Engineering, Computer Science and Mathematics, University of L’Aquila, L’Aquila, Italy
Michele Palladino
Viale Moliere 51, Rome, 00142, Italy
Andrea Pesare

Authors

Alessandro Alla
View author publications
You can also search for this author in PubMed Google Scholar
Agnese Pacifico
View author publications
You can also search for this author in PubMed Google Scholar
Michele Palladino
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Pesare
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alessandro Alla.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Communicated by: Stefan Volkwein

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A. Alla and A. Pacifico are members of the INdAM-GNCS activity group. A. Alla is part of INdAM - GNCS Project “Metodi numerici innovativi per equazioni di Hamilton-Jacobi” (CUP_E53C23001670001). The work of A.A. was carried out within the “Data-driven discovery and control of multi-scale interacting artificial agent systems,” and received funding from the European Union Next-GenerationEU -National Recov-ery and Resilience Plan (NRRP) - MISSION 4 COMPONENT 2, INVES-TIMENT 1.1 Fondo per il Programma Nazionale di Ricerca e Progetti di Rilevante Interesse Nazionale (PRIN) - Project Code P2022JC95T, CUP H53D23008920001. The work of M. Palladino is partially funded by the University of L’Aquila Starting Project Grant “Optimal Control and Applications,” and by INdAM-GNAMPA project, n. CUP_E53C22001930001

Andrea Pesare is an Independent Researcher by the time this manuscript is processed for publication

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Alla, A., Pacifico, A., Palladino, M. et al. Online identification and control of PDEs via reinforcement learning methods. Adv Comput Math 50, 85 (2024). https://doi.org/10.1007/s10444-024-10167-y

Download citation

Received: 01 November 2023
Accepted: 19 June 2024
Published: 01 August 2024
DOI: https://doi.org/10.1007/s10444-024-10167-y

Online identification and control of PDEs via reinforcement learning methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive optimal control of unknown discrete-time linear systems with guaranteed prescribed degree of stability using reinforcement learning

A New Algorithm for the LQR Problem with Partially Unknown Dynamics

Learning-Informed Parameter Identification in Nonlinear Time-Dependent PDEs

Code availibility

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification (2010)

Subscribe and save

Buy Now

Navigation

Online identification and control of PDEs via reinforcement learning methods

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive optimal control of unknown discrete-time linear systems with guaranteed prescribed degree of stability using reinforcement learning

A New Algorithm for the LQR Problem with Partially Unknown Dynamics

Learning-Informed Parameter Identification in Nonlinear Time-Dependent PDEs

Code availibility

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Subscribe and save

Buy Now

Search

Navigation