Document Zbl 1542.68172

A framework for machine learning of model error in dynamical systems. (English) Zbl 1542.68172

Commun. Am. Math. Soc. 2, 283-344 (2022).

Summary: The development of data-informed predictive models for dynamical systems is of widespread interest in many disciplines. We present a unifying framework for blending mechanistic and machine-learning approaches to identify dynamical systems from noisily and partially observed data. We compare pure data-driven learning with hybrid models which incorporate imperfect domain knowledge, referring to the discrepancy between an assumed truth model and the imperfect mechanistic model as model error. Our formulation is agnostic to the chosen machine learning model, is presented in both continuous- and discrete-time settings, and is compatible both with model errors that exhibit substantial memory and errors that are memoryless.
First, we study memoryless linear (w.r.t. parametric-dependence) model error from a learning theory perspective, defining excess risk and generalization error. For ergodic continuous-time systems, we prove that both excess risk and generalization error are bounded above by terms that diminish with the square-root of $T$, the time-interval over which training data is specified.
Secondly, we study scenarios that benefit from modeling with memory, proving universal approximation theorems for two classes of continuous-time recurrent neural networks (RNNs): both can learn memory-dependent model error, assuming that it is governed by a finite-dimensional hidden variable and that, together, the observed and hidden variables form a continuous-time Markovian system. In addition, we connect one class of RNNs to reservoir computing, thereby relating learning of memory-dependent error to recent work on supervised learning between Banach spaces using random features.
Numerical results are presented (Lorenz ’63, Lorenz ’96 Multiscale systems) to compare purely data-driven and hybrid approaches, finding hybrid methods less datahungry and more parametrically efficient. We also find that, while a continuous-time framing allows for robustness to irregular sampling and desirable domain-interpretability, a discrete-time framing can provide similar or better predictive performance, especially when data are undersampled and the vector field defining the true dynamics cannot be identified. Finally, we demonstrate numerically how data assimilation can be leveraged to learn hidden dynamics from noisy, partially-observed data, and illustrate challenges in representing memory by this approach, and in the training of such models.

Cited in 11 Documents

MSC:

68T05	Learning and adaptive systems in artificial intelligence
37A30	Ergodic theorems, spectral theory, Markov operators
37M10	Time series analysis of dynamical systems
41A05	Interpolation in approximation theory

Keywords:

dynamical systems; model error; statistical learning; random features; recurrent neural networks; reservoir computing

Software:

Statsmodels; Adam; Python; GitHub; DeepONet; SciPy; CRCP; APHYNITY; BayesianOptimization; EikoNet; GELUs

Cite Review PDF

Full Text: DOI arXiv Link

References:

[1]	Alexander, Romeo, Operator-theoretic framework for forecasting nonlinear time series with kernel analog techniques, Phys. D, 132520, 24 pp. (2020) · Zbl 1496.37085 · doi:10.1016/j.physd.2020.132520
[2]	Ranjan Anantharaman, Yingbo Ma, Shashi Gowda, Chris Laughman, Viral Shah, Alan Edelman, and Chris Rackauckas, Accelerating simulation of stiff nonlinear systems using continuous-time echo state networks, https://arxiv.org/abs/2010.04004v6, 2020.
[3]	Asch, Mark, Data assimilation, Fundamentals of Algorithms, xvii+306 pp. (2016), Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA · Zbl 1361.93001 · doi:10.1137/1.9781611974546.pt1
[4]	Ibrahim Ayed, Emmanuel de B\'ezenac, Arthur Pajot, Julien Brajard, and Patrick Gallinari, Learning dynamical systems from partial observations, Second Workshop on Machine Learning and the Physical Sciences (NeurIPS 2019), Vancouver, Canada, February 2019. · Zbl 07570161
[5]	Yunhao Ba, Guangyuan Zhao, and Achuta Kadambi, Blending diverse physical priors with neural networks, 1910.00201, 2019.
[6]	Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio, Neural machine translation by jointly learning to align and translate, 1409.0473, 2016.
[7]	Bahsoun, Wael, Variance continuity for Lorenz flows, Ann. Henri Poincar\'{e}, 1873-1892 (2020) · Zbl 1448.37007 · doi:10.1007/s00023-020-00913-5
[8]	Randall D. Beer, On the dynamics of small continuous-time recurrent neural networks, Adapt. Behav. 3 (1995), no. 4, 469-509, http://journals.sagepub.com/doi/10.1177/105971239500300405.
[9]	Bensoussan, A., Asymptotic analysis for periodic structures, xii+398 pp. (2011), AMS Chelsea Publishing, Providence, RI · Zbl 1229.35001 · doi:10.1090/chel/374
[10]	Jos\'e Bento, Morteza Ibrahimi, and Andrea Montanari, Information theoretic limits on learning stochastic differential equations, 2011 IEEE International Symposium on Information Theory Proceedings, IEEE, 2011, pp. 855-859.
[11]	Beucler, Tom, Enforcing analytic constraints in neural networks emulating physical systems, Phys. Rev. Lett., Paper No. 098302, 7 pp. (2021) · doi:10.1103/physrevlett.126.098302
[12]	Bhattacharya, Kaushik, Model reduction and neural networks for parametric PDEs, SMAI J. Comput. Math., 121-157 (2021) · Zbl 1481.65260
[13]	Marc Bocquet, Julien Brajard, Alberto Carrassi, and Laurent Bertino, Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization, Found. Data Sci. 2 (2020), no. 1, 55, https://www.aimsciences.org/article/doi/10.3934/fods.2020004.
[14]	Borra, Francesco, Effective models and predictability of chaotic multiscale systems via machine learning, Phys. Rev. E, 052203, 11 pp. (2020) · doi:10.1103/physreve.102.052203
[15]	Brajard, Julien, Combining data assimilation and machine learning to emulate a dynamical model from sparse and noisy observations: a case study with the Lorenz 96 model, J. Comput. Sci., 101171, 11 pp. (2020) · doi:10.1016/j.jocs.2020.101171
[16]	Brajard, Julien, Combining data assimilation and machine learning to infer unresolved scale parametrization, Philos. Trans. Roy. Soc. A, Paper No. 20200086, 16 pp. (2021) · doi:10.1098/rsta.2020.0086
[17]	Leo Breiman, Bagging predictors, Mach. Learn., 24 (1996), no. 2, 123-140. · Zbl 0858.68080
[18]	N. D. Brenowitz and C. S. Bretherton, Prognostic validation of a neural network unified physics parameterization, Geophys. Res. Lett. 45, no. 12, 6289-6298, 2018. https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2018GL078510.
[19]	Brunton, Steven L., Discovering governing equations from data by sparse identification of nonlinear dynamical systems, Proc. Natl. Acad. Sci. USA, 3932-3937 (2016) · Zbl 1355.94013 · doi:10.1073/pnas.1517384113
[20]	Burov, Dmitry, Kernel analog forecasting: multiscale test problems, Multiscale Model. Simul., 1011-1040 (2021) · Zbl 1487.60088 · doi:10.1137/20M1338289
[21]	Champion, Kathleen, Data-driven discovery of coordinates and governing equations, Proc. Natl. Acad. Sci. USA, 22445-22451 (2019) · Zbl 1433.68396 · doi:10.1073/pnas.1906995116
[22]	Bo Chang, Minmin Chen, Eldad Haber, and Ed H. Chi, AntisymmetricRNN: A dynamical system view on recurrent neural networks, 1902.09689, 2019.
[23]	Ashesh Chattopadhyay, Pedram Hassanzadeh, and Devika Subramanian, Data-driven predictions of a multiscale Lorenz 96 chaotic system using machine-learning methods: reservoir computing, artificial neural network, and long short-term memory network, Nonlinear Process. Geophys. 27 (2020), no. 3, 373-389, https://npg.copernicus.org/articles/27/373/2020/.
[24]	Ashesh Chattopadhyay, Adam Subel, and Pedram Hassanzadeh, Data-driven super-parameterization using deep learning: experimentation with multiscale Lorenz 96 systems and transfer learning. J. Adv. Model. Earth Sys. 12 (2020), no. 11, e2020MS002084, https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2020MS002084. · Zbl 07527717
[25]	Chen, Yifan, Solving and learning nonlinear PDEs with Gaussian processes, J. Comput. Phys., Paper No. 110668, 29 pp. (2021) · Zbl 07516428 · doi:10.1016/j.jcp.2021.110668
[26]	Chen, Yuming, Autodifferentiable ensemble Kalman filters, SIAM J. Math. Data Sci., 801-833 (2022) · Zbl 1493.62499 · doi:10.1137/21M1434477
[27]	Chkrebtii, Oksana A., Bayesian solution uncertainty quantification for differential equations, Bayesian Anal., 1239-1267 (2016) · Zbl 1357.62108 · doi:10.1214/16-BA1017
[28]	Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, On the properties of neural machine translation: encoder-decoder approaches, 1409.1259, 2014.
[29]	Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, Learning phrase representations using rnn encoder-decoder for statistical machine translation, 1406.1078, 2014.
[30]	Alexandre J. Chorin and Fei Lu, Discrete approach to stochastic parametrization and dimension reduction in nonlinear dynamics, Proc. Natl. Acad. Sci. 112 (2015), no. 32, 9804-9809, https://www.pnas.org/content/112/32/9804.
[31]	Chorin, Alexandre J., Optimal prediction and the Mori-Zwanzig representation of irreversible processes, Proc. Natl. Acad. Sci. USA, 2968-2973 (2000) · Zbl 0968.60036 · doi:10.1073/pnas.97.7.2968
[32]	Colton, David, Inverse acoustic and electromagnetic scattering theory, Applied Mathematical Sciences, xiv+405 pp. (2013), Springer, New York · Zbl 1266.35121 · doi:10.1007/978-1-4614-4942-3
[33]	Wahba, Grace, Smoothing noisy data with spline functions, Numer. Math., 383-393 (1975) · Zbl 0299.65008 · doi:10.1007/BF01437407
[34]	Cybenko, G., Approximation by superpositions of a sigmoidal function, Math. Control Signals Systems, 303-314 (1989) · Zbl 0679.94019 · doi:10.1007/BF02551274
[35]	Eric Darve, Jose Solomon, and Amirali Kia, Computing generalized Langevin equations and generalized Fokker-Planck equations, Proc. Natl. Acad. Sci. 106 (2009), no. 27, 10884-10889.
[36]	DeVore, Ronald A., Constructive approximation, Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], x+449 pp. (1993), Springer-Verlag, Berlin · Zbl 0797.41016
[37]	Jonathan Dong, Ruben Ohana, Mushegh Rafayelyan, and Florent Krzakala, Reservoir computing meets recurrent kernels and structured transforms, 2006.07310, 2020.
[38]	Dormand, J. R., A family of embedded Runge-Kutta formulae, J. Comput. Appl. Math., 19-26 (1980) · Zbl 0448.65045 · doi:10.1016/0771-050X(80)90013-3
[39]	Du, Qiang, The discovery of dynamics via linear multistep methods and deep learning: error estimation, SIAM J. Numer. Anal., 2014-2045 (2022) · Zbl 1506.65105 · doi:10.1137/21M140691X
[40]	Duraisamy, Karthik, Annual review of fluid mechanics. Vol. 51. Turbulence modeling in the age of data, Annu. Rev. Fluid Mech., 357-377 (2019), Annual Reviews, Palo Alto, CA · Zbl 1412.76040
[41]	E, Weinan, A priori estimates of the population risk for two-layer neural networks, Commun. Math. Sci., 1407-1425 (2019) · Zbl 1427.68277 · doi:10.4310/CMS.2019.v17.n5.a11
[42]	N. Benjamin Erichson, Omri Azencot, Alejandro Queiruga, Liam Hodgkinson, and Michael W. Mahoney, Lipschitz recurrent neural networks, 2006.12070, 2020.
[43]	Alban Farchi, Patrick Laloyaux, Massimo Bonavita, and Marc Bocquet, Using machine learning to correct model error in data assimilation and forecast applications, 2010.12605, 2021.
[44]	Fatkullin, Ibrahim, A computational strategy for multiscale systems with applications to Lorenz 96 model, J. Comput. Phys., 605-638 (2004) · Zbl 1058.65065 · doi:10.1016/j.jcp.2004.04.013
[45]	Freno, Brian A., Machine-learning error models for approximate solutions to parameterized systems of nonlinear equations, Comput. Methods Appl. Mech. Engrg., 250-296 (2019) · Zbl 1440.65058 · doi:10.1016/j.cma.2019.01.024
[46]	Roger Frigola, Yutian Chen, and Carl Edward Rasmussen, Variational Gaussian process state-space models, Adv. Neural Inform. Process. Systems 27 (2014), https://proceedings.neurips.cc/paper/2014/hash/139f0874f2ded2e41b0393c4ac5644f7-Abstract.html.
[47]	Funahashi, Ken-ichi, Proceedings of the 5th International Colloquium on Differential Equations, Vol. 2. Approximation theory, dynamical systems, and recurrent neural networks, 51-58 (1994), Sci. Cult. Technol. Publ., Singapore · Zbl 0882.34051
[48]	Daniel J. Gauthier, Erik Bollt, Aaron Griffith, and Wendson A. S. Barbosa, Next generation reservoir computing, Nat. Comm. 12 (2021), no. 1, 5564, ISSN 2041-1723.
[49]	Gilani, Faheem, Kernel-based prediction of non-Markovian time series, Phys. D, Paper No. 132829, 16 pp. (2021) · Zbl 1490.62289 · doi:10.1016/j.physd.2020.132829
[50]	R. Gonz\'alez-Garc\'ia, R. Rico-Mart\'inez, and I. G. Kevrekidis, Identification of distributed parameter systems: a neural net based approach, Comput. Chem. Eng. 22 (1998), S965-S968, https://linkinghub.elsevier.com/retrieve/pii/S0098135498001914.
[51]	Goodfellow, Ian, Deep learning, Adaptive Computation and Machine Learning, xxii+775 pp. (2016), MIT Press, Cambridge, MA · Zbl 1373.68009
[52]	Gottwald, Georg A., Combining machine learning and data assimilation to forecast dynamical systems from noisy partial observations, Chaos, Paper No. 101103, 8 pp. (2021) · Zbl 07867355 · doi:10.1063/5.0066080
[53]	Gottwald, Georg A., Supervised learning from noisy observations: combining machine-learning techniques with data assimilation, Phys. D, Paper No. 132911, 15 pp. (2021) · Zbl 1508.68288 · doi:10.1016/j.physd.2021.132911
[54]	Gouasmi, Ayoub, {\it A priori} estimation of memory effects in reduced-order models of nonlinear systems using the Mori-Zwanzig formalism, Proc. A., 20170385, 24 pp. (2017) · Zbl 1402.76065 · doi:10.1098/rspa.2017.0385
[55]	Wojciech W. Grabowski, Coupling cloud processes with the large-scale dynamics using the cloud-resolving convection parameterization (CRCP), J. Atmos. Sci. 58 (2001), no. 9, 978-997, https://journals.ametsoc.org/view/journals/atsc/58/9/1520-0469_2001_058_0978_ccpwtl_2.0.co_2.xml.
[56]	Lyudmila Grigoryeva and Juan-Pablo Ortega, Echo state networks are universal, 1806.00797, 2018. · Zbl 1434.68409
[57]	Grimmett, Geoffrey R., Probability and random processes, xii+669 pp. (2020), Oxford University Press, Oxford · Zbl 1437.60004
[58]	Gupta, Abhinav, Neural closure models for dynamical systems, Proc. A., Paper No. 20201004, 29 pp. (2021)
[59]	Haber, Eldad, Stable architectures for deep neural networks, Inverse Problems, 014004, 22 pp. (2018) · Zbl 1426.68236 · doi:10.1088/1361-6420/aa9a90
[60]	Franz Hamilton, Alun L. Lloyd, and Kevin B. Flores, Hybrid modeling and prediction of dynamical systems, PLOS Comput. Biol. 13 (2017), no. 7, e1005655, https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1005655.
[61]	Hamzi, Boumediene, Learning dynamical systems from data: a simple cross-validation perspective, part I: Parametric kernel flows, Phys. D, Paper No. 132817, 10 pp. (2021) · Zbl 1509.68217 · doi:10.1016/j.physd.2020.132817
[62]	Harlim, John, Machine learning for prediction with missing dynamics, J. Comput. Phys., Paper No. 109922, 22 pp. (2021) · Zbl 07511409 · doi:10.1016/j.jcp.2020.109922
[63]	Fabr\'icio P. H\"arter and Haroldo Fraga de Campos Velho, Data assimilation procedure by recurrent neural network, Eng. Appl. Comput. Fluid Mech. 6 (2012), 224-233, https://doi.org/10.1080/19942060.2012.11015417.
[64]	New directions in statistical signal processing: from systems to brain, Neural Information Processing Series, x+514 pp. (2007), MIT Press, Cambridge, MA · Zbl 1190.62183
[65]	Dan Hendrycks and Kevin Gimpel, Gaussian error linear units (gelus), https://arxiv.org/abs/1606.08415, 2016.
[66]	Carmen Hij\'on, Pep Espa\~nol, Eric Vanden-Eijnden, and Rafael Delgado-Buscalioni, Mori-Zwanzig formalism as a practical computational tool, Faraday Discuss. 144 (2010), 301-322.
[67]	Sepp Hochreiter and J\"urgen Schmidhuber. Long short-term memory, Neural Comput. 9 (1997), 1735-1780.
[68]	Holland, Mark, Central limit theorems and invariance principles for Lorenz attractors, J. Lond. Math. Soc. (2), 345-364 (2007) · Zbl 1126.37006 · doi:10.1112/jlms/jdm060
[69]	Herbert Jaeger, The “echo state” approach to analysing and training recurrent neural networks-with an erratum note’, German National Research Center for Information Technology GMD Technical Report, Bonn, Germany, January 2001, p. 148.
[70]	Xiaowei Jia, Jared Willard, Anuj Karpatne, Jordan S. Read, Jacob A. Zwart, Michael Steinbach, and Vipin Kumar, Physics-guided machine learning for scientific discovery: an application in simulating lake temperature profiles, ACM/IMS Trans. Data Sci. 2 (2021), no. 3, 20:1-20:26, https://doi.org/10.1145/3447814.
[71]	Jiang, Shixiao W., Modeling of missing dynamical systems: deriving parametric models using a nonparametric framework, Res. Math. Sci., Paper No. 16, 25 pp. (2020) · Zbl 1447.37067 · doi:10.1007/s40687-020-00217-4
[72]	Kadierdan Kaheman, Eurika Kaiser, Benjamin Strom, J. Nathan Kutz, and Steven L. Brunton, Learning discrepancy models from experimental data, 1909.08574, 2019.
[73]	Kadierdan Kaheman, Steven L. Brunton, and J. Nathan Kutz, Automatic differentiation to simultaneously identify nonlinear dynamics and extract noise probability distributions from data, Mach. Learn. Sci. Technol. 3 (2022), no. 1, 015031, https://doi.org/10.1088/2632-2153/ac567a. · Zbl 1473.93007
[74]	Kaipio, Jari, Statistical and computational inverse problems, Applied Mathematical Sciences, xvi+339 pp. (2005), Springer-Verlag, New York · Zbl 1068.65022
[75]	J. Nagoor Kani and Ahmed H. Elsheikh, DR-RNN: a deep residual recurrent neural network for model reduction, 1709.00939, 2017.
[76]	Kashinath, K., Physics-informed machine learning: case studies for weather and climate modelling, Philos. Trans. Roy. Soc. A, Paper No. 20200093, 36 pp. (2021) · doi:10.1098/rsta.2020.0093
[77]	Keller, Rachael T., Discovery of dynamics using linear multistep methods, SIAM J. Numer. Anal., 429-455 (2021) · Zbl 1466.65050 · doi:10.1137/19M130981X
[78]	Kemeth, Felix P., Initializing LSTM internal states via manifold learning, Chaos, Paper No. 093111, 14 pp. (2021) · Zbl 07866704 · doi:10.1063/5.0055371
[79]	Marat F. Khairoutdinov and David A. Randall, A cloud resolving model as a cloud parameterization in the NCAR community climate system model: preliminary results, Geophys. Res. Lett. 28 (2001), no. 18, 3617-3620, https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2001GL013552.
[80]	Diederik P. Kingma and Jimmy Ba, Adam: A method for stochastic optimization, 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, Conference Track Proceedings, 1412.6980, 2015.
[81]	Kocijan, Ju\v{s}, Modelling and control of dynamic systems using Gaussian process models, Advances in Industrial Control, xvi+267 pp. (2016), Springer, Cham · Zbl 1339.93004 · doi:10.1007/978-3-319-21021-6
[82]	Korda, Milan, Data-driven spectral analysis of the Koopman operator, Appl. Comput. Harmon. Anal., 599-629 (2020) · Zbl 1436.37093 · doi:10.1016/j.acha.2018.08.002
[83]	K. Krischer, R. Rico-Mart\'inez, I. G. Kevrekidis, H. H. Rotermund, G. Ertl, and J. L. Hudson, Model identification of a spatiotemporally varying catalytic reaction, AIChE J. 39 (1993), no. 1, 89-98, January 1993, http://doi.wiley.com/10.1002/aic.690390110.
[84]	Kullback, S., On information and sufficiency, Ann. Math. Statistics, 79-86 (1951) · Zbl 0042.38403 · doi:10.1214/aoms/1177729694
[85]	Kutoyants, Yury A., Statistical inference for ergodic diffusion processes, Springer Series in Statistics, xiv+481 pp. (2004), Springer-Verlag London, Ltd., London · Zbl 1038.62073 · doi:10.1007/978-1-4471-3866-2
[86]	Lagaris, I. E., Mathematical methods in scattering theory and biomedical technology. A hardware implementable non-linear method for the solution of ordinary, partial and integrodifferential equations, Pitman Res. Notes Math. Ser., 110-126 (1997), Longman, Harlow
[87]	Law, Kody, Analysis of the 3DVAR filter for the partially observed Lorenz’63 model, Discrete Contin. Dyn. Syst., 1061-1078 (2014) · Zbl 1283.62194 · doi:10.3934/dcds.2014.34.1061
[88]	Law, Kody, Data assimilation, Texts in Applied Mathematics, xviii+242 pp. (2015), Springer, Cham · Zbl 1353.60002 · doi:10.1007/978-3-319-20325-6
[89]	Youming Lei, Jian Hu, and Jianpeng Ding, A hybrid model based on deep LSTM for predicting high-dimensional chaotic systems, 2002.00799, 2020.
[90]	Zhen Li, Hee Sun Lee, Eric Darve, and George Em Karniadakis, Computing the non-Markovian coarse-grained interactions derived from the Mori-Zwanzig formalism in molecular systems: application to polymer melts, J. Chem. Phys. 146, no. 1, 014104.
[91]	Zhong Li, Jiequn Han, Weinan E, and Qianxiao Li, On the curse of memory in recurrent neural networks: approximation and optimization analysis, 2009.07799, 2020. · Zbl 07625195
[92]	Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar, Fourier neural operator for parametric partial differential equations, 2010.08895, 2021.
[93]	Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar, Markov neural operators for learning chaotic systems, 2106.06898, 2021.
[94]	Lin, Kevin K., Data-driven model reduction, Wiener projections, and the Koopman-Mori-Zwanzig formalism, J. Comput. Phys., Paper No. 109864, 33 pp. (2021) · Zbl 07508469 · doi:10.1016/j.jcp.2020.109864
[95]	Ori Linial, Neta Ravid, Danny Eytan, and Uri Shalit, Generative ODE modeling with known unknowns, Proceedings of the Conference on Health, Inference, and Learning, CHIL ’21, New York, NY, USA, Association for Computing Machinery, April 2021, pp. 79-94, https://doi.org/10.1145/3450439.3451866.
[96]	E. Lorenz, Predictability-a problem partly solved, Proc. Seminar on Predictability, Reading, UK, ECMWF, 1996. https://ci.nii.ac.jp/naid/10015392260/en/.
[97]	Lorenz, Edward N., Deterministic nonperiodic flow, J. Atmospheric Sci., 130-141 (1963) · Zbl 1417.37129 · doi:10.1175/1520-0469(1963)020$\langle
[98]	Robert J. Lovelett, Jos\'e L. Avalos, and Ioannis G. Kevrekidis, Partial observations and conservation laws: gray-box modeling in biotechnology and optogenetics, Ind. Eng. Chem. Res. 59 (2020), no. 6, 2611-2620, https://doi.org/10.1021/acs.iecr.9b04507.
[99]	Lu, Fei, Data-driven model reduction for stochastic Burgers equations, Entropy, Paper No. 1360, 22 pp. (2020) · doi:10.3390/e22121360
[100]	Lu, Fei, Comparison of continuous and discrete-time data-based modeling for hypoelliptic systems, Commun. Appl. Math. Comput. Sci., 187-216 (2016) · doi:10.2140/camcos.2016.11.187
[101]	Lu, Fei, Data-based stochastic model reduction for the Kuramoto-Sivashinsky equation, Phys. D, 46-57 (2017) · Zbl 1376.35100 · doi:10.1016/j.physd.2016.09.007
[102]	Lu Lu, Pengzhan Jin, and George Em Karniadakis, DeepONet: learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators, 1910.03193, 2020.
[103]	Lu, Zhixin, Attractor reconstruction by machine learning, Chaos, 061104, 9 pp. (2018) · doi:10.1063/1.5039508
[104]	Mantas Lukosevicius and Herbert Jaeger, Reservoir computing approaches to recurrent neural network training, Comput. Sci. Rev. 3 (2009), no. 3, 127-149, August 2009, https://www.sciencedirect.com/science/article/pii/S1574013709000173. · Zbl 1302.68235
[105]	Chao Ma, Jianchun Wang, and Weinan E, Model reduction with memory and the machine learning of dynamical systems, Commun. Comput. Phys. 25 (2019), no. 4, http://www.global-sci.com/intro/article_detail/cicp/12885.html. · Zbl 1473.35450
[106]	Romit Maulik, Bethany Lusch, and Prasanna Balaprakash, Reduced-order modeling of advection-dominated systems with recurrent neural networks and convolutional autoencoders, Phys. Fluids 33 (2021), no. 3, 037106, March 2021, https://aip.scitation.org/doi/abs/10.1063/5.0039986.
[107]	McGoff, Kevin, Consistency of maximum likelihood estimation for some dynamical systems, Ann. Statist., 1-29 (2015) · Zbl 1319.37006 · doi:10.1214/14-AOS1259
[108]	Meng, Xiao-Li, The EM algorithm-an old folk-song sung to a fast new tune, J. Roy. Statist. Soc. Ser. B, 511-567 (1997) · Zbl 1090.62518 · doi:10.1111/1467-9868.00082
[109]	Andrew C. Miller, Nicholas J. Foti, and Emily Fox, Learning insulin-glucose dynamics in the wild, 2008.02852, 2020.
[110]	Andrew C. Miller, Nicholas J. Foti, and Emily B. Fox, Breiman’s two cultures: you don’t have to choose sides, 2104.12219, 2021.
[111]	Mockus, Jonas, Bayesian approach to global optimization, Mathematics and its Applications (Soviet Series), xiv+254 pp. (1989), Kluwer Academic Publishers Group, Dordrecht · Zbl 0693.49001 · doi:10.1007/978-94-009-0909-0
[112]	Kumpati S. Narendra and Kannan Parthasarathy, Neural networks and dynamical systems, Internat. J. Approx. Reason. 6 (1992), no. 2, 109-131, https://www.sciencedirect.com/science/article/pii/0888613X9290014Q. · Zbl 0767.93035
[113]	Nelsen, Nicholas H., The random feature model for input-output maps between Banach spaces, SIAM J. Sci. Comput., A3212-A3243 (2021) · Zbl 07398767 · doi:10.1137/20M133957X
[114]	Duong Nguyen, Said Ouala, Lucas Drumetz, and Ronan Fablet, EM-like learning chaotic dynamics from noisy and partial observations, 1903.10335, 2019.
[115]	Murphy Yuezhen Niu, Lior Horesh, and Isaac Chuang, Recurrent neural networks in the eye of differential equations, 1904.12933, 2019.
[116]	Fernando Nogueira, Bayesian optimization: open source constrained global optimization tool for Python, 2014. https://github.com/fmfn/BayesianOptimization.
[117]	Paul A. O’Gorman and John G. Dwyer, Using machine learning to parameterize moist convection: potential for modeling of climate, climate change, and extreme events, J. Adv. Model. Earth Syst. 10 (2018), no. 10, 2548-2563, https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2018MS001351.
[118]	Ouala, S., Learning latent dynamics for partially observed chaotic systems, Chaos, 103121, 17 pp. (2020) · Zbl 1456.37099 · doi:10.1063/5.0019309
[119]	Parish, Eric J., A dynamic subgrid scale model for large eddy simulations based on the Mori-Zwanzig formalism, J. Comput. Phys., 154-175 (2017) · Zbl 1380.76024 · doi:10.1016/j.jcp.2017.07.053
[120]	Jaideep Pathak, Brian Hunt, Michelle Girvan, Zhixin Lu, and Edward Ott, Model-free prediction of large spatiotemporally chaotic systems from data: a reservoir computing approach, Phys. Rev. Lett. 120 (2018), no. 2, 024102, https://link.aps.org/doi/10.1103/PhysRevLett.120.024102. · Zbl 1390.37138
[121]	Pathak, Jaideep, Hybrid forecasting of chaotic processes: using machine learning in conjunction with a knowledge-based model, Chaos, 041101, 9 pp. (2018) · doi:10.1063/1.5028373
[122]	Pavliotis, Grigorios A., Multiscale methods, Texts in Applied Mathematics, xviii+307 pp. (2008), Springer, New York · Zbl 1160.35006
[123]	Plumlee, Matthew, Bayesian calibration of inexact computer models, J. Amer. Statist. Assoc., 1274-1285 (2017) · doi:10.1080/01621459.2016.1211016
[124]	Plumlee, Matthew, Orthogonal Gaussian process models, Statist. Sinica, 601-619 (2018) · Zbl 1390.62047
[125]	Manuel Pulido, Pierre Tandeo, Marc Bocquet, Alberto Carrassi, and Magdalena Lucini, Stochastic parameterization identification using ensemble Kalman filtering combined with maximum likelihood methods, Tellus A Dyn. Meteorology Oceanogr. 70 (2018), no. 1, 1-17.
[126]	Pyle, Ryan, Domain-driven models yield better predictions at lower cost than reservoir computers in Lorenz systems, Philos. Trans. Roy. Soc. A, Paper No. 20200246, 22 pp. (2021) · doi:10.1103/physrevlett.120.024102
[127]	Zhaozhi Qian, William R. Zame, Lucas M. Fleuren, Paul Elbers, and Mihaela van der Schaar, Integrating expert ODEs into neural ODEs: pharmacology and disease progression, 2106.02875, 2021.
[128]	Alejandro F. Queiruga, N. Benjamin Erichson, Dane Taylor, and Michael W. Mahoney, Continuous-in-depth neural networks, 2008.02389, 2020.
[129]	Christopher Rackauckas, Yingbo Ma, Julius Martensen, Collin Warner, Kirill Zubov, Rohit Supekar, Dominic Skinner, Ali Ramadhan, and Alan Edelman, Universal differential equations for scientific machine learning, 2001.04385, 2020.
[130]	Christopher Rackauckas, Roshan Sharma, and Bernt Lie, Hybrid mechanistic + neural model of laboratory helicopter, pages 264-271, March 2021, pp. 264-271, https://ep.liu.se/en/conference-article.aspx?series=ecp&issue=176&Article_No=37.
[131]	Ali Rahimi and Benjamin Recht, Random features for large-scale kernel machines, Adv. Neural Inform. Process. Syst. 20 (2008), Curran Associates, Inc., https://proceedings.neurips.cc/paper/2007/file/013a006f03dbc5392effeb8f18fda755-Paper.pdf.
[132]	Ali Rahimi and Benjamin Recht, Uniform approximation of functions with random bases, 2008 46th Annual Allerton Conference on Communication, Control, and Computing, IEEE, 2008, pp. 555-561.
[133]	Ali Rahimi and Benjamin Recht, Weighted sums of random kitchen sinks: replacing minimization with randomization in learning, Nips, Citeseer, 2008, pp. 1313-1320.
[134]	Raissi, M., Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys., 686-707 (2019) · Zbl 1415.68175 · doi:10.1016/j.jcp.2018.10.045
[135]	Raissi, M., Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys., 686-707 (2019) · Zbl 1415.68175 · doi:10.1016/j.jcp.2018.10.045
[136]	Rasmussen, Carl Edward, Gaussian processes for machine learning, Adaptive Computation and Machine Learning, xviii+248 pp. (2006), MIT Press, Cambridge, MA · Zbl 1177.68165
[137]	Stephan Rasp, Michael S. Pritchard, and Pierre Gentine, Deep learning to represent subgrid processes in climate models, Proc. Natl. Acad. Sci. USA 115 (2018), no. 39, 9684-9689, https://www.pnas.org/content/115/39/9684.
[138]	Reich, Sebastian, Probabilistic forecasting and Bayesian data assimilation, x+297 pp. (2015), Cambridge University Press, New York · Zbl 1314.62005 · doi:10.1017/CBO9781107706804
[139]	R. Rico-Martines, I. G. Kevrekidis, M. C. Kube, and J. L. Hudson, Discrete- vs. continuous-time nonlinear signal processing: Attractors, transitions and parallel implementation issues, 1993 American Control Conference, June 1993, pp. 1475-1479, San Francisco, CA, USA, IEEE, ISBN 978-0-7803-0860-2, https://ieeexplore.ieee.org/document/4793116/.
[140]	R. Rico-Mart\'inez, K. Krischer, I. G. Kevrekidis, M. C. Kube, and J. L. Hudson. Discrete- vs. continuous-time nonlinear signal processing of Cu electrodissolution data, Chemical Engineering Communications, 118 (1): 25-48, November 1992. https://www.tandfonline.com/doi/full/10.1080/00986449208936084.
[141]	R. Rico-Martinez, J. S. Anderson, and I. G. Kevrekidis, Continuous-time nonlinear signal processing: a neural network based approach for gray box identification, Proceedings of IEEE Workshop on Neural Networks for Signal Processing, Ermioni, Greece, IEEE, 1994, pp. 596-605. ISBN 978-0-7803-2026-0, http://ieeexplore.ieee.org/document/366006/.
[142]	Yulia Rubanova, Ricky T. Q. Chen, and David Duvenaud, Latent ODEs for irregularly-sampled time series, 1907.03907, 2019.
[143]	David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams, Learning representations by back-propagating errors, Nature 323 (1986), no. 6088, 533-536. · Zbl 1369.68284
[144]	Matteo Saveriano, Yuchao Yin, Pietro Falco, and Dongheui Lee, Data-efficient control policy search using residual dynamics learning, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), September 2017, pp. 4709-4715, ISSN 2153-0866
[145]	Hayden Schaeffer, Giang Tran, and Rachel Ward, Learning dynamical systems and bifurcation via group sparsity, Preprint, 1709.01558, 2017.
[146]	Schaeffer, Hayden, Extracting sparse high-dimensional dynamics from limited data, SIAM J. Appl. Math., 3279-3295 (2018) · Zbl 1405.62127 · doi:10.1137/18M116798X
[147]	Schaeffer, Hayden, Extracting structured dynamical systems using sparse optimization with very few samples, Multiscale Model. Simul., 1435-1461 (2020) · Zbl 1528.65035 · doi:10.1137/18M1194730
[148]	Anton Maximilian Sch\"afer and Hans-Georg Zimmermann, Recurrent neural networks are universal approximators, International Journal of Neural Systems 17, no. 4, 253-263, 2007.
[149]	Robert E. Schapire, The strength of weak learnability, Mach. Learn. 5 (1990), no. 2, 197-227. · Zbl 0747.68058
[150]	Schneider, Tapio, Learning stochastic closures using ensemble Kalman inversion, Trans. Math. Appl., Paper No. tnab003, 31 pp. (2021) · Zbl 1478.60129 · doi:10.1093/imatrm/tnab003
[151]	Skipper Seabold and Josef Perktold, statsmodels: econometric and statistical modeling with python, 9th Python in Science Conference, 2010.
[152]	Sherstinsky, Alex, Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network, Phys. D, 132306, 28 pp. (2020) · Zbl 1508.68298 · doi:10.1016/j.physd.2019.132306
[153]	Guanya Shi, Xichen Shi, Michael O’Connell, Rose Yu, Kamyar Azizzadenesheli, Animashree Anandkumar, Yisong Yue, and Soon-Jo Chung, Neural Lander: stable drone landing control using learned dynamics, 1811.08027, 2018.
[154]	Jonathan D. Smith, Kamyar Azizzadenesheli, and Zachary E. Ross, EikoNet: solving the eikonal equation with deep neural networks, IEEE Transactions on Geoscience and Remote Sensing, 2020, pp. 1-12.
[155]	Peter D. Sottile, David Albers, Peter E. DeWitt, Seth Russell, J. N. Stroh, David P. Kao, Bonnie Adrian, Matthew E. Levine, Ryan Mooney, Lenny Larchick, Jean S. Kutner, Matthew K. Wynia, Jeffrey J. Glasheen, and Tellen D. Bennett, Real-time electronic health record mortality prediction during the COVID-19 pandemic: A prospective cohort study, medRxiv, Cold Spring Harbor Laboratory Press, January 2021, p. 2021.01.14.21249793, https://www.medrxiv.org/content/10.1101/2021.01.14.21249793v1.
[156]	Langxuan Su and Sayan Mukherjee, A large deviation approach to posterior consistency in dynamical systems, 2106.06894, 2021.
[157]	Takens, Floris, Dynamical systems and turbulence, Warwick 1980 (Coventry, 1979/1980). Detecting strange attractors in turbulence, Lecture Notes in Math., 366-381 (1981), Springer, Berlin-New York · Zbl 0513.58032
[158]	Zhihong Tan, Colleen M. Kaul, Kyle G. Pressel, Yair Cohen, Tapio Schneider, and Jo\~ao Teixeira, An extended eddy-diffusivity mass-flux scheme for unified representation of subgrid-scale turbulence and convection, J. Adv. Model. Earth Sys. 10 (2010), no. 3, 770-800.
[159]	Tran, Giang, Exact recovery of chaotic systems from highly corrupted data, Multiscale Model. Simul., 1108-1129 (2017) · doi:10.1137/16M1086637
[160]	Tu, Jonathan H., On dynamic mode decomposition: theory and applications, J. Comput. Dyn., 391-421 (2014) · Zbl 1346.37064 · doi:10.3934/jcd.2014.1.391
[161]	Vanden-Eijnden, Eric, Numerical techniques for multi-scale dynamical systems with stochastic effects, Commun. Math. Sci., 385-391 (2003) · Zbl 1088.60060
[162]	Vapnik, Vladimir N., The nature of statistical learning theory, xvi+188 pp. (1995), Springer-Verlag, New York · Zbl 0833.62008 · doi:10.1007/978-1-4757-2440-0
[163]	Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, St\'efan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C. J. Carey, Ilhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Ant\^onio H. Ribeiro, Fabian Pedregosa, and Paul van Mulbregt, SciPy 1.0: fundamental algorithms for scientific computing in Python, Nat. Methods 17 (2020), no. 3, 261-272, https://www.nature.com/articles/s41592-019-0686-2.
[164]	P. R. Vlachas, J. Pathak, B. R. Hunt, T. P. Sapsis, M. Girvan, E. Ott, and P. Koumoutsakos, Backpropagation algorithms and reservoir computing in recurrent neural networks for the forecasting of complex spatiotemporal dynamics, Neural Netw. 126 (2020), 191-217, https://linkinghub.elsevier.com/retrieve/pii/S0893608020300708.
[165]	Jack Wang, Aaron Hertzmann, and David J. Fleet, Gaussian process dynamical models, Adv. Neural Inform. Process. Syst. 18 (2005), https://papers.nips.cc/paper/2005/hash/ccd45007df44dd0f12098f486e7e8a0f-Abstract.html.
[166]	Wang, Qian, Recurrent neural network closure of parametric POD-Galerkin reduced-order models based on the Mori-Zwanzig formalism, J. Comput. Phys., 109402, 32 pp. (2020) · Zbl 1436.65093 · doi:10.1016/j.jcp.2020.109402
[167]	Peter A. G. Watson, Applying machine learning to improve simulations of a chaotic dynamical system using empirical error correction, 1904.10904, 2019.
[168]	Wikner, Alexander, Combining machine learning with knowledge-based modeling for scalable forecasting and subgrid-scale closure of large, complex, spatiotemporal systems, Chaos, 053111, 16 pp. (2020) · doi:10.1063/5.0005541
[169]	Jared Willard, Xiaowei Jia, Shaoming Xu, Michael Steinbach, and Vipin Kumar, Integrating scientific knowledge with machine learning for engineering and environmental systems, 2003.04919, 2021.
[170]	J. A. Wilson and L. F. M. Zorzetto, A generalised approach to process state estimation using hybrid artificial neural network/mechanistic models, Comput. Chem. Eng. 21 (1997), no. 9, 951-963, http://linkinghub.elsevier.com/retrieve/pii/S0098135496003365.
[171]	Armand Wirgin, The inverse crime, math-ph/0401050, 2004. · Zbl 0478.76023
[172]	David H. Wolpert, Stacked generalization, Neural Netw. 5 (1992), no. 2, 241-259, https://www.sciencedirect.com/science/article/pii/S0893608005800231.
[173]	Wan, Zhong Yi, Bubbles in turbulent flows: data-driven, kinematic models with history terms, Int. J. Multiph. Flow, 103286, 11 pp. (2020) · doi:10.1016/j.ijmultiphaseflow.2020.103286
[174]	Yin, Yuan, Augmenting physical models with deep networks for complex dynamics forecasting, J. Stat. Mech. Theory Exp., Paper No. 124012, 30 pp. (2021) · Zbl 1539.68325 · doi:10.1088/1742-5468/ac3ae5
[175]	He Zhang, John Harlim, Xiantao Li, Estimating linear response statistics using orthogonal polynomials: an RKHS formulation, Found. Data Sci. 2 (2020), no. 4, 443-485, http://aimsciences.org//article/doi/10.3934/fods.2020021.
[176]	Zhang, He, Error bounds of the invariant statistics in machine learning of ergodic It\^o diffusions, Phys. D, Paper No. 133022, 28 pp. (2021) · Zbl 1482.60108 · doi:10.1016/j.physd.2021.133022
[177]	Jian Zhu and Masafumi Kamachi, An adaptive variational method for data assimilation with imperfect models, Tellus A Dyn. Meteorology Oceanogr. 52, (2000), no. 3, 265-279, https://doi.org/10.3402/tellusa.v52i3.12265.
[178]	Zhu, Yuanran, On the estimation of the Mori-Zwanzig memory integral, J. Math. Phys., 103501, 39 pp. (2018) · Zbl 1402.60044 · doi:10.1063/1.5003467

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.