×

Deep reinforcement learning for the control of conjugate heat transfer. (English) Zbl 07513859

Summary: This research gauges the ability of deep reinforcement learning (DRL) techniques to assist the control of conjugate heat transfer systems governed by the coupled Navier-Stokes and heat equations. It uses a novel, “degenerate” version of the proximal policy optimization (PPO) algorithm, intended for situations where the optimal policy to be learnt by a neural network does not depend on state, as is notably the case in optimization and open-loop control problems. The numerical reward fed to the neural network is computed with an in-house stabilized finite elements environment combining variational multi-scale (VMS) modeling of the governing equations, immerse volume method, and multi-component anisotropic mesh adaptation. Several test cases of natural and forced convection in two and three dimensions are used as testbed for developing the methodology. The approach successfully alleviates the natural convection induced enhancement of heat transfer in a two-dimensional, differentially heated square cavity controlled by piece-wise constant fluctuations of the sidewall temperature. It also proves capable of improving the homogeneity of temperature across the surface of two and three-dimensional hot workpieces under impingement cooling. Various cases are tackled, in which the position of multiple cold air injectors is optimized relative to a fixed workpiece position. The flexibility of the numerical framework makes it tractable to solve also the inverse problem, i.e., to optimize the workpiece position relative to a fixed injector distribution. The obtained results showcase the potential of the method for black-box optimization of practically meaningful computational fluid dynamics (CFD) conjugate heat transfer systems. More significantly, they stress how DRL can reveal unanticipated solutions or parameter relations (as the optimal workpiece position under symmetrical actuation turns to be offset from the symmetry axis), in addition to being a tool for optimizing searches in large parameter spaces.

MSC:

76Mxx Basic methods in fluid mechanics
76Dxx Incompressible viscous fluids
76Rxx Diffusion and convection

References:

[1] Jameson, A., Aerodynamic design via control theory, J. Sci. Comput., 3, 233-260 (1998)
[2] Gunzburger, M. D., Perspectives in Flow Control and Optimization (2002), SIAM: SIAM Philadelphia
[3] Bewley, T., Flow control: new challenges for a new renaissance, Prog. Aerosp. Sci., 37, 21-58 (2001)
[4] Momose, K.; Abe, K.; Kimoto, H., Reverse computation of forced convection heat transfer for optimal control of thermal boundary conditions, Heat Transf. Asian Res., 33, 161-174 (2004)
[5] Belmiloudi, A., Robin-type boundary control problems for the nonlinear Boussinesq type equations, J. Math. Anal. Appl., 273, 428-456 (2002) · Zbl 1032.76017
[6] Bärwolff, G.; Hinze, M., Optimization of semiconductor melts, Z. Angew. Math. Mech., 86, 423-437 (2006) · Zbl 1098.76027
[7] Boldrini, J.; Fernàndez-Cara, E.; Rojas-Medar, M., An optimal control problem for a generalized Boussinesq model: the time dependent case, Rev. Mat. Comput., 20, 339-366 (2007) · Zbl 1131.49002
[8] Karkaba, H.; Dbouk, T.; Habchi, C.; Russeil, S.; Lemenand, T.; Bougeard, D., Multi objective optimization of vortex generators for heat transfer enhancement using large design space exploration, Chem. Eng. Process., 154, Article 107982 pp. (2020)
[9] Meliga, P.; Chomaz, J.-M., Global modes in a confined impinging jet: application to heat transfer and control, Theor. Comput. Fluid Dyn., 25, 1, 179-193 (2011) · Zbl 1272.76123
[10] Rumelhart, D.; Hinton, G.; Williams, R., Learning representations by back-propagating errors, Nature, 323, 533-536 (1986) · Zbl 1369.68284
[11] Kober, J.; Bagnell, J. A.; Peters, J., Reinforcement learning in robotics: a survey, Int. J. Robot. Res., 32, 11, 1238-1274 (2013)
[12] Mnih, V.; Kavukcuoglu, K.; Silver, D.; A. A., R.; Bellemare, M.; Graves, A.; Riedmiller, M.; Fidjeland, A. K.; Ostrovski, G.; Petersen, S.; Beattie, C.; Sadik, A.; Antonoglou, I.; King, H.; Kumaran, D.; Wierstra, D.; Legg, S.; Hassabis, D., Human-level control through deep reinforcement learning, Nature, 518, 7540 (2015)
[13] Hinton, G. E.; Krizhevsky, A.; Sutskever, I., Imagenet classification with deep convolutional neural networks, Adv. Neural Inf. Process. Syst., 25, 1106-1114 (2012)
[14] Lusch, B.; Kutz, J. N.; Brunton, S. L., Deep learning for universal linear embeddings of nonlinear dynamics, Nat. Commun., 9, 1, 1-10 (2018)
[15] Raissi, M.; Yazdani, A.; Karniadakis, G. E., Hidden fluid mechanics: a Navier-Stokes informed deep learning framework for assimilating flow visualization data (2018)
[16] Beck, A. D.; Flad, D. G.; Munz, C.-D., Deep neural networks for data-driven turbulence models (2018)
[17] Brunton, S. L.; Noack, B.; Koumoutsakos, P., Machine learning for fluid mechanics, Annu. Rev. Fluid Mech., 52, 477-508 (2020) · Zbl 1439.76138
[18] Silver, D.; Schrittwieser, J.; Simonyan, K.; Antonoglou, I.; Huang, A.; Guez, A.; Hubert, T.; Baker, L.; Lai, M.; Bolton, A.; Chen, Y.; Lillicrap, T.; Hui, F.; Sifre, L.; van den Driessche, G.; Graepel, T.; Hassabis, D., Mastering the game of go without human knowledge, Nature, 550, 7676, 354-359 (2017)
[19] Hwangbo, J.; Lee, J.; Dosovitskiy, A.; Bellicoso, D.; Tsounis, V.; Koltun, V.; Hutter, M., Learning agile and dynamic motor skills for legged robots, Sci. Robot., 4, 26, Article eaau5872 pp. (2019)
[20] Bernstein, A.; Burnaev, E., Reinforcement learning in computer vision, (Proc. SPIE 10696, 10th International Conference on Machine Vision. Proc. SPIE 10696, 10th International Conference on Machine Vision, ICMV 2017 (2018))
[21] Lillicrap, T. P.; Hunt, J. J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D., Continuous control with deep reinforcement learning (2015), arXiv:e-prints
[22] Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O., Proximal policy optimization algorithms (Jul. 2017), arXiv:e-prints
[23] Belus, V.; Rabault, J.; Viquerat, J.; Che, Z.; Hachem, E.; Réglade, U., Exploiting locality and translational invariance to design effective deep reinforcement learning control of the 1-dimensional unstable falling liquid film, AIP Adv., 9, Article 125014 pp. (2019)
[24] Bucci, M. A.; Semeraro, O.; Allauzen, A.; Wisniewski, G.; Cordier, L.; Mathelin, L., Control of chaotic systems by deep reinforcement learning, Proc. R. Soc. A, 475, Article 20190351 pp. (2019) · Zbl 1472.68171
[25] Novati, G.; Mahadevan, L.; Koumoutsakos, P., Controlled gliding and perching through deep-reinforcement-learning, Phys. Rev. Fluids, 4, Article 093902 pp. (2019)
[26] Novati, G.; Verma, S.; Alexeev, D.; Rossinelli, D.; van Rees, W. M.; Koumoutsakos, P., Synchronisation through learning for two self-propelled swimmers, Bioinspir. Biomim., 12, Article 036001 pp. (2017)
[27] Verma, S.; Novati, G.; Koumoutsakos, P., Efficient collective swimming by harnessing vortices through deep reinforcement learning, Proc. Natl. Acad. Sci. USA, 115, 5849-5854 (2018)
[28] Lee, K.; Kim, S.; Choi, J., Deep reinforcement learning in continuous action spaces: a case study in the game of simulated curling, (Procs. of the 35th International Conference on Machine Learning (2018)), 4587-4596
[29] Yan, X.; Zhu, J.; Kuang, M.; Wang, X., Aerodynamic shape optimization using a novel optimizer based on machine learning techniques, Aerosp. Sci. Technol., 86, 826-835 (2019)
[30] Viquerat, J.; Rabault, J.; Kuhnle, A.; Ghraieb, H.; Hachem, E., Direct shape optimization through deep reinforcement learning (2019), arXiv preprint
[31] Ma, P.; Tian, Y.; Pan, Z.; Ren, B.; Manocha, D., Fluid directed rigid body control using deep reinforcement learning, ACM Trans. Graph., 37, 4, 1-11 (2018)
[32] Biferale, L.; Bonaccorso, F.; Buzicotti, M.; Clark Di Leioni, P.; Gustavsson, K., Zermelo’s problem: optimal point-to-point navigation in 2D turbulent flows using reinforcement learning, Chaos, 29, Article 103138 pp. (2019)
[33] Rabault, J.; Kuchta, M.; Jensen, A.; Réglade, U.; Cerardi, N., Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control, J. Fluid Mech., 865, 281-302 (2019) · Zbl 1415.76222
[34] Ren, F.; Hu, H.; Tang, H., Active flow control using machine learning: a brief review, J. Hydrodyn., 32, 247-253 (2020)
[35] Tang, H.; Rabault, J.; Kuhnle, A.; Wang, Y.; Wang, T., Robust active flow control over a range of Reynolds numbers using an artificial neural network trained through deep reinforcement learning, Phys. Fluids, 32, Article 053605 pp. (2020)
[36] Paris, R.; Beneddine, R.; Dandois, J., Robust flow control and optimal sensor placement using deep reinforcement learning (2020), arXiv preprint
[37] Xu, H.; Zhang, W.; Deng, J.; Rabault, J., Active flow control with rotating cylinders by an artificial neural network trained by deep reinforcement learning, J. Hydrodyn., 32, 254-258 (2020)
[38] Fan, D.; Yang, L.; Wang, Z.; Triantafyllou, M. S.; Karniadakis, G. E., Reinforcement learning for bluff body active flow control in experiments and simulations, Proc. Natl. Acad. Sci. USA, 117, 26091-26098 (2020)
[39] Beintema, G.; Corbetta, A.; Biferale, L.; Toschi, F., Controlling Rayleigh-B ’enard convection via reinforcement learning (2020), arXiv preprint
[40] Kazmi, H.; Mehmood, F.; Lodeweyckx, S.; Driesen, J., Gigawatt-hour scale savings on a budget of zero: deep reinforcement learning based optimal control of hot water systems, Energy, 144, 159-168 (2018)
[41] Zhang, T.; Luo, J.; Chen, P.; Liu, J., Flow rate control in smart district heating systems using deep reinforcement learning (2019), arXiv preprint
[42] Ghraieb, H.; Meliga, P.; Viquerat, J.; Hachem, E., Single-step deep reinforcement learning for open-loop control of laminar and turbulent flows, Phys. Rev. Fluids (2021), accepted for publication in
[43] Gruau, C.; Coupez, T., 3d tetrahedral, unstructured and anisotropic mesh generation with adaptation to natural and multidomain metric, Comput. Methods Appl. Mech. Eng., 194, 4951-4976 (2005) · Zbl 1102.65122
[44] Bernacki, M.; Chastel, Y.; Coupez, T.; Logé, R., Level set framework for the numerical modelling of primary recrystallization in polycrystalline materials, Scr. Mater., 58, 1129-1132 (2008)
[45] Mesri, Y.; Digonnet, H.; Coupez, T., Advanced parallel computing in material forming with CIMLib, Eur. J. Comput. Mech., 18, 669-694 (2009) · Zbl 1278.74177
[46] Coupez, T., Metric construction by length distribution tensor and edge based error for anisotropic adaptive meshing, J. Comput. Phys., 230, 2391-2405 (2011) · Zbl 1218.65139
[47] Patankar, S., Numerical Heat Transfer and Fluid Flow (1980), Taylor and Francis · Zbl 0521.76003
[48] Patankar, S. V., A numerical method for conduction in composite materials, flow in irregular geometries and conjugate heat transfer, (Procs. of the 6th International Heat Transfer Conference (1978)), 297-302
[49] Hughes, T. J.R.; Feijóo, G. R.; Mazzei, L.; Quincy, J.-B., The variational multiscale method - a paradigm for computational mechanics, Comput. Methods Appl. Mech. Eng., 166, 3-24 (1998) · Zbl 1017.65525
[50] Codina, R., Stabilization of incompressibility and convection through orthogonal sub-scales in finite element methods, Comput. Methods Appl. Mech. Eng., 190, 1579-1599 (2000) · Zbl 0998.76047
[51] Bazilevs, Y.; Calo, V. M.; Cottrell, J. A.; Hughes, T. J.R.; Reali, A.; Scovazzi, G., Variational multiscale residual-based turbulence modeling for large eddy simulation of incompressible flows, Comput. Methods Appl. Mech. Eng., 197, 173-201 (2007) · Zbl 1169.76352
[52] Hachem, E.; Rivaux, B.; Kloczko, T.; Digonnet, H.; Coupez, T., Stabilized finite element method for incompressible flows with high Reynolds number, J. Comput. Phys., 229, 23, 8643-8665 (2010) · Zbl 1282.76120
[53] Hachem, E.; Kloczko, T.; Digonnet, H.; Coupez, T., Stabilized finite element solution to handle complex heat and fluid flows in industrial furnaces using the immersed volume method, Int. J. Numer. Methods Fluids, 68, 99-121 (2012) · Zbl 1319.76027
[54] Codina, R., Stabilized finite element approximation of transient incompressible flows using orthogonal subscales, Comput. Methods Appl. Mech. Eng., 191, 4295-4321 (2002) · Zbl 1015.76045
[55] Hachem, E.; Feghali, S.; Codina, R.; Coupez, T., Immersed stress method for fluid-structure interaction using anisotropic mesh adaptation, Int. J. Numer. Methods Eng., 94, 805-825 (2013) · Zbl 1352.74108
[56] Codina, R., Comparison of some finite element methods for solving the diffusion-convection-reaction equation, Comput. Methods Appl. Mech. Eng., 156, 185-210 (1998) · Zbl 0959.76040
[57] Badia, S.; Codina, R., Analysis of a stabilized finite element approximation of the transient convection-diffusion equation using an ALE framework, SIAM J. Numer. Anal., 44, 2159-2197 (2006) · Zbl 1126.65079
[58] Brooks, A. N.; Hughes, T., Streamline upwind/Petrov-Galerkin formulations for convection dominated flows with particular emphasis on the incompressible Navier-Stokes equations, Comput. Methods Appl. Mech. Eng., 32, 199-259 (1982) · Zbl 0497.76041
[59] Galeão, A.; Do Carmo, E., A consistent approximate upwind Petrov-Galerkin method for convection-dominated problems, Comput. Methods Appl. Mech. Eng., 68, 83-95 (1988) · Zbl 0626.76091
[60] Hachem, E.; Digonnet, H.; Massoni, E.; Coupez, T., Immersed volume method for solving natural convection, conduction and radiation of a hat-shaped disk inside a 3d enclosure, Int. J. Numer. Methods Heat Fluid Flow (2012) · Zbl 1356.76165
[61] Hachem, E.; Digonnet, H.; Massoni, E.; Coupez, T., Immersed volume method for solving natural convection, conduction and radiation of a hat-shaped disk inside a 3d enclosure, Int. J. Numer. Methods Heat Fluid Flow, 22, 718-741 (2012) · Zbl 1356.76165
[62] Hachem, E.; Jannoun, G.; Veysset, J.; Henri, M.; Pierrot, R.; Poitrault, I.; Massoni, E.; Coupez, T., Modeling of heat transfer and turbulent flows inside industrial furnaces, Simul. Model. Pract. Theory, 30, 35-53 (2013)
[63] Goodfellow, I.; Bengio, Y.; Courville, A., The Deep Learning Book (2017), MIT Press
[64] Sutton, R. S.; Barto, A. G., Reinforcement Learning: An Introduction (2018), MIT Press · Zbl 1407.68009
[65] Kakade, A., A natural policy gradient, Adv. Neural Inf. Process. Syst., 14, 1531-1538 (2001)
[66] Schulman, J.; Levine, S.; Moritz, P.; Jordan, M. I.; Abbeel, P., Trust region policy optimization (Feb. 2015), arXiv e-prints
[67] Wang, Y.; He, H.; Tan, X.; Gan, Y., Trust region-guided proximal policy optimization (2019), arXiv preprint
[68] Hill, A.; Raffin, A.; Ernestus, M.; Gleave, A.; Kanervisto, A.; Traore, R.; Dhariwal, P.; Hesse, C.; Klimov, O.; Nichol, A.; Plappert, M.; Radford, A.; Schulman, J.; Sidor, S.; Wu, Y., Stable baselines (2018)
[69] Brockman, G.; Cheung, V.; Pettersson, L.; Schneider, J.; Schulman, J.; Tang, J.; Zaremba, W., Openai gym (2016)
[70] de Vahl Davis, G.; Jones, I., Natural convection in a square cavity: a comparison exercise, Int. J. Numer. Methods Fluids, 3, 227-248 (1983) · Zbl 0538.76076
[71] Dixit, H.; Babu, V., Simulation of high Rayleigh number natural convection in a square cavity using the lattice Boltzmann method, Int. J. Heat Mass Transf., 49, 727-739 (2006) · Zbl 1189.76529
[72] Markatos, N.; Pericleous, K., Laminar and turbulent natural convection in an enclosed cavity, Int. J. Heat Mass Transf., 27, 772-775 (1984) · Zbl 0542.76112
[73] Barakos, G.; Mitsoulis, E., Natural convection flow in a square cavity revisited: laminar and turbulent models with wall functions, Int. J. Numer. Methods Fluids, 18, 695-719 (1994) · Zbl 0806.76055
[74] Khanafer, K.; Vafai, K.; Lightstone, M., Buoyancy-driven heat transfer enhancement in a two-dimensional enclosure utilizing nanofluids, Int. J. Heat Mass Transf., 46, 3639-3653 (2003) · Zbl 1042.76586
[75] Lazaric, A.; Restelli, M.; Bonarini, A., Reinforcement learning in continuous action spaces through sequential Monte Carlo methods, (Procs. of the 35th International Conference on Machine Learning (2018)), 4587-4596
[76] Sari, J.; Cremonesi, F.; Khalloufi, M.; Cauneau, F.; Meliga, P.; Mesri, Y.; Hachem, E., Anisotropic adaptive stabilized finite element solver for rans models, Int. J. Numer. Methods Fluids, 86, 717-736 (2018)
[77] Guiza, G.; Larcher, A.; Goetz, A.; Billon, L.; Meliga, P.; Hachem, E., Anisotropic boundary layer mesh generation for reliable 3D unsteady RANS simulations, Finite Elem. Anal. Des., 170, Article 103345 pp. (2020)
[78] Meliga, P.; Hachem, E., Time-accurate calculation and bifurcation analysis of the incompressible flow over a square cavity using variational multiscale modeling, J. Comput. Phys., 376, 952-972 (2019) · Zbl 1416.76120
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.