×

Physics-constrained non-Gaussian probabilistic learning on manifolds. (English) Zbl 07841255

Summary: An extension of the probabilistic learning on manifolds (PLoM), recently introduced by the authors, has been presented: In addition to the initial data set given for performing the probabilistic learning, constraints are given, which correspond to statistics of experiments or of physical models. We consider a non-Gaussian random vector whose unknown probability distribution has to satisfy constraints. The method consists in constructing a generator using the PLoM and the classical Kullback-Leibler minimum cross-entropy principle. The resulting optimization problem is reformulated using Lagrange multipliers associated with the constraints. The optimal solution of the Lagrange multipliers is computed using an efficient iterative algorithm. At each iteration, the Markov chain Monte Carlo algorithm developed for the PLoM is used, consisting in solving an Itô stochastic differential equation that is projected on a diffusion-maps basis. The method and the algorithm are efficient and allow the construction of probabilistic models for high-dimensional problems from small initial data sets and for which an arbitrary number of constraints are specified. The first application is sufficiently simple in order to be easily reproduced. The second one is relative to a stochastic elliptic boundary value problem in high dimension.
{© 2019 John Wiley & Sons, Ltd.}

MSC:

65Cxx Probabilistic methods, stochastic differential equations
62Hxx Multivariate analysis
68Txx Artificial intelligence
Full Text: DOI

References:

[1] KaipioJ, SomersaloE. Statistical and Computational Inverse Problems. New York, NY: Springer‐Verlag; 2005. · Zbl 1068.65022
[2] MarzoukYM, NajmHN, RahnLA. Stochastic spectral methods for efficient Bayesian solution of inverse problems. J Comput Phys. 2007;224(2):560‐586. https://doi.org/10.1016/j.jcp.2006.10.010 · Zbl 1120.65306 · doi:10.1016/j.jcp.2006.10.010
[3] CarlinBP, LouisTA. Bayesian Methods for Data Analysis. 3rd ed. Boca Raton, FL: Chapman & Hall/CRC Press; 2009.
[4] StuartAM. Inverse problems: a Bayesian perspective. Acta Numerica. 2010;19:451‐559. https://doi.org/10.1017/S0962492910000061 · Zbl 1242.65142 · doi:10.1017/S0962492910000061
[5] SpantiniA, CuiT, WillcoxK, TenorioL, MarzoukY. Goal‐oriented optimal approximations of Bayesian linear inverse problems. SIAM J Sci Comput. 2017;39(5):S167‐S196. https://doi.org/10.1137/16M1082123 · Zbl 1373.15027 · doi:10.1137/16M1082123
[6] MarinJ‐M, PudloP, RobertCP, RyderRJ. Approximate Bayesian computational methods. Stat Comput. 2012;22(6):1167‐1180. https://doi.org/10.1007/s11222-011-9288-2 · Zbl 1252.62022 · doi:10.1007/s11222-011-9288-2
[7] MarjoramP, MolitorJ, PlagnolV, TavaréS. Markov chain Monte Carlo without likelihoods. Proc Natl Acad Sci USA. 2003;100(26):15324‐15328. https://doi.org/10.1073/pnas.0306899100 · doi:10.1073/pnas.0306899100
[8] JaynesET. Information theory and statistical mechanics. Physical Review. 1957;106(4):620‐630. · Zbl 0084.43701
[9] BhattacharyyaA. On the measures of divergence between two statistical populations defined by their probability distributions. Bull Calcutta Math Soc. 1943;35:99‐109. · Zbl 0063.00364
[10] KullbackS, LeiblerRA. On information and sufficiency. Ann Math Stat. 1951;22(1):79‐86. · Zbl 0042.38403
[11] KapurJN, KesavanHK. Entropy Optimization Principles with Applications. San Diego, CA: Academic Press; 1992.
[12] GishH. A probabilistic approach to the understanding and training of neural network classifiers. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing; 1990; Albuquerque, NM.
[13] ShatkayH, KaelblingLP. Learning topological maps with weak local odometric information. In: Proceedings of Fifteenth International Joint Conference on Artifical Intelligence (IJCAI’97); 1997; Nagoya, Japan.
[14] BarberD, BishopCM. Ensemble learning for multi‐layer networks. In: Proceedings of the Advances in Neural Information Processing Systems; 1998; Denver, CO.
[15] HollménJ, TrespV, SimulaO. A learning vector quantization algorithm for probabilistic models. In: Proceedings of the 2000 10th European Signal Processing Conference; 2000; Tampere, Finland.
[16] GalataA, JohnsonN, HoggD. Learning variable‐length Markov models of behavior. Comput Vis Image Underst. 2001;81(3):398‐413. https://doi.org/10.1006/cviu.2000.0894 · Zbl 1011.68551 · doi:10.1006/cviu.2000.0894
[17] BigiB. Using Kullback‐Leibler distance for text categorization. In: Proceedings of the European Conference on Information Retrieval; 2003; Pisa, Italy.
[18] KaskiS, SinkkonenJ. Principle of learning metrics for exploratory data analysis. J VLSI Signal Process Syst Signal Image Video Technol. 2004;37(2‐3):177‐188.
[19] LangeT, LawMHC, JainAK, BuhmannJM. Learning with constrained and unlabelled data. In: Proceedings of the Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05); 2005; San Diego, CA.
[20] PeperkampS, Le CalvezR, NadalJ‐P, DupouxE. The acquisition of allophonic rules: statistical learning with linguistic constraints. Cognition. 2006;101(3):B31‐B41. https://doi.org/10.1016/j.cognition.2005.10.006 · doi:10.1016/j.cognition.2005.10.006
[21] FilippiS, CappéO, GarivierA. Optimism in reinforcement learning and Kullback‐Leibler divergence. In: Proceedings of the 2010 48th Annual Allerton Conference on Communication, Control, and Computing; 2010; Allerton, IL.
[22] VasconcelosN, HoP, MorenoP. The Kullback‐Leibler Kernel as a framework for discriminant and localized representations for visual recognition. In: Proceedings of the European Conference on Computer Vision; 2004; Prague, Czech Republic.
[23] ZhangW, ShanS, ChenX, GaoW. Local Gabor binary patterns based on Kullback-Leibler divergence for partially occluded face recognition. IEEE Signal Process Lett. 2007;14(11):875‐878. https://doi.org/10.1109/LSP.2007.903260 · doi:10.1109/LSP.2007.903260
[24] KöestingerM, HirzerM, WohlhartP, RothPM, BischofH. Large scale metric learning from equivalence constraints. In: Proceedings of the 2012 IEEE Conference on Computer vision and Pattern Recognition; 2012; Providence, RI.
[25] CappéO, GarivierA, MaillardO‐A, MunosR, StoltzG. Kullback-Leibler upper confidence bounds for optimal sequential allocation. Ann Stat. 2013;41(3):1516‐1541. https://doi.org/10.1214/13-AOS1119 · Zbl 1293.62161 · doi:10.1214/13-AOS1119
[26] RanZ‐Y, HuB‐G. Determining structural identifiability of parameter learning machines. Neurocomputing. 2014;127:88‐97. https://doi.org/10.1016/j.neucom.2013.08.039 · doi:10.1016/j.neucom.2013.08.039
[27] SunM, LiY, GemmekeJF, ZhangX. Speech enhancement under low SNR conditions via noise estimation using sparse and low‐rank NMF with Kullback-Leibler divergence. IEEE/ACM Trans Audio Speech Lang Process. 2015;23(7):1233‐1242. https://doi.org/10.1109/TASLP.2015.2427520 · doi:10.1109/TASLP.2015.2427520
[28] NowozinS, CsekeB, TomiokaR. f‐gan: training generative neural samplers using variational divergence minimization. In: Proceedings of the Conference on Advances in Neural Information Processing Systems; 2016; Barcelona, Spain.
[29] HanasusantoGA, RoitchV, KuhnD, WiesemannW. Ambiguous joint chance constraints under mean and dispersion information. Operations Research. 2017;65(3):751‐767. https://doi.org/10.1287/opre.2016.1583 · Zbl 1387.90271 · doi:10.1287/opre.2016.1583
[30] NajmHN, BerryRD, SaftaC, SargsyanK, DebusschereBJ. Data‐free inference of uncertain parameters in chemical models. Int J Uncertain Quantification. 2014;4(2):111‐132. https://doi.org/10.1615/Int.J.UncertaintyQuantification.2013005679 · doi:10.1615/Int.J.UncertaintyQuantification.2013005679
[31] NajmHN, ChowdharyK. Inference given summary statistics. In: GhanemR (ed.), HigdonD (ed.), OwhadiH (ed.), eds. Handbook of Uncertainty Quantification. Cham, Switzerland: Springer; 2017:33‐67.
[32] NguyenB, MorellC, De BaetsB. Supervised distance metric learning through maximization of the Jeffrey divergence. Pattern Recognition. 2017;64:215‐225. https://doi.org/10.1016/j.patcog.2016.11.010 · Zbl 1429.68235 · doi:10.1016/j.patcog.2016.11.010
[33] WetzelSJ. Unsupervised learning of phase transitions: from principal component analysis to variational autoencoders. Phys Rev E. 2017;96(2):022140. https://doi.org/10.1103/PhysRevE.96.022140 · doi:10.1103/PhysRevE.96.022140
[34] FanM, ZhangX, DuL, ChenL, TaoD. Semi‐supervised learning through label propagation on geodesics. IEEE Trans Cybern. 2018;48(5):1486‐1499. https://doi.org/10.1109/TCYB.2017.2703610 · doi:10.1109/TCYB.2017.2703610
[35] SaleemN, IjazG. Low rank sparse decomposition model based speech enhancement using gammatone filterbank and Kullback-Leibler divergence. Int J Speech Technol. 2018;21(2):217‐231. https://doi.org/10.1007/s10772-018-9500-2 · doi:10.1007/s10772-018-9500-2
[36] XuM, HanM, ChenCP, QiuT. Recurrent broad learning systems for time series prediction. IEEE Trans Cybern. 2018;99:1‐13. https://doi.org/10.1109/TCYB.2018.2863020 · doi:10.1109/TCYB.2018.2863020
[37] SoizeC, GhanemR. Data‐driven probability concentration and sampling on manifold. J Comput Phys. 2016;321:242‐258. https://doi.org/10.1016/j.jcp.2016.05.044 · Zbl 1349.62202 · doi:10.1016/j.jcp.2016.05.044
[38] SoizeC, GhanemR. Polynomial chaos representation of databases on manifolds. J Comput Phys. 2017;335:201‐221. https://doi.org/10.1016/j.jcp.2017.01.031 · Zbl 1375.60028 · doi:10.1016/j.jcp.2017.01.031
[39] GhanemR, SoizeC. Probabilistic nonconvex constrained optimization with fixed number of function evaluations. Int J Numer Methods Eng. 2018;113(4):719‐741. · Zbl 07867277
[40] SoizeC, GhanemRG, SaftaC, et al. Entropy‐based closure for probabilistic learning on manifolds. J Comput Phys. 2019;388:528‐533. https://doi.org/10.1016/j.jcp.2018.12.029 · Zbl 1459.62239 · doi:10.1016/j.jcp.2018.12.029
[41] GhanemR, SoizeC, ThimmisettyC. Optimal well‐placement using probabilistic learning. Data Enabled Discov Appl. 2018;2(1):4‐16. https://doi.org/10.1007/s41688-017-0014-x · doi:10.1007/s41688-017-0014-x
[42] GhanemRG, SoizeC, SaftaC, et al. Design optimization of a scramjet under uncertainty using probabilistic learning on manifolds. J Comput Phys. 2019.
[43] SoizeC, FarhatC. Probabilistic learning for modeling and quantifying model‐form uncertainties in nonlinear computational mechanics. Int J Numer Methods Eng. 2019;117:819‐843. · Zbl 07865239
[44] SoizeC. Polynomial chaos expansion of a multimodal random vector. SIAM/ASA J Uncertain Quantification. 2015;3(1):34‐60. https://doi.org/10.1137/140968495 · Zbl 1327.62332 · doi:10.1137/140968495
[45] BowmanA, AzzaliniA. Applied Smoothing Techniques for Data Analysis. Oxford, UK: Oxford University Press; 1997. · Zbl 0889.62027
[46] ShannonCE. A mathematical theory of communication. Bell Syst Technol J. 1948;27(14):379‐423 & 623‐659. · Zbl 1154.94303
[47] JaynesE. Information theory and statistical mechanics II. Physical Review. 1957;108(2):171‐190.
[48] CoverTM, ThomasJA. Elements of Information Theory. 2nd ed. Hoboken, NJ: John Wiley & Sons; 2006. · Zbl 1140.94001
[49] SoizeC. Uncertainty Quantification: An Accelerated Course with Advanced Applications in Computational Engineering. New York, NY: Springer; 2017. · Zbl 1377.60002
[50] GuilleminotJ, SoizeC. On the statistical dependence for the components of random elasticity tensors exhibiting material symmetry properties. J Elast. 2013;111(2):109‐130. https://doi.org/10.1007/s10659-012-9396-z · Zbl 1273.74007 · doi:10.1007/s10659-012-9396-z
[51] AgmonN, AlhassidY, LevineRD. An algorithm for finding the distribution of maximal entropy. J Comput Phys. 1979;30(2):250‐258. · Zbl 0406.65034
[52] BatouA, SoizeC. Calculation of Lagrange multipliers in the construction of maximum entropy distributions in high stochastic dimension. SIAM/ASA J Uncertain Quantification. 2013;1(1):431‐451. · Zbl 1362.60011
[53] SoizeC. Construction of probability distributions in high dimension using the maximum entropy principle: applications to stochastic processes, random fields and random matrices. Int J Numer Methods Eng. 2008;76(10):1583‐1611. · Zbl 1195.74311
[54] SoizeC. The Fokker‐Planck Equation for Stochastic Dynamical Systems and its Explicit Steady State Solutions. Singapore: World Scientific Publishing; 1994. · Zbl 0807.60072
[55] CoifmanR, LafonS, LeeA, et al. Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc Natl Acad Sci. 2005;102(21):7426‐7431. · Zbl 1405.42043
[56] SoizeC. Non‐Gaussian positive‐definite matrix‐valued random fields for elliptic stochastic partial differential operators. Comput Methods Appl Mech Eng. 2006;195(1‐3):26‐64. https://doi.org/10.1016/j.cma.2004.12.014 · Zbl 1093.74065 · doi:10.1016/j.cma.2004.12.014
[57] ShinozukaM. Simulation of multivariate and multidimensional random processes. J Acoust Soc Am. 1971;49(1B):357‐368. https://doi.org/10.1121/1.1912338 · doi:10.1121/1.1912338
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.