×

Verification of general Markov decision processes by approximate similarity relations and policy refinement. (English) Zbl 1367.93615

Summary: In this work we introduce new approximate similarity relations that are shown to be key for policy (or control) synthesis over general Markov decision processes. The models of interest are discrete-time Markov decision processes, endowed with uncountably infinite state spaces and metric output (or observation) spaces. The new relations, underpinned by the use of metrics, allow, in particular, for a useful trade-off between deviations over probability distributions on states, and distances between model outputs. We show that the new probabilistic similarity relations, inspired by a notion of simulation developed for finite-state models, can be effectively employed over general Markov decision processes for verification purposes, and specifically for control refinement from abstract models.

MSC:

93E03 Stochastic systems in control theory (general)
90C40 Markov and semi-Markov decision processes
93B50 Synthesis problems
68Q60 Specification and verification (program logics, model checking, etc.)
93C55 Discrete-time control/observation systems

Software:

CVX; FAUST2
Full Text: DOI

References:

[1] A. Abate, {\it Approximation metrics based on probabilistic bisimulations for general state-space Markov processes: a survey}, Electron. Notes Theor. Comput. Sci., 297 (2013), pp. 3-25. · Zbl 1334.68128
[2] A. Abate, M. Kwiatkowska, G. Norman, and D. Parker, {\it Probabilistic model checking of labelled Markov processes via finite approximate bisimulations}, in Horizons of the Mind – P. Panangaden Festschrift, Springer, Cham, Switzerland,, 2014, pp. 40-58. · Zbl 1407.68275
[3] A. Abate, M. Prandini, J. Lygeros, and S. Sastry, {\it Probabilistic reachability and safety for controlled discrete time stochastic hybrid systems}, Automatica J. IFAC, 44 (2008), pp. 2724-2734. · Zbl 1152.93051
[4] A. Abate, F. Redig, and I. Tkachev, {\it On the effect of perturbation of conditional probabilities in total variation}, Statist. Probab. Lett., 88 (2014), pp. 1-80. · Zbl 1370.60006
[5] R. Alur, T. A. Henzinger, O. Kupferman, and M. Y. Vardi, {\it Alternating refinement relations}, in CONCUR ’98, Springer, London, 1998, pp. 163-178. · Zbl 1070.68524
[6] P. Bacher and H. Madsen, {\it Identifying suitable models for the heat dynamics of buildings}, Energy Build., 43 (2011), pp. 1511-1522.
[7] D. Bertsekas and S. E. Shreve, {\it Stochastic Optimal Control: The Discrete Time Case}, Athena Scientific, Belmont, MA, 1996. · Zbl 0471.93002
[8] R. Blute, J. Desharnais, A. Edalat, and P. Panangaden, {\it Bisimulation for labelled Markov processes}, in Proceeding of the 12th Annual IEEE Symposium on Logic in Computer Science, 1997 IEEE, IEEE Computer Society, Los Alamitos, CA, 1997, pp. 149-158. · Zbl 1096.68103
[9] V. I. Bogachev, {\it Measure Theory}, Springer, Berlin, 2007. · Zbl 1120.28001
[10] V. S. Borkar, {\it Probability Theory: An Advanced Course}, Springer, New York, 1995. · Zbl 0838.60001
[11] S. Boyd and L. Vandenberghe, {\it Convex Optimization}, Cambridge University Press, Cambridge, 2004. · Zbl 1058.90049
[12] M. L. Bujorianu, J. Lygeros, and M. C. Bujorianu, {\it Bisimulation for general stochastic hybrid systems}, in International Workshop on Hybrid Systems: Computation and Control, Springer, Berlin, 2005, pp. 198-214. · Zbl 1078.93062
[13] J. Desharnais, A. Edalat, and P. Panangaden, {\it A logical characterization of bisimulation for labeled Markov processes}, in Proceedings of the Thirteenth Annual IEEE Symposium on Logic in Computer Science, 1998, IEEE, IEEE Computer Society, Los Alamitos, CA, 1998, pp. 478-487.
[14] J. Desharnais, A. Edalat, and P. Panangaden, {\it Bisimulation for labelled Markov processes}, Inform. and Comput., 179 (2002), pp. 163-193. · Zbl 1096.68103
[15] J. Desharnais, V. Gupta, R. Jagadeesan, and P. Panangaden, {\it Approximating labelled Markov processes}, Inform. and Comput., 184 (2003), pp. 160-200. · Zbl 1028.68091
[16] J. Desharnais, V. Gupta, R. Jagadeesan, and P. Panangaden, {\it Metrics for labelled Markov processes}, Theoret. Comput. Sci., 318 (2004), pp. 323-354. · Zbl 1068.68093
[17] J. Desharnais, F. Laviolette, and M. Tracol, {\it Approximate analysis of probabilistic processes: Logic, simulation and games}, Conference on Quantitative Evaluation of Systems, IEEE Computer Society, Los Alamitos, CA, 2008, pp. 264-273.
[18] A. D’Innocenzo, A. Abate, and J.-P. Katoen, {\it Robust PCTL model checking}, in Proceedings of the 15th ACM International Conference on Hybrid Systems: Computation and Control, ACM, New York, 2012, pp. 275-285. · Zbl 1361.68140
[19] A. Edalat, {\it Semi-pullbacks and bisimulation in categories of Markov processes}, Math. Structures Comput. Sci., 9 (1999), pp. 523-543. · Zbl 0937.18005
[20] S. Esmaeil Zadeh Soudjani and A. Abate, {\it Adaptive and sequential gridding procedures for the abstraction and verification of stochastic processes}, SIAM J. Appl. Dyn. Syst., 12 (2013), pp. 921-956. · Zbl 1278.93243
[21] S. Esmaeil Zadeh Soudjani and A. Abate, {\it Quantitative approximation of the probability distribution of a Markov process by formal abstractions}, Log. Methods Comput. Sci., 11 (2015), 8. · Zbl 1342.60120
[22] S. Esmaeil Zadeh Soudjani, C. Gevaerts, and A. Abate, {\it FAUST \(^2\): Formal Abstractions of Uncountable-STate STochastic Processes}, in Tools and Algorithms for the Construction and Analysis of Systems (TACAS), Lecture Notes in Comput. Sci. 9035, Springer, Berlin, 2015, pp. 272-286.
[23] A. Girard and G. J. Pappas, {\it Hierarchical control system design using approximate simulation}, Automatica J. IFAC, 45 (2009), pp. 566-571. · Zbl 1158.93301
[24] M. Grant and S. Boyd, {\it CVX: Matlab Software for Disciplined Convex Programming, version} 2.1, (2014).
[25] S. Haesaert, A. Abate, and P. M. J. Van den Hof, {\it Correct-by-design output feedback of LTI systems}, in Proceedings of Conference on Decision and Control, IEEE, Piscataway, NJ, 2015, pp. 6159-6164.
[26] S. Haesaert, A. Abate, and P. M. J. Van den Hof, {\it Verification of general Markov decision processes by approximate similarity relations and policy refinement}, in 13th International Conference on Quantitative Evaluation of Systems, QEST 2016, Quebec City, Canada, Springer, 2016, pp. 227-243. · Zbl 1377.68131
[27] O. Hernández-Lerma and J. B. Lasserre, {\it Discrete-Time Markov Control Processes}, Appl. Math. 30, Springer, New York, 1996. · Zbl 0853.93106
[28] O. Holub and K. Macek, {\it HVAC simulation model for advanced diagnostics}, in Symposium on Intelligent Signal Processing, IEEE, Piscataway, NJ, 2013, pp. 93-96.
[29] B. Jonsson and K. G. Larsen, {\it Specification and refinement of probabilistic processes}, in Proceedings of the Sixth Annual IEEE Symposium on Logic in Computer Science, IEEE Computer Society, Los Alamitos, CA, 1991, pp. 266-277.
[30] A. A. Julius and G. J. Pappas, {\it Approximations of stochastic hybrid systems}, IEEE Trans. Automat. Control, 54 (2009), pp 1193-1203. · Zbl 1367.93618
[31] K. G. Larsen and A. Skou, {\it Bisimulation through probabilistic testing}, Inform. and Comput., 94 (1991), pp. 1-28. · Zbl 0756.68035
[32] T. Lindvall, {\it Lectures on the coupling method}, Dover, Mineola, NY, 2002. · Zbl 1013.60001
[33] M. Mazo Jr, A. Davitian, and P. Tabuada, {\it PESSOA: Towards the automatic synthesis of correct-by-design control software}, presented at Work-in-Progress HSCC, 2010.
[34] S. P. Meyn and R. L. Tweedie, {\it Markov Chains and Stochastic Stability}, Comm. Control Engrg., Springer, London, 1993. · Zbl 0925.60001
[35] G. Pola, C. Manes, A. J. van der Schaft, and M. D. Di Benedetto, {\it On equivalence notions for discrete-time stochastic control systems}, in 2015 54th IEEE Conference on Decision and Control, IEEE, Piscataway, NJ, 2015, pp. 1180-1185.
[36] R. Segala, {\it Modeling and Verification of Randomized Distributed Real-Time Systems}, Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA, 1995.
[37] R. Segala and N. Lynch, {\it Probabilistic simulations for probabilistic processes}, Nordic J. Comput., 2 (1995), pp. 250-273. · Zbl 0839.68067
[38] H. J. Skala, {\it The existence of probability measures with given marginals}, Ann. Probab., 21 (1993), pp. 136-142. · Zbl 0768.60004
[39] V. Strassen, {\it The existence of probability measures with given marginals}, Ann. Math. Statist., 36 (1965), pp. 423-439. · Zbl 0135.18701
[40] S. Strubbe and A. Van Der Schaft, {\it Bisimulation for communicating piecewise deterministic Markov processes (CPDPS)}, in International Workshop on Hybrid Systems: Computation and Control, Springer, Berlin, 2005, pp. 623-639. · Zbl 1078.93065
[41] S. Strubbe and A. van der Schaft, {\it Communicating piecewise deterministic Markov processes}, in Stochastic Hybrid Systems, Springer, Berlin, 2006, pp. 65-104. · Zbl 1130.93051
[42] P. Tabuada, {\it Verification and Control of Hybrid Systems: A Symbolic Approach}, Springer, New York, 2009. · Zbl 1195.93001
[43] C. W. Therrien, {\it Discrete Random Signals and Statistical Signal Processing}, Prentice Hall, Englewood Cliffs, NJ, 1992. · Zbl 0747.94004
[44] M. Zamani, P. M. Esfahani, R. Majumdar, A. Abate, and J. Lygeros, {\it Symbolic control of stochastic systems via approximately bisimilar finite abstractions}, IEEE Trans. Automat. Control, 59 (2014), pp. 3135-3150. · Zbl 1360.93445
[45] C. Zhang and J. Pang, {\it On probabilistic alternating simulations}, in Theoretical Computer Science, C. Calude and V. Sassone, eds., IFIP Adv. Inf. Commun. Technol. 323, Springer, Berlin, 2010, pp. 71-85. · Zbl 1202.68464
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.