Linear-Quadratic Stochastic Delayed Control and Deep Learning Resolution

332 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

We consider a simple class of stochastic control problems with a delayed control, in both the drift and the diffusion part of the state stochastic differential equation. We provide a new characterization of the solution in terms of a set of Riccati partial differential equations. Existence and uniqueness of a solution are obtained under a sufficient condition expressed directly as a relation between the time horizon, the drift, the volatility and the delay. Furthermore, a deep learning scheme (The code is available in a IPython notebook.) is designed and used to illustrate the effect of the delay feature on the Markowitz portfolio allocation problem with execution delay.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recurrent neural networks for stochastic control problems with delay

Article 22 July 2021

Solving Stochastic Optimal Control Problem via Stochastic Maximum Principle with Deep Learning Method

Article 07 September 2022

Neural networks-based backward scheme for fully nonlinear PDEs

Article 27 January 2021

Notes

When $d=0$, recall that $\alpha _t^* =\frac{\lambda _1P_t}{\sigma _1(1-\rho ^2)}(1-\rho \frac{\lambda _2}{{\lambda _1}})(\xi ^*-X_t^*)$ and $\beta ^* =\frac{\lambda _2P_t}{\sigma _2(1-\rho ^2)}(1-\rho \frac{\lambda _1}{{\lambda _2}})(\xi ^*-X_t^*)$ with P being a positive function and $\xi ^* \ge X^*$. Thus, in the classical setting, the buy or sell thresholds are $(1-\rho \frac{\lambda _2}{{\lambda _1}})$ and $(1-\rho \frac{\lambda _1}{{\lambda _2}})$.

References

Alekal, Y., Brunovsky, P., Chyung, D., Lee, E.: The quadratic problem for systems with time delays. IEEE Trans. Autom. Control 16(6), 673–687 (1971)
Article MathSciNet Google Scholar
Asea, P.K., Zak, P.J.: Time-to-build and cycles. J. Econ. Dyn. Control 23(8), 1155–1175 (1999)
Article MathSciNet Google Scholar
Bambi, M.: Endogenous growth and time-to-build: the AK case. J. Econ. Dyn. Control 32(4), 1015–1040 (2008)
Article MathSciNet Google Scholar
Bambi, M., Fabbri, G., Gozzi, F.: Optimal policy and consumption smoothing effects in the time-to-build AK model. Econ. Theory 50(3), 635–669 (2012)
Article MathSciNet Google Scholar
Bensoussan, A., Da Prato, G., Delfour, M.C., Mitter, S.: Representation and Control of Infinite Dimensional Systems, 2nd edn. Birkhaüser, Boston, MA (2007)
Book Google Scholar
Carmona, R., Fouque, J.-P., Mousavi, M., Sun, L.-H.: Systemic risk and stochastic games with delay. J. Optim. Theory Appl. 179(2), 366–399 (2018)
Article MathSciNet Google Scholar
dAlbis, H., Augeraud-Véron, E., Venditti, A.: Business cycle fluctuations and learning-by-doing externalities in a one-sector model. J. Math. Econ. 48(5), 295–308 (2012)
Article MathSciNet Google Scholar
El Karoui, N.: Les Aspects Probabilistes Du Controle Stochastique. 9th Saint Flour Probability Summer School-1979. Lecture Notes in Mathematics, vol. 876, pp. 73–238. Springer, Berlin (1981)
Google Scholar
Fabbri, G., Federico, S.: On the infinite-dimensional representation of stochastic controlled systems with delayed control in the diffusion term. Math. Econ. Lett. 2(3–4), 33–43 (2014)
Article Google Scholar
Fouque, J.-P., Zhang, Z.: Deep learning methods for mean field control problems with delay. Front. Appl. Math. Stat. 6, 11 (2020)
Article Google Scholar
Gozzi, F., Marinelli, C.: Stochastic optimal control of delay equations arising in advertising models. In: Stochastic Partial Differential Equations and Applications—VII. Lecture Notes in Pure and Applied Mathematics, vol. 245, pp. 133–148. Chapman & Hall/CRC, Boca Raton (2006)
Gozzi, F., Marinelli, C., Savin, S.: On controlled linear diffusions with delay in a model of optimal advertising under uncertainty with memory effects. J. Optim. Theory Appl. 142(2), 291–321 (2009)
Article MathSciNet Google Scholar
Hall, R.E., Sims, C.A., Modigliani, F., Brainard, W.: Investment, interest rates, and the effects of stabilization policies. Brook. Pap. Econ. Act. 1977(1), 61–121 (1977)
Article Google Scholar
Han, J., Hu, R.: Recurrent neural networks for stochastic control problems with delay. Math. Control Signals Syst. (2021). https://doi.org/10.1007/s00498-021-00300-3
Article Google Scholar
Huzmezan, M., Gough, W.A., Dumont, G.A., Kovac, S.: Time delay integrating systems: a challenge for process control industries. A practical solution. Control Eng. Pract. 10(10), 1153–1161 (2002)
Article Google Scholar
Ichikawa, A.: Quadratic control of evolution equations with delays in control. SIAM J. Control Optim. 20(5), 645–668 (1982)
Article MathSciNet Google Scholar
Jarlebring, E., Damm, T.: The Lambert W function and the spectrum of some multidimensional time-delay systems. Automatica 43(12), 2124–2128 (2007)
Article MathSciNet Google Scholar
Kydland, F.E., Prescott, E.C.: Time to build and aggregate fluctuations. Econometrica 50(6), 1345–1370 (1982)
Article Google Scholar
Markowitz, H.: Portfolio selection. J. Financ. 7(1), 77–91 (1952)
Google Scholar
Pauwels, W.: Optimal dynamic advertising policies in the presence of continuously distributed time lags. J. Optim. Theory Appl. 22(1), 79–89 (1977)
Article MathSciNet Google Scholar
Pham, H.: Continuous-Time Stochastic Control and Optimization with Financial Applications. Springer, Berlin (2009)
Book Google Scholar
Raissi, M., Perdikaris, P., Karniadakis, G.E.: Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 378, 686–707 (2019)
Article MathSciNet Google Scholar
Sethi, S.P.: Sufficient conditions for the optimal control of a class of systems with continuous lags. J. Optim. Theory Appl. 13(5), 545–552 (1974)
Article MathSciNet Google Scholar
Sipahi, R., Niculescu, S.-I., Abdallah, C.T., Michiels, W., Gu, K.: Stability and stabilization of systems with time delay. IEEE Control Syst. Mag. 31(1), 38–65 (2011)
Article MathSciNet Google Scholar
Sirignano, J., Spiliopoulos, K.: DGM: A deep learning algorithm for solving partial differential equations. J. Comput. Phys. 375, 1339–1364 (2018)
Article MathSciNet Google Scholar
Tian, Y.-C., Gao, F.: Control of integrator processes with dominant time delay. Ind. Eng. Chem. Res. 38(8), 2979–2983 (1999)
Article Google Scholar
Tsoukalas, J.D.: Time to build capital: revisiting investment-cash flow sensitivities. J. Econ. Dyn. Control 35(7), 1000–1016 (2011)
Article MathSciNet Google Scholar

Download references

Acknowledgements

We would like to thank Salvatore Federico and Huyên Pham for their useful comments and remarks that helped to improve this article. The authors are also grateful to the handling Associate Editor and anonymous reviewer for comments and suggestions that contributed to improving the quality of the manuscript. This work is issued from a CIFRE collaboration between BNP Paribas Global Markets and LPSM.

Author information

Authors and Affiliations

BNP Paribas Global Markets, 20 Boulevard des Italiens, 75009, Paris, France
William Lefebvre
Laboratoire de Probabilités, Statistique et Modélisation (LPSM, UMR CNRS 8001), Université de Paris and Sorbonne Université, Building Sophie Germain, Avenue de France, 75013, Paris, France
William Lefebvre & Enzo Miller

Authors

William Lefebvre
View author publications
You can also search for this author in PubMed Google Scholar
Enzo Miller
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to William Lefebvre.

Additional information

Communicated by Bruno Bouchard.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Proof of Proposition 3.3

Our proof extends [1, see Theorem 5] to the case where the volatility is controlled. It consists in slicing the domain ${\mathcal {D}}$ in slices of size d and proceeding by a backward recursion. More precisely, we show existence and uniqueness over a sequence of slices $\left( [T-(n+1)d, T-nd] \times [-d, 0]^2\right) _{n}$. We then concatenate the sequence of absolutely continuous solutions obtained, which yields a piecewise absolutely continuous solution. In each slice, the proof consists of the following steps

(1)
Show that there exists a unique solution on a small interval;
(2)
Prove that the local solution is Lipschitz;
(3)
As a result extend the solution to the whole slice.

We finally concatenate the sequence of solutions obtained above.

1.1 A.1 Slice $t \in [T-d, T]$, Initialization

On ${\mathcal {D}}_b \cup {\mathcal {D}}_c$, the constraints (2.9)–(2.10)–(2.11) on $P_{12}, P_{\hat{22}}$ and $P_{22}$ reduce to linear homogeneous transport equations admitting closed form solutions given, for every $(t,s,r)\in {\mathcal {D}}_{b} \cup {\mathcal {D}}_c$, by

$$\begin{aligned}&P_{12}(t,s) = b P_{11}(t+s+d)1_{t+s+d \le T}, \qquad P_{\hat{22}}(t,s) = \sigma ^2 P_{11}(t+s+d)1_{t+s+d \le T}, \\&P_{22}(t,s,r) = b^2 P_{11}(t+ s \vee r + d)1_{t+s\vee r+d \le T}. \end{aligned}$$

Or, as $P_{11}(t) = 1$ for any $t \ge T-d$, we then have for every $(t,s,r)\in {\mathcal {D}}_{b} \cup {\mathcal {D}}_c$

$$\begin{aligned}&P_{11}(t) = 1, \qquad P_{12}(t,s) = b1_{t+s+d \le T}, \\&P_{\hat{22}}(t,s) = \sigma ^21_{t+s+d \le T}, \qquad P_{22}(t,s,r) = b^21_{t+s\vee r+d \le T}. \end{aligned}$$

The existence and uniqueness in the sense of Definition 2.6 are thus trivially proved on $[T-d, T]$.

1.2 A.2 Slice $[T - 2d, T- d]$

On $[T-2d, T-d]\times [-d, 0]^2$, we have $P_{\hat{22}}(t,s) = \sigma ^2 P_{11}(t+s+d)$ so that $P_{\hat{22}}(t,0) = \sigma ^2 P_{11}(t+d)=\sigma ^2$. Consequently, the system (2.9)–(2.10)–(2.11) reduces to

$$\begin{aligned}&\dot{P}_{11}(t) =\frac{P_{12}(t,0)^{2}}{\sigma ^2}, \nonumber \\&(\partial _t - \partial _s)(P_{12})(t,s) =\frac{P_{12}(t,0)P_{22}(t,s,0)}{\sigma ^2}, \nonumber \\&(\partial _t - \partial _s-\partial _r)(P_{22})(t,s,r) =\frac{P_{22}(t,s,0)P_{22}(t,0,r)}{\sigma ^2}, \end{aligned}$$

(A.1)

with terminal conditions

$$\begin{aligned} P_{11}(T-d) =1, \quad P_{12}(T-d, s) = b, \quad P_{22}(T-d, s, r) = b^2, \end{aligned}$$

(A.2)

and boundary constraints

$$\begin{aligned}&P_{12}(t,-d) = b P_{11}(t), \quad P_{22}(t,s,-d) = b P_{12}(t,s). \end{aligned}$$

(A.3)

Thus, for every $(t,s,r)\in [T-2d, T-d]\times [-d,0]^2$, the set of equations (A.1) and constraints (A.2)–(A.3) can be rewritten in the following integral form

$$\begin{aligned} P_{11}(t)&= 1- \sigma ^{-2}\int _{t}^{T-d} {P_{12}(x,0)^{2}}\mathrm{d}x,\nonumber \\ P_{12}(t,s)&= b P_{11}((T-d) \wedge (t+s+d)) \nonumber \\&\quad - \sigma ^{-2} \int _{t}^{(T-d) \wedge (t+s+d)} {P_{12}(x,0)P_{22}(x,t+s-x,0)}\mathrm{d}x,\nonumber \\ P_{22}(t,s,r)&= b P_{12}((T-d) \wedge (t+s \wedge r+d),(s-r) \vee (r-s)-d)\nonumber \\&\quad - \sigma ^{-2} \int _{t}^{(T-d) \wedge (t+s \wedge r+d)} {P_{22}(x,t+s-x,0) P_{22}(x,0,t+r-x)}\mathrm{d}x. \end{aligned}$$

(A.4)

We then make use of the following lemma to prove local existence of a solution.

Lemma A.1

There exists $\tau \in (0, d]$ such that system (A.4) has a unique absolutely continuous solution on $[T-\tau -d, T-d]\times [-d, 0]^2$.

Proof

Let $\tau \in (0, d]$ and ${\mathcal {S}}_\tau $ denote the Banach space of absolutely continuous functions $\xi =\left( \xi _{1}(\cdot ),\xi _{2} (\cdot ,\cdot ),\xi _{3}(\cdot ,\cdot ,\cdot )\right) $ defined on

$$\begin{aligned} {\mathcal {D}}_{\tau }=\left\{ (t,s,r)|\ T-d-\tau \le t\le T-d, -d\le s,r\le 0 \right\} , \end{aligned}$$

endowed with the sup-norm

$$\begin{aligned} \Vert \xi \Vert _\infty =\Vert \xi _1\Vert _\infty + \Vert \xi _2\Vert _\infty + \Vert \xi _3\Vert _\infty , \end{aligned}$$

where $\Vert \xi _1\Vert _\infty , \Vert \xi _2\Vert _\infty $ and $\Vert \xi _3\Vert _\infty $ denote, with a slight abuse of notation, the respective sup-norm on $[T-d-\tau , T-d]$, $[T-d-\tau , T-d]\times [-d, 0]$ and $[T-d-\tau , T-d]\times [-d, 0]^2$. Let ${\mathcal {B}}_\tau $ denote the ball in ${\mathcal {S}}_\tau $

$$\begin{aligned} {\mathcal {B}}_\tau = \{ (\xi _1, \xi _2, \xi _3) \in {\mathcal {S}}_\tau : \quad \Vert \xi _1-1\Vert _{{ \infty }} \le 1/2, \quad \Vert \xi _2 -b\Vert _{{ \infty }} \le |b|/2, \quad \Vert \xi _3 - b^2\Vert _{{ \infty }} \le b^2/2 \}, \end{aligned}$$

On ${\mathcal {B}}_\tau $, we denote by $\phi =\left( \phi _{1},\phi _{2},\phi _{3}\right) $ the operator defined as follows

$$\begin{aligned} \left( \phi _{1}\xi \right) (t)&= 1 - \sigma ^{-2}\int _{t}^{T-d}{\xi _{2}(x,0)^{2}}\mathrm{d}x\\ \left( \phi _{2}\xi \right) (t,s)&= b \phi _1(\xi )((T-d) \wedge (t+s+d)) \\&\quad -\sigma ^{-2}\int _{t}^{(T-d) \wedge (t+s+d)} {\xi _{2}(r,0)\xi _{3}(r,t+s-x,0)}\mathrm{d}x\\ \left( \phi _{3}\xi \right) (t,s,r)&= {b\phi _2(\xi ) \left( (T-d) \wedge (t+s \wedge r+d),(s-r) \vee (r-s)-d \right) }\\&\quad -\sigma ^{-2}\int _{t}^{(T-d) \wedge (t+s \wedge r+d)} {\xi _{3} (x,t+s-x,0)\xi _{3}(x,0,t+r-x)} \mathrm{d}x. \end{aligned}$$

Clearly, there exists $\tilde{\tau }>0$ such that for any $\tau \le \tilde{\tau } $, $\phi ({\mathcal {B}}_{ \tau }) \rightarrow {\mathcal {B}}_{ \tau }$. We show a contraction property on $\phi $. For any $\xi , \xi ' \in {\mathcal {B}}_{\tau }$, we have the following inequalities

$$\begin{aligned} \Vert \phi _{1}(\xi )-\phi _{1}(\xi ')\Vert _\infty&\le 4\tau \sigma ^{-2} |b| \Vert \xi _2 - \xi _2' \Vert _\infty ,\\ \Vert \phi _{2}(\xi )- \phi _{2}(\xi ') \Vert _{\infty }&\le 4\tau \sigma ^{-2} \left( |b| \Vert \xi _2-\xi _2' \Vert _\infty + |b|^2 \Vert \xi _3-\xi _3' \Vert _\infty \right) \\&\quad + |b| \Vert \phi _1(\xi ) - \phi _1( \xi ')\Vert _\infty ,\\ \Vert \phi _{3}(\xi ) - \phi _{3}(\xi ')\Vert _\infty&\le |b| \Vert \phi _2(\xi ) -\phi _2(\xi ')\Vert _\infty + 4\tau \sigma ^{-2} |b|^2 \Vert \xi _3-\xi _3'\Vert _\infty . \end{aligned}$$

Consequently, the operator $\phi $ satisfies

$$\begin{aligned} \Vert \phi (\xi ) - \phi (\xi ')\Vert _\infty \le \tau m \Vert \xi -\xi '\Vert _\infty , \end{aligned}$$

where $m>0$ depends on b and $\sigma $. Therefore, for $\tau < \tilde{\tau } \wedge m^{-1}$, the operator $\phi $ is a contraction of ${\mathcal {B}}_\tau $ into itself. Thus, $\phi $ admits a unique fixed point in ${\mathcal {B}}_\tau $, which is solution to (A.4) on ${\mathcal {D}}_\tau $. $\square $

Lemma A.2

Let $\xi = (\xi _1, \xi _2, \xi _3)$ denote the absolutely continuous solution of (A.4) on ${\mathcal {D}}_{\tau }$ from Lemma A.1. Then $\xi $ is Lipschitz in each variable on ${\mathcal {D}}_\tau $.

Proof

As $\xi _1$, $\xi _2$ and $\xi _3$ are continuous on ${\mathcal {D}}_\tau $, there exists a constant $m>0$ such that $|\xi _1| \wedge |\xi _2| \wedge |\xi _3| \le m$ on ${\mathcal {D}}_\tau $. Thus, $\xi _1$ is Lipschitz with constant $\kappa =m^2\sigma ^{-2}$. Let us now show that $\xi _2$ and $\xi _3$ are Lipschitz in the s-variable. Fix $t \in [T-d - \tau , T-d]$ and $\eta >0$. Then, for any $s \in [-d, 0]$ such that $s+\eta \in [-d, 0]$, we have

$$\begin{aligned} \left| \xi _2(t,s) - \xi _2(t,s+\eta ) \right|&\le \kappa \eta +{\sigma ^{-2}} \Bigg | \int _{t}^{(T-d) \wedge (t+s+\eta +d)}\\&\quad \times \xi _2(x,0)\xi _3(x,t+s+\eta -x,0)\mathrm{d}x\\&\quad - \int _{t}^{(T-d) \wedge (t+s+d)} \xi _2(x,0)\xi _3(x,t+s-x,0)\mathrm{d}x \Bigg | \\&\le \kappa \eta + \mathbf{I}(t,s) + \mathbf{II}(t,s), \end{aligned}$$

Since $|\xi _2| \le m$, it yields

$$\begin{aligned} \mathbf{I}(t,s)&\le \int _t^{(T-d) \wedge (t+s+d)} \big |\xi _2(x,0)\big | \big | \xi _3(x,t+s+\eta -x,0) - \xi _3(x,t+s-x,0) \big |\mathrm{d}x \\&\le m \int _t^{(T-d) \wedge (t+s+d)} \epsilon (x) \mathrm{d}x, \end{aligned}$$

where $\epsilon $ is defined as

$$\begin{aligned} \epsilon (x) =\underset{\begin{array}{c} s,r\\ \in [-d,0]^2 \end{array}}{\sup } \left| \xi _3(x,s,r)- \xi _3(x,s + \eta ,r) \right| + \underset{s\in [-d,0]}{\sup }\left| \xi _2(x,s)-\xi _2(x,s + \eta ) \right| . \end{aligned}$$

Furthermore, as $|\xi _2| \wedge |\xi _3| \le m$ on ${\mathcal {D}}_\tau $, we have

$$\begin{aligned} \mathbf{II}(t,s)&\le \int _{(T-d) \wedge (t+s+d) }^{(T-d) \wedge (t+s+\eta +d)} |\xi _2(x,0)\xi _3(x,t+s+\eta -x,0)|\mathrm{d}x \\&\le m^2 \eta . \end{aligned}$$

Consequently, for any $t \in [T-d-\tau , T-d]$, we obtain

$$\begin{aligned} \underset{s}{\sup }\left| \xi _2(t,s)- \xi _2(t,s+\eta ) \right| \le m^2 \eta + m \int _t^{T-d} \epsilon (r)\mathrm{d}r. \end{aligned}$$

(A.5)

Looking at the equation of $\xi _3$ in system (A.4), we obtain in a similar manner

$$\begin{aligned} \left| \xi _3(t,s,r)-\xi _3(t,s+\eta ,r) \right| \le&|b| \mathbf{I}(t,s,r) +\sigma ^{-2}{} \mathbf{II}(t,s,r). \end{aligned}$$

(A.6)

An application to the triangle inequality combined with (A.5) and the Lipschitzianity of $\xi _1$ leads to

$$\begin{aligned} \mathbf{I}(t,s,r)&\le |\xi _2((T-d) \wedge (t+(s+\eta ) \wedge r+d), (s+\eta -r) \vee (r-(s+\eta ))-d) \nonumber \\&\quad - \xi _2((T-d) \wedge (t+s \wedge r+d),(s-r) \vee (r-s)-d)|\nonumber \\&\le (\kappa + m^2(1+ \sigma ^{-2})) \eta + m \int _{(T-d) \wedge (t+s \wedge r+d)}^{T-d} \epsilon (x)\mathrm{d}x\nonumber \\&\le (1+2\kappa ) \eta + m \int _{t}^{T-d} \epsilon (x)\mathrm{d}x. \end{aligned}$$

(A.7)

Furthermore

$$\begin{aligned} \mathbf{II}(t,s,r)&\le \Big |\int _{t}^{(T-d) \wedge (t+(s+\eta ) \wedge r+d)} \xi _3(x,t+s+\eta -x,0)\xi _3(x,0,t+r-x)\mathrm{d}x\nonumber \\&\quad -\int _{t}^{(T-d) \wedge (t+s \wedge r+d)} \xi _3(x,t+s-x,0)\xi _3(x,0,t+{ r}-x)\mathrm{d}x \Big |\nonumber \\&\le \int _{t}^{(T-d) \wedge (t+s \wedge r+d)} |\xi _3(x,0,t+r-x)| |\xi _3(x,t+s-x,0) \nonumber \\&\quad - \xi _3(x,t+(s+\eta )-x,0)| \mathrm{d}x\nonumber \\&\quad + \int ^{(T-d) \wedge (t+(s+\eta ) \wedge r+d)}_{(T-d) \wedge (t+s \wedge r+d)} | \xi _3(x,t+(s+\eta )-x,0)\xi _3(r,0,t+r-x) | \mathrm{d}x\nonumber \\&\le m^2 \eta + \int _t^{T-d} \epsilon (r) \mathrm{d}r \end{aligned}$$

(A.8)

Thus, inequality (A.7) together with (A.8) and (A.6) yield the existence of a positive constant $c>0$, independent of $\eta $, such that

$$\begin{aligned} \sup _{\begin{array}{c} s,r \\ \in [-d,0]^2 \end{array}} \left| \xi _3(t,s,r) -\xi _3(t,s+\eta ,r) \right| \le&c \left( \eta + \int _t^{T-d} \epsilon (r) \mathrm{d}r \right) , \end{aligned}$$

which, combined with (A.5) leads, for any $t\in [T-d-\tau , T-d]$, to

$$\begin{aligned} \epsilon (t) \le c \left( \eta + \int _t^{T-d} \epsilon (r)\mathrm{d}r \right) . \end{aligned}$$

Consequently, an application to Gronwall’s lemma yields $\epsilon (t) \le m' \eta $ on $[T-d-\tau , T-d]$, with $m'>0$. Thus, $\xi _2$ and $\xi _3$ are Lipschitz in the s-variable. The arguments for showing that $\xi _2$ and $\xi _3$ are Lipschitz in the t-variable and $\xi _3$ Lipschitz in the r-variable follow the same line. $\square $

Lemma A.3

There exists a unique absolutely continuous solution $\xi =(\xi _1, \xi _2, \xi _3)$ of (A.4) on $[T-2 d, T-d]\times [-d,0]^2$ such that $\xi _1\ge 1- d \left( \frac{b}{\sigma } \right) ^2 >0$.

Proof

Let $\theta \in [T - 2d, T - d)$ denote the lower limit of all $\tau $’s such that there exists an absolutely continuous solution $(\xi _1, \xi _2, \xi _3)$ to (A.4) on $[\theta , T-d]$. Assume $\theta > T - 2 d$. From Lemma A.2, $\xi _1$, $\xi _2$ and $\xi _3$ are Lipschitz in each variable and thus admit a limit, when $t \rightarrow \theta $, which is Lipschitz. Therefore, the argument of Lemma A.1 can be repeated to extend the existence and uniqueness of the solution of system (A.4) on $[\xi , T-d]$ for $T-2d \le \xi < \theta $. As a result, we necessarily have $\theta = T - 2 d$. It remains to prove that $0<\xi _1$. For this, note that since $\xi _1$ is solution to (A.4), we have

$$\begin{aligned} {\Vert }\xi _1 - 1\Vert _\infty \le \frac{d}{\sigma ^{2}} \sup _{\begin{array}{c} t \in \\ {[}T-2 d, T-d{]} \end{array}} |\xi _2(t,0)|^2. \end{aligned}$$

(A.9)

By injecting the boundary condition (A.2) into the system (A.4), one notes that $t\in [T-2 d, T-d] \mapsto \xi _2(t, 0)$ is solution to

$$\begin{aligned} \xi _2(t, 0) = b - \sigma ^{-2}\int _t^{T-d} \xi _2(x, 0)\xi _3(x,t-x, 0) \mathrm{d}x, \qquad T-2 d \le t \le T-d. \end{aligned}$$

But for every $t \in [T-2d , T-d]$, $f_t : x\in [t, T-d] \mapsto f_t(x) := \xi _3(x,t-x,0)$ takes only positive values as $f_t$ is solution to the system

$$\begin{aligned} f_t(x)&= b^2 - \sigma ^{-2}\int _x^{T-d} f_t(u) \xi _3(u, 0,x-u) \mathrm{d}u, \qquad x \in [t,T-d],\\ f_t(T-d)&= b^2, \end{aligned}$$

which can be proven to admit, through a contraction proof in the Banach space $C([t,T-d], {\mathbb {R}})$, a unique positive solution since $\xi $ and its derivatives are bounded. Similarly, we also have $\xi _2(t,0) \ge 0$ for any $t \in [T-2d, T-d]$. As a result, we have $\textit{sign}(\xi _2) = \textit{sign}(b)$ and

$$\begin{aligned} \sup _{\begin{array}{c} t \in \\ {[}T-2 d, T-d{]} \end{array}} {|}\xi _2(t,0)| \le |b|. \end{aligned}$$

(A.10)

Consequently, (A.9) and (A.10) yield that for any $T-2 d \le t \le T-d$, we have $\xi _1 \ge 1- d \left( \frac{b}{\sigma } \right) ^2 = a_2 >0$ as ${\mathcal {N}}(d, b, \sigma )$ is assumed to be greater than 2. $\square $

Finally, by setting $P_{11}(t) = \xi _1(t)$, $P_{12}(t,s) =\xi _2(t,s)$, $P_{22}(t,s,r)=\xi _3(t,s,r)$ and $P_{\hat{22}}(t,s) =\xi _1(t+s+d)$ for any $(t,s,r) \in [T-2 d, T-d] \times [-d, 0]^2$, Lemma A.3 yields the existence and uniqueness of a solution P to (2.9)–(2.10)–(2.11) in the sense of Definition 2.6 on $[T-2 d, T-d]$. The concatenation of the unique solution of (2.9)–(2.10)–(2.11) on $[T-d, T]$ and $[T-2d, T-d]$ leads to a unique solution on $[T-2d, T]$.

1.3 A.3 From Slice $[T- n d, T]$ to $[T-(n+1)d, T]$

Let n be an integer such that $2 \le n < {\mathcal {N}}(d, b, \sigma )$. Assume that there exists a solution P to (2.9)–(2.10)–(2.11) in the sense of Definition 2.6 on $[T-nd, T]$ such that $0<a_n \le P_{11}(t) \le 1$, for any $t \ge T-nd$. Recall the Definition 3.7 of $(a_n)_{n \ge 0}$. Consider the following system on $[T-(n+1)d, T-n d]\times [-d, 0]^2$

$$\begin{aligned} P_{11}(t)&= P_{11}(T-nd)- \int _{t}^{T-nd} \frac{P_{12}(x,0)^{2}}{\sigma ^2 P_{11} (x+d)}\mathrm{d}x,\nonumber \\ P_{12}(t,s)&= b P_{11}((T-nd) \wedge (t+s+d))\nonumber \\&\quad -\int _{t}^{(T-n d) \wedge (t+s+d)} \frac{P_{12}(x,0)P_{22}(x,t+s-x,0)}{\sigma ^2 P_{11} (x+d)}\mathrm{d}x,\nonumber \\ P_{22}(t,s,r)&= b P_{12}((T-nd) \wedge (t+s \wedge r+d),(s-r) \vee (r-s)-d)\nonumber \\&\quad -\int _{t}^{(T-nd) \wedge (t+s \wedge r+d)} \frac{P_{22}(x,t+s-x,0) P_{22}(x,0,t+r-x)}{\sigma ^2 P_{11} (x+d)}\mathrm{d}x. \end{aligned}$$

(A.11)

Note that this system is the same as (A.4), the only difference being the term $x\in [T-(n+1)d, T-nd] \mapsto P_{11}(x+d)$ which comes from the previous slice $[T-nd, T-(n-1)d]$. Therefore, it can be considered as a positive continuous coefficient by induction hypothesis. As result, existence and uniqueness on $[T-(n+1)d, T-nd]$ can be proven in the same fashion as in Lemmas A.1–A.2–A.3. It remains to prove that $P_{11}(t)\ge a_{n+1}$ for any $t\in [T-(n+1)d, T-nd]$. As in Lemma A.3, and by using the induction hypothesis, we have

$$\begin{aligned} |P_{12}(t, -d)| \le |bP_{11}(T-nd)| \le |b|, \qquad t \in [T-(n+1)d, T-nd]. \end{aligned}$$

(A.12)

Furthermore, $P_{11}$ satisfies (A.11) on $[T-(n+1)d, T-nd]$, which, combined with $P_{11} \ge a_n$ on $[T-nd, T-(n-1)d]$ and (A.12) yields

$$\begin{aligned} P_{11}(t)&\ge P_{11}(T-nd) - \frac{d}{a_n} \left( \frac{b}{\sigma }\right) ^2 \\&\ge a_n - \frac{d}{a_n} \left( \frac{b}{\sigma }\right) ^2 = a_{n+1}>0, \end{aligned}$$

for any $t \in [T-(n+1) d, T-n d]$, which ends the proof.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lefebvre, W., Miller, E. Linear-Quadratic Stochastic Delayed Control and Deep Learning Resolution. J Optim Theory Appl 191, 134–168 (2021). https://doi.org/10.1007/s10957-021-01923-x

Download citation

Received: 12 March 2021
Accepted: 11 August 2021
Published: 23 September 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s10957-021-01923-x

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recurrent neural networks for stochastic control problems with delay

Solving Stochastic Optimal Control Problem via Stochastic Maximum Principle with Deep Learning Method

Neural networks-based backward scheme for fully nonlinear PDEs

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Proof of Proposition 3.3

1.1 A.1 Slice \(t \in [T-d, T]\), Initialization

1.2 A.2 Slice \([T - 2d, T- d]\)

Lemma A.1

Proof

Lemma A.2

Proof

Lemma A.3

Proof

1.3 A.3 From Slice \([T- n d, T]\) to \([T-(n+1)d, T]\)

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Linear-Quadratic Stochastic Delayed Control and Deep Learning Resolution

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Recurrent neural networks for stochastic control problems with delay

Solving Stochastic Optimal Control Problem via Stochastic Maximum Principle with Deep Learning Method

Neural networks-based backward scheme for fully nonlinear PDEs

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Proof of Proposition 3.3

A Proof of Proposition 3.3

1.1 A.1 Slice \(t \in [T-d, T]\), Initialization

1.2 A.2 Slice \([T - 2d, T- d]\)

Lemma A.1

Proof

Lemma A.2

Proof

Lemma A.3

Proof

1.3 A.3 From Slice \([T- n d, T]\) to \([T-(n+1)d, T]\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation