Abstract
In many applications obtaining ordinary differential equation descriptions of dynamic processes is scientifically important. In both, Bayesian and likelihood approaches for estimating parameters of ordinary differential equations, the speed and the convergence of the estimation procedure may crucially depend on the choice of initial values of the parameters. Extending previous work, we show in this paper how using window smoothing yields a fast estimator for systems that are linear in the parameters. Using weak assumptions on the measurement error, we prove that the proposed estimator is \(\sqrt{n}\)-consistent. The estimator does not require an initial guess for the parameters and is computationally fast and, therefore, it can serve as a good initial estimate for more efficient estimators. In simulation studies and on real data we illustrate the performance of the proposed estimator.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bellman, R., Roth, S.: The use of splines with unknown end points in the identification of systems. J. Math. Anal. Appl. 34(1), 26–33 (1971)
Bernstein, S.: Matrix Mathematics Theory, Facts, and Formulas. Princeton University Press, Princeton, NJ (2009)
Bickel, P.J., Ritov, Y.: Nonparametric estimators which can be “plugged-in”. Ann. Stat. 31(4), 1033–1053 (2003)
Bock, H.G.: Recent advances in parameter identification techniques for odes. In: Numerical Treatments of Inverse Problems in Differential and Integral Equtions, vol. 2, pp. 95–121 (1983)
Bonhoeffer, S., May, R.M., Shaw, G.M., Nowak, A.: Virus dynamics and drug therapy. Proc. Natl. Acad. Sci. 94(13), 6971–6976 (1997)
Brunel, B.: Parameter estimation of ode’s via nonparametric estimators. Electron. J. Stat. 2, 1242–1267 (2008)
Campbell, D., Steele, J.: Smooth functional tempering for nonlinear differential equation models. Stat. Comput. 22(2), 429–443 (2012)
Dattner, I., Klaassen, A.: Estimation in systems of ordinary differential equations linear in the parameters. arXiv preprint arXiv:1305.4126 (2013)
Earn, D.J., Rohani, P., Bolker, B.M., Grenfell, B.T.: A simple model for complex dynamical transitions in epidemics. Science 287(5453), 667–670 (2000)
Edelstein-Keshet, L.: Mathematical Models in Biology, vol. 46. SIAM, Philadelphia, PA (2005)
Fang, Y., Wu, H., Zhu, L.X.: A two-stage estimation method for random coefficient differential equation models with application to longitudinal HIV dynamic data. Statistica Sinica 21(3), 1145 (2011)
Fine, P.E., Clarkson, A.: Measles in england and wales-I: an analysis of factors underlying seasonal patterns. Int. J. Epidemiol. 11(1), 5–14 (1982)
Finkenstädt, B.F., Grenfell, T.: Time series modelling of childhood diseases: a dynamical systems approach. J. R. Stat. Soc. Ser. C (Appl. Stat.) 49(2), 187–205 (2000)
FitzHugh, R.: Impulses and physiological states in theoretical models of nerve membrane approach. Biophys. J. 1(6), 445–466 (1961)
González, J., Vujačić, I., Wit, E.: Inferring latent gene regulatory network kinetics. Stat. Appl. Genet. Mol. Biol. 12(1), 109–127 (2013)
González, J., Vujačić, I., Wit, E.: Reproducing kernel Hilbert space based estimation of systems of ordinary differential equations. Pattern Recognit. Lett. 45, 26–32 (2014)
Gugushvili, S., Klaassen, C.A.J.: \(\sqrt{n}\)-consistent parameter estimation for systems of ordinary differential equations: bypassing numerical integration via smoothing. Bernoulli 18, 1061–1098 (2012)
Gugushvili, S., Spreij, P.: Parametric inference for stochastic differential equations: a smooth and match approach. Lat. Am. J. Probab. Math. Stat. 9(2), 609–635 (2012)
Goldstein, L., Messer, K.: Optimal plug-in estimators for nonparametric functional estimation. Ann. Stat. 20, 1306–1328 (1992)
He, D., Ionides, E.L., King, A.: Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. J. R. Soc. Interface 7(43), 271–283 (2010)
Himmelblau, D.M., Jones, C.R., Bischoff, B.: Determination of rate constants for complex kinetics models. Ind. Eng. Chem. Fundam. 6(4), 539–543 (1967)
Hooker, G., Ellner, S.P., Roditi, L.D.V., Earn, J.: Parameterizing state-space models for infectious disease dynamics by generalized profiling: measles in ontario. J. R. Soc. Interface 8(60), 961–974 (2011)
Hooker, G., Xiao, L., Ramsay, J.: CollocInfer: collocation inference for dynamic systems. R package version 0.1.7 (2012), http://CRAN.R-project.org/package=CollocInfer
Huppert, A., Barnea, O., Katriel, G., Yaari, R., Roll, U., Stone, L.: Modeling and statistical analysis of the spatio-temporal patterns of seasonal influenza in Israel. PloS One 7(10), e45107 (2012)
Khanin, R., Vinciotti, V., Wit, E.C.: Reconstructing repressor protein levels from expression of gene targets in Escherichia coli. Proc. Natl. Acad. Sci. 103(49), 18592–18596 (2006)
Khanin, R., Vinciotti, V., Mersinias, V., Smith, C.P., Wit, E.C.: Statistical reconstruction of transcription factor activity using Michaelis-Menten kinetics. Biometrics 63(3), 816–823 (2007)
Liang, H., Wu, H.: Parameter estimation for differential equation models using a framework of measurement error in regression models. J. Am. Stat. Assoc. 103(484), 1570–1583 (2008)
Loader, C.: Local Regression and Likelihood. Springer, Berlin (1999)
Miao, H., Dykes, C., Demeter, L.M., Cavenaugh, J., Park, S.Y., Perelson, A.S., Wu, H.: Modeling and estimation of kinetic parameters and replicative fitness of HIV-1 from flow-cytometry-based growth competition experiments. Bull. Math. Biol. 70(6), 1749–1771 (2008)
Nagumo, J., Arimoto, S., Yoshizawa, S.: An active pulse transmission line simulating nerve axon. Proc. IRE 50(10), 2061–2070 (1962)
Olinky, R., Huppert, A., Stone, L.: Seasonal dynamics and thresholds governing recurrent epidemics. J. Math. Biol. 56(6), 827–839 (2008)
Qi, X., Zhao, H.: Asymptotic efficiency and finite-sample properties of the generalized profiling estimation of parameters in ordinary differential equations. Ann. Stat. 38(1), 435–481 (2010)
Ramsay, J.O., Hooker, G., Campbell, D., Cao, J.: Parameter estimation for differential equations: a generalized smoothing approach. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 69(5), 741–796 (2007)
Steinke, F., Schölkopf, B.: Kernels, regularization and differential equations. Pattern Recognit. 41(11), 3271–3286 (2008)
Stone, L., Olinky, R., Huppert, A.: Seasonal dynamics of recurrent epidemics. Nature 446(7135), 533–536 (2007)
Varah, J.: A spline least squares method for numerical parameter estimation in differential equations. SIAM J. Sci. Stat. Comput. 3(1), 28–46 (1982)
Wood, N.S., Lindgren, F.: APTS statistical computing. http://www2.warwick.ac.uk/fac/sci/statistics/apts/students/resources/apts-statcomp.pdf (2013). Accessed 27 Dec 2013
Xue, H., Miao, H., Wu, H.: Sieve estimation of constant and time-varying coefficients in nonlinear ordinary differential equation models by considering both numerical error and measurement error. Ann. Stat. 38(4), 2351–2387 (2010)
Xun, X., Cao, J., Mallick, B., Carroll, R.J., Arnab, M.: Parameter estimation of partial differential equation models. J. Am. Stat. Assoc. to appear (2013)
Acknowledgments
The authors are very grateful to the reviewers and to the Associate Editor for their very useful comments, which enabled us to greatly improve the manuscript. Vujačić acknowledges financial support through the JoinEUsee project scholarship. Wit and González acknowledge financial support through the NWO SBC-EMA-435065 project grant. The authors would like to thank Amit Huppert (Gertner Institute for Epidemiology & Health Policy Research, Israel) for useful discussions concerning the real data example. We also thank Prof. H.G. Bock (Heidelberg University) and Dr. A. Abbruzzo (University of Palermo) for providing important references for this work.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Proof of Theorem 1
Denote the number of time points in subinterval \(S_i\) by \(I_i\). The length of each subinterval \(S(t)\) is \(T/I\). We have that \(\sum _{i=1}^I I_i=n\) and \(\lfloor \sqrt{n}\rfloor \le I_i\le \lfloor \sqrt{n}\rfloor +2\), which implies
With the following notation
we have
and for any \(t\in [0,T]\) we have
We use these results throughout the proof. The proof is based on verifying the following conditions as stated in Theorem 2 in (Dattner and Klaassen 2013).
-
1.
\(\Vert \text {E}\widehat{x}_n-x\Vert ^2_{\infty }=O(\frac{1}{n})\) For nonnegative numbers \(a_1,\ldots ,a_m\) it holds
$$\begin{aligned} \Big (\sum _{k=1}^m a_k\Big )^2 \le m \sum _{k=1}^m a_k^2. \end{aligned}$$(18)The inequality is the consequence of that between arithmetic and quadratic mean. Applying the triangle inequality and (18) yields
$$\begin{aligned}&\Vert \text {E}\widehat{x}_n-x\Vert ^2_{\infty }\\&\quad ={\mathop {\max }\limits _{1\le i\le I}}{\mathop {\sup }\limits _{t\in S_i}}\Vert \frac{1}{I_i}\sum _{t_j\in S_i}x(t_j)-x(t)\Vert ^2\\&\quad ={\mathop {\max }\limits _{1\le i\le I}}{\mathop {\sup }\limits _{t\in S_i}}\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\{x(t_j)-x(t)\}\Vert ^2\\&\quad \le {\mathop {\max }\limits _{1\le i\le I}}{\mathop {\sup }\limits _{t\in S_i}}\frac{1}{I_i^2}\left\{ \sum _{t_j\in S_i}\Vert \int \limits _{t}^{t_j}x'(s)\mathrm{{d}}s\Vert \right\} ^2\\&\quad \le {\mathop {\max }\limits _{1\le i\le I}}{\mathop {\sup }\limits _{t\in S_i}}\frac{I}{I_i^2}\sum _{t_j\in S_i}\Vert \int \limits _{t}^{t_j}g(x(s))\theta \mathrm{{d}}s\Vert ^2\\&\quad \le {\mathop {\max }\limits _{1\le i\le I}}{\mathop {\sup }\limits _{t\in S_i}}\frac{I}{I_i^2}\sum _{t_j\in S_i}\sup _{s\in S_i}\Vert g(x(s))\Vert ^2\Vert \theta \Vert ^2(t_j-t)^2\\&\quad \le {\mathop {\max }\limits _{1\le i\le I}}\frac{I}{I_i^2}\sum _{t_j\in S_i}\frac{T^2}{I^2}\sup _{s\in S_i}\Vert g(x(s))\Vert ^2\Vert \theta \Vert ^2\\&\quad \le {\mathop {\max }\limits _{1\le i\le I}}\frac{I}{I_i^2}\frac{I_i}{I^2}\cdot O(1)=O\Big (\frac{1}{\sqrt{n}\sqrt{n}}\Big )=O\Big (\frac{1}{n}\Big ). \end{aligned}$$Here we used boundedness of parameter space \(\varTheta \). Also, since \(g(\cdot )\) is continuous and \(x(\cdot )\) is bounded, then \(g(x(\cdot ))\) is bounded.
-
2.
\(\Vert \widehat{x}_n\Vert _{\infty }=O_p(1)\)
$$\begin{aligned}&P(\Vert \widehat{x}_n-\text {E}\widehat{x}_n\Vert _{\infty }\ge M)\\&\quad =P({\mathop {\sup }\limits _{t\in [0,T]}}\Vert \widehat{x}_n(t)-\text {E}\widehat{x}_n(t)\Vert \ge M)\\&\quad =P\left( {\mathop {\sup }\limits _{t\in [0,T]}}\parallel \frac{1}{|S(t)|}\sum _{t_j\in S(t)}\varepsilon (t_j)\parallel \ge M\right) \\&\quad =P\Big ({\mathop {\max }\limits _{1\le i\le I}}\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\varepsilon (t_j)\Vert \ge M\Big )\\&\quad =1-P\Big ({\mathop {\max }\limits _{1\le i\le I}}\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\varepsilon (t_j)\Vert \le M\Big )\\&\quad =1-\prod _{i=1}^I P\Big (\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\varepsilon (t_j)\Vert \le M\Big )\\&\quad =1- \prod _{i=1}^I\Big \{1-P\Big (\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\varepsilon (t_j)\Vert \ge M\Big )\Big \}\\&\quad \le 1-\prod _{i=1}^I\Big (1-\frac{d\sigma ^2_{\varepsilon }}{M^2I_i}\Big )\\&\quad \le 1- \Big (1-\sum _{i=1}^I\frac{d\sigma ^2_{\varepsilon }}{M^2}\frac{1}{I_i}\Big ) \le \frac{d\sigma ^2_{\varepsilon }}{M^2}\sum _{i=1}^I\frac{1}{I_i}. \end{aligned}$$The first inequality above follows from Chebyshev’s inequality and the second from Lemma 1, which is given in the Appendix 2. We conclude that \(\widehat{x}_n-\text {E}\widehat{x}_n\) is bounded in probability. Since we have
$$\begin{aligned} \widehat{x}_n=\underbrace{\widehat{x}_n-\text {E}\widehat{x}_n}_{O_p(1)}+\underbrace{\text {E}\widehat{x}_n-x}_{O(\frac{1}{n})}+\underbrace{x}_{O(1)}, \end{aligned}$$it follows that the sup norm of \(\widehat{x}_n\) is bounded in probability, i.e.
$$\begin{aligned} \Vert \widehat{x}_n\Vert _{\infty }=O_p(1). \end{aligned}$$
-
3.
\(\text {E}(\int _0^T\Vert \widehat{x}_n(t)-\text {E}\widehat{x}_n(t)\Vert ^2\mathrm{{d}}t)=O(\frac{1}{\sqrt{n}} )\)
$$\begin{aligned}&\text {E}\Big (\int \limits _0^T\Vert \widehat{x}_n(t)-\text {E}\widehat{x}_n(t)\Vert ^2\mathrm{{d}}t\Big )\\&\quad \mathop {=}\limits ^{(16)}\text {E}\Big (\sum _{i=1}^I\int \limits _{S_i}\Vert \frac{1}{|S(t)|}\sum _{t_j\in S(t)}\varepsilon (t_j) \Vert ^2\mathrm{{d}}t\Big )\\&\quad =\text {E}\Big (\frac{1}{I}\sum _{i=1}^I\Vert \frac{1}{I_i}\sum _{t_j\in S_i}\varepsilon (t_j) \Vert ^2\Big )\\&\quad =\frac{1}{I}\sum _{i=1}^I\frac{1}{I^2_i}\text {E}\Big [\sum _{h=1}^d\big \{\sum _{t_j\in S_i}\varepsilon _h(t_j)\big \}^2\Big ]\\&\quad =\frac{1}{I}\sum _{i=1}^I\frac{1}{I^2_i}\sum _{h=1}^d\sum _{t_j\in S_i}\text {E}\varepsilon _h(t_j)^2\\&\quad \quad +\,\frac{1}{I}\sum _{i=1}^I\frac{1}{I^2_i}\sum _{h=1}^d\sum _{t_j,t_k\in S_i: j\ne k}\text {E}\varepsilon _h(t_j)\varepsilon _h(t_k)\\&\quad \mathop {=}\limits ^{(17)}\frac{1}{I}\sum _{i=1}^I\frac{1}{I^2_i}\sum _{h=1}^dI_i\sigma ^2_{\varepsilon }=\frac{1}{I}\sum _{i=1}^I\frac{d\sigma ^2_{\varepsilon }}{I_i}\\&\quad \mathop {=}\limits ^{(14)}O\Big (\frac{1}{I}\Big )=O\Big (\frac{1}{\sqrt{n}}\Big ).\\ \end{aligned}$$
-
4.
\(\int _0^T\text {var}\{\int _0^{t} f(s)\widehat{x}_{n,h}(s)\mathrm{d}{s}\}\mathrm{{d}}t=O(\frac{1}{n})\) By using Lemma 2 (Appendix 2) we obtain
$$\begin{aligned}&\int \limits _0^T\text {var}\left\{ \int \limits _0^{t} f(s)\widehat{x}_{n,h}(s)\mathrm{{d}}s\right\} \mathrm{{d}}t\\&\quad =\sum _{i=1}^I\int \limits _{S_i}\text {var}\left\{ \int \limits _0^{t} f(s)\widehat{x}_{n,h}(s)\mathrm{{d}}s\right\} \mathrm{{d}}t\\&\quad =\sum _{i=1}^I\int \limits _{S_i}\Big [\sum _{l=1}^I\Big \{\int \limits _{S_l\cap [0,t]} f(s)\mathrm{{d}}s\Big \}^2\frac{\sigma _{\epsilon }^2}{I_l}\Big ]\mathrm{{d}}t\\&\quad \le \Vert f\Vert _{\infty }^2\sum _{i=1}^I\int \limits _{S_i}\mathrm{{d}}t\sum _{l=1}^I \frac{\sigma _{\epsilon }^2}{I_l^3}\\&\quad =\Vert f\Vert _{\infty }^2\sum _{i=1}^I\frac{1}{I_i}\sum _{l=1}^I \frac{\sigma _{\epsilon }^2}{I_l^3}\\&\quad =O\Big (\frac{1}{n}\Big ), \end{aligned}$$
-
5.
\(\text {var}\{\int _0^T f(t)\widehat{x}_{n,h}(t)\mathrm{{d}}t\}=O(\frac{1}{n})\) According to Lemma 2 (Appendix 2) for any \(t\in [0,T]\)
$$\begin{aligned} \text {var}\left\{ \int \limits _0^{t} f(s)\widehat{x}_{n,h}(s)\mathrm{{d}}s\right\} =\sum _{i=1}^I \left\{ \int \limits _{S_i\cap [0,t]}f(s)\mathrm{{d}}s\right\} ^2\frac{\sigma _{\epsilon }^2}{I_i}, \end{aligned}$$whence it follows that
$$\begin{aligned}&\text {var}\left\{ \int \limits _0^T f(t)\widehat{x}_{n,h}(t)\mathrm{{d}}t\right\} \\&\quad =\sum _{i=1}^I \left\{ \int \limits _{S_i\cap [0,T]}f(t)\mathrm{{d}}t\right\} ^2\frac{\sigma _{\epsilon }^2}{I_i}\\&\quad =\sum _{i=1}^I \left\{ \int \limits _{S_i}f(t)\mathrm{{d}}t\right\} ^2\frac{\sigma _{\epsilon }^2}{I_i}\\&\quad \le \sum _{i=1}^I \left( \Vert f\Vert _{\infty }\frac{1}{I}\right) ^2\frac{\sigma _{\epsilon }^2}{I_i}\\&\quad = \Vert f\Vert _{\infty }^2\frac{\sigma _{\epsilon }^2}{I^2}\sum _{i=1}^I\frac{1}{I_i}\\&\quad \mathop {=}\limits ^{(14)}O\Big (\frac{1}{I^2}\Big )=O\Big (\frac{1}{n}\Big ). \end{aligned}$$
Appendix 2: Auxiliary results
Lemma 1
For any \(0\le a_1,\ldots ,a_n\le 1\) the following inequality holds
Proof
Using induction.
Lemma 2
For any \(t\in [0,T]\) the following equality holds
Proof
Let \(C_i\) denote \(\widehat{x}_{n,h}(S_i)\). Since, \([0,T]=\cup _{i=1}^I S_i\) then for any \(t\in [0,T]\) it holds that \([0,t]=\cup _{i=1}^I (S_i \cap [0,t])\). Now we have
where in the second equality we have used the fact that \(\widehat{x}_{n,h}\) is constant on \(S_i\) and equal to \(C_i\). We need \(\text {var}(C_i)\) and \(\text {cov}(C_k,C_l)\).
Plugging the variance and covariance terms into the original expression we obtain the formula.
Appendix 3: Calculation for the algorithmic complexity
The following results are taken from Wood and Lindgren (2013) and are used in the sequel.
Lemma 3
If \(A\) and \(B\) are matrices of dimension \(n\times m\) then the computational cost of addition and subtraction i.e. \(A+B,\,A-B\), is \(O(mn)\) flops.
Lemma 4
If \(A\) is a matrix of dimension \(n\times m\) and \(B\) is a matrix of dimension \(m\times p\) then the computational cost of the matrix multiplication \(AB\) is \(O(nmp)\) flops.
Lemma 5
If a matrix \(A\) is of order \(n\) then the computational cost of the matrix inversion \(A^{-1}\) is \(O(n^3)\) flops.
The matrices that appear in the form for the estimators have the following dimensions:
-
\(I_d\) and \(\widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top }\) are of dimension \(d\times d\).
-
\(\widehat{A}_n,\,\widehat{G}_n(S_i),\,\widehat{A}_n\widehat{B}_n^{-1}\) and \(g(\widehat{x}_n(t))\) are of dimension \(d\times p\).
-
\(\widehat{A}_n^{\top }\), \(\widehat{G}_n^{\top } (t_i)\) and \(g (\widehat{x}_n(t))^{\top }\) are of dimension \(p\times d\).
-
\(\widehat{B}_n\) is of dimension \(p\times p\).
-
\(\widehat{x}_n(t_i),\,Y(t_i)\) and \(\widehat{\xi }_n\) are of dimension \(d\times 1\).
-
1.
The cost for \(\widehat{x}_n(S_i)\) and \(\widehat{G}_n(a_i)\) We calculate \(\widehat{x}_n(S_i)\), for \(i=1,\ldots ,I\). The cost for \(\widehat{x}_n(S_i)\) is \(O((I-1)d)\) so the total cost is \(O(I(I-1)d)=O(I^2d)=O(nd)\). The calculation of \(\widehat{G}_n(a_i)\) involves the calculation of \(g(\widehat{x}_n(S_i))\) and since the construction of matrix \(g\) depends on the form of the ODE we assume that every entry of the matrix is obtained by at most a constant number of flops \(C\) that does not depend on \(n,\,p\) and \(d\). Then the cost of obtaining matrix \(g(\widehat{x}_n(S_i))\), for any \(i=1,\ldots ,I\), is \(O(Cdp)=O(dp)\) flops. The estimator \(\widehat{G}_n(t)\) can be calculated recursively by
$$\begin{aligned} \widehat{G}_n(a_1)&= g(\widehat{x}_n(S_1))a_1,\nonumber \\ \widehat{G}_n(a_i)&= \widehat{G}_n(a_{i-1})\!+\!g(\widehat{x}_n(S_i))\varDelta ,\quad i\!=\!2,\ldots ,I.\quad \end{aligned}$$(19)Since the cost for computing \(g(\widehat{x}_n(S_i))\), for \(i=1,\ldots ,I\) is \(O(dp)\) flops, from (19) it follows that \(\widehat{G}_n(a_i),\,i=1,\ldots ,I\) can be computed in \(O(dp)\) flops and therefore summing across \(i=1,\ldots ,I\) we obtain that the total cost is
$$\begin{aligned} O(Idp)=O(\sqrt{n}dp) \end{aligned}$$flops.
-
2.
The cost for \(\widehat{A_n},\,\widehat{A_n}(i)\) and \(\widehat{B_n}\) Since the cost for computing \(g(\widehat{x}_n(S_i))\) and \(\widehat{G}_n(a_i)\) for each \(i\) is \(O(dp)\) flops, from (9) it follows that the cost for \(\widehat{A_n(i)}\) is \(O(dp)\) flops. The equality \(\widehat{A_n}=\sum _{i=1}^n\widehat{A_n}(i)\) implies that the total cost for computing \(\widehat{A_n}\) and \(\widehat{A_n}(i),\,i=1,\ldots ,n\) is \(O(Idp)\) flops. For the cost of \(\widehat{B_n}\) we analyze each term in (10).
-
\((\widehat{G}_n(T)^{\top },\widehat{A}_n)\mapsto \widehat{G}_n(T)^{\top }\widehat{A}_n\) costs \(O(p^2d)\) flops.
-
\(\left( g(\widehat{x}_{n}(S_i))^{\top }, t^2\widehat{G}_n(t) \Big |_{a_{i-1}}^{a_i}\right) \!\mapsto \!g(\widehat{x}_{n}(S_i))^{\top }t^2\widehat{G}_n(t) \Big |_{a_{i-1}}^{a_i}\) costs \(O(p^2d)\) flops.
-
\((R_i^{\top },g(\widehat{x}_{n}(S_i))(2i-1))\mapsto R_i^{\top }g(\widehat{x}_{n}(S_i))(2i-1)\) costs \(O(p^2d)\) flops.
-
\((g(\widehat{x}_{n}(S_i))^{\top },g(\widehat{x}_{n}(S_i))(3i^2-1))\mapsto g(\widehat{x}_{n}(S_i))^{\top }g(\widehat{x}_{n}(S_i))(3i^2-1)\) costs \(O(p^2d)\) flops.
Each term above has complexity \(O(p^2d)\) and summing over \(i\) we get that the cost for \(\widehat{B_n}\) is \(O(Ip^2d)\). Finally, the total cost for \(\widehat{A_n},\,\widehat{A_n}(i),\,i=1,\ldots ,I\), and \(\widehat{B_n}\) is \(O(Idp)+O(Ip^2d)=O(Ip^2d)\), which is equal to
flops.
-
3.
The cost for \(\widehat{\xi }_n\) The term \(\left( T I_d - \widehat{A}_n \widehat{B}_n^{-1} \widehat{A}_n^{\top }\right) ^{-1}\)
-
\(\widehat{B}_n\mapsto \widehat{B}_n^{-1}\) costs \(O(p^3)\) flops.
-
\((\widehat{A}_n,\widehat{B}_n^{-1})\mapsto \widehat{A}_n\widehat{B}_n^{-1}\) costs \(O(p^2d)\) flops.
-
\((\widehat{A}_n\widehat{B}_n^{-1},\widehat{A}_n^{\top })\mapsto \widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top }\) costs \(O(d^2p)\) flops.
-
\((I_d,\widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top })\mapsto TI_d-\widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top }\) costs \(O(d^2)\) flops.
-
\(T I_d-\widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top }\mapsto (T I_d-\widehat{A}_n\widehat{B}_n^{-1}\widehat{A}_n^{\top })^{-1}\) costs \(O(d^3)\) flops.
Thus, the computational cost of \(\left( T I_d - \widehat{A}_n \widehat{B}_n^{-1} \widehat{A}_n^{\top }\right) ^{-1}\) is \(O(p^3+p^2d+d^2p+d^2+d^3)\) which is equal to \(O(p^3+d^3)\) flops.
The term \(T\sum _{i=1}^I \widehat{x}_{n}(S_i)/I-\widehat{A}_n \widehat{B}_n^{-1}\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i)\)
-
\(\widehat{x}_{n}(S_i)\mapsto T\sum _{i=1}^I \widehat{x}_{n}(S_i)/I\) costs \(O(Id)\) flops.
-
\((\widehat{A}_n(i)^{\top },\widehat{x}_{n}(S_i))\mapsto \sum _{i=1}^I\widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i)\) costs \(O(Ipd)\) flops.
-
\((\widehat{A}_n \widehat{B}_n^{-1},\sum _{i=1}^I\widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i))\!\mapsto \!\widehat{A}_n \widehat{B}_n^{-1}\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i)\) costs \(O(dp)\) flops.
In total, the cost for \(T\sum _{i=1}^I \widehat{x}_{n}(S_i)/I-\widehat{A}_n \widehat{B}_n^{-1}\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i)\) is \(O(Idp)=O(\sqrt{n}dp)\) flops. Finally we have the following:
-
\(\left( T I_d - \widehat{A}_n \widehat{B}_n^{-1} \widehat{A}_n^{\top }\right) ^{-1}\) costs \(O(p^3+d^3)\) flops.
-
\(T\sum _{i=1}^I \widehat{x}_{n}(S_i)/I-\widehat{A}_n \widehat{B}_n^{-1}\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i)\) costs \(O(\sqrt{n}dp)\) flops.
-
Multiplication of two aforementioned terms costs \(O(d^2)\) flops.
In total, calculation of \(\widehat{\xi }_n \) costs \(O(p^3+d^3+\sqrt{n}dp+d^2)flops\), which is equal to
flops.
-
4.
The cost for \(\widehat{\theta }_n\) The term \(\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i) -\widehat{A}_n^{\top }\widehat{\xi }_n\)
-
\((\widehat{A}_n(i)^{\top },\widehat{x}_{n}(S_i))\mapsto \sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i) \) costs \(O(Ipd)\) flops.
-
\((\widehat{A}_n^{\top },\widehat{\xi }_n)\mapsto \widehat{A}_n^{\top }\widehat{\xi }_n\) costs \(O(pd)\) flops.
-
Addition of two aforementioned terms costs \(O(p)\) flops.
In total, the cost for calculating \(\sum _{i=1}^I \widehat{A}_n(i)^{\top }\widehat{x}_{n}(S_i) -\widehat{A}_n^{\top }\widehat{\xi }_n\) is \(O(Ipd)=O(\sqrt{n}pd)\) flops. The multiplication of \(\widehat{B}_n^{-1}\) with the last mentioned term costs \(O(p^2)\) flops so the total cost for \(\widehat{\theta }_n\) is
flops.
Hence, the computational complexity of the procedure is \(O(\sqrt{n}dp+\sqrt{n}dp^2+p^3+d^3+\sqrt{n}dp+p^2+\sqrt{n}pd)\) flops which is equal to
flops.
Appendix 4: Calculation of the integrals
Let us first recall that \(S_i=[a_{i-1},a_i]\), for \(i=1,\ldots ,I-1\) and \(S_I=[a_{I-1},T]\). The length of each subinterval \(S_i\) is \(\varDelta =T/I\) and the boundary points of the subintervals are \(a_i=i\varDelta \), for \(i=0,\ldots ,I\). To shorten the notation we let \(G_1(t)\) stand for an arbitrary entry of the matrix \(\widehat{G}_n(t)\). Similarly, let \(g_1(\widehat{x}_n(t))\) denote the entry of the matrix \(g(\widehat{x}_n(t))\) that corresponds to \(G_1(t)\), i.e \(G_1(t)=\int _0^tg_1(\widehat{x}_n(s))\)ds. Let \(C_i=g_1(\widehat{x}_n(S_i))\), for \(i=1,\ldots ,I\). Using integration by parts yields
In the same manner, we obtain that for \(t\in S_i\) it holds
where we have used that \(a_m^2-a^2_{m-1}=(2m-1)\varDelta ^2\). Recall the convention that \(\sum _{m=1}^{i-1}(2m-1)g_1(C_m)\) is equal to zero for \(i=1\). Setting \(t=T\) in (21) we obtain
Applying (20) and (22) to every entry of the matrices \(\int _{S_i}\widehat{G}_n(t)\mathrm{{d}}t\) and \(\int _0^T\widehat{G}_n(t)\mathrm{{d}}t\), respectively, we conclude that
To obtain \(\widehat{B}_n\), we need integrals of the form \(\int _0^T G_1(t)G_2(t)\mathrm{{d}}t\) where, to simplify notation, we have denoted by \(G_1(t)\) and \(G_2(t)\) entries of the matrix \(\widehat{G}_n(t)\). Similarly \(g_1(\widehat{x}_n(t))\) and \(g_2(\widehat{x}_n(t))\) denote the corresponding entries of the matrix \(g(\widehat{x}_n(t))\), i.e \(G_i(t)=\int _0^tg_i(\widehat{x}_n(s))\mathrm{d}{s},\,i=1,2\). Integration by parts yields
According to (21), for any \(t\in S_i\)
and hence
Again, using integration by parts, we obtain
Since \(a_i=i\varDelta \), for \(i=0,\ldots I\) it follows that
Also, the following identity holds
where \(r_i=\sum _{m=i+1}^{I}g_1(C_m)\). By using previous identities, we finally obtain
Since \((r,k)\)th entry, \(\widehat{G}_n(t)[r,k]\), of the matrix \(\int _{0}^{T} \widehat{G}_n(t)^{\top } \widehat{G}_n(t)\mathrm{{d}}t \) is the sum of expressions \(\int _{0}^{T} \widehat{G}_n(t)[l,r]\widehat{G}_n(t)[l,k]\) \(\mathrm{{d}}t\) that are of the form above we obtain that
where \(R_i=\sum _{m=i+1}^{I}g(\widehat{x}_n(S_m))\).
Rights and permissions
About this article
Cite this article
Vujačić, I., Dattner, I., González, J. et al. Time-course window estimator for ordinary differential equations linear in the parameters. Stat Comput 25, 1057–1070 (2015). https://doi.org/10.1007/s11222-014-9486-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-014-9486-9