Skip to main content
Log in

ForLion: a new algorithm for D-optimal designs under general parametric statistical models with mixed factors

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

In this paper, we address the problem of designing an experimental plan with both discrete and continuous factors under fairly general parametric statistical models. We propose a new algorithm, named ForLion, to search for locally optimal approximate designs under the D-criterion. The algorithm performs an exhaustive search in a design space with mixed factors while keeping high efficiency and reducing the number of distinct experimental settings. Its optimality is guaranteed by the general equivalence theorem. We present the relevant theoretical results for multinomial logit models (MLM) and generalized linear models (GLM), and demonstrate the superiority of our algorithm over state-of-the-art design algorithms using real-life experiments under MLM and GLM. Our simulation studies show that the ForLion algorithm could reduce the number of experimental settings by 25% or improve the relative efficiency of the designs by 17.5% on average. Our algorithm can help the experimenters reduce the time cost, the usage of experimental devices, and thus the total cost of their experiments while preserving high efficiencies of the designs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data availability

No datasets were generated or analysed during the current study.

References

  • Atkinson, A.C., Donev, A.N., Tobias, R.D.: Optimum Experimental Designs, with SAS. Oxford University Press, Oxford (2007)

    Google Scholar 

  • Ai, M., Ye, Z., Yu, J.: Locally D-optimal designs for hierarchical response experiments. Stat. Sin. 33, 381–399 (2023)

    MathSciNet  Google Scholar 

  • Byrd, R.H., Lu, P., Nocedal, J., Zhu, C.: A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 16, 1190–1208 (1995)

    MathSciNet  Google Scholar 

  • Bu, X., Majumdar, D., Yang, J.: D-optimal designs for multinomial logistic models. Ann. Stat. 48(2), 983–1000 (2020)

    MathSciNet  Google Scholar 

  • Broyden, C.G.: A class of methods for solving nonlinear simultaneous equations. Math. Comput. 19(92), 577–593 (1965)

    MathSciNet  Google Scholar 

  • Duarte, B.P.M., Atkinson, A.C., Granjo, J.F.O., Oliveira, N.M.C.: Optimal design of experiments for implicit models. J. Am. Stat. Assoc. 117(539), 1424–1437 (2022)

    MathSciNet  Google Scholar 

  • Duarte, B.P.M., Granjo, J.F.O., Wong, W.K.: Optimal exact designs of experiments via mixed integer nonlinear programming. Stat. Comput. 30(1), 93–112 (2020)

    MathSciNet  Google Scholar 

  • Dennis, J.E., Jr., Moré, J.J.: Quasi-newton methods, motivation and theory. SIAM Rev. 19(1), 46–89 (1977)

    MathSciNet  Google Scholar 

  • Duarte, B.P.M., Sagnol, G., Wong, W.K.: An algorithm based on semidefinite programming for finding minimax optimal designs. Comput. Stat. Data Anal. 119, 99–117 (2018)

    MathSciNet  Google Scholar 

  • Duarte, B.P.M.: Exact optimal designs of experiments for factorial models via mixed-integer semidefinite programming. Mathematics 11(4), 854 (2023)

    Google Scholar 

  • Duarte, B.P.M., Wong, W.K.: A semi-infinite programming based algorithm for finding minimax optimal designs for nonlinear models. Stat. Comput. 24(6), 1063–1080 (2014)

    MathSciNet  Google Scholar 

  • Duarte, B.P.M., Wong, W.K.: Finding Bayesian optimal designs for nonlinear models: a semidefinite programming-based approach. Int. Stat. Rev. 83(2), 239–262 (2015)

    MathSciNet  Google Scholar 

  • Duarte, B.P.M., Wong, W.K., Dette, H.: Adaptive grid semidefinite programming for finding optimal designs. Stat. Comput. 28(2), 441–460 (2018)

    MathSciNet  Google Scholar 

  • Fedorov, V.V.: Theory of Optimal Experiments. Academic Press, Cambridge (1972)

    Google Scholar 

  • Fedorov, V.V., Hackl, P.: Model-Oriented Design of Experiments. Springer, New York (1997)

    Google Scholar 

  • Fedorov, V.V., Leonov, S.L.: Optimal Design for Nonlinear Response Models. Chapman & Hall/CRC, Baton Rouge (2014)

    Google Scholar 

  • Fletcher, R., Reeves, C.M.: Function minimization by conjugate gradients. Comput. J. 7(2), 149–154 (1964)

    MathSciNet  Google Scholar 

  • Givens, G.H., Hoeting, J.A.: Computational Statistics, 2nd edn. Wiley, New Jersey (2013)

    Google Scholar 

  • Glonek, G.F.V., McCullagh, P.: Multivariate logistic models. J. Roy. Stat. Soc. B 57, 533–546 (1995)

    Google Scholar 

  • Harman, R., Filová, L.: Computing efficient exact designs of experiments using integer quadratic programming. Comput. Stat. Data Anal. 71, 1159–1167 (2014)

    MathSciNet  Google Scholar 

  • Harman, R., Filová, L., Richtárik, P.: A randomized exchange algorithm for computing optimal approximate designs of experiments. J. Am. Stat. Assoc. 115(529), 348–361 (2020)

    MathSciNet  Google Scholar 

  • Harman, R., Jurík, T.: Computing c-optimal experimental designs using the simplex method of linear programming. Comput. Stat. Data Anal. 53(2), 247–254 (2008)

    MathSciNet  Google Scholar 

  • Harman, R., Rosa, S.: On greedy heuristics for computing D-efficient saturated subsets. Oper. Res. Lett. 48(2), 122–129 (2020)

    MathSciNet  Google Scholar 

  • Hestenes, M.R., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Natl. Bur. Stand. 49(6), 409–436 (1952)

    MathSciNet  Google Scholar 

  • Huang, Y., Tong, L., Yang, J.: Constrained d-optimal design for paid research study. Stat. Sin. (2023). https://www3.stat.sinica.edu.tw/ss_newpaper/SS-2022-0414_na.pdf

  • Itepan, N.M.: Aumento do periodo de aceitabilidade de pupas de musca domestica l., 1758 (diptera: muscidae), irradiadas com raios gama, como hospedeira de parasitoides (hymenoptera: pteromalidae). Master’s thesis, Universidade de São Paulo (1995)

  • Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN’95-International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995). IEEE

  • Kirkpatrick, S., Gelatt, C.D., Jr., Vecchi, M.P.: Optimization by simulated annealing. Science 220(4598), 671–680 (1983)

    MathSciNet  Google Scholar 

  • Kiefer, J.: General equivalence theory for optimum designs (approximate theory). Ann. Stat. 2, 849–879 (1974)

    MathSciNet  Google Scholar 

  • Lukemire, J., Mandal, A., Wong, W.K.: d-QPSO: a quantum-behaved particle swarm technique for finding D-optimal designs with discrete and continuous factors and a binary response. Technometrics 61(1), 77–87 (2019)

    MathSciNet  Google Scholar 

  • Lukemire, J., Mandal, A., Wong, W.K.: Optimal experimental designs for ordinal models with mixed factors for industrial and healthcare applications. J. Qual. Technol. 54(2), 184–196 (2022)

    Google Scholar 

  • McCullagh, P., Nelder, J.: Generalized Linear Models, 2nd edn. Chapman and Hall/CRC, Baton Rouge (1989)

    Google Scholar 

  • Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7(4), 308–313 (1965)

    MathSciNet  Google Scholar 

  • Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)

    Google Scholar 

  • Phadke, M.S.: Quality Engineering Using Robust Design. Prentice-Hall, Englewood Cliffs (1989)

    Google Scholar 

  • Poli, R., Kennedy, J., Blackwell, T.: Particle swarm optimization: an overview. Swarm Intell. 1, 33–57 (2007)

    Google Scholar 

  • Pukelsheim, F.: Optimal Design of Experiments. Wiley, New Jersey (1993)

    Google Scholar 

  • Pukelsheim, F.: Optimal Design of Experiments. SIAM, Philadelphia (2006)

    Google Scholar 

  • Sagnol, G.: Computing optimal designs of multiresponse experiments reduces to second-order cone programming. J. Stat. Plann. Inference 141(5), 1684–1708 (2011)

    MathSciNet  Google Scholar 

  • Seber, G.A.F.: A Matrix Handbook for Statisticians. Wiley, New Jersey (2008)

    Google Scholar 

  • Sagnol, G., Harman, R.: Computing exact -optimal designs by mixed integer second-order cone programming. Ann. Stat. 43(5), 2198–2224 (2015)

    MathSciNet  Google Scholar 

  • Silvey, S.D., Titterington, D.M., Torsney, B.: An algorithm for optimal designs on a finite design space. Commun. Stat. Theory Methods 14, 1379–1389 (1978)

    Google Scholar 

  • Stufken, J., Yang, M.: Optimal designs for generalized linear models. In: Hinkelmann, K. (ed.) Design and Analysis of Experiments, Volume 3: Special Designs and Applications, pp. 137–164. Wiley, New Jersey (2012). Chap. 4

  • Titterington, D.M.: Algorithms for computing d-optimal design on finite design spaces. In: Proc. of the 1976 Conf. on Information Science and Systems, vol. 3, pp. 213–216. John Hopkins University, Baltimore (1976)

  • Titterington, D.M.: Estimation of correlation coefficients by ellipsoidal trimming. J. R. Stat. Soc. Ser. C (Appl. Stat.) 27, 227–234 (1978)

    Google Scholar 

  • Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S, 4th edn. Springer, New York (2002)

    Google Scholar 

  • Vo-Thanh, N., Jans, R., Schoen, E.D., Goos, P.: Symmetry breaking in mixed integer linear programming formulations for blocking two-level orthogonal experimental designs. Comput. Oper. Res. 97, 96–110 (2018)

    MathSciNet  Google Scholar 

  • Whitman, C., Gilbert, T.M., Rahn, A.M., Antonell, J.A.: Determining factors affecting ESD failure voltage using DOE. Microelectron. Reliab. 46(8), 1228–1237 (2006)

    Google Scholar 

  • Wu, F.-C.: Simultaneous optimization of robust design with quantitative and ordinal data. Int. J. Ind. Eng. Theory Appl. Pract. 5, 231–238 (2008)

    Google Scholar 

  • Wong, W.K., Zhou, J.: Using CVX to construct optimal designs for biomedical studies with multiple objectives. J. Comput. Graph. Stat. 32(2), 744–753 (2023)

    MathSciNet  Google Scholar 

  • Yang, M., Biedermann, S., Tang, E.: On optimal designs for nonlinear models: a general and efficient algorithm. J. Am. Stat. Assoc. 108, 1411–1420 (2013)

    MathSciNet  Google Scholar 

  • Yang, J., Mandal, A.: D-optimal factorial designs under generalized linear models. Commun. Stat. Simul. Comput. 44, 2264–2277 (2015)

    MathSciNet  Google Scholar 

  • Yang, J., Mandal, A., Majumdar, D.: Optimal designs for \(2^k\) factorial experiments with binary response. Stat. Sin. 26, 385–411 (2016)

    Google Scholar 

  • Yang, J., Tong, L., Mandal, A.: D-optimal designs with ordered categorical data. Stat. Sin. 27, 1879–1902 (2017)

    MathSciNet  Google Scholar 

  • Yu, Y.: D-optimal designs via a cocktail algorithm. Stat. Comput. 21, 475–481 (2011)

    MathSciNet  Google Scholar 

  • Ye, J.J., Zhou, J.: Minimizing the condition number to construct design points for polynomial regression models. SIAM J. Optim. 23(1), 666–686 (2013)

    MathSciNet  Google Scholar 

  • Zocchi, S.S., Atkinson, A.C.: Optimum experimental designs for multinomial logistic models. Biometrics 55, 437–444 (1999)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors gratefully acknowledge the authors of Ai et al. (2023), Lukemire et al. (2019) and Lukemire et al. (2022) for kindly sharing their source codes, which we used to implement and compare their methods with ours. The authors gratefully acknowledge the support from the U.S. NSF grants DMS-1924859 and DMS-2311186.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization and methodology, all authors; software, Y.H., K.L., and J.Y.; validation, Y.H., A.M., and J.Y.; formal analysis, Y.H. and K.L.; investigation, Y.H. and J.Y.; resources, all authors; data curation, Y.H.; writing-original draft preparation, all authors; writing-review and editing, all authors.; supervision, J.Y. and A.M.; project administration, J.Y. and A.M.; funding acquisition, J.Y. and A.M.. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Jie Yang.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 1561 KB)

Appendices

Computing \(u_{st}^{{\textbf{x}}}\) in Fisher information \({\textbf{F}}_{{\textbf{x}}}\)

In this section, we provide more technical details for Sect. 3.1 and Theorem 2.

For MLM (1), Corollary 3.1 in Bu et al. (2020) provided an alternative form \({{\textbf{F}}}_{{{\textbf{x}}}_i} = {\textbf{X}}_i^T {{\textbf{U}}}_i {{\textbf{X}}}_i\), which we use for computing the Fisher information \({{\textbf{F}}}_{{\textbf{x}}}\) at an arbitrary \({{\textbf{x}}} \in {{\mathcal {X}}}\). More specifically, first of all, the corresponding model matrix at \({{\textbf{x}}}\) is

$$\begin{aligned} {{\textbf{X}}}_{{\textbf{x}}}= \begin{pmatrix} {{\textbf{h}}}_1^T({{\textbf{x}}}) &{} {\varvec{0}}^T &{} \cdots &{} {\varvec{0}}^T&{} {{\textbf{h}}}_c^T({{\textbf{x}}})\\ {\varvec{0}}^T &{} {{\textbf{h}}}_2^T({{\textbf{x}}}) &{}\ddots &{} \vdots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} {\varvec{0}}^T &{} {{\textbf{h}}}_c^T({{\textbf{x}}})\\ {\varvec{0}}^T &{} \cdots &{} {\varvec{0}}^T &{} {{\textbf{h}}}_{J-1}^T({{\textbf{x}}}) &{} {{\textbf{h}}}_c^T({{\textbf{x}}})\\ {\varvec{0}}^T &{} \cdots &{} \cdots &{} {\varvec{0}}^T &{} {\varvec{0}}^T\\ \end{pmatrix}_{J \times p} \end{aligned}$$
(A1)

where \({{\textbf{h}}}^T_j(\cdot ) = (h_{j1}(\cdot ), \ldots , h_{jp_j}(\cdot ))\) and \({{\textbf{h}}}^T_c(\cdot )\) \( = \) \((h_1(\cdot ), \ldots , h_{p_c}(\cdot ))\) are known predictor functions. We let \(\varvec{\beta }_j\) and \(\varvec{\zeta }\) denote the model parameters associated with \({{\textbf{h}}}^T_j({{\textbf{x}}})\) and \({{\textbf{h}}}^T_c({{\textbf{x}}})\), respectively, then the model parameter vector \(\varvec{\theta }=(\varvec{\beta }_{1},\varvec{\beta }_{2},\cdots ,\varvec{\beta }_{J-1},\varvec{\zeta })^T \in {\mathbb {R}}^p\), and the linear predictor \(\varvec{\eta }_{{\textbf{x}}} = {{\textbf{X}}}_{{\textbf{x}}} \varvec{\theta }= (\eta _1^{{\textbf{x}}}, \ldots , \eta _{J-1}^{{\textbf{x}}}, 0)^T \in {\mathbb {R}}^J\), where \(\eta _j^{{\textbf{x}}} = {{\textbf{h}}}_j^T({{\textbf{x}}}) \varvec{\beta }_j + {{\textbf{h}}}_c^T({{\textbf{x}}}) \varvec{\zeta }\), \(j=1, \ldots , J-1\).

According to Lemmas S.10, S.12 and S.13 in the Supplementary Material of Bu et al. (2020), the categorical probabilities \(\varvec{\pi }_{{\textbf{x}}} = (\pi _1^{{\textbf{x}}}, \ldots , \pi _J^{{\textbf{x}}})^T \in {\mathbb {R}}^J\) at \({{\textbf{x}}}\) for baseline-category, adjacent-categories and continuation-ratio logit models can be expressed as follows:

$$\begin{aligned} \pi _j^{{\textbf{x}}} = \left\{ \begin{array}{cl} \frac{\exp \{\eta _j^{{\textbf{x}}}\}}{\exp \{\eta _1^{{\textbf{x}}}\} + \cdots + \exp \{\eta _{J-1}^{{\textbf{x}}}\} + 1} &{} \text{ baseline-category }\\ \frac{\exp \{\eta _{J-1}^{{\textbf{x}}} + \cdots +\eta _j^{{\textbf{x}}}\}}{D_j} &{} \text{ adjacent-categories }\\ \exp \{\eta _j^{{\textbf{x}}}\} \prod _{l=1}^j (\exp \{\eta _l^{{\textbf{x}}}\} + 1)^{-1} &{} \text{ continuation-ratio } \end{array}\right. \nonumber \\ \end{aligned}$$
(A2)

for \(j=1, \ldots , J-1\), where \(D_j = \exp \{\eta _{J-1}^{{\textbf{x}}} + \cdots + \eta _1^{{\textbf{x}}}\} + \exp \{\eta _{J-1}^{{\textbf{x}}} + \cdots + \eta _2^{{\textbf{x}}}\} + \cdots +\exp \{\eta _{J-1}^{{\textbf{x}}}\} + 1\), and

$$\begin{aligned} \pi _J^{{\textbf{x}}} = \left\{ \begin{array}{cl} \frac{1}{\exp \{\eta _1^{{\textbf{x}}}\} + \cdots + \exp \{\eta _{J-1}^{{\textbf{x}}}\} + 1} &{} \text{ baseline-category }\\ \frac{1}{D_J} &{} \text{ adjacent-categories }\\ \prod _{l=1}^{J-1} (\exp \{\eta _l^{{\textbf{x}}}\} + 1)^{-1} &{} \text{ continuation-ratio } \end{array}\right. \end{aligned}$$

where \(D_J = \exp \{\eta _{J-1}^{{\textbf{x}}} + \cdots + \eta _1^{{\textbf{x}}}\} + \exp \{\eta _{J-1}^{{\textbf{x}}} + \cdots + \eta _2^{{\textbf{x}}}\} + \cdots +\exp \{\eta _{J-1}^{{\textbf{x}}}\} + 1\). Note that we provide the expression of \(\pi _J^{{\textbf{x}}}\) for completeness while \(\pi _J^{{\textbf{x}}} = 1 - \pi _1^{{\textbf{x}}} - \cdots - \pi _{J-1}^{{\textbf{x}}}\) is an easier way for numerical calculations.

As for cumulative logit models, the candidate \({{\textbf{x}}}\) must satisfy \(-\infty< \eta _1^{{\textbf{x}}}< \eta _2^{{\textbf{x}}}< \cdots< \eta _{J-1}^{{\textbf{x}}} < \infty \). Otherwise, \(0< \pi _j^{\textbf{x}} < 1\) might be violated for some \(j=1, \ldots , J\). In other words, the feasible design region should be

$$\begin{aligned} {{\mathcal {X}}}_{\varvec{\theta }} = \{{{\textbf{x}}}\in {{\mathcal {X}}} \mid -\infty< \eta _1^{{\textbf{x}}}< \eta _2^{{\textbf{x}}}< \cdots< \eta _{J-1}^{{\textbf{x}}} < \infty \} \end{aligned}$$
(A3)

which depends on the regression parameter \(\varvec{\theta }\) (see Section S.14 in the Supplementary Material of Bu et al. (2020) for such an example). For cumulative logit models, if \({{\textbf{x}}} \in {{\mathcal {X}}}_{\varvec{\theta }}\), then

$$\begin{aligned} \pi _j^{{\textbf{x}}} = \left\{ \begin{array}{ll} \frac{\exp \{\eta _1^{{\textbf{x}}}\}}{1+\exp \{\eta _1^{{\textbf{x}}}\}} &{} j=1\\ \frac{\exp \{\eta _j^{{\textbf{x}}}\}}{1+\exp \{\eta _j^{{\textbf{x}}}\}} - \frac{\exp \{\eta _{j-1}^{{\textbf{x}}}\}}{1+\exp \{\eta _{j-1}^{{\textbf{x}}}\}} &{} 1< j < J\\ \frac{1}{1+\exp \{\eta _{J-1}^{{\textbf{x}}}\}} &{} j=J \end{array}\right. \end{aligned}$$
(A4)

according to Lemma S.11 of Bu et al. (2020).

Once \(\varvec{\pi }_{{\textbf{x}}} \in {\mathbb {R}}^J\) is obtained, we can calculate \(u_{st}^{{\textbf{x}}} = u_{st}({\varvec{\pi }}_{\textbf{x}})\) based on Theorem A.2 in Bu et al. (2020) as follows:

  1. (i)

    \(u_{st}^{{\textbf{x}}} = u_{ts}^{{\textbf{x}}}\), \(s,t=1, \ldots , J\);

  2. (ii)

    \(u_{sJ}^{{\textbf{x}}} = 0\) for \(s=1, \ldots , J-1\) and \(u_{JJ}^{{\textbf{x}}} = 1\);

  3. (iii)

    For \(s=1, \ldots , J-1\), \(u_{ss}^{{\textbf{x}}}\) is

    $$\begin{aligned} \left\{ \begin{array}{ll} \pi _s^{{\textbf{x}}} (1-\pi _s^{{\textbf{x}}}) &{} \text{ for } \text{ baseline-category },\\ (\gamma _s^{{\textbf{x}}})^2(1-\gamma _s^{{\textbf{x}}})^2((\pi _s^{{\textbf{x}}})^{-1} + (\pi _{s+1}^{{\textbf{x}}})^{-1}) &{} \text{ for } \text{ cumulative },\\ \gamma _s^{{\textbf{x}}}(1-\gamma _s^{{\textbf{x}}}) &{} \text{ for } \text{ adjacent-categories },\\ \pi _s^{{\textbf{x}}}(1-\gamma _s^{{\textbf{x}}})(1-\gamma _{s-1}^{\textbf{x}})^{-1} &{} \text{ for } \text{ continuation-ratio }; \end{array}\right. \end{aligned}$$
  4. (iv)

    For \(1\le s < t \le J-1\), \(u_{st}^{{\textbf{x}}}\) is

    $$\begin{aligned} \left\{ \begin{array}{ll} -\pi _s^{{\textbf{x}}} \pi _t^{{\textbf{x}}} &{} \text{ for } \text{ baseline-category },\\ -\gamma _s^{{\textbf{x}}}\gamma _t^{{\textbf{x}}}(1-\gamma _s^{{\textbf{x}}})(1-\gamma _t^{{\textbf{x}}})(\pi _t^{{\textbf{x}}})^{-1} &{} \text{ for } \text{ cumulative }, t-s=1,\\ 0 &{} \text{ for } \text{ cumulative }, t-s>1,\\ \gamma _s^{{\textbf{x}}}(1-\gamma _t^{{\textbf{x}}}) &{} \text{ for } \text{ adjacent-categories },\\ 0 &{} \text{ for } \text{ continuation-ratio }; \end{array}\right. \end{aligned}$$

where \(\gamma _j^{{\textbf{x}}} = \pi _1^{{\textbf{x}}} + \cdots + \pi _j^{{\textbf{x}}}\), \(j=1, \ldots , J-1\); \(\gamma _0^{{\textbf{x}}}\equiv 0\) and \(\gamma _J^{{\textbf{x}}}\equiv 1\).

Example that \({{\textbf{F}}}_{{{\textbf{x}}}} = {{\textbf{F}}}_{{{\textbf{x}}}'}\) with \({{\textbf{x}}} \ne {{\textbf{x}}}'\)

Consider a special MLM (1) with proportional odds (po) (see Section S.7 in the Supplementary Material of Bu et al. (2020) for more technical details). Suppose \(d=2\) and a feasible design point \({{\textbf{x}}} = (x_1, x_2)^T \in [a, b]\times [-c, c] = \mathcal{X}\), \(c > 0\), \(J\ge 2\), \({{\textbf{h}}}_c({{\textbf{x}}}) = (x_1, x_2^2)^T\). Then the model matrix at \({{\textbf{x}}} = (x_1, x_2)^T\) is

$$\begin{aligned} {{\textbf{X}}}_{{\textbf{x}}} = \left( \begin{array}{cccccc} 1 &{} 0 &{} \cdots &{} 0 &{} x_1 &{} x_2^2\\ 0 &{} 1 &{} \ddots &{} \vdots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} 0 &{} x_1 &{} x_2^2\\ 0 &{} \cdots &{} 0 &{} 1 &{} x_1 &{} x_2^2\\ 0 &{} \cdots &{} 0 &{} 0 &{} 0 &{} 0 \end{array}\right) _{J\times (J+1)} \end{aligned}$$

Then \(p=J+1\). Let \(\varvec{\theta }= (\beta _1, \ldots , \beta _{J-1}, \zeta _1, \zeta _2)^T \in {\mathbb {R}}^{J+1}\) be the model parameters (since \(\varvec{\theta }\) is fixed, we may assume that \(\mathcal{X} = \mathcal{X}_{\varvec{\theta }}\) if the model is a cumulative logit model). Let \({{\textbf{x}}}' = (x_1, -x_2)^T\). Then \({{\textbf{X}}}_{{\textbf{x}}} = {{\textbf{X}}}_{{{\textbf{x}}}'}\) and thus \(\varvec{\eta }_{{\textbf{x}}} = \varvec{\eta }_{{{\textbf{x}}}'}\). According to (A2) (or (A4)), we obtain \(\varvec{\pi }_{{\textbf{x}}} = \varvec{\pi }_{{{\textbf{x}}}'}\) and then \({{\textbf{U}}}_{{\textbf{x}}} = {{\textbf{U}}}_{{{\textbf{x}}}'}\) . The Fisher information matrix at \({{\textbf{x}}}\) is \({\textbf{F}}_{{\textbf{x}}} = {{\textbf{X}}}_{{\textbf{x}}}^T {{\textbf{U}}}_{{\textbf{x}}} {{\textbf{X}}}_{{\textbf{x}}} = {{\textbf{X}}}_{{{\textbf{x}}}'}^T {\textbf{U}}_{{{\textbf{x}}}'} {{\textbf{X}}}_{{{\textbf{x}}}'} = {{\textbf{F}}}_{{\textbf{x}}'}\). Note that \({{\textbf{x}}} \ne {{\textbf{x}}}'\) if \(x_2 \ne 0\).

First-order derivative of sensitivity function

As mentioned in Sect. 3.1, to apply Algorithm 1 for MLM, we need to calculate the first-order derivative of the sensitivity function \(d({\textbf{x}}, \varvec{\xi })\).

Recall that the first k (\(1\le k\le d\)) factors are continuous. Given \({{\textbf{x}}} = (x_1, \ldots , x_d)^T \in \mathcal{X}\), for each \(i=1, \ldots , k\), according to Formulae 17.1(a), 17.2(a) and 17.7 in Seber (2008),

$$\begin{aligned}{} & {} \frac{\partial d({{\textbf{x}}}, \varvec{\xi })}{\partial x_i}\ =\ \frac{\partial \textrm{tr}({{\textbf{F}}}(\varvec{\xi })^{-1} {{\textbf{F}}}_{{\textbf{x}}})}{\partial x_i} \nonumber \\= & {} \textrm{tr}\left( {{\textbf{F}}}(\varvec{\xi })^{-1} \frac{\partial {{\textbf{F}}}_{{\textbf{x}}}}{\partial x_i}\right) \nonumber \\= & {} \textrm{tr}\left( {{\textbf{F}}}(\varvec{\xi })^{-1} \left[ \frac{\partial {{\textbf{X}}}_{{\textbf{x}}}^T}{\partial x_i} {{\textbf{U}}}_{{\textbf{x}}} {{\textbf{X}}}_{{\textbf{x}}} + {{\textbf{X}}}_{{\textbf{x}}}^T \frac{\partial {{\textbf{U}}}_{{\textbf{x}}}}{\partial x_i} {{\textbf{X}}}_{{\textbf{x}}}\right. \right. \nonumber \\{} & {} + \left. \left. {{\textbf{X}}}_{{\textbf{x}}}^T {{\textbf{U}}}_{{\textbf{x}}} \frac{\partial {{\textbf{X}}}_{{\textbf{x}}}}{\partial x_i}\right] \right) \end{aligned}$$
(C5)

where

$$\begin{aligned} \frac{\partial {{\textbf{X}}}_{{\textbf{x}}}}{\partial x_i}= \begin{pmatrix} \frac{\partial {{\textbf{h}}}_1^T({{\textbf{x}}})}{\partial x_i} &{} {\varvec{0}}^T &{} \cdots &{} {\varvec{0}}^T&{} \frac{\partial {{\textbf{h}}}_c^T({{\textbf{x}}})}{\partial x_i}\\ {\varvec{0}}^T &{} \frac{\partial {{\textbf{h}}}_2^T({{\textbf{x}}})}{\partial x_i} &{}\ddots &{} \vdots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} {\varvec{0}}^T &{} \frac{\partial {{\textbf{h}}}_c^T({{\textbf{x}}})}{\partial x_i}\\ {\varvec{0}}^T &{} \cdots &{} {\varvec{0}}^T &{} \frac{\partial {{\textbf{h}}}_{J-1}^T({{\textbf{x}}})}{\partial x_i} &{} \frac{\partial {{\textbf{h}}}_c^T({{\textbf{x}}})}{\partial x_i}\\ {\varvec{0}}^T &{} \cdots &{} \cdots &{} {\varvec{0}}^T &{} {\varvec{0}}^T\\ \end{pmatrix}_{J \times p} \end{aligned}$$
(C6)

\(\frac{\partial {{\textbf{U}}}_{{\textbf{x}}}}{\partial x_i} = \left( \frac{\partial u^{{\textbf{x}}}_{st}}{\partial x_i}\right) _{s,t=1, \ldots , J}\) with

$$\begin{aligned} \frac{\partial u^{{\textbf{x}}}_{st}}{\partial x_i}= & {} \frac{\partial u^{{\textbf{x}}}_{st}}{\partial \varvec{\pi }_{{\textbf{x}}}^T} \cdot \frac{\partial \varvec{\pi }_{{\textbf{x}}}}{\partial \varvec{\eta }_{{\textbf{x}}}^T} \cdot \frac{\partial \varvec{\eta }_{{\textbf{x}}}}{\partial x_i}\nonumber \\= & {} \frac{\partial u^{{\textbf{x}}}_{st}}{\partial \varvec{\pi }_{{\textbf{x}}}^T} \cdot \left( {{\textbf{C}}}^T {\textbf{D}}_{{\textbf{x}}}^{-1} {{\textbf{L}}}\right) ^{-1} \cdot \frac{\partial {{\textbf{X}}}_{{\textbf{x}}}}{\partial x_i} \cdot \varvec{\theta } \end{aligned}$$
(C7)

\({{\textbf{C}}}\) and \({{\textbf{L}}}\) defined as in (1), and \({{\textbf{D}}}_{{\textbf{x}}} = \textrm{diag}({{\textbf{L}}} \varvec{\pi }_{{\textbf{x}}})\). Explicit formula of \(({{\textbf{C}}}^T {{\textbf{D}}}_{{\textbf{x}}}^{-1} {{\textbf{L}}})^{-1}\) can be found in Section S.3 in the Supplementary Material of Bu et al. (2020) with \({{\textbf{x}}}_i\) replaced by \( {{\textbf{x}}}\). As for \(\frac{\partial u^{{\textbf{x}}}_{st}}{\partial \varvec{\pi }_{{\textbf{x}}}^T}\), we have the following explicit formulae

  1. (i)

    \(\frac{\partial u_{st}^{{\textbf{x}}}}{\partial \varvec{\pi }_{{\textbf{x}}}} = \frac{\partial u_{ts}^{{\textbf{x}}}}{\partial \varvec{\pi }_{{\textbf{x}}}}\), \(s,t=1, \ldots , J\);

  2. (ii)

    \(\frac{\partial u_{sJ}^{{\textbf{x}}}}{\partial \varvec{\pi }_{{\textbf{x}}}} = {{\textbf{0}}} \in {\mathbb {R}}^J\) for \(s=1, \ldots , J\);

  3. (iii)

    For \(s=1, \ldots , J-1\), \(\frac{\partial u_{ss}^{{\textbf{x}}}}{\partial \varvec{\pi }_{{\textbf{x}}}}\) is

    $$\begin{aligned} \left\{ \begin{array}{cl} \left( \pi _s^{{\textbf{x}}} {{\textbf{1}}}_{s-1}^T, 1-\pi _s^{{\textbf{x}}}, \pi _s^{{\textbf{x}}} {{\textbf{1}}}_{J-s}^T\right) ^T &{} \text{ for } \text{ baseline-category }\\ u_{ss}^{{\textbf{x}}} \left[ \left( \frac{2}{\gamma _s^{{\textbf{x}}}} {{\textbf{1}}}_s^T, \frac{2}{1-\gamma _s^{{\textbf{x}}}} {{\textbf{1}}}_{J-s}^T\right) ^T\right. &{} \\ \left. - \frac{\pi _{s+1}^{{\textbf{x}}} {{\textbf{e}}}_s}{\pi _s^{{\textbf{x}}} (\pi _s^{{\textbf{x}}} + \pi _{s+1}^{{\textbf{x}}})} - \frac{\pi _s^{{\textbf{x}}} {{\textbf{e}}}_{s+1}}{\pi _{s+1}^{{\textbf{x}}} (\pi _s^{{\textbf{x}}} + \pi _{s+1}^{{\textbf{x}}})}\right] &{} \text{ for } \text{ cumulative }\\ \left( (1-\gamma _s^{{\textbf{x}}}) {{\textbf{1}}}_s^T, \gamma _s^{{\textbf{x}}} {{\textbf{1}}}_{J-s}^T\right) ^T &{} \text{ for } \text{ adjacent-categories }\\ \left( {{\textbf{0}}}_{s-1}^T, \frac{(1-\gamma _s^{\textbf{x}})^2}{(1-\gamma _{s-1}^{{\textbf{x}}})^2}, \frac{(\pi _s^{{\textbf{x}}})^2 {{\textbf{1}}}_{J-s}^T}{(1-\gamma _{s-1}^{{\textbf{x}}})^2}\right) ^T &{} \text{ for } \text{ continuation-ratio } \end{array}\right. \end{aligned}$$

    where \({{\textbf{e}}}_s\) is the \(J\times 1\) vector with the sth coordinate 1 and all others 0, \({{\textbf{1}}}_s\) is the \(s\times 1\) vector of all 1, and \({{\textbf{0}}}_s\) is the \(s\times 1\) vector of all 0.

  4. (iv)

    For \(1\le s < t \le J-1\), \(\frac{\partial u_{st}^{{\textbf{x}}}}{\partial \varvec{\pi }_{{\textbf{x}}}}\) is

    $$\begin{aligned} \left\{ \begin{array}{cl} \left( {{\textbf{0}}}_{s-1}^T, -\pi _t^{{\textbf{x}}}, {{\textbf{0}}}_{t-s-1}^T, -\pi _s^{{\textbf{x}}}, {{\textbf{0}}}_{J-t}^T\right) ^T &{} \text{ for } \text{ baseline-category }\\ \left( -(1-\gamma _s^{{\textbf{x}}})(1-\gamma _t^{{\textbf{x}}})\left( 1 + \frac{2\gamma _s^{{\textbf{x}}}}{\pi _t^{{\textbf{x}}}}\right) {{\textbf{1}}}_s^T, \right. &{} \\ -\gamma _s^{{\textbf{x}}} (1-\gamma _t^{{\textbf{x}}}) \left[ 1 - \frac{\gamma _s^{{\textbf{x}}} (1-\gamma _t^{{\textbf{x}}})}{(\pi _t^{{\textbf{x}}})^2}\right] , &{} \\ \left. -\gamma _s^{{\textbf{x}}} \gamma _t^{{\textbf{x}}} \left[ 1 + \frac{2(1-\gamma _t^{{\textbf{x}}})}{\pi _t}\right] {{\textbf{1}}}_{J-s-1}^T\right) ^T &{} \text{ for } \text{ cumulative }, t-s=1\\ {{\textbf{0}}}_J &{} \text{ for } \text{ cumulative }, t-s>1\\ \left( (1-\gamma _t^{{\textbf{x}}}) {{\textbf{1}}}_s^T, {{\textbf{0}}}_{t-s}^T, \gamma _s^{{\textbf{x}}} {{\textbf{1}}}_{J-t}^T\right) ^T &{} \text{ for } \text{ adjacent-categories }\\ {{\textbf{0}}}_J &{} \text{ for } \text{ continuation-ratio } \end{array}\right. \end{aligned}$$

where \(\gamma _j^{{\textbf{x}}} = \pi _1^{{\textbf{x}}} + \cdots + \pi _j^{{\textbf{x}}}\), \(j=1, \ldots , J-1\); \(\gamma _0^{{\textbf{x}}}\equiv 0\) and \(\gamma _J^{{\textbf{x}}}\equiv 1\).

Thus the explicit formulae for \(\frac{\partial d({{\textbf{x}}}, \varvec{\xi })}{\partial x_i}\), \(i=1, \ldots , k\) can be obtained via (C5). Only \(\frac{\partial {\textbf{X}}_{{\textbf{x}}}}{\partial x_i}\) is related to i, which may speed up the computations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, Y., Li, K., Mandal, A. et al. ForLion: a new algorithm for D-optimal designs under general parametric statistical models with mixed factors. Stat Comput 34, 157 (2024). https://doi.org/10.1007/s11222-024-10465-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-024-10465-x

Keywords

Navigation