Abstract
Interim analysis techniques for clinical trials provide improved power with smaller average sample sizes. These techniques crucially require multivariate probability calculations for determining critical values. Most existing techniques rely on multivariate normal approximations to the joint null distribution of test statistics evaluated on potential interim and full data sets. More accurate critical values for nonparametric testing with an interim analysis are given, using a new multivariate Cornish–Fisher expansion. While earlier authors demonstrated that such an expansion is possible, it has never been implemented before this manuscript. Generally, the superior accuracy of power calculations via an Edgeworth series is demonstrated. Example calculations giving sample sizes from desired power are provided. Calculations are implemented in an R package.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Amrhein, P. (1995). An example of a two-sided wilcoxon signed rank test which is not unbiased. Annals of the Instatute of Statistical Mathematics, 47(1), 167–170.
Cahalin, L. P., Arena, R., & Guazzi, M. (2012). Comparison of heart rate recovery after the six-minute walk test to cardiopulmonary exercise testing in patients with heart failure and reduced and preserved ejection fraction. The American Journal of Cardiology, 110(3), 467–468. http://www.sciencedirect.com/science/article/pii/S0002914912013318.
Chow, S.-C., & Huang, Z. (2020). Innovative design and analysis for rare disease drug development. Journal of Biopharmaceutical Statistics, 30(3), 537–549. PMID: 32065047. https://doi.org/10.1080/10543406.2020.1726371.
Cornish, E. A., & Fisher, R. A. (1938). Moments and cumulants in the specification of distributions. Revue de l’Institut International de Statistique/Review of the International Statistical Institute, 5(4), 307–320. http://www.jstor.org/stable/1400905.
Fix, E., & Hodges, J. L. (1955). Significance probabilities of the wilcoxon test. Annals of Mathematical Statistics, 26(2), 301–312. https://doi.org/10.1214/aoms/1177728547.
Henricson, E., Abresch, R., Han, J. J., Nicorici, A., Goude Keller, E., Elfring, G., Reha, A., Barth, J., & McDonald, C. M. (2012). Percent-predicted 6-minute walk distance in duchenne muscular dystrophy to account for maturational influences. PLOS Currents, 4, RRN1297–RRN1297. https://www.ncbi.nlm.nih.gov/pubmed/22306689.
Kolassa, J. (1989). Topics in Series Approximations to Distribution Functions. Ph.D. thesis, University of Chicago.
Kolassa, J. E. (1995). A comparison of size and power calculations for the Wilcoxon statistic for ordered categorical data. Statistics in Medicine, 14(14), 1577–1581.
Kolassa, J. E. (2003). Multivariate saddlepoint tail probability approximations. The Annals of Statistics, 31(1), 274–286. http://www.jstor.org/stable/3448375.
Kolassa, J. E., & McCullagh, P. (1990). Edgeworth series for lattice distributions. The Annals of Statistics, 18(2), 981–985. https://doi.org/10.1214/aos/1176347637.
Kolassa, J. E., & Seifu, Y. (2013). Nonparametric multivariate inference on shift parameters. Academic Radiology, 20(7), 883–888. http://www.sciencedirect.com/science/article/pii/S1076633213001645.
Mann, H. B., & Whitney, D. R. (1947). On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 18(1), 50–60. https://doi.org/10.1214/aoms/1177730491.
McCullagh, P. (1987). Tensor methods in statistics. Monographs on statistics and applied probability. Chapman and Hall. https://books.google.com/books?id=JEjvAAAAMAAJ.
O’Brien, P. C., & Fleming, T. R. (1979). A multiple testing procedure for clinical trials. Biometrics, 35(3), 549–556. http://www.jstor.org/stable/2530245.
Peripheral and Central Nervous System Drugs Advisory Committee (2016). Nda 206488: Eteplirsen. Technical report. U.S. Food and Drug Administration.
Pharmaceutical Research and Manufacturers of America (2019). Spurring innovation in rare diseases. Downloaded 22 March 2020. https://www.phrma.org/-/media/Project/PhRMA/PhRMA-Org/PhRMA-Org/PDF/P-R/RareDisease_Backgrounder1.pdf.
Pocock, S. J. (1977). Group sequential methods in the design and analysis of clinical trials. Biometrika, 64(2), 191–199. https://doi.org/10.1093/biomet/64.2.191.
Rahardja, D., Zhao, Y. D., & Qu, Y. (2009). Sample size determinations for the Wilcoxon-Mann-Whitney test: A comprehensive review. Statistics in Biopharmaceutical Research, 1(3), 317–322. https://doi.org/10.1198/sbr.2009.0016.
Rom, D. M., & McTague, J. A. (2020). Exact critical values for group sequential designs with small sample sizes. Journal of Biopharmaceutical Statistics, 30(4), 1–13. PMID: 32151177. https://doi.org/10.1080/10543406.2020.1730878.
Shieh, G., Jan, S., & Randles, R. H. (2006). On power and sample size determinations for the Wilcoxon-Mann-Whitney test. Journal of Nonparametric Statistics, 18(1), 33–43. https://doi.org/10.1080/10485250500473099.
Spurrier, J. D., & Hewett, J. E. (1976). Two-stage Wilcoxon tests of hypotheses. Journal of the American Statistical Association, 71(356), 982–987. https://www.tandfonline.com/doi/abs/10.1080/01621459.1976.10480981.
Sundrum, R. M. (1954). A further approximation to the distribution of Wilcoxon’s statistic in the general case. Journal of the Royal Statistical Society. Series B (Methodological), 16(2), 255–260. http://www.jstor.org/stable/2984051.
Takemura, A., & Takeuchi, K. (1988). Some results on univariate and multivariate Cornish-Fisher expansion: Algebraic properties and validity. Sankhyā: The Indian Journal of Statistics, Series A (1961–2002), 50(1), 111–136. http://www.jstor.org/stable/25050684.
Takeuchi, K. (1978). A multivariate generalization of Cornish Fisher expansion and its applications (in Japanese). Keizaigaku Ronshu, 44(2), 1–12.
Whitehead, J., & Jaki, T. (2009). One- and two-stage design proposals for a phase II trial comparing three active treatments with control using an ordered categorical endpoint. Statistics in Medicine, 28(5), 828–847. https://onlinelibrary.wiley.com/doi/abs/10.1002/sim.3508.
Wilding, G. E., Shan, G., & Hutson, A. D. (2011). Exact two-stage designs for phase ii activity trials with rank-based endpoints. Contemporary Clinical Trials, 33(2), 332–341. https://doi.org/10.1016/j.cct.2011.10.008.
Wold, H. (1934). Sheppard’s correction formulae in several variables. Scandinavian Actuarial Journal, 1934(1), 248–255. https://doi.org/10.1080/03461238.1934.10419243.
Wolfram Research, Inc. (2018). Mathematica, Version 11.3. Champaign, IL.
Zhong, D., & Kolassa, J. (2017). Moments and Cumulants of The Two-Stage Mann-Whitney Statistic. Technical report.
Acknowledgements
The authors thank Yuzo Maruyama for an English summary of Takeuchi (1978) and Todd Kuffner for pointers to references Takemura and Takeuchi (1988) and Takemura and Takeuchi (1988). John Kolassa thanks NSF DMS 1712839 for partial support for this project. The authors thank editors and referees for helpful comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix 1: A Bivariate Recursion for Exact Probabilities
The univariate recursion is constructed by counting the number of data set orderings leading to the statistic value and decomposing them into orderings based on one fewer value from the first group and one fewer value from the second group. The bivariate recursion is similar to the above univariate recursion and is similar to that of Wilding et al. (2011).
Let b(u1, u2, m1, n1, m2, n2) represent the number of orderings of (4) for which U1 = u1 and U2 = u2. This number is zero if any of the sample sizes m1, n1, m2, n2 is negative, if either statistic value is negative, or if either statistic value is larger than its maximum value. It is also zero if both additional sample sizes for stage 2 are zero but the second statistic value exceeds the first. If all sample sizes are zero, then the sums in (1) and (3) are empty, and both statistic values are zero; hence b(0, 0, 0, 0, 0, 0) = 1. These end conditions are given by
Otherwise, the number of rearrangements of the data (4) giving rise to statistic values u1 and u2 are the sum of four contributions. First, add those with sample sizes m1 − 1, n1, m2, n2, with an additional value from the first group in the first sample that exceeds all values in the sample, and hence leaves the statistic value unchanged. Second, add those with sample sizes m1, n1 − 1, m2, n2, with an additional value from the second group in the first sample that exceeds all values in the sample, and hence increases the first statistic by m1 and increases the second statistic by m1 + m2. Third, add those with sample sizes m1, n1, m2 − 1, n2, with an additional value from the first group in the second sample added that exceeds all values in the sample, and hence leaves the statistic value unchanged. Fourth, add those with sample sizes m1, n1, m2, n2 − 1, with an additional value from the second group in the second sample added that exceeds all values in the sample, and hence leaves U1 unchanged, and increases the second statistic by m1 + m2. This leads to the following recursion:
Numerical examples in Figs. 1 and 2 exhibit comparisons of probability approximations to bivariate probabilities and quantiles for U = (U1, U2), to the exact values.
Appendix 2: A Continuous Example with Nonzero Skewness
Our aim in determining the expansion for c2 is to apply the techniques to Wilcoxon testing, but the same quantile approximation may be used more generally. Before application to the Wilcoxon statistic, which is somewhat atypical, because the third cumulants are zero, leading to a less dramatic effect, and because the Wilcoxon statistic is discrete, and hence lacks the continuity that the technique was developed for. Instead, we present a more general example consisting of a continuous distribution with nonzero third order cumulants.
Consider Y1, Y2, Y3 independent exponentials. Let U1 = Y1 + Y3, and U2 = Y2 + Y3. Figure 3 compares Edgeworth (E) or Cornish–Fisher (CF), normal (N), and Monte Carlo (MC, taken with 500,000 samples and treated as the truth). Panels are:
-
(a)
compares difference of E and N upper tail univariate probability approximation from the MC approximation, as a function of the MC approximation.
-
(b)
compares error of E and N upper tail bivariate probability approximation, as a function of the ordinate.
-
(c)
Gives contours of MC upper tail probabilities.
-
(d)
Represents CF and N approximation to upper tail, vs. MC value. CF and N values exhibit some dependence on target for first univariate tail, and so are represented as a range.
Note from panel b that the Edgeworth approximation fails to dominate the normal approximation only for a narrow band between the contours marked 0 in the middle of the plot.
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Kolassa, J., Chen, X., Seifu, Y., Zhong, D. (2023). Power Calculations and Critical Values for Two-Stage Nonparametric Testing Regimes. In: Yi, M., Nordhausen, K. (eds) Robust and Multivariate Statistical Methods. Springer, Cham. https://doi.org/10.1007/978-3-031-22687-8_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-22687-8_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22686-1
Online ISBN: 978-3-031-22687-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)