Abstract
Phase II clinical trials investigate whether a new drug or treatment has sufficient evidence of effectiveness against the disease under study. Two-stage designs are popular for phase II since they can stop in the first stage if the drug is ineffective. Investigators often face difficulties in determining the target response rates, and adaptive designs can help to set the target response rate tested in the second stage based on the number of responses observed in the first stage. Popular adaptive designs consider two alternate response rates, and they generally minimise the expected sample size at the maximum uninterested response rate. Moreover, these designs consider only futility as the reason for early stopping and have high expected sample sizes if the provided drug is effective. Motivated by this problem, we propose an adaptive design that enables us to terminate the single-arm trial at the first stage for efficacy and conclude which alternate response rate to choose. Comparing the proposed design with a popular adaptive design from literature reveals that the expected sample size decreases notably if any of the two target response rates are correct. In contrast, the expected sample size remains almost the same under the null hypothesis.
Keywords: Phase II trial, two-stage design, optimal design, single-arm trial, sample size
1. Introduction
After obtaining the dose with an acceptable level of toxicity in phase I, we move to phase II for screening out the drugs that have little or no effect on the disease while minimising the number of patients exposed. Phase II trials can be further divided into single-arm or double-arm. The single-arm trials are often known as IIa trials, where the drug's efficacy is compared with the fixed standard response rate. Similarly, double-arm trials are known as phase IIb trials, where the experimental drug is compared with the other standard or experimental drugs so that the most promising one can be carried to the next phase for large scale evaluation [1]. Compared to phase IIa trials, phase IIb trials require a larger sample size. Since the paper is devoted to single-arm trials, we restrict ourselves mostly to phase IIa designs. Moreover, we exclusively use phase II to mean a phase IIa trial. Fleming [6] proposed a design for phase II that calculates critical values for testing the null hypothesis using the O'Brien and Fleming multiple testing procedure [19]. This design allowed early stopping under controlled type I and II error rates, and there was no attempt to be ‘optimal’ in terms of minimising the expected sample size. Multi-stage designs are more popular than the single-stage designs since they can stop the study early if the drug is ineffective. The very first two-stage design was proposed by Gehan [7]. This design was highly criticized as it has a high probability of going to the second stage even for an inferior performing drug, which contradicts the main idea of using multi-stage designs.
Simon [26] proposed two-stage designs, optimal and minimax, which minimise the expected sample size and maximum sample size, respectively, under the null hypothesis. The idea behind the two-step implementation is that it is not ethical to proceed further if the drug is not active and to terminate the study at the first stage for futility. The problem arises when we have an efficacious drug since we cannot stop early using Simon's designs as they do not consider efficacy as a stopping rule at the first stage. The average sample size approaches close to the maximum sample size since the probability of early termination due to futility becomes close to zero. One possible solution to this problem might be constructing designs that minimise the expected sample size under the alternate response rate. Nevertheless, for such designs, the expected sample size under the null response rate would be larger than Simon's optimal design. There have been several extensions to the Simon two-stage designs, including the optimal three-stage design [2], optimal three-stage design stopping for efficacy [3], and admissible designs that balance the optimisation criteria of expected sample size and maximum sample size [10]. The list also includes a predictive probability design [15], balanced two-stage designs [27], adaptive two-stage optimal design [23], etc. All these papers only consider the optimal design under the null response rate.
Mander and Thompson [17] showed that in situations where an agent is active, Simon's two-stage design is not optimal. The authors proposed designs that also consider efficacy as a reason for early termination. They showed that if a trial stops early for both futility and efficacy, then the expected sample size reduces in almost every case. The new early stopping rule generally increases the probability of early termination, which reduces the expected sample size. In designing clinical trials, especially in the early investigation of new treatments, researchers often face uncertainty in assuming the variability of the response variable and/or the treatment effects' magnitude. A natural way to resolve this problem is to choose the alternate response rate with some flexibility. Lin and Shih [16] introduced an adaptive two-stage phase II design that concerns the specification of alternative response rate and the associated power. They considered two alternate response rates and their associated pre-specified powers. The design takes a primary sample in the first stage and based on the response in the first stage, it decides which alternate response rate to be tested. Like Simon's design, this design also considers futility as the only reason for early termination.
Sambucini [20] used a Bayesian predictive strategy to derive an adaptive two-stage design, where the second stage sample size is not selected in advance but depends on the first stage responses. Englert and Kieser [4] considered the loss of power while transforming continuous test statistic into discrete test statistic and proposed a method based on the conditional error function principle that directly accounts for the discreteness of the outcome. Englert and Kieser [5] proposed a design that allows an arbitrary modification of the sample size of the second stage using the results of the interim analysis or external information while controlling the type I error rate. Shan et al. [22] proposed an adaptive design that used a branch-and-bound algorithm to find the optimal design with the smallest expected sample size under the null hypothesis. Kim and Wong [12] developed a design that considered three alternative response rates with their associated powers. This paper is an extension of Lin and Shih [16] to include three alternative response rates. Sambucini [21] took efficacy and safety as bivariate binary outcomes and proposed design using Bayesian predictive strategy for interim monitoring. Jin and Yin [8] proposed the Bayesian enhancement two-stage design to strengthen the passing criterion to the second stage. Mander et al. [18] combined the maximum sample size and the two expected sample sizes under null and alternative hypotheses to produce an expected loss function to find admissible designs.
Jung [9] considered phase II trials randomising patients between a prospective control and an experimental therapy. This design is analog to Simon's design for a single-arm trial. Lai et al. [13] expanded a randomised phase II study of response rate seamlessly into a randomised phase III study of time to failure. This approach is based on advances in group sequential designs and joint modeling of the response rate and time to the event. Shi and Yin [24] proposed a Bayesian two-stage design with changing hypothesis tests to bridge the single- and double-arm schemes in one phase II clinical trial. Shi and Yin [25] proposed a two-stage design, in which the first stage takes a single-arm comparison of the experimental drug with the standard response rate, and the second stage imposes a two-arm comparison by adding an active control arm.
This paper is organised as follows. Section 2 presents the methodology of the work. A new design is proposed in Section 2.4 and its computational algorithm is presented in Section 2.5. The numerical results of the proposed design are available in Section 3. Finally, we end up with a discussion in Section 6.
2. Methodology
Usually, the primary endpoint for a phase II clinical trial is categorised as a response or no response. For cancer trials, the clinical response is complete (the patient is cured completely) or partial response. Partial response is often defined as a or more tumor volume shrinkage based on a two-dimensional measurement. Assume that p is the true response rate for an experimental drug. Then the hypothesis to be tested is
where is the maximum uninteresting response rate and is the minimum desired response rate. Generally, the value of is below 0.3 and the improvement of the target response rate ( ) is between 0.1 and 0.3 [14].
2.1. Simon's two-stage design
According to Simon's design [26], patients are recruited in the first stage. Let x be the number of responses in the first stage. If , then the trial is stopped for futility, and the drug is identified as ineffective. If more than responses are observed in the first stage, then the study proceeds to the second stage, and ( ) more patients are recruited. If the total number of responses observed from the two-stages is more than r, then is rejected. Simon's design is indexed by the four numbers , r, and n, and is referred as a ‘ r/n’ design. The probability that the design would not proceed to the second stage or the probability of early termination is
The probability of not recommending a drug to proceed further is
(1) |
where and are the probability mass function and cumulative distribution function of the binomial distribution, respectively. The expected sample size is then obtained as
(2) |
Both and are expressed as a function of the true response rate p. Let α and β be the type I and type II error probabilities, respectively. For predetermined , , α and β, an acceptable design should satisfy the error constraints and . Let Ω be the set of all such designs. Simon's optimal design is the one in Ω that has the minimum expected sample size under the null response rate. On the other hand, the minimax design is the one that has the smallest among those designs in Ω with the smallest n.
2.2. Mander and Thomson's design
Like Simon's design, Mander and Thompson's design [17] also recruits patients in the first stage and have a similar stopping rule in the second stage. The only exception is that the design stops for efficacy if , where x is the number of responses in the first stage. The design rejects and proceed to the second stage only if . It recruits patients in the second stage, where . This design is indexed by five numbers and will be referred as ‘ r/n’ design. The probability of early termination for this design is
The probability of not rejecting the null hypothesis is given as
(3) |
Then the expected number of patients is
(4) |
As before we require the error probabilities to be constrained by and . Suppose be the set of all such designs. Mander and Thompson used four optimality criteria and named those designs as:
: is the smallest.
: is the smallest among those designs in with the smallest n.
: is the smallest.
: is the smallest among those designs in with the smallest n.
2.3. Lin and Shih's design
Lin and Shih [16] emphasised the uncertainty that investigators often face while choosing the target response rate. They proposed an adaptive two-stage design with two choices of the target response rates and ( ). This design has a fixed sample size in the first stage, like Simon's design, but the sample size in the second stage depends on the number of responses observed in the first stage. Let x be the number of responses observed in the first stage. Then Lin and Shih's design proceeds as follows:
If , stop the trial for futility.
If , power the study at for and enter additional patients into the study. Reject the null hypothesis ( ) if the total number of responses out of m patients.
If , power the study at for and enter additional patients into the study. Reject the null hypothesis ( ) if the total number of responses out of n patients.
This design is indexed by the seven numbers , , , s, m, r and n, and we refer it as ( ) (s/m) (r/n). The probability of terminating the study at the first stage is
The probability of not recommending the drug is
(5) |
Finally, the expected sample size is given as
(6) |
For predetermined , , , α, and , an acceptable design must satisfy the following error constraints
(7) |
Although not required, it is reasonable to let in practice because we would like to have higher power for detecting more improvement of the new therapy ( versus ). From the feasibility aspect, we need to compromise the power somewhat for less improvement ( versus ) because the sample size cannot be too large for a phase II study. Let be the set of designs that satisfy the error constraints. Lin and Shih [16] proposed designs based on the following optimality conditions.
Optimality type 1 (O1): is the smallest.
Optimality type 2 (O2): is the smallest.
Optimality type 3 (O3): is the smallest among all fesible solutions and is the smallest among such solutions.
Optimality type 4 (O4): is the smallest among all fesible solutions and is the smallest among such solutions.
O1 and O3 are extensions to Simon's ‘optimal design’ and ‘minimax design’ criteria. That is, if and , then , s = r, m = n, and O1 and O3 reduces to Simon's optimal and minimax designs, respectively.
2.4. Proposed design
Like Simon's design, the design by Lin and Shih [16] considers only futility as a reason for early stopping in the first stage. The design has a larger expected sample size if the proposed drug is effective. We propose an adaptive design that considers both futility and efficacy as reasons for early stopping. Although it is easier to set the uninteresting rate , the same is not right for , the alternate response rate. The challenge in selecting a single alternate response rate can be minimised by the specification of two alternate response rates instead. Therefore, as in Lin and Shih's design, the proposed design also considers two alternate response rates. Based on the response in the first stage, it determines which target response rate is to be tested and engage patients in the second stage accordingly. Let x be the number of responses in the first stage out of the patients enrolled. The proposed design then proceeds as follows:
If , stop the trial for futility.
If , power the study at for and enter additional patients into the study. Reject the null hypothesis ( ) if the total number of responses out of m patients.
If , power the study at for and enter additional patients into the study. Reject the null hypothesis ( ) if the total number of responses out of n patients.
If , stop for efficacy in the first stage and reject the null hypothesis for : .
If , stop for efficacy and reject the null hypothesis for : .
This design is indexed by the nine parameters , , , , , s, m, r and n, where . It is referred as ( ) (s/m) (r/n). The probability of early termination in the proposed design is
Then the probability of not rejecting is
(8) |
Finally, the expected sample size is
(9) |
where is the probability that the design would stop at the first stage, is the probability that the design would proceed to the second stage and test for the target response rate and is the probability that the design would proceed to the second stage and test for the target response rate . Note that . For a predetermined level of ϕ= , we find the optimal designs under the four optimality criteria proposed by Lin and Shih [16], and these designs satisfy the error constrains in (7). From Equation (8), we can see that the probability of not rejecting does not depend on . So the value of will be determined based on the following hypothesis
2.5. Algorithm
Now we discuss the algorithm to find the optimal solutions. For specified values of , , , α, and , at first feasible values of m and n are calculated. The range of m is calculated as 0.85 to 1.5 times the sample size of the one-stage design for testing versus at the significance level α and power . This strategy is similar to that was used by Simon [26] in his algorithm. To ensure that the range's values are integers, the lower value is taken as the largest integer that is less than or equal to the previously calculated lower value. Similarly, the range's upper value is taken as the smallest integer that is greater than or equal to the previously calculated upper value of the range. Similarly, feasible range of n is calculated from the sample size of the design for testing versus at the significance level α and power .
For each combination of m, n and in the range , at first is calculated as the th quantile of . Following this, is taken in the range , in the range , in the range , s in the range , and r in the range . Then we find the designs with parameters , , , , , s, m, r, and n those satisfy the error constraints associated with and . Then we check which remaining designs fulfill the error constraint associated with α and call them ‘set of feasible designs’. For every feasible design, the expected sample sizes , , and are calculated.
To find the designs of optimality type 1, we arrange the designs in order of . For optimality type 2, a vector of is calculated and the designs are ordered accordingly. For finding the design of optimality type 3 and 4, first a column containing is calculated. For optimality 3, the designs are first ordered according to and then . Finally, for optimality type 4, the ordering is done first according to and then . In every optimality criterion, the first ten ordered designs are kept and the first design is called the optimality design under that optimality criterion. All the designs are obtained numerically using a self-written code in R through parallel computation.
3. Numerical results
3.1. Proposed design
We have considered from 0.05 to 0.50 with regular interval 0.05 and the differences and are kept fixed as 0.15 and 0.20, respectively. Level of significance α is taken as and and the maximum type II error and are kept fixed as and for the target response rate and , respectively. The designs for and are shown in Tables 1 and 2, respectively.
Table 1.
First stage | Second stage | True | Expected sample size | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Design | Optimal type | m | s | n | r | α | ||||||||||||||||
0.05 | 0.20 | 0.25 | Proposed | 1 | 10 | 0 | 1 | 2 | 3 | 28 | 3 | 38 | 4 | 0.042 | 0.199 | 0.086 | 0.610 | 0.430 | 0.531 | 17.76 | 23.29 | 21.26 |
2 | 12 | 0 | 1 | 2 | 3 | 28 | 3 | 26 | 3 | 0.049 | 0.197 | 0.080 | 0.560 | 0.510 | 0.640 | 18.84 | 19.27 | 17.28 | ||||
3 | 12 | 0 | 1 | 2 | 3 | 27 | 3 | 27 | 3 | 0.049 | 0.198 | 0.080 | 0.560 | 0.510 | 0.640 | 18.60 | 19.34 | 17.38 | ||||
4 | 12 | 0 | 1 | 2 | 3 | 27 | 3 | 27 | 3 | 0.049 | 0.198 | 0.080 | 0.560 | 0.510 | 0.640 | 18.60 | 19.34 | 17.38 | ||||
LS | 1 | 9 | 0 | 2 | 31 | 3 | 43 | 5 | 0.049 | 0.200 | 0.094 | 0.630 | 0.134 | 0.075 | 17.23 | 31.19 | 34.14 | |||||
2 | 18 | 0 | 2 | 29 | 3 | 23 | 3 | 0.044 | 0.199 | 0.080 | 0.397 | 0.018 | 0.006 | 24.28 | 24.43 | 23.75 | ||||||
3 | 21 | 0 | 1 | 26 | 2 | 26 | 3 | 0.047 | 0.197 | 0.076 | 0.341 | 0.009 | 0.002 | 24.30 | 25.95 | 25.99 | ||||||
4 | 21 | 0 | 1 | 26 | 2 | 26 | 3 | 0.047 | 0.197 | 0.076 | 0.341 | 0.009 | 0.002 | 24.30 | 25.95 | 25.99 | ||||||
0.10 | 0.25 | 0.30 | Proposed | 1 | 18 | 2 | 3 | 5 | 6 | 37 | 6 | 50 | 8 | 0.049 | 0.200 | 0.078 | 0.740 | 0.420 | 0.530 | 24.13 | 34.41 | 31.82 |
2 | 17 | 1 | 3 | 4 | 6 | 40 | 7 | 38 | 7 | 0.049 | 0.199 | 0.068 | 0.504 | 0.480 | 0.690 | 28.29 | 28.61 | 25.12 | ||||
3 | 23 | 2 | 4 | 6 | 7 | 38 | 7 | 37 | 6 | 0.050 | 0.199 | 0.067 | 0.598 | 0.395 | 0.576 | 28.97 | 31.70 | 29.06 | ||||
4 | 24 | 2 | 4 | 6 | 8 | 38 | 7 | 36 | 6 | 0.049 | 0.198 | 0.066 | 0.572 | 0.430 | 0.620 | 29.84 | 31.23 | 28.72 | ||||
LS | 1 | 18 | 2 | 3 | 38 | 6 | 49 | 8 | 0.048 | 0.199 | 0.077 | 0.734 | 0.135 | 0.060 | 24.40 | 42.93 | 45.99 | |||||
2 | 21 | 2 | 4 | 44 | 8 | 29 | 5 | 0.048 | 0.199 | 0.068 | 0.648 | 0.075 | 0.027 | 28.30 | 32.80 | 31.35 | ||||||
3 | 18 | 1 | 3 | 37 | 6 | 38 | 7 | 0.050 | 0.199 | 0.068 | 0.450 | 0.039 | 0.014 | 28.54 | 36.94 | 37.57 | ||||||
4 | 26 | 2 | 4 | 38 | 7 | 35 | 6 | 0.050 | 0.198 | 0.067 | 0.511 | 0.026 | 0.007 | 31.54 | 35.24 | 35.14 | ||||||
0.15 | 0.30 | 0.35 | Proposed | 1 | 19 | 3 | 4 | 6 | 7 | 54 | 12 | 59 | 13 | 0.050 | 0.200 | 0.074 | 0.700 | 0.468 | 0.578 | 30.12 | 39.55 | 35.43 |
2 | 25 | 4 | 6 | 7 | 9 | 57 | 13 | 41 | 10 | 0.050 | 0.199 | 0.065 | 0.708 | 0.580 | 0.700 | 33.65 | 35.74 | 31.65 | ||||
3 | 38 | 6 | 9 | 10 | 14 | 46 | 11 | 46 | 10 | 0.050 | 0.200 | 0.061 | 0.679 | 0.650 | 0.840 | 40.56 | 40.78 | 39.31 | ||||
4 | 38 | 6 | 9 | 10 | 14 | 46 | 11 | 46 | 10 | 0.050 | 0.200 | 0.061 | 0.679 | 0.650 | 0.840 | 40.56 | 40.78 | 39.31 | ||||
LS | 1 | 19 | 3 | 6 | 55 | 12 | 46 | 10 | 0.050 | 0.199 | 0.074 | 0.684 | 0.133 | 0.059 | 30.22 | 47.20 | 48.20 | |||||
2 | 27 | 4 | 7 | 51 | 12 | 34 | 8 | 0.049 | 0.198 | 0.061 | 0.619 | 0.059 | 0.018 | 35.48 | 39.57 | 37.27 | ||||||
3 | 38 | 6 | 9 | 46 | 11 | 46 | 10 | 0.050 | 0.200 | 0.061 | 0.659 | 0.036 | 0.008 | 40.73 | 45.71 | 45.94 | ||||||
4 | 38 | 6 | 9 | 46 | 11 | 46 | 10 | 0.050 | 0.200 | 0.061 | 0.659 | 0.036 | 0.008 | 40.73 | 45.71 | 45.94 | ||||||
0.20 | 0.35 | 0.40 | Proposed | 1 | 23 | 5 | 6 | 9 | 10 | 56 | 15 | 69 | 19 | 0.049 | 0.199 | 0.068 | 0.704 | 0.390 | 0.500 | 34.74 | 49.46 | 45.19 |
2 | 26 | 5 | 8 | 9 | 11 | 63 | 18 | 47 | 14 | 0.049 | 0.199 | 0.059 | 0.601 | 0.490 | 0.660 | 40.20 | 42.20 | 36.46 | ||||
3 | 28 | 5 | 6 | 11 | 12 | 52 | 15 | 53 | 15 | 0.050 | 0.200 | 0.058 | 0.505 | 0.290 | 0.460 | 40.18 | 45.79 | 41.48 | ||||
4 | 38 | 8 | 9 | 13 | 16 | 53 | 16 | 53 | 15 | 0.050 | 0.199 | 0.057 | 0.667 | 0.510 | 0.720 | 42.99 | 45.30 | 42.18 | ||||
LS | 1 | 23 | 5 | 6 | 56 | 15 | 66 | 18 | 0.049 | 0.199 | 0.068 | 0.695 | 0.131 | 0.054 | 34.67 | 59.14 | 62.98 | |||||
2 | 35 | 8 | 11 | 60 | 17 | 40 | 12 | 0.049 | 0.196 | 0.058 | 0.745 | 0.089 | 0.026 | 40.69 | 45.81 | 43.25 | ||||||
3 | 31 | 6 | 12 | 53 | 15 | 40 | 13 | 0.050 | 0.200 | 0.058 | 0.571 | 0.046 | 0.013 | 40.38 | 48.56 | 46.48 | ||||||
4 | 31 | 6 | 12 | 53 | 15 | 40 | 13 | 0.050 | 0.200 | 0.058 | 0.571 | 0.046 | 0.013 | 40.38 | 48.56 | 46.48 | ||||||
0.25 | 0.40 | 0.45 | Proposed | 1 | 26 | 7 | 8 | 12 | 13 | 55 | 18 | 74 | 24 | 0.050 | 0.199 | 0.064 | 0.690 | 0.320 | 0.420 | 38.31 | 56.62 | 52.73 |
2 | 34 | 9 | 10 | 13 | 16 | 87 | 29 | 61 | 21 | 0.050 | 0.200 | 0.055 | 0.692 | 0.580 | 0.730 | 45.58 | 46.97 | 41.38 | ||||
3 | 37 | 9 | 11 | 17 | 18 | 58 | 19 | 59 | 20 | 0.050 | 0.200 | 0.054 | 0.552 | 0.220 | 0.400 | 46.60 | 54.13 | 50.26 | ||||
4 | 37 | 8 | 14 | 15 | 18 | 59 | 20 | 57 | 19 | 0.050 | 0.200 | 0.054 | 0.411 | 0.420 | 0.650 | 49.92 | 49.54 | 44.50 | ||||
LS | 1 | 23 | 6 | 7 | 59 | 19 | 74 | 24 | 0.049 | 0.197 | 0.066 | 0.654 | 0.124 | 0.051 | 38.41 | 65.98 | 70.44 | |||||
2 | 38 | 9 | 14 | 65 | 22 | 41 | 15 | 0.050 | 0.200 | 0.058 | 0.513 | 0.027 | 0.006 | 50.32 | 50.18 | 45.62 | ||||||
3 | 37 | 9 | 11 | 58 | 19 | 59 | 20 | 0.050 | 0.200 | 0.054 | 0.550 | 0.035 | 0.008 | 46.64 | 58.14 | 58.79 | ||||||
4 | 42 | 9 | 16 | 59 | 20 | 51 | 17 | 0.050 | 0.200 | 0.058 | 0.371 | 0.009 | 0.001 | 52.53 | 54.58 | 52.81 | ||||||
0.30 | 0.45 | 0.50 | Proposed | 1 | 28 | 9 | 10 | 14 | 15 | 57 | 22 | 80 | 30 | 0.050 | 0.199 | 0.061 | 0.690 | 0.350 | 0.470 | 41.21 | 59.45 | 54.50 |
2 | 34 | 11 | 13 | 15 | 18 | 75 | 29 | 71 | 28 | 0.050 | 0.200 | 0.053 | 0.720 | 0.560 | 0.700 | 45.15 | 50.82 | 44.52 | ||||
3 | 39 | 12 | 14 | 19 | 21 | 63 | 24 | 64 | 25 | 0.050 | 0.200 | 0.051 | 0.622 | 0.310 | 0.510 | 48.22 | 56.02 | 51.16 | ||||
4 | 37 | 10 | 11 | 17 | 20 | 60 | 23 | 64 | 25 | 0.050 | 0.200 | 0.051 | 0.437 | 0.410 | 0.630 | 51.63 | 52.92 | 46.90 | ||||
LS | 1 | 28 | 9 | 10 | 58 | 22 | 77 | 29 | 0.049 | 0.200 | 0.061 | 0.682 | 0.119 | 0.044 | 41.16 | 69.38 | 73.94 | |||||
2 | 42 | 14 | 18 | 70 | 27 | 45 | 19 | 0.050 | 0.200 | 0.056 | 0.743 | 0.085 | 0.022 | 48.54 | 53.94 | 49.90 | ||||||
3 | 39 | 12 | 16 | 64 | 25 | 63 | 24 | 0.050 | 0.199 | 0.051 | 0.618 | 0.050 | 0.012 | 48.50 | 62.11 | 62.87 | ||||||
4 | 37 | 9 | 17 | 64 | 25 | 45 | 18 | 0.049 | 0.200 | 0.051 | 0.289 | 0.008 | 0.001 | 55.95 | 56.42 | 52.02 | ||||||
0.35 | 0.50 | 0.55 | Proposed | 1 | 27 | 10 | 11 | 15 | 16 | 77 | 33 | 79 | 34 | 0.050 | 0.198 | 0.06 | 0.678 | 0.340 | 0.450 | 43.47 | 60.87 | 55.50 |
2 | 34 | 12 | 16 | 17 | 20 | 77 | 34 | 59 | 27 | 0.050 | 0.200 | 0.05 | 0.616 | 0.490 | 0.660 | 50.02 | 53.36 | 45.77 | ||||
3 | 41 | 14 | 15 | 23 | 24 | 66 | 30 | 66 | 29 | 0.050 | 0.200 | 0.049 | 0.528 | 0.200 | 0.390 | 52.80 | 60.89 | 56.24 | ||||
4 | 47 | 17 | 18 | 25 | 27 | 66 | 30 | 66 | 29 | 0.050 | 0.199 | 0.049 | 0.634 | 0.320 | 0.550 | 53.95 | 59.93 | 55.55 | ||||
LS | 1 | 27 | 10 | 11 | 69 | 30 | 80 | 34 | 0.050 | 0.198 | 0.06 | 0.670 | 0.124 | 0.046 | 43.10 | 72.37 | 76.97 | |||||
2 | 43 | 17 | 20 | 87 | 38 | 45 | 21 | 0.050 | 0.200 | 0.057 | 0.785 | 0.111 | 0.030 | 50.66 | 56.09 | 50.70 | ||||||
3 | 42 | 15 | 22 | 66 | 29 | 65 | 29 | 0.050 | 0.200 | 0.049 | 0.608 | 0.044 | 0.009 | 51.41 | 64.62 | 65.20 | ||||||
4 | 50 | 19 | 26 | 66 | 29 | 62 | 29 | 0.050 | 0.200 | 0.049 | 0.726 | 0.059 | 0.012 | 54.36 | 63.71 | 63.36 | ||||||
0.40 | 0.55 | 0.60 | Proposed | 1 | 28 | 12 | 14 | 17 | 18 | 83 | 40 | 82 | 39 | 0.050 | 0.199 | 0.06 | 0.703 | 0.350 | 0.450 | 44.23 | 63.39 | 57.92 |
2 | 39 | 17 | 20 | 21 | 24 | 83 | 41 | 67 | 33 | 0.050 | 0.199 | 0.048 | 0.763 | 0.600 | 0.730 | 49.00 | 54.68 | 47.94 | ||||
3 | 37 | 15 | 19 | 22 | 23 | 70 | 34 | 70 | 35 | 0.049 | 0.199 | 0.047 | 0.602 | 0.290 | 0.480 | 50.13 | 60.28 | 54.24 | ||||
4 | 43 | 17 | 23 | 24 | 27 | 69 | 34 | 70 | 34 | 0.050 | 0.197 | 0.045 | 0.554 | 0.430 | 0.670 | 54.62 | 57.96 | 51.80 | ||||
LS | 1 | 26 | 11 | 12 | 79 | 38 | 82 | 39 | 0.050 | 0.200 | 0.062 | 0.674 | 0.135 | 0.052 | 43.89 | 74.13 | 78.93 | |||||
2 | 40 | 17 | 21 | 81 | 40 | 45 | 23 | 0.050 | 0.199 | 0.049 | 0.689 | 0.077 | 0.019 | 51.36 | 57.51 | 51.75 | ||||||
3 | 36 | 14 | 19 | 69 | 34 | 66 | 32 | 0.050 | 0.200 | 0.047 | 0.518 | 0.038 | 0.008 | 51.77 | 66.11 | 66.13 | ||||||
4 | 39 | 15 | 22 | 69 | 34 | 45 | 23 | 0.050 | 0.200 | 0.046 | 0.491 | 0.028 | 0.005 | 53.95 | 59.29 | 53.98 | ||||||
0.45 | 0.60 | 0.65 | Proposed | 1 | 33 | 16 | 17 | 21 | 22 | 74 | 40 | 79 | 42 | 0.050 | 0.199 | 0.052 | 0.729 | 0.400 | 0.540 | 44.94 | 60.32 | 54.06 |
2 | 40 | 19 | 23 | 24 | 27 | 77 | 42 | 57 | 31 | 0.050 | 0.199 | 0.044 | 0.704 | 0.510 | 0.700 | 50.53 | 55.40 | 48.60 | ||||
3 | 50 | 23 | 26 | 32 | 33 | 67 | 36 | 68 | 37 | 0.050 | 0.199 | 0.044 | 0.616 | 0.270 | 0.510 | 56.66 | 63.05 | 58.78 | ||||
4 | 45 | 19 | 26 | 28 | 30 | 68 | 37 | 67 | 36 | 0.050 | 0.200 | 0.044 | 0.420 | 0.340 | 0.600 | 58.32 | 59.96 | 54.00 | ||||
LS | 1 | 31 | 15 | 16 | 74 | 39 | 79 | 42 | 0.050 | 0.200 | 0.054 | 0.713 | 0.128 | 0.042 | 44.23 | 72.38 | 76.75 | |||||
2 | 43 | 20 | 25 | 75 | 41 | 46 | 26 | 0.049 | 0.199 | 0.044 | 0.639 | 0.051 | 0.010 | 53.68 | 57.68 | 51.92 | ||||||
3 | 38 | 16 | 18 | 68 | 36 | 68 | 37 | 0.050 | 0.200 | 0.044 | 0.425 | 0.019 | 0.003 | 55.26 | 67.42 | 67.90 | ||||||
4 | 46 | 19 | 27 | 68 | 37 | 66 | 35 | 0.050 | 0.200 | 0.044 | 0.363 | 0.008 | 0.001 | 59.97 | 66.79 | 66.44 | ||||||
0.50 | 0.65 | 0.70 | Proposed | 1 | 30 | 16 | 18 | 21 | 22 | 76 | 44 | 80 | 47 | 0.050 | 0.199 | 0.051 | 0.716 | 0.350 | 0.470 | 43.45 | 61.57 | 55.94 |
2 | 39 | 21 | 23 | 25 | 28 | 81 | 48 | 72 | 43 | 0.050 | 0.199 | 0.043 | 0.765 | 0.590 | 0.740 | 48.19 | 54.13 | 47.42 | ||||
3 | 41 | 21 | 24 | 29 | 30 | 67 | 40 | 68 | 40 | 0.050 | 0.199 | 0.042 | 0.625 | 0.220 | 0.410 | 50.87 | 61.76 | 56.86 | ||||
4 | 45 | 23 | 26 | 30 | 32 | 68 | 40 | 67 | 40 | 0.050 | 0.198 | 0.041 | 0.625 | 0.390 | 0.640 | 53.52 | 58.54 | 52.95 | ||||
LS | 1 | 30 | 16 | 17 | 69 | 40 | 79 | 46 | 0.049 | 0.200 | 0.052 | 0.708 | 0.126 | 0.040 | 43.21 | 71.88 | 76.59 | |||||
2 | 43 | 23 | 27 | 77 | 46 | 46 | 28 | 0.049 | 0.197 | 0.041 | 0.729 | 0.079 | 0.016 | 51.20 | 56.84 | 51.39 | ||||||
3 | 53 | 26 | 32 | 66 | 39 | 67 | 40 | 0.050 | 0.200 | 0.042 | 0.500 | 0.012 | 0.001 | 59.55 | 66.56 | 66.90 | ||||||
4 | 38 | 16 | 22 | 67 | 40 | 66 | 39 | 0.050 | 0.200 | 0.042 | 0.209 | 0.003 | 0.000 | 60.82 | 66.13 | 66.07 |
Table 2.
First stage | Second stage | True | Expected sample size | |||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Design | Optimal type | m | s | n | r | α | ||||||||||||||||
0.05 | 0.20 | 0.25 | Proposed | 1 | 10 | 0 | 1 | 2 | 3 | 22 | 2 | 23 | 2 | 0.092 | 0.198 | 0.092 | 0.610 | 0.430 | 0.530 | 14.75 | 17.15 | 15.91 |
2 | 12 | 0 | 1 | 2 | 3 | 22 | 2 | 20 | 2 | 0.082 | 0.194 | 0.086 | 0.560 | 0.51 | 0.641 | 16.20 | 16.33 | 15.13 | ||||
3 | 12 | 0 | 1 | 2 | 3 | 21 | 2 | 21 | 2 | 0.080 | 0.197 | 0.087 | 0.560 | 0.51 | 0.641 | 15.96 | 16.41 | 15.23 | ||||
4 | 12 | 0 | 1 | 2 | 3 | 21 | 2 | 21 | 2 | 0.080 | 0.197 | 0.087 | 0.560 | 0.51 | 0.641 | 15.96 | 16.41 | 15.23 | ||||
LS | 1 | 9 | 0 | 1 | 28 | 2 | 27 | 2 | 0.092 | 0.199 | 0.100 | 0.630 | 0.134 | 0.075 | 14.46 | 23.38 | 24.75 | |||||
2 | 12 | 0 | 1 | 24 | 2 | 18 | 2 | 0.086 | 0.200 | 0.093 | 0.540 | 0.069 | 0.032 | 16.81 | 18.82 | 18.57 | ||||||
3 | 17 | 0 | 1 | 18 | 1 | 20 | 2 | 0.091 | 0.200 | 0.087 | 0.418 | 0.023 | 0.008 | 18.00 | 19.74 | 19.89 | ||||||
4 | 17 | 0 | 1 | 18 | 1 | 20 | 2 | 0.091 | 0.200 | 0.087 | 0.418 | 0.023 | 0.008 | 18.00 | 19.74 | 19.89 | ||||||
0.10 | 0.25 | 0.30 | Proposed | 1 | 13 | 1 | 2 | 3 | 4 | 32 | 5 | 34 | 5 | 0.097 | 0.200 | 0.088 | 0.656 | 0.542 | 0.643 | 19.74 | 22.2 | 20.22 |
2 | 15 | 1 | 2 | 3 | 5 | 29 | 5 | 32 | 5 | 0.098 | 0.198 | 0.081 | 0.605 | 0.619 | 0.738 | 20.92 | 21.01 | 19.17 | ||||
3 | 20 | 1 | 3 | 4 | 7 | 28 | 5 | 27 | 4 | 0.099 | 0.200 | 0.080 | 0.435 | 0.609 | 0.77 | 24.43 | 22.93 | 21.71 | ||||
4 | 20 | 1 | 3 | 4 | 7 | 28 | 5 | 27 | 4 | 0.099 | 0.200 | 0.080 | 0.435 | 0.609 | 0.77 | 24.43 | 22.93 | 21.71 | ||||
LS | 1 | 14 | 1 | 2 | 26 | 4 | 34 | 5 | 0.095 | 0.199 | 0.085 | 0.585 | 0.101 | 0.047 | 20.25 | 30.54 | 32.14 | |||||
2 | 17 | 1 | 3 | 32 | 5 | 22 | 4 | 0.091 | 0.200 | 0.084 | 0.482 | 0.05 | 0.019 | 23.95 | 24.78 | 23.73 | ||||||
3 | 20 | 1 | 3 | 28 | 5 | 27 | 4 | 0.099 | 0.200 | 0.080 | 0.392 | 0.024 | 0.008 | 24.73 | 27.03 | 27.05 | ||||||
4 | 20 | 1 | 3 | 28 | 5 | 27 | 4 | 0.099 | 0.200 | 0.080 | 0.392 | 0.024 | 0.008 | 24.73 | 27.03 | 27.05 | ||||||
0.15 | 0.30 | 0.35 | Proposed | 1 | 15 | 2 | 4 | 5 | 6 | 39 | 8 | 43 | 9 | 0.099 | 0.200 | 0.084 | 0.621 | 0.405 | 0.497 | 24.27 | 30.1 | 27.91 |
2 | 20 | 3 | 4 | 5 | 8 | 41 | 9 | 40 | 9 | 0.100 | 0.197 | 0.074 | 0.715 | 0.691 | 0.799 | 25.88 | 26.32 | 24.09 | ||||
3 | 28 | 3 | 6 | 7 | 11 | 34 | 7 | 34 | 8 | 0.100 | 0.200 | 0.075 | 0.426 | 0.651 | 0.822 | 31.45 | 30.09 | 29.07 | ||||
4 | 28 | 3 | 6 | 7 | 11 | 34 | 7 | 34 | 8 | 0.100 | 0.200 | 0.075 | 0.426 | 0.651 | 0.822 | 31.45 | 30.09 | 29.07 | ||||
LS | 1 | 15 | 2 | 3 | 34 | 7 | 45 | 9 | 0.098 | 0.199 | 0.085 | 0.604 | 0.127 | 0.062 | 24.47 | 39.32 | 41.93 | |||||
2 | 22 | 3 | 5 | 41 | 9 | 26 | 6 | 0.096 | 0.199 | 0.077 | 0.575 | 0.068 | 0.025 | 28.57 | 29.41 | 27.98 | ||||||
3 | 26 | 2 | 5 | 34 | 8 | 33 | 7 | 0.100 | 0.200 | 0.075 | 0.230 | 0.007 | 0.001 | 31.98 | 33.11 | 33.05 | ||||||
4 | 26 | 2 | 5 | 34 | 8 | 33 | 7 | 0.100 | 0.200 | 0.075 | 0.230 | 0.007 | 0.001 | 31.98 | 33.11 | 33.05 | ||||||
0.20 | 0.35 | 0.40 | Proposed | 1 | 17 | 3 | 4 | 6 | 7 | 42 | 11 | 48 | 13 | 0.100 | 0.198 | 0.074 | 0.587 | 0.484 | 0.600 | 28.56 | 32.20 | 28.97 |
2 | 22 | 4 | 5 | 7 | 10 | 44 | 12 | 43 | 12 | 0.099 | 0.199 | 0.070 | 0.599 | 0.600 | 0.720 | 30.61 | 30.53 | 27.57 | ||||
3 | 23 | 4 | 5 | 8 | 10 | 39 | 10 | 40 | 11 | 0.100 | 0.199 | 0.070 | 0.528 | 0.470 | 0.630 | 30.83 | 31.95 | 29.24 | ||||
4 | 24 | 4 | 5 | 8 | 10 | 38 | 10 | 40 | 11 | 0.099 | 0.199 | 0.070 | 0.496 | 0.520 | 0.69 | 31.67 | 31.61 | 28.98 | ||||
LS | 1 | 17 | 3 | 5 | 39 | 10 | 52 | 14 | 0.099 | 0.200 | 0.076 | 0.549 | 0.103 | 0.046 | 28.30 | 44.28 | 47.55 | |||||
2 | 25 | 5 | 7 | 53 | 15 | 29 | 8 | 0.100 | 0.194 | 0.068 | 0.617 | 0.083 | 0.029 | 33.11 | 34.03 | 31.86 | ||||||
3 | 23 | 4 | 5 | 40 | 10 | 40 | 11 | 0.098 | 0.200 | 0.070 | 0.501 | 0.055 | 0.019 | 31.49 | 39.06 | 39.68 | ||||||
4 | 26 | 4 | 8 | 40 | 11 | 33 | 9 | 0.099 | 0.200 | 0.069 | 0.383 | 0.024 | 0.007 | 34.22 | 35.53 | 34.49 | ||||||
0.25 | 0.40 | 0.45 | Proposed | 1 | 20 | 5 | 6 | 9 | 10 | 44 | 14 | 55 | 17 | 0.098 | 0.200 | 0.076 | 0.631 | 0.370 | 0.460 | 31.06 | 40.67 | 37.94 |
2 | 21 | 5 | 7 | 8 | 10 | 58 | 19 | 42 | 14 | 0.099 | 0.199 | 0.07 | 0.623 | 0.570 | 0.700 | 33.78 | 34.05 | 29.88 | ||||
3 | 25 | 5 | 10 | 11 | 12 | 43 | 14 | 43 | 13 | 0.099 | 0.200 | 0.067 | 0.389 | 0.300 | 0.470 | 36.00 | 37.65 | 34.61 | ||||
4 | 31 | 7 | 8 | 12 | 15 | 41 | 13 | 43 | 14 | 0.100 | 0.199 | 0.067 | 0.502 | 0.510 | 0.710 | 36.66 | 36.76 | 34.50 | ||||
LS | 1 | 20 | 5 | 6 | 43 | 13 | 54 | 17 | 0.100 | 0.199 | 0.075 | 0.617 | 0.126 | 0.055 | 31.16 | 48.36 | 51.30 | |||||
2 | 28 | 7 | 10 | 49 | 16 | 32 | 11 | 0.097 | 0.199 | 0.068 | 0.600 | 0.074 | 0.024 | 35.25 | 37.22 | 35.12 | ||||||
3 | 25 | 5 | 10 | 43 | 14 | 39 | 12 | 0.100 | 0.200 | 0.067 | 0.378 | 0.029 | 0.009 | 36.07 | 40.81 | 40.38 | ||||||
4 | 24 | 4 | 10 | 43 | 14 | 33 | 11 | 0.099 | 0.200 | 0.067 | 0.247 | 0.013 | 0.004 | 38.10 | 39.25 | 37.47 | ||||||
0.30 | 0.45 | 0.50 | Proposed | 1 | 20 | 6 | 9 | 10 | 11 | 55 | 20 | 59 | 22 | 0.100 | 0.200 | 0.075 | 0.625 | 0.380 | 0.470 | 33.24 | 42.36 | 39.27 |
2 | 26 | 8 | 10 | 11 | 14 | 61 | 23 | 46 | 18 | 0.100 | 0.200 | 0.067 | 0.688 | 0.630 | 0.730 | 35.96 | 36.65 | 32.70 | ||||
3 | 29 | 8 | 11 | 15 | 15 | 46 | 17 | 47 | 18 | 0.100 | 0.199 | 0.064 | 0.483 | 0.220 | 0.370 | 37.92 | 42.75 | 40.26 | ||||
4 | 34 | 10 | 13 | 15 | 18 | 47 | 18 | 46 | 17 | 0.100 | 0.199 | 0.064 | 0.581 | 0.520 | 0.710 | 39.36 | 40.01 | 37.60 | ||||
LS | 1 | 20 | 6 | 10 | 55 | 20 | 44 | 16 | 0.100 | 0.199 | 0.075 | 0.608 | 0.130 | 0.058 | 33.53 | 47.71 | 48.45 | |||||
2 | 32 | 10 | 13 | 53 | 20 | 35 | 14 | 0.097 | 0.199 | 0.066 | 0.644 | 0.082 | 0.025 | 38.23 | 40.07 | 37.87 | ||||||
3 | 29 | 8 | 11 | 46 | 17 | 47 | 18 | 0.099 | 0.200 | 0.064 | 0.479 | 0.043 | 0.012 | 37.99 | 45.99 | 46.66 | ||||||
4 | 34 | 10 | 13 | 47 | 18 | 46 | 17 | 0.098 | 0.200 | 0.064 | 0.554 | 0.047 | 0.012 | 39.68 | 45.66 | 45.96 | ||||||
0.35 | 0.50 | 0.55 | Proposed | 1 | 22 | 8 | 9 | 12 | 13 | 54 | 22 | 65 | 27 | 0.100 | 0.198 | 0.073 | 0.665 | 0.400 | 0.500 | 34.83 | 46.29 | 42.86 |
2 | 24 | 8 | 10 | 12 | 14 | 61 | 26 | 48 | 21 | 0.099 | 0.200 | 0.064 | 0.568 | 0.500 | 0.640 | 38.15 | 38.65 | 33.98 | ||||
3 | 26 | 8 | 9 | 14 | 15 | 45 | 19 | 49 | 21 | 0.100 | 0.200 | 0.062 | 0.426 | 0.320 | 0.480 | 38.56 | 41.54 | 37.83 | ||||
4 | 32 | 11 | 12 | 16 | 19 | 49 | 21 | 49 | 21 | 0.100 | 0.198 | 0.061 | 0.578 | 0.490 | 0.670 | 39.17 | 40.75 | 37.63 | ||||
LS | 1 | 20 | 7 | 8 | 54 | 22 | 60 | 25 | 0.099 | 0.2 | 0.076 | 0.601 | 0.132 | 0.058 | 34.99 | 54.02 | 57.24 | |||||
2 | 30 | 11 | 14 | 58 | 25 | 34 | 15 | 0.097 | 0.199 | 0.063 | 0.655 | 0.100 | 0.033 | 38.10 | 41.46 | 38.61 | ||||||
3 | 30 | 10 | 15 | 49 | 21 | 39 | 17 | 0.100 | 0.199 | 0.062 | 0.508 | 0.049 | 0.014 | 39.05 | 43.78 | 42.29 | ||||||
4 | 29 | 9 | 15 | 49 | 21 | 34 | 16 | 0.099 | 0.200 | 0.062 | 0.408 | 0.031 | 0.008 | 40.54 | 43.05 | 40.31 | ||||||
0.40 | 0.55 | 0.60 | Proposed | 1 | 24 | 10 | 11 | 15 | 15 | 50 | 23 | 62 | 29 | 0.100 | 0.200 | 0.069 | 0.658 | 0.310 | 0.380 | 35.36 | 49.03 | 46.78 |
2 | 30 | 12 | 15 | 16 | 19 | 54 | 26 | 51 | 25 | 0.100 | 0.199 | 0.059 | 0.627 | 0.570 | 0.720 | 38.82 | 39.8 | 36.01 | ||||
3 | 27 | 10 | 15 | 16 | 17 | 50 | 24 | 49 | 23 | 0.100 | 0.200 | 0.06 | 0.472 | 0.310 | 0.470 | 39.13 | 42.73 | 38.99 | ||||
4 | 30 | 11 | 16 | 17 | 19 | 50 | 24 | 47 | 23 | 0.100 | 0.200 | 0.06 | 0.452 | 0.390 | 0.590 | 40.87 | 41.72 | 37.86 | ||||
LS | 1 | 22 | 9 | 13 | 58 | 27 | 60 | 28 | 0.100 | 0.200 | 0.07 | 0.624 | 0.133 | 0.055 | 35.57 | 53.77 | 56.92 | |||||
2 | 33 | 14 | 17 | 56 | 27 | 37 | 18 | 0.100 | 0.198 | 0.06 | 0.681 | 0.101 | 0.031 | 39.11 | 42.43 | 40.20 | ||||||
3 | 25 | 9 | 14 | 50 | 24 | 50 | 23 | 0.100 | 0.200 | 0.06 | 0.425 | 0.044 | 0.013 | 39.38 | 48.9 | 49.67 | ||||||
4 | 30 | 11 | 17 | 50 | 24 | 35 | 18 | 0.100 | 0.200 | 0.059 | 0.431 | 0.033 | 0.008 | 41.06 | 43.94 | 41.16 | ||||||
0.45 | 0.60 | 0.65 | Proposed | 1 | 22 | 10 | 11 | 14 | 15 | 47 | 25 | 66 | 34 | 0.099 | 0.200 | 0.067 | 0.628 | 0.411 | 0.521 | 35.51 | 45.90 | 41.95 |
2 | 27 | 12 | 13 | 16 | 18 | 69 | 36 | 52 | 28 | 0.100 | 0.199 | 0.059 | 0.603 | 0.530 | 0.670 | 39.39 | 39.88 | 35.20 | ||||
3 | 35 | 15 | 16 | 23 | 24 | 50 | 27 | 49 | 26 | 0.100 | 0.200 | 0.057 | 0.473 | 0.230 | 0.410 | 42.52 | 45.88 | 43.30 | ||||
4 | 38 | 17 | 23 | 24 | 26 | 49 | 26 | 50 | 27 | 0.100 | 0.200 | 0.057 | 0.562 | 0.330 | 0.540 | 42.83 | 45.49 | 43.17 | ||||
LS | 1 | 22 | 10 | 11 | 48 | 25 | 60 | 31 | 0.100 | 0.200 | 0.066 | 0.604 | 0.121 | 0.047 | 35.25 | 54.13 | 57.48 | |||||
2 | 32 | 15 | 18 | 62 | 33 | 34 | 19 | 0.097 | 0.197 | 0.06 | 0.654 | 0.092 | 0.027 | 40.35 | 42.33 | 38.67 | ||||||
3 | 35 | 15 | 16 | 50 | 27 | 49 | 26 | 0.100 | 0.200 | 0.057 | 0.469 | 0.030 | 0.006 | 42.57 | 48.61 | 48.92 | ||||||
4 | 35 | 15 | 16 | 50 | 27 | 49 | 26 | 0.100 | 0.200 | 0.057 | 0.469 | 0.030 | 0.006 | 42.57 | 48.61 | 48.92 | ||||||
0.50 | 0.65 | 0.70 | Proposed | 1 | 25 | 13 | 15 | 17 | 18 | 54 | 31 | 58 | 33 | 0.100 | 0.200 | 0.061 | 0.677 | 0.430 | 0.560 | 34.75 | 42.78 | 39.07 |
2 | 27 | 14 | 16 | 17 | 20 | 65 | 38 | 51 | 30 | 0.100 | 0.198 | 0.056 | 0.710 | 0.630 | 0.740 | 37.12 | 39.03 | 34.36 | ||||
3 | 28 | 14 | 15 | 19 | 20 | 50 | 29 | 50 | 29 | 0.100 | 0.199 | 0.055 | 0.593 | 0.380 | 0.550 | 36.96 | 41.58 | 37.94 | ||||
4 | 27 | 12 | 16 | 18 | 20 | 50 | 29 | 49 | 29 | 0.100 | 0.199 | 0.054 | 0.377 | 0.380 | 0.580 | 41.24 | 40.93 | 36.34 | ||||
LS | 1 | 20 | 10 | 11 | 51 | 29 | 58 | 33 | 0.099 | 0.199 | 0.064 | 0.588 | 0.122 | 0.048 | 34.53 | 52.56 | 55.72 | |||||
2 | 30 | 15 | 19 | 53 | 31 | 34 | 20 | 0.099 | 0.200 | 0.052 | 0.572 | 0.065 | 0.017 | 38.90 | 41.85 | 38.73 | ||||||
3 | 25 | 12 | 13 | 47 | 27 | 50 | 29 | 0.099 | 0.200 | 0.056 | 0.500 | 0.060 | 0.017 | 37.04 | 48.29 | 49.48 | ||||||
4 | 36 | 19 | 23 | 50 | 29 | 39 | 24 | 0.099 | 0.199 | 0.055 | 0.691 | 0.088 | 0.022 | 39.97 | 43.34 | 41.59 |
Let us assume that the maximum uninteresting response rate ( ) is , and we are not sure whether the target response rate is or . If the maximum type II errors are considered as and for the target response rates and , then our proposed design for is (0/1/2/3/10) (3/28) (4/38) under the optimality criterion 1 (O1): see Table 1. That is, at the first stage, 10 patients would be recruited. If none of the patients respond, then the design would be stopped for futility. If one patient responds, the design will proceed to the second stage to test the target response rate as at maximum type II error rate and 28−10 = 18 more patients would be recruited. If more than 3 respondents are observed at the second stage, the null hypothesis would be rejected for an alternate response rate . If 2 respondents are observed at the first stage, the design will proceed to the second stage, and 38−10 = 28 more patients would be recruited to test for the target response rate as at maximum type II error rate. If more than 4 respondents are observed, the null hypothesis would be rejected for an alternate response rate . Otherwise, the drug would be identified as ineffective, and the trial would be stopped. However, if three respondents are observed in the first stage, then the design would be stopped for efficacy, and the null hypothesis would be rejected for the target response rate . Finally, if the number of respondents observed at the first stage is more than 3, the design would be stopped, and the null hypothesis would be rejected for the target response rate .
The expected sample sizes are 17.76, 23.29 and 21.26, if the true response rates are 0.05, 0.20 and 0.25, respectively. Note that the expected sample size is much higher if is the true response rate than if or is the true response rate. This trend is found for almost every designs except few exceptions. The reason behind is that the proposed design is more unlikely to proceed to the second stage if the drug is either futile or highly effective. If is true, then we can say that the drug is more effective than it was at . The optimality criterion for O2 design is to minimise max and in every case is the highest. So we can say that our proposed O2 design have minimum among all the designs fulfilling the error constraints. Although not presented in the paper, for every set of parameter values, the first ten designs under each optimality criterion are computed. For ϕ , the O2 design is (9/10/13/16/34) (29/87) (21/61) with 45.58, 46.97 and 41.37 as the expected sample sizes under , and , respectively. Though not presented here, the second design under same optimality criterion is (9/11/13/16/34) (23/68) (22/64), where the expected sample sizes are 44.1, 47.17 and 41.73. The O2 design has and for the second design, . The second design may be more attractive to some investigatorss than the optimal one because of lower max(m,n). Note that the expected sample sizes are very similar in this case.
The O1 designs have some common features that can be observed from Tables 1 and 2. E is the lowest among the designs under four optimality criteria. It is obvious as the optimality criterion for these designs ensures the smallest expected sample size under the null hypothesis. The maximum difference in the expected sample sizes between O2 and O1 is observed for , which is 45.58−38.31 = 7.27. This difference is higher for designs at significance level than that of the designs at significance level. For the same values of design parameters but at significance level, the difference is 33.78−31.06 = 2.72. It is because the sample size in the first stage and total sample sizes m and n are higher at the smaller significance level. Also, the probability of early termination if the null hypothesis is true, , is the highest, and the sample size in the first stage ( ) is the lowest for designs under optimality criterion 1. If is high, then the design is less likely to proceed to the second stage when the drug is ineffective. In Equation (9), we have expressed the expected sample size as an weighted sum of , m and n. For O1 designs, the is lowest because or is highest and the expected sample size is dominated by . Generally, for O1 designs is higher than the other three designs.
O2 designs generally have larger than those for the O1 designs but smaller than those for O3 and O4 designs. Only two exceptions are observed in Tables 1 and 2. For , for O2 design is 17, which is smaller than for O1 design . For this set of parameters, for O1 design is 0.740, which is the highest for the designs we have computed. Another exception is observed for , where for O2 is 30, which is larger than for O3 . Moreover, the O2 designs have highest and smallest max among the four optimal designs. Between the O3 and O4 designs, generally O3 designs have smaller except the cases where it is the same for both the designs. Tables 1 and 2 show that in many cases the sample size in the first stage are the same under both conditions and in some cases the two designs coincide. For significance level in Table 2, we see that in the first three cases, O3 and O4 designs coincide. At significance level, for the first and third cases, these two designs coincide. Among these two, the O3 designs have larger while O4 have larger .
3.2. Comparison with Lin and Shih's design
We now compare the proposed design with the Lin and Shih's (LS) design. It is seen that the expected sample sizes are notably smaller for the proposed design if the target response rates ( or ) are true. The difference is higher if is the true response rate. For , 0.20, 0.25, 0.10, 0.20, , the LS design under the optimality criterion O1 is and , and are 17.23, 31.19 and 34.14, respectively. The difference in for O1 design is 31.19−23.29 = 7.90 while the difference in is 34.14−21.26 = 12.88. The highest differences between and are observed in O1 design for the set (0.45, 0.60, 0.65, 0.05, 0.20, , which are 72.38−60.32 = 12.06 and 76.75−54.06 = 22.69, respectively. The lowest differences are observed for O4 designs for (0.40, 0.55, 0.60, 0.05, 0.20, , which are 59.29−57.96 = 1.33 and 53.98−51.80 = 2.18, respectively.
If is the true response rate, the expected sample size in LS design is generally smaller than that of the proposed design, but the increment is tiny. For , 0.20, 0.25, 0.10, 0.20, , is 17.23, which is smaller than the proposed design( ). It happened since the probabilities of early termination under the null hypothesis for LS and proposed designs are high. But and for LS design is very low, which means that these designs have a very high chance of proceeding to the second stage, and therefore results in larger expected sample sizes. In case of the proposed design, and are notably larger than LS design.
Figure 1 shows and versus p for , 0.20, 0.25, 0.05, 0.20, . We see that PET decreases with the increment of the true response rate for LS design. However, for the proposed design, PET starts increasing after reaching a minimum value. Note that the curves for O3 and O4 designs are not shown since those designs' main goal is to minimise the maximum sample size. For LS O1 design, the expected sample size is a monotonic non-decreasing curve, and eventually, it proceeds to max(m,n). For LS O2 design, there is no common trend because of the optimality criterion, but the curve also proceeds to max(m,n). For our proposed design, the expected sample size curve starts to increase as the true response rate increases. However, after a certain point, it starts decreasing and eventually reaches the sample size at the first stage ( ). As stated earlier, is smaller for the LS design in this case. The expected sample sizes for LS and proposed designs are very similar if the true response rate is close to under optimality criterion 1. After a little increment in p, the proposed design's expected sample size seems to be much lower than that of the LS design.
4. Application on Lin and Shih's VBG study
A study was conducted by Lin and Shih [16] to investigate the efficacy of the combinations of therapies of vinorelbine, bleomycin and gemcitabine (VBG) for treating patients with recurrent or refractory Hodgkin disease. In their study, the maximum uninteresting response rate is considered as , and the target response rate may vary from to . For , Lin and Shih considered two target response rates, and at and . Table 1 shows that for LS design is 43.89 under O1 if the true response rate is . However, if the true response rate is or , then are 74.13 and 78.93, which are significantly higher than the expected sample size when the true response rate is . Under the same setup, the proposed design is under O1. for the proposed design is 44.23 when the true response rate is , which is almost the same for LS design. But when the true response rate is and , are 63.39 and 57.92 respectively, which are notably lower than that for LS design. This is true for the other three optimality criteria: see Table 1.
5. Comparison with Mander and Thompson's design
The main difference between these two designs lies in the number of target response rates being considered. In the proposed design, against one maximum uninteresting response rate , we consider two target response rates and , where Mander and Thompson's design [17] consider only one . If and then , and m = n, and the proposed design becomes Mander and Thompson's design. Continuing the VBG example of Lin and Shih's study presented in Section 4, against one maximum uninteresting response rate ( ), it is not possible to test two different target response rates ( and ) by using Mander and Thompson's design at the same time. One possible solution may be conducting two separate tests as vs. and vs. . Appropriate designs for these separate tests are given in Table 3. These designs are computed by grid searching over different combinations of n, , r, and using the self-written code in R.
Table 3.
First stage | Second stage | True | Expected sample size} | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
β | n | r | α | β | Comment | |||||||||
0.40 | 0.55 | 0.20 | 26 | 11 | 17 | 84 | 40 | 0.676 | 0.237 | 0.050 | 0.194 | 44.78 | 70.23 | |
41 | 16 | 23 | 69 | 34 | 0.530 | 0.414 | 0.050 | 0.199 | 54.17 | 57.41 | ||||
44 | 19 | 23 | 80 | 40 | 0.759 | 0.663 | 0.049 | 0.200 | 52.69 | 56.12 | ||||
41 | 16 | 23 | 69 | 34 | 0.530 | 0.414 | 0.050 | 0.199 | 54.17 | 57.41 | ||||
0.40 | 0.60 | 0.10 | 25 | 11 | 17 | 66 | 32 | 0.733 | 0.231 | 0.049 | 0.098 | 35.93 | 56.51 | |
29 | 12 | 19 | 54 | 27 | 0.639 | 0.248 | 0.049 | 0.099 | 38.03 | 47.81 | ||||
27 | 10 | 15 | 62 | 32 | 0.492 | 0.626 | 0.048 | 0.099 | 44.77 | 40.09 | ||||
36 | 16 | 21 | 54 | 27 | 0.772 | 0.561 | 0.050 | 0.098 | 40.10 | 43.91 |
For testing vs. , will be (11 17)/26 40/84. That means at the first stage, 26 patients will be recruited. If the 11 or fewer patients respond, the study will be stopped due to futility, and if the number is more than 17, we will stop the study because of efficacy and reject the null hypothesis for target response rate. However, suppose the number of responses is more than 11 and less or equal to 17 patients. In that case, we will proceed to the second stage and recruit 58 additional patients and reject the null hypothesis only if the total number of responses is more than 40. Similarly, for testing vs. , design will be (11 17)/25 32/66. Although it is possible to stop the study early for both futility and efficacy in Mander and Thompson's design, it is impossible to mitigate the uncertainty that arises while selecting the target response rate.
6. Discussion
The phase II clinical trial aims to determine whether a drug is effective and screens out the ineffective drugs. Phase II is an early phase of a clinical trial, and recruiting fewer patients is desirable. The adaptive phase II design by Lin and Shih [16] only considers the futility to stop early and has a large expected sample size if the proposed drug is effective. In this paper, we have discussed why efficacy should also be considered as a reason for early termination. A design has been proposed for a single-arm phase II clinical trial that, along with futility, also considers efficacy to stop early. The proposed design can achieve a notable reduction in the expected sample size if the drug is effective without affecting the sample size when the drug is ineffective.
One of the difficulties is that the proposed design takes much time to be calculated. Designs at significance level take less time than that of the designs at significance level. This is because the sample sizes in both stages are notably larger for significance level. It will take even more time if we consider designs at significance level. The other difficulty is calculating the values of . As discussed at the beginning, Kim and Wong [12] proposed an adaptive phase II clinical trial design that allows three target response rates against one null response rate. However, they did not allow early stopping for efficacy, rather considered futility as the only reason. The authors used the Particle Swarm Optimisation (PSO) technique introduced by Kennedy and Eberhart [11] to find the solutions of parameters for their optimal design. One possible extension of the proposed design could be the usage of PSO to find the solutions. The design can also be extended for three or more target response rates and their associated type II error rates against one maximum uninteresting response rate. Finally, the paper's findings should encourage stopping early for both futility and efficacy in two-stage adaptive design for phase II trials.
Acknowledgments
The authors would like to thank the reviewers for their useful suggestions to improve the paper. The first author also would like to thank the Ministry of Science and Technology, Government of Bangladesh, for providing him the National Science and Technology Fellowship during this work.
Disclosure statement
No potential conflict of interest was reported by the author(s).
References
- 1.Berry S.M., Carlin B.P., Lee J.J., and Muller P., Bayesian Adaptive Methods for Clinical Trials, CRC Press, New York, 2010. [Google Scholar]
- 2.Chen T.T., Optimal three-stage designs for phase II cancer clinical trials, Stat. Med. 16 (1998), pp. 2701–2711. [DOI] [PubMed] [Google Scholar]
- 3.Chen K. and Shan M., Optimal and minimax three-stage designs for phase II oncology clinical trials, Contemp. Clin. Trials 29 (2008), pp. 32–41. [DOI] [PubMed] [Google Scholar]
- 4.Englert S. and Kieser M., Improving the flexibility and efficiency of phase II designs for oncology trials, Biometrics 68 (2011), pp. 886–892. [DOI] [PubMed] [Google Scholar]
- 5.Englert S. and Kieser M., Adaptive designs for single-arm phase II trials in oncology, Pharm. Stat. 11 (2012), pp. 241–249. [DOI] [PubMed] [Google Scholar]
- 6.Fleming T.R., One-sample multiple testing procedure for phase II clinical trials, Biometrics 38 (1982), pp. 143–151. [PubMed] [Google Scholar]
- 7.Gehan E.A., The determination of the number of patients required in a preliminary and a follow-up trial of a new chemotherapeutic agent, J. Chronic Dis. 13 (1961), pp. 346–353. [DOI] [PubMed] [Google Scholar]
- 8.Jin H. and Yin G., Bayesian enhancement two-stage design with error control for phase II clinical trials, Stat. Med. 39 (2020), pp. 4452–4465. [DOI] [PubMed] [Google Scholar]
- 9.Jung S.-H., Randomized phase II trials with a prospective control, Stat. Med. 27 (2008), pp. 568–583. [DOI] [PubMed] [Google Scholar]
- 10.Jung S.-H., Lee T., Kim K., and George S.L., Admissible two-stage designs for phase II cancer clinical trials, Stat. Med. 23 (2004), pp. 561–569. [DOI] [PubMed] [Google Scholar]
- 11.Kennedy J. and Eberhart R., Particle swarm optimization, Proc. Int. Conf. Neural Networks 4 (1995), pp. 1942–1948. [Google Scholar]
- 12.Kim S. and Wong W.K., Extended two-stage adaptive design with three target responses for phase II clinical trial, Stat. Methods Med. Res. 27 (2017), pp. 3628–3642. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Lai T.L., Lavori P.W., and Shih M.-C., Sequential design of phase II-III cancer trials, Stat. Med. 31 (2012), pp. 1944–1960. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Lee J.J. and Feng L., Randomized phase II designs in cancer clinical trials: Current status and future directions, J. Clin. Oncol. 23 (2005), pp. 4450–4457. [DOI] [PubMed] [Google Scholar]
- 15.Lee J.J. and Liu D.D., A predictive probability design for phase II cancer clinical trials, Clin. Trials 5 (2008), pp. 93–106. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Lin Y. and Shih W.J., Adaptive two-stage designs for single-arm phase IIA cancer clinical trials, Biometrics 60 (2004), pp. 482–490. [DOI] [PubMed] [Google Scholar]
- 17.Mander A.P. and Thompson S.G., Two-stage designs optimal under the alternative hypothesis for phase II cancer clinical trials, Contemp. Clin. Trials 31 (2010), pp. 572–578. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Mander A.P., Wason J.M.S., Sweeting M.J., and Thompson S.G., Admissible two-stage designs for phase II cancer clinical trials that incorporate the expected sample size under the alternative hypothesis, Pharm. Stat. 11 (2012), pp. 91–96. [DOI] [PubMed] [Google Scholar]
- 19.O'Brien P.C. and Fleming T.R., A multiple testing procedure for clinical trials, Biometrics 35 (1979), pp. 549–556. [PubMed] [Google Scholar]
- 20.Sambucini V., A Bayesian predictive strategy for an adaptive two-stage design in phase II clinical trials, Stat. Med. 29 (2010), pp. 1430–1442. [DOI] [PubMed] [Google Scholar]
- 21.Sambucini V., Bayesian predictive monitoring with bivariate binary outcomes in phase II clinical trials, Comput. Stat. Data Anal. 132 (2019), pp. 18–30. [Google Scholar]
- 22.Shan G., Wilding G.E., Hutson A.D., and Gerstenberger S., Optimal adaptive two-stage designs for early phase II clinical trials, Stat. Med. 35 (2015), pp. 1257–1266. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23.Shan G., Zhang H., and Jiang T., Adaptive two-stage optimal designs for phase II clinical studies that allow early futility stopping, Seq. Anal. 38 (2019), pp. 199–213. [Google Scholar]
- 24.Shi H. and Yin G., Bayesian two-stage design for phase II clinical trials with switching hypothesis tests, Bayesian Anal. 12 (2017), pp. 31–51. [Google Scholar]
- 25.Shi H. and Yin G., Two-stage seamless transition design from open-label single-arm to randomized double-arm clinical trials, Stat. Methods Med. Res. 27 (2018), pp. 158–171. [DOI] [PubMed] [Google Scholar]
- 26.Simon R., Optimal two-stage designs for phase II clinical trials, Control. Clin. Trials 10 (1989), pp. 1–10. [DOI] [PubMed] [Google Scholar]
- 27.Ye F. and Shyr Y., Balanced two-stage designs for phase II clinical trials, Clin. Trials 4 (2007), pp. 514–524. [DOI] [PubMed] [Google Scholar]