CRT designs are commonly selected for PCTs, because individual-level randomization often raises practical implementation challenges and because outcomes within clusters tend to be correlated. There is an extensive literature on the inefficiency of simple cluster randomization (ie, parallel cluster randomization) compared to individual-level randomization and approaches to accounting for this inefficiency in terms of sample size (Donner et al, 1981; Hsieh, 1988; Donner, 1992; Donner and Klar, 1996; Campbell et al, 2001). However, modified cluster randomized designs, such as cluster randomization with crossover, may reduce the required sample size and may be particularly feasible in PCTs in healthcare systems with electronic health records. In this section, we describe alternative design choices for cluster-with-crossover randomized trials and their implications for statistical power and sample size calculations.
Simple Cluster vs Individual-Level Randomized Trials
It is well known that simple CRTs have less statistical power than individual-level RCTs because of correlation within clusters. Specifically, in a trial designed to determine whether there is a significant difference between interventions A and B on response Y, randomization at the cluster level to either A or B would require a larger sample size to obtain the same statistical power as randomization at the individual level. The magnitude of loss of statistical power is related to the cluster size, the balance of cluster sizes, and the correlation within the clusters.
For a given sample size, statistical power increases as the number of clusters increases. This makes intuitive sense, in that as the number of clusters increases, the size of the clusters decreases toward 1, or individual-level randomization. Moreover, as the cluster sizes become more imbalanced, statistical power decreases (Eldridge et al, 2009). Also, as correlation within a given cluster increases, power decreases. If there were no intracluster correlation, the power would be the same as with individual-level randomization.
Inefficiency is not the only problem with simple CRT designs. Another challenge is the potential for imbalance in baseline factors, especially with large clusters. For example, in study designs that involve clustering at the clinic level, individual clinics may differ in the size and demographic characteristics of their patient populations. These challenges may be overcome by adding a crossover, or a switch to the other intervention, within cluster during randomization.
Cluster With Crossover
We define a cluster-with-crossover design as a randomization design in which each cluster is randomly assigned to a study arm at the beginning of the study and, after a certain period of time, switches (ie, crosses over) to the other study arm. Timing the crossover to occur approximately halfway through the study achieves balance between the study arms, including balance on baseline factors.
A cluster-with-crossover design is only feasible if the intervention can be turned off and on without “learning,” such that residual practices are not carried over from the precrossover period to the postcrossover period. A carryover effect would cause contamination between the study arms. Implementing a washout period after the crossover, during which the data from the clusters are discarded, may help to prevent contamination, though washout periods are not always feasible. For example, in a device trial in which hospitals are randomly assigned to the device intervention or usual care, and in which the outcome of interest is patient survival, a carryover effect may occur despite a washout period due to the time sequence of other potential confounding treatments (eg, a new protocol introduced into the system that may improve survival midway through the trial). However, there would likely be a balance of these time effects across the study arms.
When a cluster-with-crossover design is feasible, it is statistically more efficient than individual-level randomization in certain situations. However, we do not advocate the cluster-with-crossover approach over individual-level randomization, because of challenges with feasibility (ie, turning the intervention on and off) and carryover effects, but as a viable alternative to simple cluster randomization. The efficiency gained with a cluster-with-crossover design is similar to that gained from a paired t test over an independent t test. More power is gained as the between-period correlation within cluster increases (Li et al, 2018). Furthermore, when the precrossover and postcrossover periods are balanced, statistical power may actually increase. As the periods become less balanced, power decreases and the study design moves toward a simple cluster randomized design.
Cluster With Partial Crossover
If an intervention cannot be turned off and on, another simple alternative is to collect data from all clusters during a baseline period (ie, before the intervention is introduced), then assign half of the clusters to the intervention and continue to collect data. This approach involving an untreated baseline period followed by parallel cluster randomization has statistical advantages because data are available from some clusters to efficiently estimate a within-cluster effect without the potential for “learning” contamination, or carryover effect, that could occur with cluster-with-crossover designs. Moreover, if outcome data are already being collected through the electronic health record or medical billing claims, this design is more powerful than a simple CRT design without cost to the study, because the data are already available and easily obtained. A limitation of the design is that not all clusters receive the intervention, unlike other designs such as the stepped-wedge trial.
To implement new cluster-with-crossover designs, there is a need for sample size calculations that are more feasible than currently available simulation approaches. These calculations require derivation of variance formulas for different designs incorporating the potential for unbalanced cluster sizes or crossover periods. The NIH Collaboratory's Biostatistics and Study Design Core is working on deriving these formulas for future trials.
Campbell MK, Mollison J, Grimshaw JM. 2001. Cluster trials in implementation research: estimation of intracluster correlation coefficients and sample size. Stat Med. 20:391-399. PMID: 11180309.
Donner A. 1992. Sample size requirements for stratified cluster randomization designs. Stat Med. 11:743-750. PMID: 1594813.
Donner A, Birkett N, Buck C. 1981. Randomization by cluster. Sample size requirements and analysis. Am J Epidemiol. 114:906-914. PMID: 7315838.
Donner A, Klar N. 1996. Statistical considerations in the design and analysis of community intervention trials. J Clin Epidemiol. 49:435-439. PMID: 8621994.
Eldridge SM, Ukoumunne OC, Carlin JB. 2009. The intra-cluster correlation coefficient in cluster randomized trials: a review of definitions. Int Stat Rev. 77:378-394. doi: 10.111 l/j.l751-5823.2009.00092.
Hsieh FY. 1988. Sample size formulae for intervention studies with the cluster as unit of randomization. Stat Med. 7:1195-1201. PMID: 3201045.
Li F, Forbes AB, Turner EL, Preisser JS. 2018. Power and sample size requirements for GEE analyses of cluster randomized crossover trials. Stat Med. 2018 Oct 8. doi: 10.1002/sim.7995. PMID: 30298551. [Epub ahead of print]