Experimental Designs and Randomization Schemes
Section 7
Covariate-Constrained Randomization
When implemented correctly and with a sufficiently large sample size, randomization helps provide an unbiased causal estimate of the treatment effect (Armitage 1982). This is because randomization enables researchers to compare groups that are similar on measured and unmeasured baseline characteristics other than the treatment condition. In addition to randomization, trialists also employ concealment and blinding, trial monitoring, strategies that minimize loss to follow-up, and intention-to-treat analyses to enhance the rigor of a trial. These techniques reduce the chances of conscious or unconscious bias and increase confidence that observed differences between treatment groups are attributable to the intervention and not to confounding factors.
Randomization can occur at the individual level or cluster level. In individually randomized trials, the participant is the unit of randomization, the intervention is applied to the participant, and analyses are conducted at the individual level. In CRT), randomization occurs at the level of a clinic, hospital, healthcare system, city block, or another unit that comprises multiple patients or participants. In CRTs, the intervention can be applied at the individual or cluster level, and the analysis can be conducted at the individual or cluster level. (See the Cluster Randomized Trials section in this chapter of the Living Textbook.)
Simple random allocation is the easiest randomization method, and every "randomization unit" is allocated entirely at random, like flipping a coin. However, there is a moderate to high chance of imbalance in baseline characteristics between treatment arms when a small number of units is randomized (for example, fewer than 20) (Perry et al 2010). In trials with individual-level randomization, some have suggested randomizing at least 1000 participants per arm to have sufficient protection against the imbalance of baseline characteristics (Chu et al 2012). However, it is often impractical or impossible to randomize that many units in CRTs. One review of 300 CRTs found that half of the trials randomly allocated fewer than 21 clusters, and three-quarters of the trials randomized fewer than 52 clusters (Ivers et al 2011).
Imbalances in baseline characteristics in RCTs complicate the interpretation of observed treatment effects. They may also threaten a trial’s internal validity (Raab and Butcher 2001, Carter et al 2008, Ivers et al 2012). Using "restricted" or "constrained" randomization procedures can help minimize the chance of baseline imbalance (and improve statistical efficiency) in parallel-arm CRTs (Ivers et al 2012). These constrained procedures include stratification, matching, minimization, and covariate-constrained randomization. (See also the Pair Matching and Stratification With Cluster Designs section in this chapter of the Living Textbook.) All restricted methods require prior knowledge about the clusters and the baseline measures used for the restriction process.
Applying Covariate-Constrained Randomization in CRTs
Covariate-constrained randomization calculates the baseline imbalance of cluster-level variables, or covariates, using a prespecified balancing metric and selects a randomization scheme at random from those with an "acceptable" balance (Ivers et al 2012, Ciolino et al 2019). In both 2-arm and multi-arm trials (Ciolino et al 2019; Watson et al 2020; Zhou et al 2022), covariate-constrained randomization can (1) provide a better balance on baseline characteristics than other allocation procedures, such as simple randomization, stratification, and minimization; and (2) overcome the limitations of other randomization methods (for example, sparse strata when using stratification or additional complexity estimating the intracluster correlation coefficient when matching) (Perry et al 2010; Ivers et al 2012; Moulton et al 2004; Xiao et al 2011; Zhou et al 2022).
The covariate-constrained randomization process includes the following steps:
- Identify a small number of important prognostic cluster-level or individual-level variables available at the time of randomization (for example, fewer than 5 variables).
- Either enumerate all allocation schemes when fewer than 20 clusters are randomized or generate at least 100,000 allocation schemes (Carter et al 2008).
- Estimate the balance of chosen variables from step 1 according to some balancing metric (for example, absolute differences, standardized differences, or another measure [Lee et al 2016]) for each allocation scheme in step 2.
- Select a subset of allocation schemes balanced on the constrained variables (Li et al 2016; Li et al 2017; Yu et al 2019).
- Select a scheme from the constrained randomization space in step 4. This allocation is then used in the trial.
There are trade-offs between better balance achieved on the constrained baseline characteristics and the consequences of a highly restricted randomization procedure (Moulton et al 2004; Li et al 2017). The trade-offs include the following:
- The "random" element of randomization can be threatened when pairs of clusters always or rarely appear in the same arm (Moulton et al 2004; Li et al 2017). Note that this is different from other approaches (such as matching) in which pairs of centers can never appear together by design.
- There can be a departure from the nominal type I error rate when correlated clusters (such as academic vs community hospitals) have a high or low probability of being included in the same treatment arm (Moulton et al 2004; Li et al 2017).
- There can be a loss of statistical power when constrained variables are not associated with the trial outcomes (Moulton et al 2004; Li et al 2017). At the end of the trial period, variables used to constrain the randomization must be considered during the analysis; it is often more statistically efficient to adjust at the individual level rather than the cluster level (Ford et al 1995; Ford et al 2002; Kahan 2014).
Recommendations
Several recommendations have been proposed for using covariate-constrained randomization in CRTs (Al Jaishi et al 2021):
- Identify beforehand the prognostic variables associated with the trial outcome. Previous reports show that constraining the randomization on prevalent conditions that are strongly associated with the outcome (for example, between 10% and 50%) improves the precision of treatment effect estimates and statistical power (Kahan et al 2014; Hernández et al 2004; Hernández et al 2006; Raab et al 2000). This information can come from background literature, previous trials or observational studies, or historical data (such as administrative data, electronic health records, and registries).
- Consider generating all (or at least 1000) simple random allocations to identify baseline characteristics that are always balanced (for example, at least 95% of the time). There is no need to constrain the randomization on these variables; however, these variables can be included in model-based adjustment to improve the precision of treatment effect estimates. All prognostic variables used for model adjustment should be prespecified at the design stage (Raab et al 2000).
- Consider the number of variables used to constrain the randomization process. Over-constraining the randomization process could result in clusters with correlated outcomes having a lower or higher probability of being allocated to the same trial arm (Moulton et al 2004; Freedman et al 1990). This correlation in cluster assignment can violate the assumption of independence used to carry out significance testing and associated 95% CIs for the estimated treatment effect. An invalid design could result in incorrect type I errors and CIs with incorrect coverage (Hayes and Moulton 2009).
- Consider using a dimensionality-reduction method (such as principal component analysis or propensity scores) to reduce many prognostic variables to a few criterion variables used in the constrained randomization process. These criterion variables must be considered at the analysis stage in the model adjustment (Silipo and Widmann 2019).
- Although the constraining process uses aggregate patient-level and cluster-level data, researchers should consider the missingness of constrained variables. Variables with missing information should be imputed before implementing the constraining process (Fiero et al 2016). A similar imputation approach should be implemented at the analysis stage.
- Consider enumerating all possible randomization schemes when there are fewer than 20 clusters or generating at least 100,000 randomization schemes otherwise (Li et al 2017).
SECTIONS
sections
- Introduction
- Statistical Design Considerations
- Cluster Randomized Trials
- Alternative Cluster Randomized Designs
- Stepped-Wedge Designs
- Choosing Between Cluster and Individual Randomization
- Covariate-Constrained Randomization
- Pair Matching and Stratification With Cluster Designs
- Concealment and Masking
- Designing to Avoid Identification Bias
- Additional Resources
REFERENCES
Al-Jaishi A, Dixon S, McArthur E, Devereaux P, Thabane L, Garg A. 2021. Simple compared to covariate-constrained randomization methods in balancing baseline characteristics: a case study of randomly allocating 72 hemodialysis centers in a cluster trial. Trials. 22(1):626. doi: 10.1186/s13063-021-05590-1. PMID: 34526092.
Armitage P. 1982. The role of randomization in clinical trials. Stat Med. 1(4):345-52. doi: 10.1002/sim.4780010412. PMID: 7187102.
Carter BR, Hood K. 2008. Balance algorithm for cluster randomized trials. BMC Med Res Methodol. 8:65. doi: 10.1186/1471-2288-8-65. PMID: 18844993.
Chu R, Walter SD, Guyatt G, et al. 2012. Assessment and implication of prognostic imbalance in randomized controlled trials with a binary outcome--a simulation study. PLoS One. 7(5):e36677. doi: 10.1371/journal.pone.0036677. PMID: 22629322.
Ciolino JD, Diebold A, Jensen JK, Rouleau GW, Koloms KK, Tandon D. 2019. Choosing an imbalance metric for covariate-constrained randomization in multiple-arm cluster-randomized trials. Trials. 20(1):293. doi: 10.1186/s13063-019-3324-5. PMID: 31138319.
Fiero MH, Huang S, Oren E, Bell ML. 2016. Statistical analysis and handling of missing data in cluster randomized trials: a systematic review. Trials. 17:72. doi: 10.1186/s13063-016-1201-z. PMID: 26862034.
Ford I, Norrie J. 2002. The role of covariates in estimating treatment effects and risk in long-term clinical trials. Stat Med. 21(19):2899-2908. doi:10.1002/sim.1294. PMID: 12325106.
Ford I, Norrie J, Ahmadi S. 1995. Model inconsistency, illustrated by the cox proportional hazards model. Stat Med.14(8):735-746. doi:10.1002/sim.4780140804. PMID: 7644855.
Freedman LS, Green SB, Byar DP. 1990. Assessing the gain in efficiency due to matching in a community intervention study. Stat Med. 9(8):943-52. doi: 10.1002/sim.4780090810. PMID: 2218196.
Hayes RJ, Moulton LH. Cluster Randomised Trials. CRC Press/Chapman & Hall: Boca Raton, Florida; 2009.
Hernández AV, Steyerberg EW, Habbema JDF. 2004. Covariate adjustment in randomized controlled trials with dichotomous outcomes increases statistical power and reduces sample size requirements. J Clin Epidemiol. 57(5):454-460. doi:10.1016/j.jclinepi.2003.09.014. PMID: 15196615.
Hernández AV, Eijkemans MJ, Steyerberg EW. 2006. Randomized controlled trials with time-to-event outcomes: how much does prespecified covariate adjustment increase power? Ann Epidemiol. 16(1):41-8. doi: 10.1016/j.annepidem.2005.09.007. PMID: 16275011.
Ivers NM, Halperin IJ, Barnsley J, et al. 2012. Allocation techniques for balance at baseline in cluster randomized trials: a methodological review. Trials. 13:120. doi: 10.1186/1745-6215-13-120. PMID: 22853820.
Ivers NM, Taljaard M, Dixon S, et al. 2011. Impact of CONSORT extension for cluster randomised trials on quality of reporting and study methodology: review of random sample of 300 trials, 2000-8. BMJ. 343:d5886. doi: 10.1136/bmj.d5886. PMID: 21948873.
Kahan BC, Jairath V, Doré CJ, Morris TP. 2014. The risks and rewards of covariate adjustment in randomized trials: an assessment of 12 outcomes from 8 studies. Trials. 15(1):139. doi:10.1186/1745-6215-15-139. PMID: 24755011.
Li F, Lokhnygina Y, Murray DM, Heagerty PJ, DeLong ER. 2016. An evaluation of constrained randomization for the design and analysis of group-randomized trials. Stat Med. 35(10):1565-1579. doi:10.1002/sim.6813. PMID: 26598212.
Li F, Turner EL, Heagerty PJ, Murray DM, Vollmer WM, DeLong ER. 2017. An evaluation of constrained randomization for the design and analysis of group-randomized trials with binary outcomes. Stat Med. 36(24):3791-3806. doi:10.1002/sim.7410. PMID: 28786223.
Moulton LH. 2004. Covariate-based constrained randomization of group-randomized trials. Clin Trials. 1(3):297-305. doi:10.1191/1740774504cn024oa. PMID: 16279255.
Perry M, Faes M, Reelick MF, Olde Rikkert MG, Borm GF. 2010. Studywise minimization: a treatment allocation method that improves balance among treatment groups and makes allocation unpredictable. J Clin Epidemiol. 63(10):1118-22. doi: 10.1016/j.jclinepi.2009.11.014. PMID: 20304606.
Raab GM, Butcher I. 2001. Balance in cluster randomized trials. Stat Med. 15;20(3):351-65. doi: 10.1002/1097-0258(20010215)20:3<351::aid-sim797>3.0.co;2-c. PMID: 11180306.
Raab GM, Day S, Sales J. 2000. How to select covariates to include in the analysis of a clinical trial. Control Clin Trials. 21(4):330-42. doi: 10.1016/s0197-2456(00)00061-1. PMID: 10913808.
Silipo R, Widmann M. 2019. 3 new techniques for data-dimensionality reduction in machine learning. The New Stack. https://thenewstack.io/3-new-techniques-for-data-dimensionality-reduction-in-machine-learning/. Published August 9, 2019. Accessed July 19, 2022.
Watson SI, Girling A, Hemming K. 2020. Design and analysis of three-arm parallel cluster randomized trials with small numbers of clusters. Stat Med. 40(5):1133-46. doi: 10.1002/sim.8828. PMID: 33258219.
Xiao L, Lavori PW, Wilson SR, Ma J. 2011. Comparison of dynamic block randomization and minimization in randomized trials: a simulation study. Clin Trials. 8(1):59-69. doi: 10.1177/1740774510391683. PMID: 21335590.
Yu H, Li F, Gallis JA, Turner EL. 2019. cvcrand: A package for covariate-constrained randomization and the clustered permutation test for cluster randomized trials. R J. 2019;11(2):1-14. doi:10.32614/RJ-2019-027.
Zhou Y, Turner EL, Simmons RA, Li F. 2022. Constrained randomization and statistical inference for multi-arm parallel cluster randomized controlled trials. Stat Med. 41(10):1862-83. doi: 10.1002/sim.9333. PMID: 35146788.
current section : Covariate-Constrained Randomization
- Introduction
- Statistical Design Considerations
- Cluster Randomized Trials
- Alternative Cluster Randomized Designs
- Stepped-Wedge Designs
- Choosing Between Cluster and Individual Randomization
- Covariate-Constrained Randomization
- Pair Matching and Stratification With Cluster Designs
- Concealment and Masking
- Designing to Avoid Identification Bias
- Additional Resources