Analysis Plan

Section 2 Intraclass Correlation

Contributors

A primary driver of whether a study should randomize at a cluster level is the intraclass correlation coefficient (ICC). (For other considerations in choosing between cluster and individual randomization, see Choosing Between Cluster and Individual Randomization in the Experimental Designs and Randomization Schemes chapter of the Living Textbook.) The ICC is a measure of how similar the outcomes of individuals within a cluster are likely to be, relative to those of other clusters. For example, an intervention designed to enhance medication adherence might be implemented in several communities, where individuals of any single community might belong to the same socioeconomic group and behave similarly in terms of ability to pay for and remember to take their medication. Hence, if the primary outcome of the study is a measure of compliance, there is likely to be substantial intraclass correlation.

The ICC is closely linked to the sample size necessary to conduct the trial with adequate statistical power. The ICC ranges from completely correlated (ICC = 1) to no correlation (ICC = 0). In the extreme case of an ICC of 1, all participants in a cluster are likely to have exactly the same outcome; thus, sampling 1 participant from that cluster is as informative as sampling the whole cluster. In other words, each cluster contributes a single data point to the study, and the effective sample size for the study is the number of clusters. On the other hand, if participants in a cluster behave essentially independently of each other and their outcomes are no more related than if they were from different clusters, then the ICC is 0 and the available sample size for the study is the total number of participants. There is a substantial literature on taking account of the ICC when determining sample size requirements for CRTs. It is critical to have preliminary data that provide some estimate of the likely ICC.

The level at which to cluster creates a trade-off between potential contamination and bias on the one hand and available sample size on the other. For example, in one NIH Pragmatic Trials Collaboratory NIH Collaboratory Trial, the researchers originally intended to randomize at the provider level. However, because the providers shared clinic staff and facilities, a preliminary evaluation of the ICC demonstrated more than negligible correlation between clinics in the potential outcomes. So the researchers needed to randomize at the clinic level rather than the provider level and to recruit additional clinics to meet their expectations for statistical power. In contrast, another NIH Collaboratory Trial study team intended to randomize at the clinic level but found negligible correlation in outcomes between providers within clinics and were able to randomize providers.

Much depends on the type of intervention, and it is always prudent to obtain a preliminary estimate of the ICC before planning the study.

Accounting for the ICC in the Analysis

When individual outcomes are recorded within a cluster, it is important to understand how the outcomes relate to the primary hypothesis of the study and what exactly is to be tested and/or estimated. If conclusions are to be made at a cluster level, the individual outcomes might be accumulated into a summary measure for each cluster and the analysis performed at the cluster level. This approach eliminates the need to address the ICC directly, but it assumes that either there is little variation in cluster sizes or the size of the cluster is relatively unimportant. The analysis will place equal weight on each cluster, regardless of its size, unless a weighting mechanism is used.

For analyses performed at the individual level, random-effects models or generalized estimating equations are typically used. Again, the interpretation of the results hinges on the type of analysis, and it is important for the investigators to discuss their hypotheses clearly with the study statisticians.

Previous Section Next Section

SECTIONS

CHAPTER SECTIONS

sections

Resources

The Intraclass Correlation Coefficient (ICC)
Guidance document from the Biostatistics and Study Design Core

Intraclass Correlation Coefficient Cheat Sheet
Introductory description of the ICC

The Intraclass Correlation Coefficient as a Pie Eating Contest
Video tutorial by NIH Collaboratory investigator Dr. Greg Simon

Intraclass Correlation Coefficients for Cluster Randomized Trials With Pain Outcomes
Working document from the NIH-DOD-VA Pain Management Collaboratory Biostatistics/Design Work Group

Version History

June 23, 2022: Updated the name of the NIH Collaboratory in the contributors list, added an item to the Resources sidebar, and made nonsubstantive changes to the text as part of the annual content update (changes made by D. Seils).

July 2, 2020: Minor corrections to layout and formatting (changes made by D. Seils).

May 27, 2020: Added Heagerty to the contributors list and reordered the sections of this chapter as part the annual content update (changed made by D. Seils).

May 1, 2020: Made nonsubstantive changes to the Resources sidebar as part of the annual content update (changes made by D. Seils).

January 16, 2019: Embedded a video on intraclass correlation, added a resource to the Resource box, and made nonsubstantive changes to the text as part of the annual content update (changes made by D. Seils).

Published August 25, 2017

COVID-19 Resources

COVID-19 Resources

Rethinking Clinical Trials

A Living Textbook of Pragmatic Clinical Trials

Intraclass Correlation

Analysis Plan

Section 2

Intraclass Correlation

Accounting for the ICC in the Analysis

SECTIONS

sections

Resources

current section :

Intraclass Correlation

Citation: