Using Electronic Health Record Data in Pragmatic Clinical Trials

Section 1



Rachel Richesson, MS, PhD, MPH

Richard Platt, MD, MSc

Gregory Simon, MD, MPH

Lesley Curtis, PhD

Reesa Laws, BS

Adrian Hernandez, MD, MSH

Jon Puro, MPA-HA

Doug Zatzick, MD

Erik van Eaton, MD, FACS

Vincent Mor, PhD


Contributing Editor

Karen Staman, MS

Some material in this chapter is based on the Acquiring and Using Electronic Health Record Data chapter originally written by Meredith Zozus et al.


Using electronic health record (EHR) data for research is fundamentally different than collecting the research data prospectively, as is traditional for controlled clinical trials. Several features of EHR systems create these important differences, most importantly being the lack of investigator control over data collection and recording processes in health care facilities. Other factors include the lack of standard definitions for identifying patient cohorts and study-specific outcomes, the challenges associated with completeness of longitudinal data, and potential errors in linkage of records across systems. All of these challenge investigators to assure and demonstrate that data are of adequate quality to support research conclusions. While many of the issues addressed in this chapter apply to a broad range of study designs that might use data from the EHR, this chapter describes the use cases and associated challenges for using EHR data in pragmatic clinical trials, particularly those that include randomization. Specifically, we will discuss:

  • Developing and refining the research question and defining the data that are essential and necessary to answer that question
  • Data sources for explanatory trials vs PCTs
  • The role of data as a partial representation of (or surrogate for) clinical phenomena under investigation
  • Considerations for the use of EHR data, including understanding bias and provenance, completeness and other dimensions of data quality, and methods for linking between multiple data sources

Data Sources for Explanatory Trials vs PCTs

There is a marked contrast between using the data collected within an EHR system for research versus using data that were collected outside of an EHR explicitly for a trial. Traditionally, the study protocol specifies the data to be collected, and they are collected through a separate, stand-alone system. The circumstances around data collection for traditional trials, including procedures for taking samples, making observations and recording data (e.g., patient positioning, timing, and anatomical location), are clearly defined in the protocol and data are collected accordingly. Further, in traditional research, the protocol defines the timing of data relative to the trial milestones or activities, for example, “the second assessment occurs 14 days post baseline.” In designing traditional or explanatory studies, a top-down approach is usually taken starting with the research question and working down to the required data.

In contrast, the use of existing data streams, a defining feature for pragmatic clinical trials, presents a number of issues and requires a different approach than in traditional explanatory clinical trials. Data contained in EHRs that were captured in routine-care settings or from insurance claims have a different context from prospectively collected research data. While the context of care and data collection is often unspecified, it is certainly not defined around a research question or protocol. Similarly, the structure and representation of clinical data is imposed at the facility according to their standards for clinical documentation and business needs rather than by the needs of the research study. This structure, along with local context, record linkage considerations, use of diagnosis codes, etc., brings substantial and unique challenges for using data from EHR systems in research.




Richesson R, Platt P, Simon G, et al. Using Electronic Health Record Data in Pragmatic Clinical Trials: Introduction. In: Rethinking Clinical Trials: A Living Textbook of Pragmatic Clinical Trials. Bethesda, MD: NIH Health Care Systems Research Collaboratory. Available at: Updated September 21, 2018. DOI: 10.28929/030.