Assessing Outcomes

Using Electronic Health Record Data in Pragmatic Clinical Trials

Section 7

Assessing Outcomes


Rachel Richesson, MS, PhD, MPH

Richard Platt, MD, MSc

Gregory Simon, MD, MPH

Lesley Curtis, PhD

Reesa Laws, BS

Adrian Hernandez, MD, MSH

Jon Puro, MPA-HA

Doug Zatzick, MD

Erik van Eaton, MD, FACS

Vincent Mor, PhD


Contributing Editor

Karen Staman, MS

Because people receive health care from a variety of sources, it is often necessary to collect data from multiple providers in order to adequately determine whether patient outcomes have changed as a result of the intervention. Some considerations when defining outcomes based on the EHR are discussed in the Living Textbook Chapter Choosing and Specifying Endpoints and Outcomes. Key questions from that chapter include:

  • Is the outcome a medically significant such that a patient would seek care?
  • How will the endpoint be medically attended or documented?
  • Does it require hospitalization?
  • Is the treatment for the outcome generally inpatient or outpatient?
    • Outpatient events may not be coded very specifically in the EHR or claims.
  • What is the intensity of medical care?
    • If high, as with a myocardial infarction, then there will be a clear record in claims and/or EHR data.
    • If low, as with the gout example, there may or may not be a record of the event. A solution to this problem is to use a PRO, and reach out to the participant at specified intervals.

For more, read the full chapter.

Strategies to mitigate the incomplete capture of patient health care services and follow-up include:

  • Patient-targeted prospective (new) data collection, such as online or telephone surveys regarding patient-reported or patient-centered outcomes, using dedicated research staff
  • Linkage to other sources, such as insurance claims

Case Example: Assessing Outcomes

The goal of the Collaborative Care for Chronic Pain in Primary Care (PPACT) Study  (NCT02113592) is to enable patients to adopt self-management skills for chronic pain, limit use of opioid medications, and identify factors amenable to treatment in the primary care setting. Investigators needed patient-reported outcomes (PRO) data for their primary endpoints. The study is being conducted in three distinct regions on Kaiser Permanente: the Northwest, Hawaii, and Georgia. Investigators determined that the PRO data collected via standard clinical practices in each region was not sufficient to meet the needs of the project. In order to address this, project leadership worked with national Kaiser to create buy-in for using a common instrument across the regions and then local IT built it within each region. In addition, a multi-tiered approach was developed to supplement the clinically collected PRO data at 4 project required time points (3, 6, 9 and 12 months). Two tiers were within the clinical system; secure email from EHR was sent with an attached survey followed by an automated Interactive Voice Recognition phone call. A follow-up phone call by research staff was necessary to maximize data collection at each time point. These follow up calls were consistent with standard clinical practice of having medical assistant staff follow up with patients via phone calls.

While data linkage may appear to be a ‘pragmatic’ option for including data from multiple providers, there is still no guarantee that all patient outcome data can be accessed and included in the PCT. Linkage with other data sources fills in gaps only to the extent that those sources can provide complete data on care provided to the trial population throughout the given study time period. This may not always be true—especially for transient populations or areas where access and benefits for health care are variable and dynamic.

Longitudinal Data Linkage

Compounding the difficulties described above, care systems are highly fragmented, and EHRs from one system will lack information about whether or not a patient had additional care outside a system’s EHR. In order to fully capture all the care that a patient receives—complete longitudinal data—linking research data to data from insurance claims may be necessary. For example, if the outcome being measured is myocardial infarction, the system’s EHR will only capture information about the event if the patient was treated for the myocardial infarction at a participating health system. If the patient travels and is treated in a different facility, then the only way to capture this information is through insurance claims data. Unless the patient gives explicit consent, getting longitudinal data from an insurance carrier can be an insurmountable hurdle, both technically and legally. The lack of a universal patient identifier (Carpenter and Chute 1993) means that various linkage techniques (e.g., probabilistic; (Dusetzina et al. 2014) must be used, often requiring the matching of personal or identifying data across different sources to match the patient. In addition, although claims data may indicate that care was provided for a given patient for a certain condition, it might not contain the necessary medical detail needed for a particular study (e.g., vitals readings such as blood pressure; lab results). Also, if a patient does not have or loses insurance coverage or has inadequate coverage, there might not even be a claims record available for their visits from any source.

  • Is there a need for vital statistics or mortality data from a state health department (e.g., birth data, death data, cancer repositories), or external claims data repositories?
  • Are data from organizations such as the state or insurers needed?

The challenge of getting claims data is described in the example below.

Case Example: Longitudinal Data Linkage

The Aspirin Dosing: A Patient-Centric Trial Assessing Benefits and Long-term Effectiveness (ADAPTABLE) study is a large, pragmatic trial designed to determine the optimal dose of maintenance aspirin for patients with coronary artery disease, and is expected to enroll approximately 20,000 patients. Patients are referred to the study by their physician, and they enroll through an online portal, which also provides online consent and randomization. Each patient is given a unique identifier in the EHR, and data collected during routine care are used in the study.

However, to accurately assess outcomes—such as myocardial infarction, mortality, hospitalizations—and capture longitudinal information about all the care that each patient receives, claims data needed to be linked to study data.

How claims data are linked in ADAPTABLE depends on the type of insurer.

Centers for Medicare and Medicaid Services (CMS)

The patients on Medicare provide the last four digits of their Social Security Number, date of birth (DOB), and sex through the consent form; this information enables linkage of the majority of participants to CMS beneficiary files. Without the explicit consent provided through the online portal, in order to link the CMS data to data in the EHR, health systems would have needed to allow access to a patient’s data of birth, Medicare ID, and sex; which is considered protected health information (PHI).

The ADAPTABLE investigators, with funding from PCORI, engaged two large, national insurers (Humana and Blue Cross Blue Shield Anthem) to support record linkage for participating members.

Some issues investigators had with linking to large insurers are described below.

  • Health insurance companies do not collect social security numbers, and the team needed both the group number and the member number to accurately identify individuals.
  • Some insurers required modification of consent language to include explicit authorization for the insurer to release certain records.
  • Large, national insurers often have subsidiaries—smaller groups with different names—although they all use the same umbrella company.

For patients with other types of insurance, the follow-up strategy is multi-tiered:

  • Participant is asked to login to the portal and enter information
  • If the patient does not go back to the portal, the call center at the Duke Clinical Research Institute calls for follow up.

All of the above are complicated topics that require a set of multi-disciplinary experts with solutions that are customized to the specific trial (scientific question, study aims, and specific research settings). Because of these complexities, EHRs are rarely sufficient by themselves to support the needs of PCTs. Even though EHRs can be extremely valuable resources for clinical trials (and for some trials, EHRs may be essential), there are many steps and factors to consider, as described in previous sections.




back to top

Carpenter PC, Chute CG. 1993. The Universal Patient Identifier: a discussion and proposal. Proc Annu Symp Comput Appl Med Care. 49–53. PMID: 8130521.

Dusetzina SB, Tyree S, Meyer A-M, Meyer A, Green L, Carpenter WR. 2014. Linking Data for Health Services Research: A Framework and Instructional Guide. (Prepared by the University of North Carolina at Chapel Hill under Contract No. 290-2010-000141.) AHRQ Publication No. 14-EHC033-EF. Rockville, MD: Agency for Healthcare Research and Quality. PMID: 25392892.

Version History

November 30, 2018: Updated text as part of annual update (changes made by K. Staman).

Published August 25, 2017


Richesson R, Platt P, Simon G, et al. Using Electronic Health Record Data in Pragmatic Clinical Trials: Assessing Outcomes. In: Rethinking Clinical Trials: A Living Textbook of Pragmatic Clinical Trials. Bethesda, MD: NIH Health Care Systems Research Collaboratory. Available at: Updated December 3, 2018. DOI: 10.28929/036.