Using Electronic Health Record Data in Pragmatic Clinical Trials

Section 8 Assessing Outcomes

Contributors

Because people receive health care from a variety of sources, it is often necessary to collect data from multiple providers in order to adequately determine whether patient outcomes have changed as a result of the intervention. Some considerations when defining outcomes based on the EHR are discussed in the Living Textbook Chapter Choosing and Specifying Endpoints and Outcomes. Key questions from that chapter include:

Is the outcome a medically significant such that a patient would seek care?
How will the endpoint be medically attended or documented?
Does it require hospitalization?
Is the treatment for the outcome generally inpatient or outpatient?
- Outpatient events may not be coded very specifically in the EHR or claims.
What is the intensity of medical care?
- If high, as with a myocardial infarction, then there will be a clear record in claims and/or EHR data.
- If low, as with the gout example, there may or may not be a record of the event. A solution to this problem is to use a PRO, and reach out to the participant at specified intervals.

For more, read the full chapter.

Strategies to mitigate the incomplete capture of patient health care services and follow-up include:

Patient-targeted prospective (new) data collection, such as online or telephone surveys regarding patient-reported or patient-centered outcomes, using dedicated research staff
Linkage to other sources, such as insurance claims

Collecting Patient-Reported Outcomes in the EHR

Investigators from six of the NIH Pragmatic Trials Collaboratory’s PCTs shared lessons learned from challenges they encountered while collecting patient-reported outcome (PRO) measures during their trials and the tactics used to mitigate them.

PRO measures reflect meaningful aspects of health and provide information about outcomes that are experienced uniquely by the patient, such as pain intensity, fatigue, and satisfaction with social roles. The authors of a paper on utilizing PROs in the EHR articulate case examples for each of the challenges encountered (Zigler et al 2024), as summarized below:.

Case Examples

Healthcare systems do not collect the necessary PRO measures for research.
- Recommendation: Realize differences in system priorities and facilitate discussion within the healthcare systems to understand how changes would impact their work.
Healthcare systems are complex and often have “information overload.”
- Recommendation: The EHR should display only relevant, actionable information to providers versus raw PROM data.
Healthcare systems have unique processes, cost structures, and timelines for prioritization Additionally, health system/health system’s IT timelines often don’t match the grant timelines.
- Recommendation: Create temporary solutions for gathering the necessary PRO data before integration with the HER.
Data entry is an additional burden on clinical staff, and care team may not see the benefit.
- Recommendation: Utilize electronic PRO measures as much as possible to lift some of the data entry burden from care teams and demonstrate their value by using them during specific encounters.
Scores must be interpretable; clinicians don’t always know what is clinically significant.
- Recommendation: Ensure the PRO measures are meaningful, and the results can be interpreted by clinicians and patients alike.
Low adoption and reach of technology such as personal health records (PHRs) at low resource settings such as safety net community health clinics (CHCs).
- Recommendation: Utilize more familiar technologies, such as bidirectional SMS messaging that can connect patients to health services.
PRO measures are typically chosen based their specific intended use, so the “best” PROM for an electronic pragmatic clinical trial would likely be less optimal in a clinical care setting
- Recommendation: Clinicians and patients should work together to identify the PRO measures best suited to support their individual treatment goals.

The authors of this paper suggest that these barriers require study teams to use separate data collection systems or integrate externally collected PRO data into the electronic health record.

“When using patient-reported outcome measures for embedded pragmatic clinical trials investigators must make important decisions about whether to use data collected from the participating health system’s electronic health record, integrate externally collected patient-reported outcome data into the electronic health record, or collect these data in separate systems for their studies(Zigler et al 2024).”

Case Example: Assessing Outcomes

The goal of the Collaborative Care for Chronic Pain in Primary Care (PPACT) study was to enable patients to adopt self-management skills for chronic pain, limit use of opioid medications, and identify factors amenable to treatment in the primary care setting. Investigators needed patient-reported outcomes (PRO) data for their primary endpoints. The study was conducted in three distinct regions of` Kaiser Permanente: the Northwest, Hawaii, and Georgia. Investigators determined that the PRO data collected via standard clinical practices in each region was not sufficient to meet the needs of the project. In order to address this, project leadership worked with national Kaiser to create buy-in for using a common instrument across the regions and then local IT built it within each region. In addition, a multi-tiered approach was developed to supplement the clinically collected PRO data at 4 project required time points (3, 6, 9 and 12 months). Two tiers were within the clinical system; secure email from EHR was sent with an attached survey followed by an automated Interactive Voice Recognition phone call. A follow-up phone call by research staff was necessary to maximize data collection at each time point. These follow-up calls were consistent with standard clinical practice of having medical assistant staff follow up with patients via phone calls.

While data linkage may appear to be a ‘pragmatic’ option for including data from multiple providers, there is still no guarantee that all patient outcome data can be accessed and included in the PCT. Linkage with other data sources fills in gaps only to the extent that those sources can provide complete data on care provided to the trial population throughout the given study time period. This may not always be true—especially for transient populations or areas where access and benefits for health care are variable and dynamic.

Endpoints

EHR data can be used to define outcome measure and endpoint definitions for pragmatic research. The AHRQ-sponsored Outcome Measurement Framework provides specification and guidance to support standard approaches for querying heterogeneous and highly granular EHR data for calculated and derived measures that can be used to measure patient outcomes and study endpoints (Leavy et al 2019).

Longitudinal Data Linkage

Compounding the difficulties described above, care systems are highly fragmented, and EHRs from one system will lack information about whether or not a patient had additional care outside a system’s EHR. In order to fully capture all the care that a patient receives—complete longitudinal data—linking research data to data from insurance claims or other real-world data sources may be necessary. For example, if the outcome being measured is myocardial infarction, the system’s EHR will only capture information about the event if the patient was treated for the myocardial infarction at a participating health system. If the patient travels and is treated in a different facility, then the only way to capture this information is through linkage with insurance claims or data from the treating health system. The United States still lacks a universal patient identifier (Carpenter and Chute 1993), which means efforts to link patients rely on techniques like privacy-preserving record linkage (PPRL) that can match patients based on encrypted combinations of patient identifiers (e.g., first name + last name + date of birth + current zip code) (Marsolo et al 2023). The application of PPRL methods in research is often referred to as “tokenization” because the methods generate encrypted tokens that are used to match patients across sources. In recent years, there have been vendors that provide tokenization services and allow trial data to be linked with real-world data sources like EHRs and claims. Patient consent is usually required and there are some governance considerations in accessing the linked sources, but it remains an option to potentially obtain additional information on participants.

Even with tokenization techniques, there are some issues to consider with linkage. For instance, although claims data may indicate that care was provided for a given patient for a certain condition, it might not contain the necessary medical detail needed for a particular study (e.g., vitals readings such as blood pressure; lab results). Also, if a patient does not have or loses insurance coverage or has inadequate coverage, there might not even be a claims record available for their visits from the sources that are available to link.

Is there a need for vital statistics or mortality data from a state health department (e.g., birth data, death data, cancer repositories), or external claims data repositories?
Are data from organizations such as the state or insurers needed?

The challenge of getting claims data is described in the example below.

Case Example: Longitudinal Data Linkage

The Aspirin Dosing: A Patient-Centric Trial Assessing Benefits and Long-term Effectiveness (ADAPTABLE) study was a large, pragmatic trial designed to determine the optimal dose of maintenance aspirin for patients with coronary artery disease, and enrolled approximately 15,000 patients (Jones et al 2021). Patients were referred to the study by their physician and enrolled through an online portal, which also provided online consent and randomization. Each patient is given a unique identifier in the EHR, and data were collected during routine care.

However, to accurately assess outcomes—such as myocardial infarction, mortality, hospitalizations—and capture longitudinal information about all the care that each patient receives, claims data needed to be linked to study data.

How claims data were linked in ADAPTABLE depended on the type of insurer.

Centers for Medicare and Medicaid Services (CMS)

The patients on Medicare provided the last four digits of their Social Security Number, date of birth (DOB), and sex through the consent form; this information enabled linkage of the majority of participants to CMS beneficiary files. Without the explicit consent provided through the online portal, in order to link the CMS data to data in the EHR, health systems would have needed to allow access to a patient’s data of birth, Medicare ID, and sex; which is considered protected health information (PHI).

The ADAPTABLE investigators, with funding from PCORI, engaged two large, national insurers (Humana and Blue Cross Blue Shield Anthem) to support record linkage for participating members.

Some issues investigators had with linking to large insurers are described below.

Health insurance companies do not collect social security numbers, and the team needed both the group number and the member number to accurately identify individuals.
Some insurers required modification of consent language to include explicit authorization for the insurer to release certain records.
Large, national insurers often have subsidiaries—smaller groups with different names—although they all use the same umbrella company.

For patients with other types of insurance, the follow-up strategy is multi-tiered:

Participant was asked to login to the portal and enter information
If the patient did not go back to the portal, the call center at the Duke Clinical Research Institute called for follow up.

All of the above are complicated topics that require a set of multi-disciplinary experts with solutions that are customized to the specific trial (scientific question, study aims, and specific research settings). Because of these complexities, EHRs are rarely sufficient by themselves to support the needs of PCTs. Even though EHRs can be extremely valuable resources for clinical trials (and for some trials, EHRs may be essential), there are many steps and factors to consider, as described in previous sections.

Previous Section Next Section

SECTIONS

CHAPTER SECTIONS

sections

REFERENCES

Carpenter PC, Chute CG. 1993. The Universal Patient Identifier: a discussion and proposal. Proc Annu Symp Comput Appl Med Care. 49–53. PMID: 8130521.

Dusetzina SB, Tyree S, Meyer A-M, Meyer A, Green L, Carpenter WR. 2014. Linking Data for Health Services Research: A Framework and Instructional Guide. (Prepared by the University of North Carolina at Chapel Hill under Contract No. 290-2010-000141.) AHRQ Publication No. 14-EHC033-EF. Rockville, MD: Agency for Healthcare Research and Quality. www.effectivehealthcare.ahrq.gov/reports/final.cfm. PMID: 25392892.

Jones WS, Mulder H, Wruck LM, et al. 2021. Comparative Effectiveness of Aspirin Dosing in Cardiovascular Disease. N Engl J Med. 384:1981–1990. doi: 10.1056/NEJMoa2102137. PMID: 33999548.

Leavy MB, Schur C, Kassamali FQ, et al. 2019. Development of Harmonized Outcome Measures for Use in Patient Registries and Clinical Practice: Methods and Lessons Learned. Agency for Healthcare Research and Quality (AHRQ). https://effectivehealthcare.ahrq.gov/topics/registry-of-patient-registries/standardized-library. Accessed March 5, 2022. doi:10.23970/AHRQEPCLIBRARYFINALREPORT.

Marsolo K, Kiernan D, Toh S, et al. 2023. Assessing the impact of privacy-preserving record linkage on record overlap and patient demographic and clinical characteristics in PCORnet®, the National Patient-Centered Clinical Research Network. J Am Med Inform Assoc. 30(3):447-455. doi: 10.1093/jamia/ocac229. PMID: 36451264.

Zigler CK, Adeyemi O, Boyd AD, et al. 2023 Dec. Collecting patient-reported outcome measures in the electronic health record: Lessons from the NIH pragmatic trials Collaboratory. Contemporary Clinical Trials. :107426. doi:10.1016/j.cct.2023.107426.

Version History

October 7, 2025: Updated text as part of annual review (changes made by K. Staman).

April 16, 2024: Added information from Zigler et al. paper (changes made by K. Staman).

August 26, 2022: Updated text as part of annual update (changes made by K. Staman).

July 3, 2020: Minor corrections to layout and formatting (changes made by D. Seils).

November 30, 2018: Updated text as part of annual update (changes made by K. Staman).

Published August 25, 2017

COVID-19 Resources

COVID-19 Resources

Rethinking Clinical Trials

A Living Textbook of Pragmatic Clinical Trials

Assessing Outcomes

Using Electronic Health Record Data in Pragmatic Clinical Trials

Section 8

Assessing Outcomes

Collecting Patient-Reported Outcomes in the EHR

Case Examples

Case Example: Assessing Outcomes

Longitudinal Data Linkage

Case Example: Longitudinal Data Linkage

Centers for Medicare and Medicaid Services (CMS)

SECTIONS

sections

REFERENCES

current section :

Assessing Outcomes

Citation: