During the course of the years-long pragmatic clinical trials supported by the NIH Pragmatic Trials Collaboratory, many unanticipated challenges have occurred, some of which have had profound effects on usual care, trial implementation, data systems, and staff. These unanticipated changes threatened the ability of the trials to address the questions they were designed to answer. A new chapter of the Living Textbook—Navigating the Unknown—describes these challenges and the responses of the study teams.
The chapter describes 3 general categories of challenges, each meriting a different response:
If the challenge is a local or temporary issue (for example, a pandemic temporarily shuts down in-person care, or a partnering health system dissolves or is purchased), but the question is still relevant or important and the trial is still feasible, then a workaround may solve the problem.
If the trial is no longer feasible for some reason (for example, the recruitment process is not feasible, or the intervention cannot be delivered as planned), and the question is still relevant, it is necessary to make significant changes to the protocol.
If the question is no longer relevant or important (for example, new evidence or policy changes make the question no longer relevant), the trial should not continue. For this challenge, it may necessary either to stop the trial or to make fundamental changes to address a different question (since the original question is no longer relevant).
The chapter describes local or temporary challenges some of the study teams faced, such as the COVID-19 pandemic, health system mergers, and changes to the electronic health record (EHR). In these cases, the research questions were still relevant and important and the trial designs were still feasible, so workarounds were created to solve the problems.
Section 2: Study teams responded to staff turnover, leadership changes, and health system acquisitions and mergers.
Section 3: Rapid technology change created unexpected consequences, such as EHR updates causing system changes that affected intervention delivery, and sites switching EHRs systems creating complexities during the trial.
Section 4: COVID-19 had significant impacts on trial activities.
Section 5 of the new chapter addresses barriers that resulted from aspects of the protocol that could have impacted recruitment, retention, or implementation in a way that imperiled the ability of trials to answer the question posed by a research study. In these scenarios, researchers found it appropriate to change the protocol or research question—to pivot—in order to glean meaningful, actionable evidence.
Sections 6 and 7 describe challenges that can fall into either category 1 or 2, and investigators had to decide how to respond in real time.
Section 6: Clinical practice guidelines and policies changed due to new evidence from observational studies, small trials, and shifting expert opinion, and therefore, usual care changed.
Section 7: Quality improvement initiatives were launched to address similar problems, threatening the ability to discern differences between arms of the trial.
The NIH Pragmatic Trials Collaboratory supports pragmatic clinical trials embedded in healthcare systems to test interventions that address urgent public health problems faced by delivery systems. They involve hundreds to thousands of participants and generally include usual care as a control arm. One of the most important lessons learned through the course of these trials is that unexpected change is a given.
The NIH Pragmatic Trials Collaboratory published a new chapter in its Living Textbook of Pragmatic Clinical Trials. The chapter, Patient Engagement, describes principles and strategies for effectively engaging patient partners.
Because patients can provide valuable insights and perspectives about clinical care for specific conditions, they are key partners for pragmatic clinical trials. The chapter highlights the value of patient engagement and provides practical tactics for patient engagement throughout the life cycle of a pragmatic trial. It also recognizes potential barriers to patient engagement and case studies in the pragmatic trials context.
The chapter provides guidance on responsible conduct of pragmatic clinical trial research involving artificial intelligence and machine learning, including navigating the institutional review board approval process, considerations for data procurement and consent, and choices regarding what data are procured and how they are used to build equity-enhancing algorithmic models.
The goal of the Final NIH Policy for Data Management and Sharing is to “maximize the appropriate sharing of scientific data.” Shared data should be of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications.” The policy “does not create a uniform requirement to share all scientific data” in order to preserve “necessary flexibility,” but makes several key suggestions including that:
Any limitations on subsequent uses of data should be communicated to sharing platforms; and
Access to scientific data should be “controlled, even if de-identified and lacking explicit limitations on subsequent use” and the policy “strongly encourages the use of established repositories to the extent possible.”
It also emphasizes that nothing in the policy is intended to prevent sharing practices “consistent with consent practices, established norms, and applicable law” including open-sharing to speed scientific progress.
Under the NIH Pragmatic Trials Collaboratory Data Sharing Policy investigators must share, at a minimum, a final de-identified research data set upon which the accepted primary pragmatic trial publication is based. They must also choose the least restrictive method for sharing of research data that provides appropriate protection for participant privacy, health system privacy, and scientific integrity.
The goal of the HEAL Public Access and Data Sharing policy is to ensure that “underlying primary data should be made as widely and freely available as possible while safeguarding the privacy of participants and protecting confidential and propriety data.” Just like the Collaboratory policy, it defines “underlying primary data” as those used to support publications. Although not “proscriptive,” it suggests that primary data should be made “broadly available through an appropriate data repository…” It states that an “appropriate” data sharing plan includes that data should be de-identified (as defined by HIPAA), but that de-identified data that “contain sensitive information” be additionally deposited in controlled-access repositories. There is no definition included for sensitive information, but the goal of the requirement was to give an additional layer of protection for potentially stigmatizing information.
As described in the previous section, the FDA has provided initial guidance on assessing dataset fitness, outlining the broad concepts of relevance (eg, do the data apply to the question of interest and can they be used in the proposed analysis) and reliability (eg, are the data accurate and complete, is provenance known, and are the data traceable). It is essential to know that there is not a single approach to assessing dataset fitness because it depends on the study and the context in which the data are used.
The authors of a recent empirical study conducted a series of interviews and surveys with PCT study teams to explore their concerns, practices, and decision-making around the fitness for use of real-world data (RWD), with a focus on EHR data. Their analysis showed that, while many PCTs conducted fitness-for-use assessments, less than a quarter did so before choosing a data source. Fitness-for-use activities, findings, and resulting study design changes were not often publicly documented, and overall costs were barriers to assessments. From their analysis, the study authors developed considerations that could help researchers improve the characterization of RWD in PCTs, including (1) defining and articulating how study-specific fitness for use will be assessed, (2) conducting fitness-for-use assessments before the trial begins, and (3) sharing the results of fitness assessments and relevant challenges and facilitators.
The following case study illustrates the challenges associated with getting approvals for and obtaining RWD for prospective studies (especially internationally, for US-based researchers), and evaluating the general fitness-for use of RWD for trials.
Case Study: Harmony Outcomes EHR Ancillary Study (eHARMONY)
The Harmony Outcomes trial was designed to determine the effect of albiglutide, when added to standard blood glucose-lowering therapies, on major cardiovascular events in patients with type 2 diabetes. eHARMONY was an ancillary study conducted alongside the Harmony Outcomes trial with 3 objectives related to understanding the potential of RWD in clinical trials:
Understand how EHR data are used to facilitate trial recruitment and the barriers to that use
Evaluate the fitness of RWD data for use in populating baseline characteristics in the electronic case report form (eCRF)
Evaluate the fitness of RWD data for use in identifying clinical endpoints (Hammil et al. 2022)
Getting approvals for and obtaining RWD for prospective studies
Originally, this multinational ancillary study planned to include RWD from the United States and several European countries, including the National Hospital Discharge registry (Sweden); the National Health Service Register (Denmark); and the National Health Service hospital discharge data (UK). However, beginning in 2018, the General Data Protection Regulation (GDPR) placed new restrictions on the movement of individual-level data outside the European Union. As a result, planned approaches to international data flow in the study were no longer feasible. Instead, the ancillary study had to rely on Medicare insurance claims data and EHR data from selected US study sites. For the EHR strategy to work, it required sites with a data warehouse based on EHR data; the ability to organize EHR data into a common data format; and an integrated clinical, operational, and technical team. Many of the sites approached for this ancillary study chose not participate because they knew they could not perform this work.
In the eHARMONY ancillary study, assessing a site’s technical capabilities and the quality of their EHR data up front was often not possible. In the end, not all selected sites were able to contribute meaningful data to the study. Among the lessons learned were:
Standalone clinical research sites had very little extractable EHR data about patients.
Most lab results and medications were either not extractable or not mapped to a useful terminology.
Many sites did not have the ability to transform their data into a common format and had to send rudimentary data extracts to the ancillary study coordinating center. Sites participating in other research networks, such as PCORnet, had no difficulty with this task.
Evaluating the general fitness for use of RWD for trials
The study team compared EHR data and Medicare claims data with the trial-collected data and found that agreement varied by data domain.
Demographics
EHR and claims information consistently agreed with the eCRF.
Medical history
There were inconsistent results, but RWD often had low sensitivity and high specificity.
Medications
EHR data had low sensitivity and high specificity; claims data had substantially higher sensitivity than EHR data.
Lab results
EHR lab results were often missing but agreed with the eCRF when present.
Events
There was a very small number of events in the ancillary study population; EHR data had low sensitivity and high specificity; and claims data had substantially higher sensitivity than the EHR.
The study authors had further recommendations for future studies considering incorporating RWD as a data source:
Define inclusion/exclusion and outcome concepts to be more RWD-friendly. Prevalent disease (eg, cerebrovascular disease, coronary artery disease) is easier to identify than historical clinical events (eg, stroke, myocardial infarction). Focus on what’s available in structured data (eg, hospitalization with primary diagnosis of myocardial infarction), avoiding detailed clinical results (eg, ECG findings).
When studies have both purpose-collected data (ie, trial database) and RWD, they should perform validation studies as they are able, to contribute to the evidence base for RWD in clinical research.
This guidance for industry provides sponsors, researchers, and other interested parties with considerations for the use of EHR and claims data for regulatory decision making.
Hammill BH, Leimberger JD, Lampron Z, Raman SR, O'Brien EC, Wurst KE , Mountcastle S, Cunnington M, Janmohamed S, Curtis LH. Fitness of real-world data for clinical trial data collection: Results and lessons from a HARMONY Outcomes ancillary study. Clinical Trials. 24:17407745221114298 DOI: 10.1177/17407745221114298 PMID: 35876156
Raman SR, O’Brien EC, Hammill BG, et al. 2022. Evaluating fitness-for-use of electronic health records in pragmatic clinical trials: reported practices and recommendations. Journal of the American Medical Informatics Association. 29:798–804. doi:10.1093/jamia/ocac004. PMID: 35171985
The FDA is creating 4 methodological Patient-Focused Drug Development (PFDD) guidance documents to describe how to collect and submit patient experience data in clinical research. The guidance series is responsive to the mandates of the 21st Century Cures Act and other efforts to include patient experience data in support of regulatory decision-making and medical product development. Patient experience data include information about patients’ symptoms, the effects the disease has on patients over time, and patients’ experience with and views about treatments. It also includes information about patients’ views about the disease, treatment, and outcomes, and the relationship amongst those things. It includes information about the impact of the disease and treatment on the patient, patients’ thoughts about potential and current treatments, and the patients’ enhanced understanding of the progression, severity, and chronicity of the disease.
The guidance series consists of 4 documents:
Guidance 1: Collecting Comprehensive and Representative Input
Guidance 1 was released in June 2020 and is designed for those planning a study and deciding on sampling methods for the collection of patient input. Approaches for patient selection and input collection depend on the research question, and FDA recommends examining previous studies and relevant literature and consulting subject matter experts regarding decisions on methodology and study materials.
Guidance 1 describes the steps in the process of collecting input, which include defining a research question, describing the target population and who will be providing the data (patients, caregivers, or clinicians), choosing a data collection methodology, and collecting, storing and managing the data. Of note, when choosing data collection methodology and setting, ensure the data collected are representative of the population and consider the inclusion of diverse sites to better enable a representative sample.
Guidance 2: Methods to Identify What Is Important to Patients
This draft guidance describes what to ask patients and why. It describes qualitative, quantitative, and mixed methods approaches and provides best practices for eliciting information that is important to patients, including special populations (children, the cognitively impaired, and those with rare diseases) and diverse populations.
Guidance 3: Selecting, Developing, or Modifying Fit-for-Purpose Clinical Outcomes
This guidance is intended to help investigators decide what to measure and how to select or develop clinical outcome assessments (COAs) that are fit for the purpose of assessing outcomes that are important to patients. It “describes how stakeholders (patients, caregivers, researchers, medical product developers, and others) can collect and submit patient experience data and other relevant information from patients and caregivers to be used for medical product development and regulatory decision-making.”
Guidance 4: Incorporating Clinical Outcome Assessments Into Endpoints for Regulatory Decision-Making
This guidance is forthcoming and will describe how to incorporate a given clinical outcome assessment tool or a set of measures into a clinical research study. The guidance will include information about defining meaningful change, and the collection, analysis, interpretation, and submission of data.
Part of the mission on the NIH Pragmatic Trials Collaboratory is to share lessons learned from the NIH Collaboratory Trials, and to this end we asked the principal investigators of the Guiding Good Choices for Health (GGC4H) NIH Collaboratory Trial (Scheuer et al. 2022) to give us their most critical advice and share their expertise and lessons learned regarding using PROs.
GGC4H tests the feasibility and effectiveness of implementing Guiding Good Choices, a parenting program for parents of early adolescents ages 11-14 that has been shown to reduce adolescent alcohol, tobacco, and marijuana use; depression; and delinquent behavior (Montero-Zamora et al. 2021). The trial is being conducted in 3 geographically and socioeconomically diverse large integrated healthcare systems and will compare GGC parenting intervention to usual pediatric primary care practice, and involve approximately 3750 adolescents. The primary outcome is substance use initiation, and the study team also asks adolescents to report on substance use frequency and amount. Secondary outcomes include antisocial behavior and depression (PHQ-9).
Table. Domains Assessed in Youth Surveys for GGC4H
Primary Outcomes
Secondary Outcomes
Exploratory Outcomes
Substance use
Mental health
Anxiety (GAD-7)
Age of initiation
Depression (PHQ-9)
Screen & social media time, sexting
Lifetime frequency
Antisocial Behavior
Past-year, past 30-day use, past 30-day use amount
Ever
Substances examined
Past-year
Alcohol, marijuana, cigarettes, e-cigarettes, inhalants, opioids, other drugs
Because the GGC4H study team has extensive experience collecting and using behavioral health PROs with adolescents, we asked the PIs of GGC4H to share their lessons learned.
What is your most critical advice regarding use of PROs?
Healthcare systems need to shift toward routinely collecting PRO data via the EHR. We had initially planned to use EHR data for our study, but because these data are not routinely collected during routine clinical encounters, we switched to RedCap, which is more involved and required more resources. Healthcare systems should collect behavioral, mental health, and substance abuse information more systematically and consistently, as these data are important to the care of their patients.
What tips would you share with other study teams?
Use validated measures.
Have an experienced data collection team
Persistence is key. Reach out in various ways (eg, phone, email, text) to encourage responses.
Why use PROs in this population?
PROs provide a rich data source that complements EHR data and enables the use of quantitative and qualitative analysis methods. In behavioral health studies with a pediatric population (like ours), PROs are likely to be the primary source of outcomes information because (a) these outcomes are not necessarily collected routinely in pediatrics practice, (b) even when collected, they are not necessarily saved routinely in EHRs, and (c) even when collected, there may be measurement differences across settings (eg, PHQ-2 v. PHQ-9).
Data on risk and protective factors (eg, peer substance use, family conflict, attachment to parents) are also key predictors of study outcomes, and the study team collects these from adolescents as these data are also not available in EHRs.
Can you tell us what you learned about optimal administration formats within this study (internet vs telephone)?
We offered participants the option to complete online surveys or telephone interviews by trained interviewers. Most adolescents opted for self-administration over the internet, and tend to be more frank answering questions about substance use in a computerized format.
After the child assents, the data collection team contacts the teenager directly using their preferred method of contact. Researchers might nudge the parent if the child does not complete a questionnaire.
You collected the PHQ-9, which has questions about suicide ideation. Did you have a protocol in place for when these surveys signal potential distress?
When we assented the children and consented the parents, we let them know that if they express anything that makes us think they might be a danger to themselves or others, we will take action. For example, if an adolescent scores over a threshold on the PHQ-9, it triggers one of the interviewers to follow up with the Columbia Suicide Severity Rating Scale to assess degree of risk and respond accordingly. Adolescents commonly endorse distress, so the follow up with Columbia is important. If on assessment, they have suicide ideation, we have an IRB-approved protocol in place where they and their families are given resources to address the issue.
The expertise of your data collection team seems really critical to your project’s success. What skills did you include in the job description for the interviewers?
We have embedded teams within the heath system at each of our 3 sites of well-trained, skilled and personable interviewers who are experienced in collecting data from teenagers. Our research group deals exclusively with mental health issues and substance abuse. Adolescents are not always comfortable dealing with adults, and interviewing this population takes specific skills.
If the PROs were not routinely saved in the EHR, how did you save them, and how did you manage security concerns?
We used a separate independent data collection platform (REdCap) for the surveys. We do not go into the EHR, and as part of the consent/assent process we let families know that the data would not be put into the EHR. We link to EHR data using unique study IDs.
Montero-Zamora P, Brown EC, Ringwalt CL, et al. 2021 Oct 9. Predictors of engagement and attendance of a family-based prevention program for underage drinking in Mexico. Prev Sci. doi:10.1007/s11121-021-01301-z.
Scheuer H, Kuklinski MR, Sterling SA, et al. 2022. Parent-focused prevention of adolescent health risk behavior: study protocol for a multisite cluster-randomized trial implemented in pediatric primary care. Contemporary Clinical Trials. 112:106621. doi:10.1016/j.cct.2021.106621. https://linkinghub.elsevier.com/retrieve/pii/S1551714421003578.
Version History
September 14, 2022: Updated as part of annual review (changes made by K. Staman.)
To monitor the fidelity of an embedded and potentially complex clinical intervention, it is important to identify in advance which features are so essential to its effectiveness that modifying them could negatively affect the study’s outcomes and impact. Equally important is knowing which intervention elements can be adapted to accommodate contextual factors or local needs—without affecting fidelity. Another way of describing this is through the concepts of functions and forms:
Functions are the fundamental purpose or desired effect of the intervention activities, such as the effect on patient or provider behavior.
Forms are the intervention details, components, and activities that may vary, or be adapted, based on contextual factors.
“Functions of a complex intervention represent a purpose or goal, while forms are the tools or processes used to achieve a function. Identifying forms and functions allows adapted complex interventions to retain a level of standardization and integrity in design.” (Hill et al. 2020)
Identifying an intervention’s core functions informs how fidelity is monitored (i.e., by observing fidelity to the core functions) and what adaptations can be made without compromising effectiveness. Such fidelity is critical to demonstrating that the same intervention is being studied across sites and settings (Esmail et al. 2020). For example, take a cognitive behavioral therapy (CBT) intervention whose core function is to train patients to use coping skills. Its corresponding forms are the activities used to achieve the goal of training patients in coping skills, and might include the specific content, number, and length of intervention sessions, and the mode of delivery (e.g., group, individual, online, in-person). Moving from in-person delivery of the CBT intervention to online delivery, for example, due to the COVID-19 pandemic, would represent an adaptation to the intervention’s form—but not to the core function of the intervention.
In this section, we introduce recently developed methodology standards around intervention fidelity and adaptations for study teams to consider.
PCORI Standards for Complex Health Interventions
The Patient-Centered Outcomes Research Institute (PCORI) has synthesized a set of methodology standards to help investigators ensure the “validity, transparency, and reproducibility of studies of complex health interventions” (Esmail et al. 2020). PCORI’s conceptual model intends to help investigators evaluate the aspects of their study that go beyond the intervention’s content and delivery (i.e., the intervention components) toward a deeper delineation of how the intervention achieves its effects, referred to as the “causal pathway.” The standards require that an intervention’s core functions and forms, causal mechanisms, and expected adaptations are described during the design phase and evaluated throughout implementation, interpretation, and reporting.
Employing the PCORI methodology standards can assist study teams as they prepare to monitor fidelity and plan for adaptations or alternative means that will achieve the desired outcomes. Study teams are encouraged to refer to the publication (Esmail et al. 2020) and the PCORI methodology standards website for details on how to apply these standards to your intervention, which consist of these broad steps:
Fully describe the intervention and comparator and define their core functions.
Specify the hypothesized causal pathways* through which the core functions influence outcomes.
Specify how adaptations to the form of the intervention and comparator will be allowed and recorded.
Plan and describe a process evaluation.
Select patient outcomes informed by the causal pathway.
*The causal pathway is an assumption about how the intervention works or produces change (Moore et al. 2015).
“By emphasizing fidelity to core function rather than to form, [the PCORI methodology] standards aim to link the underlying purposes or goals of an intervention (rather than its content or activities) to their role in the causal pathway.” (Esmail et al. 2020)
Applying Functions and Forms to Interventions
Read more about how delineating an intervention’s functions and forms can help to advance future implementation:
In 2013, the Office of the Assistant Secretary for Planning and Evaluation (ASPE) of the Department of Health and Human Services developed a research brief focused on “the importance of identifying, operationalizing, and implementing the core components of evidence-based and evidence-informed interventions that likely are critical to producing positive outcomes” (Blase and Fixsen 2013). The brief, Core Intervention Components: Identifying and Operationalizing What Makes Programs Work, serves as an introduction for study teams as they identify, validate, operationalize, and adapt their program’s core components before and during study implementation.
The brief suggests that, in order for an evidence-based intervention to be useful in a real-world setting, the following elements should be present:
A clear description of the context of the program, the principles underlying the intervention, and the population it is intended to serve
A definition of the essential functions of the intervention judged as necessary to produce outcomes in a typical setting
A description of the active ingredients that operationally define the core components; ie, the teachable and learnable activities for providers delivering the intervention
Plans for a practical assessment of the performance of those delivering the intervention so that these measures can be correlated with outcomes over time
Well-defined and measured core intervention components can lead to improved use of staff resources, greater likelihood of accurately interpreting outcomes, increased ability to make appropriate adaptations, and better replication and scale-up of the intervention in other settings.
Blase K, Fixsen D. 2013. Core Intervention Components: Identifying and Operationalizing What Makes Programs Work: Executive Summary. Office of the Assistant Secretary for Planning and Evaluation; Office of Human Services Policy; U.S. Department of Health and Human Services.
Esmail LC, Barasky R, Mittman BS, Hickam DH. 2020. Improving Comparative Effectiveness Research of Complex Health Interventions: Standards from the Patient-Centered Outcomes Research Institute (PCORI). J Gen Intern Med. 35:875-881. doi:10.1007/s11606-020-06093-6. PMID: 33107006.
Hill J, Cuthel AM, Lin P, Grudzen CR. 2020. Primary Palliative Care for Emergency Medicine (PRIM-ER): Applying form and function to a theory-based complex intervention. Contemp Clin Trials Commun. 18:100570. doi:https://doi.org/10.1016/j.conctc.2020.100570.
Moore GF, Audrey S, Barker M, et al. 2015. Process evaluation of complex interventions: Medical Research Council guidance. BMJ. 350:h1258. doi:10.1136/bmj.h1258.
Perez Jolles M, Lengnick-Hall R, Mittman BS. 2019. Core Functions and Forms of Complex Health Interventions: a Patient-Centered Medical Home Illustration. J Gen Intern Med. 34:1032-1038. doi:10.1007/s11606-018-4818-7.
Version History
Published July 7, 2021
current section :
Identifying the Functions and Forms of an Intervention
Contributing Editors Liz Wing, MA Karen Staman, MS
Practical Reporting Strategies for Study Teams
“When planning research, by prospectively thinking about how findings will be translated to clinical practice, key considerations may be what clinicians will need to change their behavior and what resources the organization will require to permanently maintain (sustain) the intervention.” – Curtis et al. 2017
As described elsewhere in the Living Textbook, a key element of successful embedded pragmatic clinical trials (ePCTs) is engagement with stakeholders, early and often. The stakeholders who will be particularly interested in the results of your ePCT intervention—and who will be crucial to the adoption of your findings—include the healthcare system leaders responsible for making decisions about which interventions to implement and sustain. Ideally, ePCT study teams will establish ongoing, bidirectional health system partnerships throughout the lifecycle of the trial and will plan their intervention with “implementation in mind.” Such planning includes considering how study results will be conveyed to the health system partners. Continuous engagement with partners can help ensure that the context for the intervention is well understood and that your partners will facilitate knowledge translation and support the intervention after the trial is completed.
Next we suggest a few practical strategies and tools for sharing study findings with healthcare system partners—beyond the usual journal publications and professional conferences. Study teams should tailor the message and medium to the needs and priorities of the health system. Make what you say clear and what you show visually appealing. Make your study results actionable, and engage with your health system partners to develop a customized sustainment plan.
1. Plan for impact
From the time the proposal is written, and particularly at the start of funding, invest in relationships with healthcare system stakeholders such as administrators and executives in the C-suite (CEO, CIO, COO, etc.). Over time and with regular communication, study teams can learn the language of the leaders, how the health system functions, and what level of evidence decision-makers will use to justify their decisions. One outcome will be that health system leaders will learn the potential of your embedded intervention for improving quality and efficiency within their health system. For their part, leaders can enhance collaborations by effectively communicating their challenges and learning to frame problems in researchable terms (Alexander et al. 2007). Many embedded researchers use the research funding or seek supplemental funding to disseminate preliminary and summative findings.
Example activities for planning for impact:
Identify why the study findings will matter to health system leaders, whether findings are positive or negative.
Identify other partners, such as patient advocacy organizations, that could amplify the findings if successful.
Assess the context for adopting and sustaining positive findings.
Determine what study tools will be developed, including health system-, provider-, and patient-facing tools and materials.
Establish a dissemination advisory panel and consider carefully who will serve on it.
Set up a website portal for study materials, but be mindful about not launching the website until after study results are shared to prevent contamination of the study arms.
Note: It is important to disseminate findings even if the results are negative, as important learning and insight for future research can be gleaned from all results, including a hypothesized outcome that was not achieved.
When involving health system leadership in trial planning, there will be discussions of key comparators, outcomes, and the appropriateness and feasibility of the intervention. The Health Foundation (UK) has developed a template to help health researchers plan a communication strategy. Considerations include the context, key objectives, messaging, audience, communications channels, and additional resources.
Excerpt from the Health Foundation’s communication strategy template
2. Communicate findings in person
Prepare direct, in-person presentations (when possible) for your partners that synthesize your findings in an engaging format. Health system leaders prefer information on effectiveness over efficacy, evidence that is distilled, and results that are actionable (White et al. 2017). Explain how adopting the intervention could work in their health system context. Take the time to receive feedback from stakeholders on your findings, especially what needs to be implemented or de-implemented.
Example communication products:
Tailored briefs, executive summaries, or fact sheets
Multimedia slide show
Animated video
Interactive webinar that includes polling
Workflow demo
Role playing
Educational session
3. Provide leave-behind materials and tools
As an encouragement to adopt an intervention, study teams can leave their partner health systems with easy-to-use tools and other tangible resources. Consider devising solutions such as recommendations, clinical pathways, step-by-step instructions, and point-of-care algorithms (White et al. 2017). This will help indicate that materials are already created and there is less for the health system to provide at the time of implementation and sustainability.
Example resources:
Intervention website
Protocol toolkit
Animated workflow
Infographic or poster
Summary sheet or FAQ
Data visualization
Decision flowchart
Clinical encounter aid
Training video
Implementation guide
Dissemination Case Studies
REDUCE MRSA and ABATE Infection
Cover of Protocol Toolkit
The REDUCE MRSA trial was a large, cluster-randomized pragmatic trial of 43 hospitals (74 adult ICUs) that demonstrated that universal bathing with chlorhexidine and universal nasal decolonization with mupirocin significantly reduced methicillin-resistant Staphylococcus aureus (MRSA) clinical cultures and all-cause bloodstream infections in adult ICUs. Among the study team’s dissemination activities that supported practice change in the partner health system was a collaboration with the CDC and AHRQ to develop a 52-page protocol for universal decolonization in the ICU (results from REDUCE MRSA) that provides “decision-making tools and a rationale to help hospital leaders understand the effectiveness of ICU decolonization with mupirocin and chlorhexidine gluconate (CHG) and determine whether this strategy represents the best course of action for their facility.” The document was designed for health system leaders to understand the study results and know how to decide to implement and then train staff in the intervention.
Read more about the value of creating targeted tools and the REDUCE MRSA case study illustrating how an embedded PCT planned for and delivered a successful dissemination strategy to their health system partners.
A related study, the ABATE Infection clinical trial, was an NIH Collaboratory Trial conducted in a non-ICU setting, involving 53 hospitals (194 non-critical care units) and a total of 183K patients in the intervention period. It found that decolonization with universal daily CHG bathing plus nasal mupirocin for MRSA carriers significantly reduced infections in patients with medical devices. Patients with central lines, midlines, or lumbar drains had 37% fewer clinical cultures from antibiotic-resistant bacteria and 32% fewer all-cause bloodstream infections.
Implementing a targeted intervention (e.g., identifying select inpatients with medical devices) can be challenging. It often requires dedicated information technology (IT) support in the form of targeted order sets and adherence reports for project champions to track and encourage uptake. For this reason, NIH requested that AHRQ support a toolkit to disseminate the results of the ABATE Infection Trial. This toolkit is freely available online and provides decision-making and preparatory steps, protocols, handouts, training documents and videos, adherence and skills assessment checklists, as well as talking points and responses to frequently asked questions.
STOP CRC
The NIH Collaboratory Trial Strategies and Opportunities to Stop Colorectal Cancer in Priority Populations (STOP CRC) was a cluster-randomized trial that tested a culturally tailored, healthcare system–based program to improve colorectal cancer screening rates in 26 federally qualified health center clinics in Oregon and California. The intervention involved embedding a tool in the electronic health record (EHR) to identify patients who were overdue for colorectal cancer screening, mailing a fecal immunochemical test (FIT) kit and reminder letter to eligible patients, and implementing a practice improvement process at participating clinics. Compared with clinics that practiced usual care, intervention clinics had a significantly higher proportion of participants who completed a FIT and any colorectal cancer screening. The improved screening rates occurred despite low and highly variable rates of implementation of the program.
STOP CRC dissemination activities included a website of resources, program materials, and an implementation guide describing the “Mailed FIT Program and how to orient a clinic to the program. Clinics that implement the STOP CRC program need to address technical, workflow, and policy questions before launching it. This guide is intended to address these questions.” This guide was based on lessons from a workshop on fundamental concepts of communications planning led by members of AcademyHealth’s communications team.
STOP CRC Website
Communications Toolkits, Templates, and Resources
Various materials and approaches are available that can assist study teams with their communications planning with health system leaders. We discuss a few below.
This toolkit provides resources “to help with the publication process, ideas for dissemination beyond publishing in a research journal, guidance for managing study data, and specific steps to facilitate the formal closure of a study.”
The Health Foundation (UK) provides a quick guide to help health researchers develop a communication strategy. Considerations include the context, key messaging and objectives, audience, communications channels, and additional resources.
This website offers publicly available Dissemination Toolkits that are free to download and use. These toolkits contain guidelines, strategies, checklists, worksheets, templates, examples, and case studies for developing dissemination plans and products.
This framework was developed by Mathematica, AcademyHealth, and Palladian Partners to provide information and tools for designing and implementing a robust dissemination strategy informed by multiple stakeholder groups.
This website provides resources for knowledge translation, which is a dynamic and iterative process that includes “synthesis, dissemination, exchange, and ethically-sound application of knowledge to improve health.”
Generating Knowledge from Best Care: Advancing the Continuously Learning Health System (Abraham et al. 2016). Among the recommendations for leaders of health systems engaged in embedded clinical research is to “define a course of action that would result from any plausible answer to a good question. For example, if the question involved the effectiveness of a new clinical strategy or approach to delivering care, leaders should anticipate how the program can be disseminated if it is effective, and should be willing to terminate it or modify it if not.”
Some researchers have found it helpful to work with different organizations to disseminate their results. For example, funders, nonprofits, accreditation and advisory groups such as Agency for Healthcare Research and Quality (AHRQ), National Quality Forum (NQF), Advisory Committee on Immunization Practices (ACIP), Institute for Healthcare Improvement (IHI), and Healthcare Infection Control Practices Advisory Committee (HICPAC). Other communication channels include professional societies, conferences, magazines, academic journals (e.g., Health Affairs), and through peers and clinicians. There are an increasing number of repositories for evidence-based practices, as well as collections of systematic reviews and community and clinical guidelines.
Abraham. Generating Knowledge from Best Care: Advancing the Continuously Learning Health System. NAM Perspectives. Discussion Paper, National Academy of Medicine, Washington, DC. doi: 10.31478/201609b. Available at: https://nam.edu/generating-knowledge-from-best-care-advancing-the-continuously-learning-health-system/.
Alexander JA, Hearld LR, Jiang HJ, Fraser I. 2007. Increasing the relevance of research to health care managers: hospital CEO imperatives for improving quality and lowering costs. Health Care Manage Rev. 32:150-159. doi:10.1097/01.HMR.0000267792.09686.e3. PMID: 17438398.
Curtis K, Fry M, Shaban RZ, Considine J. 2017. Translating research findings to clinical nursing practice. J Clin Nurs. 26:862-872. doi:10.1111/jocn.13586. PMID: 27649522.
Johnson K, Grossmann C, Anau J, et al. Integrating Research into Health Care Systems: Executives’ Views. Available at: https://nam.edu/perspectives-2015-integrating-research-into-health-care-systems-executives-views/.
Schoelles K, Umscheid CA, Lin JS, et al. AHRQ Methods for Effective Health Care. A Framework for Conceptualizing Evidence Needs of Health Systems. Rockville (MD): Agency for Healthcare Research and Quality (US), 2017.
White CM, Sanders Schmidler GD, Butler M, et al. AHRQ Methods for Effective Health Care. Understanding Health-Systems' Use of and Need for Evidence To Inform Decisionmaking. Rockville (MD): Agency for Healthcare Research and Quality (US), 2017.
Version History
February 25, 2021: Updated the section title (change made by D. Seils).
February 3, 2021: Updated with new strategies and resources (changes made by L. Wing).
December 11, 2018: Updated as part of the annual review process, modified table and added text (changes made by K. Staman).
Although numerous methods have been proposed for collecting inpatient event data, there is little comparative research regarding the relative accuracy of these methods and the resulting implications for pragmatic clinical trial design. Comparative accuracy research is needed to allow clinical trial planners to understand the limitations associated with different data collection methods, properly estimate inpatient event rates, and inform their sample size estimates.
As an example, the Treatment With Adenosine Diphosphate Receptor Inhibitors: Longitudinal Assessment of Treatment Patterns and Events After Acute Coronary Syndrome (TRANSLATE-ACS; ClinicalTrials.gov Identifier: NCT01088503) study was designed to evaluate the use of prasugrel and other ADP receptor inhibitor therapies among myocardial infarction (MI) participants treated with percutaneous coronary intervention (PCI) (Chin et al. 2011). It is one of the few multi-center studies to compare inpatient event data collection methods (Krishnamoorthy et al. 2016; Guimarães et al. 2017). These investigators compared (a) patient-reported versus physician-adjudicated myocardial infarction (MI) events and (b) hospital bill-derived versus physician-adjudicated MI events. We use this as a case study because it provides insights into the limitations inherent in using patient-reported and medical claims data as the sole sources for determining inpatient endpoints.
Case Study: TRANSLATE ACS
TRANSLATE-ACS was an observational study that enrolled 12,365 acute myocardial infarction patients. After discharge, a centralized call center conducted telephone interviews with patients at 6 weeks, and at 6, 12, and 15 months follow-up. During these interviews patients were asked for information regarding re-hospitalizations. In a subset of interviews, patients were asked for their re-hospitalization reason. Initiating a source document collection process to obtain objective evidence, such as a claim or bill for a hospitalization, was triggered based upon patient reported re-hospitalizations and automatic queries from their enrolling hospital at 12 months follow-up. In a second stage, the investigators requested patient medical records when a patient’s hospital bill indicated the patient may have experienced a major adverse cardiovascular event (MACE). An independent physician committee (also called a CEC) then adjudicated patient medical records to validate MACE events. In this study’s context, hospital bill data collection assumes prior patient interviews and physician adjudication data collection assumes prior triggering of events, collection of source documents for those events and expert adjudication of the events based on the source documents. Subsequent analyses compared results from these different inpatient event data collection methods through 12 months follow-up. At one year after MI, they found that event rates for MI, stroke, and bleeding were lower when medical claims were used to identify events than when adjudicated by physicians (Guimarães et al. 2017).
Validating an inpatient event data collection method is conceptually similar to validating a computable phenotype. The data collection method must be able to identify the endpoint event it purports to identify and meet a desired accuracy level when compared with the best methods for assessing the endpoint event (i.e., the gold standard, which is physician adjudicated medical records) (Richesson and Smerek 2014). Without such validation, the researcher can’t confidently state that the data support the conclusions. Whereas with such a validation, the researcher can provide evidence and will have quantified the uncertainty. There is a trade-off between a definition that can be applied consistently [across multiple reviewers (in the case of CEC) or EHR data from multiple facilities in the case of computational definitions over EHR data] versus clinical accuracy (agreement with the truth). The existence of the trade-off emphasizes that measurement of the inaccuracy is needed.
We will define “data collection ascertainment accuracy” using three metrics: sensitivity, specificity and positive predictive value (PPV). We also will define Type I and II errors associated with data collection accuracy. While these data collection errors may influence traditional hypothesis testing Type I and II error rates, they are distinct and represent another factor to be considered in pragmatic trial sample size estimation. Essentially, traditional sample size estimation methods make implicit assumption regarding data accuracy. Here, we are making those assumptions more explicit.
For TRANSLATE ACS data collection accuracy ascertainment
True positive: physician adjudication and a patient report or hospital bill indicates that a MI event occurred.
False negative: physician adjudication indicates that a MI event occurred, but according to patient report or hospital bill, it did not occur.
False positive: patient report or hospital bill indicates that a MI event occurred, but according to physician review of the medical record, it did not actually occur.
True negative: both physician adjudication and patient report or hospital bill indicate that an event did not occur.
Gold Standard Condition (Physician adjudication indicates an event occurred)
Yes
No
Comparator Condition (Patient report or hospital bill indicates an event occurred)
Yes
True positive (TP)
False positive (FP)
No
False negative (FN)
True negative (TN)
Sensitivity is the true positive (TP) rate, in this case, the proportion of actual inpatient myocardial infarction (MI) events that are identified by a given inpatient event data collection method. Sensitivity is measured using the number of true positives (TP) and false negatives (FN) with this formula: TP/(TP + FN).
Sensitivity will help gauge the possibility of a Type II Error (the possibility of rejecting the null hypothesis when it is true) (Sharma et al. 2009). The type II error rate with respect to measurement is defined as 1 – sensitivity by Sharma et al (2009). As shown in the table below, the sensitivity of patient-reported MI events in TRANASLATE-ACS is quite low, meaning that patients tended to under state their MI events, while the sensitivity of hospitals bills is higher, and improves with the number of ICD diagnosis codes that are used to identify the MI event.
Myocardial Infarction Event Data Sources
TP
FP
FN
TN
Sensitivity
TP/(TP+FN)
Specificity
TN/(TN+FP)
PPV
TP/(TP+FP)
Standard
Compare
Physician adjudicated
Patient Report
103
257
254
0.289
0.286
Physician adjudicated
Hospital Bill
1st dx code
482
66
264
1145
0.646
0.945
0.880
2nd dx
588
90
158
1121
0.788
0.926
0.867
All dx code
625
103
121
1108
0.838
0.915
0.859
Specificity is the true negative (TN) rate. In this case the proportion of events that are not MIs identified by a given inpatient event data collection method. Specificity is measured using the number of true negatives (TN) and false positives (FP) with this formula: TN/(TN+FP).
Specificity will help us gauge the possibility of a Type I Error (the failure of rejecting a false negative hypothesis) (Sharma et al. 2009). The type I error rate is 1-specificity as defined by Sharma et al (2009). In the table above, there is no specificity for patient-reported MI events because the TRANSLATE ACS authors did not report the associated true negatives. In contract with sensitivity, the specificity of hospital bill inpatient event data collection is high but decreases with increases in the number of ICD diagnosis codes used to identify the MI event.
Positive Predictive Value (PPV) is defined as the proportion of actual inpatient events among those identified by a given inpatient event data collection method (TP/(TP + FP)) and will help gauge the precision associated with an inpatient event data collection method. As with sensitivity, the PPV was low for patient report and higher for hospital bills, and improves with the number of ICD diagnosis codes used to identify the MI event.
We can extend these metrics to compute the measurement concepts of Type I (1-specificity) and Type II (1 – sensitivity) Error rates (Sharma et al. 2009), as shown in the table below. For hospital bills using all diagnosis codes, the Type I Error Rates is 0.162 (1- 0.838), and the Type II Error Rate is 0.085.
Myocardial Infarction Event Data Sources
Errors
Standard
Comparator
Type I
(false positive)
Type II
(false negative)
Physician adjudicated
Patient report
0.711
0.714
Physician adjudicated
Hospital Bill
1st diagnosis code
0.354
0.055
2nd diagnosis code
0.212
0.074
All diagnosis codes
0.162
0.085
Our Type I and Type II Error metrics tell us that the combined patient report and hospital bill data collection methods will miss a number of true MIs (FN) and include events that are not true MIs (FP). One reason for false positives may be the stringent definitions used in adjudication of endpoints within traditional trials. Nonetheless, this information will help us understand how the accuracy of data collection methods may impact sample size estimation and subsequent hypothesis testing.
Metrology (the science of measurement) evaluates the reliability of measurements in terms of their objectivity and intersubjectivity (Mari et al. 2012). Objectivity guaranteesthat measurement results are independent of their context (e.g., properties of the object being measured, the measurement system, and the person who is measuring). To gauge objectivity, we can compare the TRANSLATE –ACS hospital billing data collection results with those from the Women’s Health Initiative (WHI) (Hlatky et al. 2014). In the WHI study, physician adjudicators used standardized forms and definitions to review patient medical records and identify MI events. Inpatient hospital bill MI events then were identified by linking study patients with their CMS Medicare Part A claims. The resulting analysis only compared physician adjudicated MIs with CMS claims-identified MIs in the first or second diagnosis codes. The resulting sensitivity (0.790) and specificity (0.988) are close to their corresponding TRANSLATE-ACS values. However, the PPV (0.708) is lower. This difference illustrates how data from different facilities, units, and providers (and patients/sub-groups of patients) may yield different and biased answers. A researcher’s options are to (1) measure the impact, or (2) measure the range and location of the errors and simulate the impact and to show that the error is inconsequential (or not) (Richesson et al. 2013 Sep 11).
Metrology’s intersubjectivity evaluation standards requires that measurement results convey both information regarding measurement values and the degree of trust that should be attributed to those values (Mari et al. 2012). This information can be presented as a target measurement uncertainty that defines the minimal quality needed to support a specific decision. The implication is that if the actual uncertainty is greater than the target, the measurement value is not considered valid because it cannot support the intended use. This is similar to the use of Type I and II Error rates in sample size estimation and hypothesis testing. In the example above, measurement values are the MI event rates obtained by different data collection methods and the degree of trust attributed to those values are their associated Type I and Type II Error rates. The unanswered question is whether these rates meet minimal data quality accuracy standards. While it could be argued that data quality accuracy errors may have minimal effect upon clinical trial outcomes, the burden of proof lies with the investigator. Calls for reporting data quality assessment results with published research results have been made (Kahn et al. 2015). Those who use these data and rely upon their associated study results in decision-making should be aware of potential data quality accuracy limitations that may influence their use and interpretation of study results. For this reason is it essential that minimum inpatient event data quality accuracy standards are developed.
One way to compensate for measurement error is to increase the sample size, but this can increase financial costs and risks; another way to compensate is to 1) improve the reliability of raters through training or 2) use multiple methods for ascertaining information (Perkins et al. 2000). Meaningful reductions in sample size have been gained from using the mean of multiple sources and other improvements in reliability (Perkins et al. 2000). Nonetheless, investigators will need to determine the data collection methods most appropriate for their study before designing their study and making their sample size estimates.
The example above uses CEC adjudicated endpoints as the gold standard. However, data collection methods will vary in their accuracy versus this gold standard and in their applicability to clinical practice. Previous research has demonstrated that the accuracy of claims data is high for cardiovascular procedures (e.g., CABG surgery and PCI) and much lower for bleeding events. Perhaps, this is because bleeding event definitions used in explanatory trials (Mehran et al. 2011) are not relevant for clinical practice and could be replaced by the number of transfusions. Similarly, the explanatory trial CEC myocardial infarction definition may differ from that commonly used in clinical practice.
All data collection methods are associated with type I and II errors. These error rates will vary by endpoint and data collection method. These errors typically are not accounted for in sample size estimates. Research is needed to determine these error rates and how they may influence sample size estimates. It is also the case that certain endpoints used in explanatory trials may have little relevance in actual practice and may not be recorded in EHRs. Because of these issues, the pragmatic trial community needs to collectively determine which endpoints are relevant for pragmatic trials, how they can be measured and validated, and how the accuracy of these measurement methods may impact hypothesis testing sample size estimates.
Chin CT, Wang TY, Anstrom KJ, et al. 2011. Treatment with adenosine diphosphate receptor inhibitors-longitudinal assessment of treatment patterns and events after acute coronary syndrome (TRANSLATE-ACS) study design: expanding the paradigm of longitudinal observational research. Am Heart J. 162:844–851. doi:10.1016/j.ahj.2011.08.021.
Guimarães PO, Krishnamoorthy A, Kaltenbach LA, et al. 2017. Accuracy of Medical Claims for Identifying Cardiovascular and Bleeding Events After Myocardial Infarction : A Secondary Analysis of the TRANSLATE-ACS Study. JAMA Cardiol. 2:750–757. doi:10.1001/jamacardio.2017.1460.
Hlatky MA, Ray RM, Burwen DR, et al. 2014. Use of Medicare Data to Identify Coronary Heart Disease Outcomes in the Women’s Health Initiative. Circulation: Cardiovascular Quality and Outcomes. 7:157–162. doi:10.1161/CIRCOUTCOMES.113.000373.
Kahn MG, Brown JS, Chun AT, et al. 2015. Transparent reporting of data quality in distributed data networks. EGEMS (Wash DC). 3:1052. doi:10.13063/2327-9214.1052.
Krishnamoorthy A, Peterson ED, Knight JD, et al. 2016. How Reliable are Patient-Reported Rehospitalizations? Implications for the Design of Future Practical Clinical Studies. J Am Heart Assoc. 5. doi:10.1161/JAHA.115.002695.
Mari L, Carbone P, Petri D. 2012. Measurement Fundamentals: A Pragmatic View. IEEE Trans Instrum Meas. 61:2107–2115. doi:10.1109/TIM.2012.2193693.
Mehran R, Rao SV, Bhatt DL, et al. 2011. Standardized bleeding definitions for cardiovascular clinical trials: a consensus report from the Bleeding Academic Research Consortium. Circulation. 123:2736–2747. doi:10.1161/CIRCULATIONAHA.110.009449.
Perkins DO, Wyatt RJ, Bartko JJ. 2000. Penny-wise and pound-foolish: the impact of measurement error on sample size requirements in clinical trials. Biol Psychiatry. 47:762–766.
Richesson RL, Smerek M. Electronic Health Records-based Phenotyping, in Rethinking Clinical Trials A Living Textbook in Pragmatic Clinical Trials. NIH Health Care Systems Research Collaboratory. Published June 27, 2014.
Sharma D, Yadav UB, Sharma P. 2009. The concept of sensitivity and specificity in relation to two types of errors and its application in medical research. J Reliability Stat Stud. 2:53–58.
Version History
Update January 17, 2021: Moved from “Inpatient Outcomes” chapter to “Assessing Fitness-for-Use” chapter (changes made by K. Staman).
Published June 19, 2019.
current section :
Data Source Accuracy: Case Study from TRANSLATE-ACS