Leslie J. Crofford, MD
Wilson Family Chair in Medicine
Professor of Medicine and Pathology, Microbiology & Immunology
Vanderbilt University Medical Center
Dana Dailey, PT, PhD
Assistant Research Scientist
Physical Therapy and Rehabilitation Science
University of Iowa
Kathleen Sluka, PT, PhD, FAPTA
Professor
University of Iowa
Keywords
Fibromyalgia; Pain; Physical Therapy; Rural Population
Key Points
Fibromyalgia (FM) is widespread pain across four quadrants of the body, exacerbated by movement. The neuropsychiatric symptoms include fatigue, cognitive effects, and syndromes like depression, headache, and abdominal pain. Clinical trial data demonstrate modest effect sizes for all drug classes.
Transcutaneous electrical nerve stimulation (TENS) works by activating central inhibition and reducing central excitability. It has been shown to reduce movement pain safely and effectively. The FM-TIPS pragmatic trial sought to test the feasibility and effectiveness of adding TENS to standard physical therapy care.
The study team found that applying TENS at an optimal dose is clinically effective for reduction of movement-evoked pain in a real-world setting. They concluded that TENS is a safe, inexpensive, and readily available treatment for FM. They shared several tips regarding selection, design, implementation, and maintaining a network and engagement.
Discussion Themes
While TENS was highly effective at modulating pain and fatigue, it resulted in only small, non-clinically relevant changes in physical function. It can be added to an existing FM treatment plan to help manage the pain flares often caused by exercise.
There was a 99% enrollment rate among rural participants who passed screening, which the team attributed to strong, trusted relationships between patients and their local providers in smaller communities.
Dr. Sluka noted her surprise that the PT-only group did not show significant improvement, underscoring the specific efficacy of TENS in this pragmatic setting.
Team cohesion, maintaining relationships with clinical sites, and remaining open to help and new ideas (i.e., bringing in specialists mid-study) were all essential to the study’s success.
Laura Galuchie, BS
Senior Director, Global Clinical Development
Merck & Co, Inc.
Zachary Smith, MA
Assistant Director, Data Sciences & Analytics
Tufts Center for the Study of Drug Development
Tufts University School of Medicine
Keywords
Data; Optimization; Data Collection; Protocol Design
Key Points
The TransCelerate Initiative – comprising a group of pharmaceutical companies with research and development organizations – seeks to identify key considerations in protocol design to optimize procedures and their frequency, while providing tools and a value-based framework for internal evaluation.
Optimized data collection can improve patient and site experience, reduce complexity, enhance trial execution through better design decisions, and maintain (or potentially improve) quality.
A seasoned approach to data collection is timely, as the volume of data is rising (and increasingly exceeds that which is needed). Additionally, recent ICH and ethics updates emphasize fit-for-purpose data and eliminating unnecessary complexity in clinical trials.
The TransCelerate-Tufts Center for Study of Drug Development (CSDD) partnership was borne of the need for continued tangible, actionable evidence to demonstrate the opportunity to optimize data collection.
In 2024, they workshopped a data collection instrument and 14 companies collected and provided data. Tufts CSDD conducted data quality checks to ensure accuracy, validity, and completeness and conducted a comprehensive quality control process. Data analysis took place in early 2025. Endpoints were defined as “core” and “non-core” based on procedure type.
The study sought to quantify the collection and use of non-core and extraneous core protocol data; gather updated benchmarks on the amount, purpose, and impact of data collected in clinical trials; and identify ways to improve protocol design by reducing complexity and easing the burden on sites and participants.
The research team found that the mean number of datapoints collected has exploded in the last decade, up from 930,000 in 2012 to nearly 6 million in 2025. More than 1/3 of all data collected comes from non-core and non-essential procedures.
Non-core and other non-essential procedures contribute to 25-30% of total participants and site burden. Note that there may be other benefits to some non-essential procedures; for example, making sure patients are heard through site questionnaires.
The analysis provides empirical evidence encouraging protocol design discussion and a shift towards more intentional and fit-for-purpose data collection strategies. Planning frameworks and collection assessment tools can reduce unnecessary burden on patients, sites, regulators, and other stakeholders, as well as help sponsors critically assess what data are collected and why.
Discussion Themes
When looking at the factors that contributed to the overcollection of data, the study team found that no one function or department was responsible for the majority of the data points and procedures; the distribution of contributing factors was diffuse. It’s an equal-opportunity problem.
Factors driving non-core data collection included teams’ fear of being asked for data they hadn’t collected by regulators and a lack of on-site experience amongst functional areas. In the latter case, teams are focused on their objectives and lack perspective on how data collection translates into the patient experience, the site experience, or an impact on another function within the group.
In addition to financial costs, there are time costs associated with data collection. The study team found a direct correlation between the amount of data and complexity of a trial – and as complexity increased, the time it took to conduct the trial also increased.
Mark Siedner, MD, MPH
Professor of Medicine, Harvard Medical School
Faculty, Africa Health Research Institute
Keywords
Hypertension; Blood Pressure; Community-Based; Implementation Science; Global Health
Key Points
Hypertension (HTN) is the leading preventable cause of death globally. Dr. Siedner’s research typically revolves around HIV, but he turned his attention to HTN after publishing a study on the convergence of infectious and non-communicable disease epidemics in rural South Africa. Unlike HIV, he noted, HTN control remains poor.
The overarching goals of the IMPACT-BP study were to determine causes for poor HTN control in rural South Africa; co-develop an intervention with partners and end-users to address those causes; and implement and evaluate a novel model of care to improve blood pressure (BP) and increase disease control rates.
They began by designing and determining the acceptability of and conducting a readiness assessment for a community-based hypertension control program. The decision to pursue a community-based care model was informed by decades of successful HIV care programs and innovative HTN care programs.
The program had 3 main elements: Patients monitored their BP at home; community health workers (CHWs) visited patients to collect data, address challenges, and deliver medicines; and nurses managed care remotely with mobile health tools and decision support. Program goals included enhancing patient efficacy and self-empowerment; decongesting clinics and decreasing wait times; and task-shifting away from overburdened nurses.
Once the program had been designed and assessed, the study team conducted a randomized trial to determine its effectiveness. The primary outcome was the change in systolic BP from enrollment to 6 months.
Participants were randomized to 3 arms: Standard of care; “CHW,” which included self-monitoring of BP, home visits and medicine delivery by CHWs, and remote management of BP by nurses; and “eCHW+,” which differed from the “CHW” arm in that BP readings were automatically sent to nurses and the CHWs were less involved.
Though the “eCHW+” arm was slightly more successful, the study team observed 8 – 10mm HG reductions in systolic BP and roughly 30% improvements in BP control in both intervention arms.
This was a multidimensional intervention that sought to address multiple barriers to care. The team faced many real-world challenges, including a community health worker labor dispute, persistent nationwide power outages, destructive weather, and a carjacking spree.
Next, the study team will estimate the fidelity, sustainability, acceptability, and cost-effectiveness of the program. Future directions may include an expansion to multimorbidity care; expansion of the model to urban settings; and transportability to the U.S.
Discussion Themes
When it comes to translating these lessons and insights for care coordination in a U.S. setting, a focus on convenience for healthcare workers and for patients will continue to be crucial.
Though eCHW+ arm was successful, participant feedback indicated that the human element was central to intervention acceptability. Participants felt they were getting a tremendous amount of support from their community health workers, and some expressed anger at the possibility of the intervention ending.
Dr. Siedner noted that he sees the success of the trial more as proof of principle that there are fundamental steps we can take to improve chronic disease care than the unveiling of a one-size-fits-all model.
With a trusted healthcare system and provider providing the right kind of health education, this study demonstrates that you can get people to engage in treatment of an asymptomatic disease.
Daniel Pach, MD
Charité – Universitätsmedizin Berlin
Stefanie Lysk, MS
Charité – Universitätsmedizin Berlin
Newsenselab GmbH (app manufacturer)
Keywords
Decentralized Trial; App; Hybrid Care Model; Digital Tools
Key Points
The 2019 Digital Healthcare Act enabled 73 million Germans in the statutory health insurance system to receive digital applications (AKA apps) on prescription. Apps may be prescribed by physicians and psychotherapists for specific diagnoses and integrated into a patient’s treatment plan.
As a result, the Digital Health Application (DiGA) framework was established. The DiGA framework sets out several requirements: Apps must have a medical purpose, not aimed at primary prevention; must be certified as a medical device; and must demonstrate a positive healthcare effect. There are strict requirements regarding safety, data protection, quality, etc., and integration with hardware (e.g. wearables) is possible.
From December 2020 to March 2022, EMMA – a remote, hybrid trial – was conducted to assess the effect of an app for migraine self-management on migraine days per month. The app included a headache and trigger diary; analysis; self-management techniques, including relaxation exercises, endurance training, and education; and heavier change techniques.
The control arm used a diary-only app, with no capacity for feedback or analysis. Symptom intake was the same as the intervention group.
Though both the control and intervention arms saw an average decrease of roughly 2 migraine days per month, the study team found that use of the app lent no added benefit over the diary-only control. Randomized clinical trials are essential to validate digital therapeutics, and the app was withdrawn from the DiGA directory.
Learnings from EMMA were applied to a new project: the Menstrual Health and Hybrid Care Model for All Young Females (MeMaF). MeMaF has 2 stages: 1) digital self-care: evidence-based self-care instructions, symptom- and cycle-diary, knowledge content, and a medical device. 2) For eligible participants, 6 months of on-site and digital hybrid care: medical care, physiotherapy, nutritional counseling, health psychology sessions, additional app content, and telemedicine.
The researchers concluded that sustainable care for complex chronic conditions will be hybrid, combining digital tools with active involvement of healthcare providers. Researchers need to understand hybrid care in real life and plan trials accordingly – though this increases complexity and demands on clinical trials.
Discussion Themes
The control app may have been too powerful; continuous data tracking may itself act as a powerful intervention.. The intervention app also had more entry points for data, so participants in the intervention arm may have been more likely to report symptoms.
There is a need for rigorous and well-designed control conditions. To ensure valid and interpretable study outcomes you need to understand your intervention well.
A post-hoc analysis of engagement with the app found that 85% of participants used the application on a daily basis. Specifically, they interacted with the trigger diaries and the analysis of their trigger factors; the self-care and self-management features of the application weren’t used as much, especially after the first 4 weeks. Dr. Pach noted that this is a common trend in digital studies.
Steps like user verification through insurance and a video consultation with a physician prevented fraudulent participation in the EMMA trial.
Peter Margolis, MD, PhD
Adjunct Professor of Pediatrics
Stanford University School of Medicine
Emeritus Professor of Pediatrics
University of Cincinnati School of Medicine
Former Co-Director of the James M. Anderson Center for Health Systems Excellence
Cincinnati Children’s Hospital Medical Center
Sean C. Dowdy, MD, FACS, FACOG
Chief Value Officer
Robert D. and Patricia E. Kern Associate Dean for Practice Transformation
Professor, Division of Gynecologic Oncology
Mayo Clinic
Sarah Greene, MPH
Consultant and Senior Advisor
The National Academy of Medicine
Keywords
Learning Health System; Healthcare; Knowledge
Key Points
In service to the goal of establishing a Learning Health System (LHS), the National Academy of Medicine developed and shared a foundational set of shared commitments. These principles, published in 2024, sought to define a common cause for all healthcare workers.
To build on the concept of the shared commitments, consider the LHS from a systems perspective: not as a sum of its parts, but as the product of their interaction. Discussions around the feasibility of LHSs often center around individual parts, e.g. data and informatics, incentives, and culture; but an LHS succeeds when these pieces are built and interact as part of a whole.
So, what does a coherent system look like? A learning organization is an organization skilled at creating, acquiring, and transferring knowledge between parts and at modifying its behavior to reflect new knowledge and insights.
Turning an LHS from an idea to a lived reality involves intertwining infrastructure with an adaptive cycle, propelled by a defined population or system; methods for system change and learning; and measurement and evaluation.
The LHS is not intended as a single program or one-size-fits-all structure; it’s a system that learns at multiple levels of scale, from the individual level to the population level. Dr. Margolis shared an example of a patient-physician collaboration that resulted in an electronic health record (EHR)-integrated dosing algorithm utilized across the healthcare system.
The Kern Center at Mayo Clinic demonstrates how the shared commitments come to life within an organization over time. Founded 15 years ago, Kern brings together diverse experts who create and evaluate data-driven solutions that transform healthcare for patients, clinicians, and communities. It seeks to generate both clinical and practical knowledge, emphasizing practice impact over research.
Roughly 4 years ago, the Kern Center was experiencing an existential crisis; a suboptimal focus was negatively impacting on their reputation within the institution. They pivoted to deep practice engagement and a focus on defining and pursuing clinical practice priorities, and have become an essential piece of practice transformation. Resources like the Project Dashboard and HealthLocator are facilitating communication and the diffusion of practice impact.
Every organization faces a design challenge. Organizing in traditional ways means imposing resource constraints based on assumptions about who can contribute and how. When organizations prioritize the capacity for information to flow freely, however, their learning capacity expands; they can overcome constraints by leaning on the amount, quality, and diversity of expertise available to a network. To illustrate this, Dr. Margolis shared a few examples of LH network successes.
The approach to the LHS has changed and adapted since its inception. Ms. Greene shared 10 reasons they think it will endure, to serve as a high-level roadmap to bring organizational leaders to the table and to help distinguish it from other approaches for translating knowledge into action.
Discussion Themes
In its first decade, Kern was established as a research arm; while it was intended to be transformative, it wasn’t well-integrated with clinical practice. A hybrid model, rooted in research and practice, didn’t work either. Transitioning to a focus on practice only has allowed them to start doing transformative work.
The speakers discussed a couple of facets of patient engagement with LHSs. First, during startup, a community of patients, clinicians, researchers, and other stakeholders often work together to identify measures of success. Second, patients who enter LHSs are consented. However, there is always more work to be done in terms of engaging patients in a mutual exchange of information.
The pursuit of standardization can come into conflict with a system’s ability to innovate. Over time, Dr. Dowdy noted, he’s started giving more weight to the unique circumstances of each hospital and their pursuant need for freedom to innovate. They’ve started emphasizing standardized expectations, measured via the Mayo Clinic Value index, as opposed to standardized methods.
Renato Lopes MD, PhD
Professor of Medicine, Division of Cardiology
Duke University Medical Center
Duke Clinical Research Institute
Keywords
Heart Failure; Chagas Disease; Underserved Population
Key Points
Chagas disease is a condition caused by the parasite Trypanosoma cruzi. One of the complications of Chagas is heart failure (HF). Though Latin America has historically borne the brunt of Chagas-related HF, incidence is increasing globally.
Chagas-related HF has unique characteristics, with lower prevalence of hypertension and diabetes, worse health-related quality of life, and higher hospitalization and mortality rates when compared to other types of HF. In short, though patients with Chagas-related HF are less likely to have classic HF risk factors, their event rates are higher. Furthermore, there’s a lack of clinical evidence to guide their treatment.
Prevention And Reduction of Adverse outcomes in Chagasic Heart failUre Trial Evaluation (PARACHUTE-HF) was a prospective randomized trial evaluating the effect of sacubitril/valsartan compared with enalapril in improving a hierarchical composite endpoint including cardiovascular (CV) death, HF hospitalization, and reduction in NT-proBNP levels – in that order, on the basis of clinical importance.
The study population consisted of 922 patients from 83 sites in 4 Latin American countries: Argentina, Brazil, Colombia, and Mexico. Participants had HF with reduced ejection fraction (HFrEF) and a confirmed diagnosis of Chagas disease.
After 12 weeks, the study team found that sacubitril/valsartan was superior to enalapril with respect to the hierarchical composite endpoint. These results were primarily driven by a 32% reduction in average NT-proBNP levels.
There were some limitations, including COVID-19-related changes in clinical practice that may have affected HF hospitalization rates and the observed treatment effect.
PARACHUTE-HF was the first major randomized trial studying HF treatment in HFrEF patients with Chagas disease. While sacubitril/valsartan has extensive clinical and real-world data supporting its effectiveness for HFrEF, this trial further supports its use in treating this specific population.
There is a need for rigorously conducted clinical trials in Chagas disease to better define the cardiovascular benefit/risk profile of new treatment options for this neglected and high-risk population. This trial provides a model for international collaboration among cardiologists and infectious disease physicians with a shared goal.
Discussion Themes
The pathophysiology of Chagas is complex. It includes chronic, low-intensity myocarditis, which can lead to changes in coagulation pathways and thrombotic events; a neurological tropism, which can cause sudden arrythmias; and an autoimmune component.
Dr. Lopes emphasized the impact that an intentional and informed coordinating center; carefully selected sites; deeply invested PIs; and a strong Data and Safety Monitoring Board (DSMB) had on the success of PARACHUTE-HF.
Charles Bailey, MD, PhD
Department of Pediatrics
Perelman School of Medicine
University of Pennsylvania
Biomedical and Health Informatics
Children’s Hospital of Philadelphia
Keith Marsolo, PhD
Associate Professor
Department of Population Health Sciences
Duke Clinical Research Institute
Duke University School of Medicine
Keywords
PCORnet®; Data; Clinical Research Network; Patient-Centered Research; Common Data Model
Key Points
PCORnet® is a clinical research network that connects communities (namely providers; researchers; patients, caregivers, and advocates) and data (EHR, claims, and patient-reported). It functions as a learning health system to help researchers generate answers that advance health outcomes.
The network is made up of healthcare institutions, from large academic health centers to local community clinics. As of August 2025, PCORnet® had collated data from healthcare encounters in all 50 states, representing over 47 million people. There had been 57 PCORnet® studies and 991 publications supported by PCORnet® resources.
To be useful, data have to be standardized across systems. Frequent data curation and a single language enabled by the PCORnet® Common Data Model (CDM) facilitates this. Data that are in the CDM and currently available for use in research include demographics, diagnoses, and vital signs. Data that may or may not be in the CDM and require additional work for research include immunizations, social determinants of health, and patient-reported outcomes.
Quarterly, the PCORnet® team executes a data curation process. This includes a range of checks looking at data completeness; plausibility; persistence; and conformance to the PCORnet® CDM. Over the last decade, PCORnet® network performance has improved in terms of data mapping and latency.
Researchers can approach the PCORnet® Front Door with both simple univariate and bivariate statistical questions – i.e. how often a particular medication is used within the PCORnet® population – and with prep-to-research queries, which may identify an eligible population and generate some information about how that population behaves.
Once a team is running a PCORnet® study, they can submit queries for study-specific data extracts. This involves identifying a cohort and extracting patient-level data.
In the near future, PCORnet® will include additional data visualization options to increase the ease of navigating complex results. The team is also working on a Query Tools repository that will show what other people have already asked about a given set of data.
Because each study operates on specific variables and general characteristics do not predict specific characteristics, study-focused assessment of data fitness is critical.
The presenters walked attendees through 5 different PCORnet® studies and how they utilized this data infrastructure in their projects.
Discussion Themes
There is no charge for Front Door queries; they are part of the research engagement process. However, prep-to-research queries are limited to those that can be turned around in a reasonable period of time; they don’t extend to statistical modeling or requests that involve asking sites to get new kinds of data. At the pilot level, researchers can execute custom queries that provide a deeper look at the data.
Linkage partners will depend on the needs of a study. For example, PCORnet® Studies have linked to claims data from Centers for Medicare & Medicaid Services, registries that collect lived experience information, and commercial vendors that perform specialty lab or image testing.
An advantage of using PCORnet® for pragmatic and prospective trials is the connection with the health system, local investigators, and data experts. These can serve as valuable resources during the design, recruitment, and analysis stages.
In an assessment of 10 high-income nations, the United States ranked 10th in healthcare system performance despite maintaining a significant lead in terms of healthcare spending.
The capacity of clinical research to improve healthcare is limited by a lack of representation. Patients who are older; live in rural locations; are uninsured; have co-morbid conditions; belong to minority groups; and are more likely to receive non-standard treatment are all inadequately represented in trials.
The NIH CARE for Health Initiative seeks to address these interrelated challenges. It will develop infrastructure for a clinical research network focused on primary care (PC); establish a foundation for sustained engagement with underrepresented communities; implement innovative study designs; integrate research into routine PC without increasing the burden on providers; and facilitate the adoption of evidence-based research findings.
CARE for Health is based in 6 national research hubs. One is the Primary Care Rural and Frontier Clinical Trials Innovation Center (PRaCTICe), a research network partnering with 300 PC practices serving 7 underrepresented population across Oregon, Washington, Wyoming, Alaska, Montana, and Idaho.
PRaCTICe utilizes a continuum of community engagement, from outreach to shared leadership. Engagement strategies have included community needs assessment reviews, regional listening sessions, and a new study development process that involves co-designing studies with PRaCTICe partners.
In 2024, BeatPain – a pragmatic, decentralized, NIH Collaboratory Trial – was selected as 1 of 2 trials PRaCTICe would partner with during Year 1. By the presentation date, PRaCTICe had referred 165 patients to the BeatPain team, 95% of which were rural residents.
Rural populations simultaneously have higher incidence of chronic pain and are less likely to receive evidence-based, nonpharmacologic treatment for it. BeatPain seeks to serve this population by delivering physical therapy (PT) to federally qualified health center patients with lower back pain.
Over the course of their collaboration with PRaCTICe, BeatPain investigators have made strides in terms of localizing the study to partnering communities, building trust with referring providers and patients, and coordinating the end of the trial. Decentralized trial methods hold promise for engaging rural residents and clinics in clinical research.
Discussion Themes
Relationships between research staff and a variety of clinic staff were critical to effective engagement. In one example provided by Dr. Tong, staff helped identify which exercises were most effective when it came to getting providers interested in the referral process. Clinics were not passive recipients, but co-developers.
To deliver PT in a rural setting, the BeatPain team delivered a virtual intervention combining traditional PT, health coaching, motivational interviewing, and pain coping strategies. In some care processes, the hands-on component of PT is essential; less so for chronic pain. Strategic use of technology could expand access to nonpharmacologic care.
Research teams will need to be responsive to shifts in the capacity of rural hospitals and clinics due to funding cuts. This may look like designing interventions that don’t increase the burden on staff; supplying resources; and sharing strategies that clinics can use to be financially sustainable.
IT support proved central to the success of this partnered research. When clinic resources are constrained, the ability to help solve problems related to the electronic health record is essential.
The standard of care in myocardial infarction (MI) management has evolved dramatically in the 20th century, shifting from absolute bed rest to early ambulation to the modern cardiac rehabilitation concept focused on physical activity. This typically includes inpatient mobilization, a 6 – 12 week outpatient program, and a maintenance phase.
Traditional cardiac rehabilitation programs have several limitations, including standardized activities, early withdrawal, high costs, and low enrollment of older adults. The latter factor is increasingly significant, as the contemporary MI patient has also changed; 2/3 of MI patients are over 65 years old.
Despite advancements in acute care, older patients presenting with MI are the highest risk population with the worst prognosis. Older adults also represent the least physically active group, often experiencing functional decline, frailty and disability after MI.
The research team sought to assess a physical activity model with both remote and supervised, in-person, monthly sessions. In the HULK pilot study, this intervention was seen to improve short physical performance battery values 6 months after acute coronary syndrome.
The Physical Activity Intervention in Elderly Patients with Myocardial Infarction (PIpELINe) trial evaluated whether an early, tailored, multi-domain rehabilitation intervention improved outcomes in older patients (65+ years old) admitted to the hospital for MI and with impaired physical performance.
PIpELINe was an investigator-initiated, multicenter, prospective, superiority randomized trial conducted across 7 centers in Italy. The intervention included metabolic risk factors management; diet counseling; and exercise training. The primary outcome was cardiovascular (CV) death or CV-related, unplanned hospitalization.
The research team found that the multi-domain cardiac rehabilitation program reduced CV death or CV-related, unplanned hospitalization in their target population by 8% compared to usual care.
Discussion Themes
One difficulty cited by similar projects is older adults’ reluctance to participate in clinical trials. In this case, the research team found that a monthly, sustained program that provided guidance following an MI was attractive to this population. The main barrier to enrollment was the pandemic.
The impact of the intervention on heart failure and unplanned hospitalization may be more pertinent to this population than CV death, as they pertain to functional decline and quality of life.
The monthly pace renders this intervention low-cost with high availability.
The multidimensionality of the trial makes it difficult to identify which factors drove the effectiveness of the intervention and to what extent. Dr. Tonet suspects that the physical activity component was the most impactful.
P-values are a part of the statistical process of hypothesis and significance testing. They quantify of the degree of “surprise” in a finding. The result is dichotomous; a P-value of less than 0.05 is considered statistically significant, while a P-value greater than or equal to 0.05 is not.
0.05 is a useful but somewhat arbitrary cutoff. It was probably first described in Statistical Methods for Medical Workers by R. A. Fisher: “It is convenient to take this point as a limit in judging whether a deviation is to be considered significant or not.” According to an anecdote shared by Fisher’s daughter, he identified the cutoff as “convincing enough” based on an informal experiment with a colleague.
Using a single threshold to determine significance can be problematic in real-world settings. Healthcare decisionmakers are seeking solutions to multi-dimensional problems, and they care about subgroups. Dr. Huang illustrated this point with an overview of ABATE Infection trial and her team’s subsequent collaboration with decision-makers.
ABATE Infection was a pragmatic, cluster-randomized trial assessing universal decolonization in non-ICUs. While decolonization wasn’t effective for all non-ICU patients, a post-hoc analysis found that the intervention was highly effective in patients with medical devices. This finding was practically significant and was included in national guidance around decolonization.
In a cost-effectiveness analysis of universal, targeted, or no decolonization for patients with medical devices, the ABATE team found that the optimal outcome was dependent on site circumstances, i.e. prevalence of device use, adherence to targeted decolonization, and financial penalties for bloodstream infection.
For years, experts have questioned the reliance on P-values. On the other hand, there are concerns that rejecting “H1 – H0” could prove to be a slippery slope to data dredging and “post-hoc chicanery.”
The dogma of the P-value may be more applicable to a clinical trial setting than to a pragmatic setting. Establishing the standard of care requires a high level of certainty. Scientific rigor demands rules and a threshold that isn’t affected by cost.
In hospitals, clinical decisions are rarely based on certainty; safe interventions that are low-cost and have a possible benefit are given more consideration. Decision-makers should understand the probability of benefit at a given P-value; circumstances may warrant adoption.
In pragmatic trials, valuable information may include the intervention effect size, the effect for various outcomes and on various subgroups, and information pertinent to implementation: fidelity, reach, cost, etc.
Decision-making is complex and multidimensional. What is important may depend on context, audience, or other situational factors. While P-values can be useful in decision-making, they aren’t the only piece of the puzzle.
Discussion Themes
Changing the reliance on P-values would require a multi-pronged, multi-dimensional approach; sponsors, journals, and other stakeholders each uphold the use of P-values for various reasons. Perhaps the best way to start integrating this perspective shift into the clinical trials ecosystem is to hold the line, routinely seeking and providing information about a variety of outcomes and confidence levels.
If we hold that the underlying but unknown truth is fixed, then our process for arriving at conclusions regarding a treatment’s effectiveness (or whether the treatment has a favorable benefit-risk profile) inherently has important operating characteristics, such as the Type I error rate. If we move away from P-values, we will need to define a design approach that considers these operating characteristics.
Maybe it’s more practical to think about honing into a standard of care as an iterative process, in the way that human learning is an iterative process; to state that we know something to some degree of certainty, then modify, refine, and get closer to defining these truths.