Grand Rounds June 13, 2025: Fit for Purpose: Improving the Ethical Oversight of Pragmatic Clinical Trials (Stephanie Morain, PhD, MPH; Nancy Kass, ScD; Ruth Faden, PhD, MPH)

Speakers

Stephanie Morain, PhD, MPH
Associate Professor, Berman Institute of Bioethics & Department of Health Policy & Management
Johns Hopkins University

Nancy Kass, ScD
Phoebe Berman Professor of Bioethics & Public Health
Berman Institute of Bioethics & Department of Health Policy & Management
Johns Hopkins University

Ruth Faden, PhD, MPH
Philip Franklin Wagley Professor of Biomedical Ethics
Berman Institute of Bioethics & Department of Health Policy & Management
Johns Hopkins University

Keywords

Comparative Effectiveness Research; Research Ethics; Oversight; Fit for Purpose

Key Points

  • There are 2 key problems with the ethical oversight of comparative effectiveness research (CER): insufficient evidence to guide key clinical decisions and challenges with ethical oversight for trials aimed at guiding those decisions.
  • The vast majority of clinical decisions are still made in the absence of high-quality evidence. For example, fewer than 10% of current recommendations in cardiology are based on the highest quality evidence; expert opinion, on the other hand, guides over 40% of recommendations.
  • There are challenges with ethical oversight in clinical trials, particularly when comparing existing therapies in widespread clinical use. The traditional approach to research ethics holds that research is conceptually different from care, undertaken for the sake of future patients. The oversight system, established in the 1970s, ensure that the risk/benefit ratio was acceptable; that people knew they were taking part in a study, and that it is not equivalent to care; and that people can voluntarily agree (or refuse) to take part.
  • But the reality isn’t so tidy: A clinical trial really might be someone’s “best treatment option.” In the meantime, clinical care has wasted billions of dollars delivering care that was unproven, unnecessary, or in error. Ongoing learning in healthcare settings is essential but must have sound ethical oversight.
  • Clinical research is not all the same; oversight must be matched (“fit”) to the specifics of the study. Sometimes it does, e.g. for studies of experimental, pre-market products, with high uncertainty. But one-size fits-all oversight can be problematic, e.g. for CER on approved products, and excessive oversight results in a greater-than-appropriate burdens for researchers and collaborating clinicians.
  • The team at the Berman Institute proposed a new model to improve the “fit for purpose” of research ethics oversight that might be feasible within current regulatory structures. There were two key considerations: participation’s impact on welfare and on autonomy. Oversight bodies should consider how much additional risk and burden is introduced with participation and studies shouldn’t restrict a decision that would have been available and meaningful to patients.
  • To achieve “fit for purpose” oversight, observational studies will require minimal oversight due to minimal increased risk compared to usual care, and no restriction of meaningful choice. Randomized trials will require case-by-case evaluation.

Discussion Themes

The origins of informed consent have their roots in paternalism, in which a physician makes all the judgement calls on behalf of a patient. Yet researchers and clinicians must make judgement calls about which of the many decision points involved in care are worth highlighting; to run through all of them risks losing an emphasis on those that have more serious implications.

The team noted that respect for autonomy (like other ethics commitments) is not absolute. It is bounded by other morally important duties, such as promoting welfare and seeking justice. In the clinical context, it’s also bounded by tradeoffs where patients have other interests.

Grand Rounds June 6, 2025: The REDCap Advanced Randomization Module: A Trial Innovation Network Project to Support the Needs of Modern Trials (Jonathan D. Casey, MD, MSc)

Speaker

Jonathan D. Casey, MD, MSc
Assistant Professor, Pulmonary & Critical Care, Vanderbilt University
Co-PI, Vanderbilt Trial Innovation Center
Director, Coordinating Center, Pragmatic Critical Care Research Group

Keywords

Randomization; Randomized Clinical Trial; REDCap; Innovation

Key Points

  • In a traditional randomized trial, trial procedures—including eligibility criteria, group allocation, intervention, and sample size—are specified before the trial and proceed unchanged throughout trial conduct. There are challenges associated with this method. Trials are expensive and take a long time; they’re designed to answer only 1 question; they commonly provide results that are underpowered and are therefore uninformative; they aren’t generalizable; and more.
  • For each of these issues, there are proposed solutions. Platform trials attempt to answer multiple questions within a single trial. Adaptive trials either increase allocation for better-performing groups (response-adaptive) or minimize imbalances (covariate-adjusted). Pragmatic or decentralized trials attempt to address generalizability.
  • To support these advanced trial designs, researchers may need a platform with the capacity to add or drop groups; randomize multiple times within the same record; generate randomization sequences that change allocation probabilities; and more.
  • REDCap is a secure web platform for building and managing online databases, available at no cost to nonprofit, academic, and government organizations. It’s one of the most commonly used platforms for clinical trials, and has included an embedded tool for randomization since 2012.
  • The original REDCap randomization tool embedded randomization into the data collection instrument and allowed one randomization for each study record. Users were able to stratify by sites or other key variables. However, there were gaps: users couldn’t add or drop a group; change the allocation ratio; randomize into multiple domains or at multiple points in time; and more.
  • This was identified as an area for improvement by Dr. Casey’s team in 2023, and the REDCap Advanced Randomization Module was developed. The new tool supports multiple randomizations, blinding, more meta-data, integration with non-REDCap systems, and more.
  • The next steps for this project include a methods and dissemination manuscript, which is currently under review; the incorporation of the randomization functionality into REDCap’s 21CFR Part 11 Validation framework; and the exploration of methods to support randomization within the EHR for pragmatic, EHR-embedded trials.
  • Clinical trials need to be efficient because resources – from personnel to funding—are limited. Innovators in the clinical trials space can target logistical efficiency, regulatory efficiency, or statistical efficiency. The REDCap Advanced Randomization module focuses on statistical efficiency.
  • Dr. Casey posited that the toughest question for clinical trial efficiency is when to use which design tool. Investigators facing these decision points can reach out to the Trial Innovation Center for guidance.

Discussion Themes

The Advanced Randomization Module has received a positive reception from the REDCap user base since its release in October. The group has also helped the REDCap team think through ways to refine the new features. Importantly, there have been no documented disruptions to ongoing projects.

In terms of clinical acceptability, the research team can track how many people are using the modules. How clinicians interpret and accept advanced the randomization features is harder to track, but the team is open to ideas.

There are some risks associated with introducing these complex tools into clinical settings; they increase the complexity of interpretation and can lead to unanticipated errors. Some people have raised concerns that a clinician audience might have a limited understanding of the biases that adaptive randomization can introduce.

The REDCap team is working on some exciting features for the next version of REDCap, including identity verification; data sharing and real-time data availability; and participant-mediated data sharing.

Grand Rounds May 30, 2025: Embedding Randomization Into Clinical Care in Learning Healthcare Systems: Insights From the KP-VACCINATE Trial (Ankeet S. Bhatt, MD, MBA, ScM)

Speaker

Ankeet S. Bhatt, MD, MBA, ScM
Cardiologist, Kaiser Permanente San Francisco Medical Center
Research Scientist, Kaiser Permanente Northern California Division of Research
Adjunct Professor, Stanford University School of Medicine

Keywords

Nudges; Behavioral Science; Vaccination; Influenza; Implementation Science

Key Points

  • Implementation science is the scientific study of methods and strategies that facilitate the uptake of evidence-based practice and research into regular use by practitioners and policymakers.
  • While many implementation science interventions have targeted patients and providers, relatively few have been scaled at the system level with the ability to be replicated in other healthcare delivery systems. Dr. Bhatt’s team was interested in using a cyclical framework to address this gap in the evidence.
  • Behavioral science emerged as a promising area for this project. In recent years, the practice of employing nudges – subtle changes in design that can impact human behavior without restricting choice – has gained traction in the tech sector and in the public eye more broadly.
  • Dr. Bhatt’s team had worked with a group of Danish researchers on a sequence of nationwide clinical trials: NUDGE-FLU, NUDGE-FLU-2, and NUDGE-FLU-CHRONIC. These trials improved influenza vaccination rates in Denmark through randomization to different behavioral science-informed messaging strategies.
  • Vaccination rates in the U.S. have been stagnant for many years, and most systems are not reaching the minimum target of 70% compliance. Dr. Bhatt’s team, inspired by the NUDGE trials’ success, launched the Kaiser Permanente VACCination Improvement with Nudge-based CardiovAscular Targeted Engagement (KP-VACCINATE) Trial.
  • KP-VACCINATE is a fully embedded, randomized clinical trial assessing the effectiveness and timing of cardiovascular-focused nudge communication when it comes to vaccine uptake in a diverse U.S. population. It was developed in collaboration with Danish partners from the NUDGE trials and will be one of the largest clinical trials ever completed.
  • At the time of presentation, KP-VACCINATE was an ongoing, a 4-arm, 1:1:1:1 randomized clinical trial. The primary outcome is influenza vaccination rates assessed with 6 co-primary outcomes. Patients in Arm 1 receive nudges at Touchpoints 1 & 2; patients in Arm 2 receive nudges at Touchpoint 1; patients in Arm 3 receive nudges at Touchpoint 2; and patients in Arm 4 receive usual care.
  • This model is embedded in an integrated healthcare delivery system and may be readily transferable to other areas of patient, clinician, and health system engagement. Seamless collaboration between the research and operational teams was paramount for stakeholder engagement, implementation, and subsequent analysis.

Discussion Themes

In the interest of pragmatic systemwide inclusion, inclusion criteria were broad and most exclusion criteria pertained to an inability to receive health care system outreach. They also allowed for local adaptation to a unified protocol.

One barrier to conducting this kind of research is that not all healthcare systems are receptive to A/B randomization. When socializing KP-VACCINATE with operational teams, Dr. Bhatt pointed out that many health systems already conduct this kind of testing, albeit informally. Healthcare operates on incomplete evidence; decisions are made based on an integration of clinician judgement and the data we have on hand. This approach could improve systems’ ability to assess these strategies and integrate them into usual care.

Grand Rounds May 16, 2025: Pivoting Clinical Trials Into a New and Evolving World (Jeffrey A. Spaeder, MD; Adrian F. Hernandez, MD, MHS)

Speakers

Jeffrey A. Spaeder, MD
Chief Medical and Scientific Officer
Senior Vice President
IQVIA

Adrian F. Hernandez, MD, MHS
Executive Director
Duke Clinical Research Institute
Vice Dean
Duke University School of Medicine

Keywords

Clinical Research; Clinical Trials; Industry Trials; Accelerating Research

Key Points

  • The U.S. lags behind in recruitment for cardiology trials. Even though the U.S. has the most sites, significantly fewer patients are enrolled. These trends suggest underlying legal, regulatory, and cost-related barriers, highlighting the need for improved clinical trial infrastructure.
  • Additionally, teams are growing, trust in science in the U.S. is low, and some health trends, such as obesity, hypertension, and diabetes, are going the wrong way. The U.S. leads avoidable deaths per 100,000 and could fall further behind, all while health care costs are rising.
  • If you are looking at the value of healthcare that is delivered in the U.S., we pay more than other countries but have worse outcomes. Healthcare expenses account for about 28% of the U.S. Federal budget. More than 90% of the volume of prescription drugs are generic with the exception of immunology, obesity, and diabetes. Retail net pricing of prescription drugs in the U.S. accounts for only 14% of all healthcare expenditures. These factors may lead policymakers to see an imbalance of expense to outcome.
  • This means for clinical trials there is an increased focus of using real-world data to make informed decisions, shorten timelines, and inform efficacy and safety. It will also be important to make sure endpoints are clinically relevant and more of an emphasis on strategy-focused research. There may also be more of an emphasis on improving outcomes from a lens of prevention
  • In industry, biopharma funding levels are increasing, with an emphasis on later-phase assets, and funding trends have returned to pre-COVID-19 levels. An increasing proportion of studies are initiated by emerging bio-pharma (EBP) sponsors. While overall measures of complexity has increased modestly, sites have experienced a greater increase – and so have study participants.
  • If there is a desire to increase industry-sponsored or other types of studies at academic medical centers, contracting timelines must be reduced, intellectual property causes significant churn with little value, IRB staffing and responsiveness is critical, cost and efficiency need to be appropriate value for expense, and principle investigators need to have actual availability for study activities.

Discussion Themes

-The U.S. conducts more clinical trials on rare diseases than other countries. Does that impact the numbers? The data from this presentation was from cardiovascular trials. The data was from common chronic cardiovascular diseases. The U.S. is an attractive place to conduct studies because the FDA regulatory timelines are predictable and fast, and the U.S. market is attractive, but enrollment per site is generally lower than it is elsewhere.

-How has IQVIA looked at drivers of cost and efficiency? Cost per patient has increased, driven by the complexity of studies and how imaging and other tools get built into studies. Investigators are trying to make studies more patient-centric, collecting data remotely, making visits fewer. There has also been an increased duration of studies, which is more costly. Time has real cost implications for sponsors.

-Is decentralization of trials a solution? It can be – hybridization can be helpful. How can you meet the patient where they are. You need clinician engagement but there are a lot of things that happen from beginning to end that could be decentralized. It has to be used selectively in the right situation.

-Is a centralized IRB a solution? They have real value if they have expertise and fast turnaround time – if they are credible, rigorous, and have experienced staff. In some situations, studies use a centralized IRB and then go through an institutional IRB as well.

Grand Rounds May 2, 2025: Fluid REStriction in Heart failure versus liberal fluid UPtake: The FRESH-UP Study (Roland RJ van Kimmenade, MD, PhD)

Speaker

Roland RJ van Kimmenade, MD, PhD
Cardiologist, Radboud University Medical Center
Nijmegen, the Netherlands

Keywords

Heart Failure; Fluids; Cardiology

Key Points

  • We are facing a pandemic of heart failure (HF), with an incidence of 1 – 20 cases of heart failure per 1,000 people. The incidence of HF is stable – if not declining – but mortality remains high, at about 15 – 30% after one year. Attributable health care costs are up, and the prevalence of HF in the general population is increasing.
  • Orthopnea and edema are symptoms of heart failure caused by congestion, or “fluid retention.” This has led to an intuitive assumption that patients should monitor their fluid intake to 1.5 – 2L per day (including beverages, ice cream, soup, and some fruit). Patients are advised to do things like chew gum or suck on frozen grapes to relieve dry mouth and thirst.
  • The literature supporting fluid restriction is limited; as of 2018, the studies supporting it had small sample sizes and heterogeneous populations. They found no differences in mortality and hospitalization. Sets of clinical practice guidelines from 2021 and 2022 also noted that more evidence was needed for fluid restriction, and that existing evidence was low quality.
  • To address this gap in the evidence, the research team used crowdfunding to conduct the Fluid REStriction in Heart failure versus liberal fluid Uptake (FRESH-UP) Study. The randomized, open-label, multicenter clinical trial study took place between May 17, 2021 and June 13, 2024.
  • The primary outcome was health status at 3 months, as assessed by the Kansas City Cardiomyopathy Questionnaire – Overall Summary Score (KCCQ-OSS). The key secondary outcome was thirst distress at 3 months, as assessed by the Thirst Distress Scale for patients with HF (TDS-HF).
  • Participants were randomized to one of two arms: liberal fluid intake (no restriction) or fluid restriction, with a maximum of 1500 mL of fluid per day.
  • After three months, the research team found that the difference in KCCQ-OSS (adjusted for baseline scores) was 2.17, with a p value of 0.06. These findings favor liberal fluid intake, but the primary outcome was not met.
  • Thirst distress was higher in the fluid restriction group. No differences were observed for safety events between groups.
  • The FRESH-UP study questions the benefit of fluid restriction in chronic HF.

Discussion Themes

Patient-centered research is key in pragmatic trials; this trial came about because a patient voiced their discomfort and questioned the validity of fluid restriction. The researchers took this as a cue to question a key assumption in their field.

The Dutch Heart Foundation facilitated the crowdfunding, from the legal requirements to the website. The money raised from crowdfunding got them far enough to apply for a second grant.

As in clinical practice, the pragmatic nature of the trial made it difficult to guarantee participant fidelity throughout the entire experiment (though they did monitor intake at week six). The research team conducted a survey of participants afterwards to assess adherence. 93% of the patients reported that they adhered well to their regimes.

“Gaps in Evidence” in clinical guidelines is not a summary of failure, but a source of inspiration!

Grand Rounds April 25, 2025: Automated Response Technology Integrated into EMR and Physician-Patient Communication (Ming Tai-Seale, PhD, MPH)

Speaker

Ming Tai-Seale, PhD, MPH
Professor
Departments of Family Medicine and Medicine (Bioinformatics)
University of California San Diego School of Medicine
Director of UC San Diego Learning Health Systems Science Center

Keywords

Electronic Health Record; Artificial Intelligence; MyChart; Patient Messages; Large Language Models; Clinician Well-Being; Mental Health

Key Points

  • Physician work is increasingly centered around the electronic health record (EHR). It consumes nearly 50% of scheduled clinic time. The volume of patient messages in MyChart increased significantly from 2020 to 2022, and remains much higher than pre-pandemic levels.
  • Research published in Health Affairs and JAMA Network Open suggests that this influx of inbox messages is detrimental to physicians’ well-being. The emotional timbre of messages from patients plays a role, as well; in an analysis of EHR inbasket messages, the research team found messages from patients that contained expletives, vitriol and personal attacks.
  • The research team sought to examine the association between generative AI (GenAI)-drafted replies for patient messages and physician time spent on answering messages. They were also looking at the quality of GenAI-drafted replies for messages dealing with mental heath concerns.
  • The team created a prompt within the EHR that gave physicians the option to either use an AI-generated response as a starting point or to start with a blank reply. Messages eligible for responses drafted by GenAI included refills, results, paperwork, and general questions.
  • The pilot study took place from June 16 to July 12, 2023, targeting primary care attending physicians at University of California San Diego. 52 physician volunteers received the intervention; the 70 physicians in the control arm did not.
  • In the pilot study, clinicians who were given the option of a GenAI-drafted reply spent more time reading patient messages. There was no change in average reply time.
  • When clinicians received messages dealing with mental health issues, replies drafted by more recent versions of GenAI had more utility than older versions.
  • The physicians expressed that they valued the GenAI-drafted replies as a compassionate starting point for their communication. They noted areas for improvement, like a robotic tone, and emphasized the continued need for human oversight and intervention.
  • The study team acknowledged potential risks when using large language models (LLMs) in mental health communication. These included a loss of human touch and empathy; overreliance and deskilling; and privacy and security risks.
  • This is an ongoing effort. Next steps include using LLMs to facilitate analyses of qualitative data on electronic patient-clinician communication; triangulating qualitative and quantitative data in the EHR; and aiming for a more comprehensive understanding of mental health communication and how LLMs might improve its quality.

Discussion Themes

Anecdotally, the researchers have heard from physicians that ART technology – which Epic and Microsoft continue to refine – seems to have improved. But issues still remain, such as GenAI recommending patients see clinicians from external hospital systems.

When a modified GenAI-drafted reply was sent to a patient, a disclaimer was included: “Part of this message was generated automatically.” The research team felt that it was important to provide this transparency and disclose to patients when AI contributed to the messaging they received.

Health systems and professional organizations must develop standards advocating for equity in the implementation of and access to these tools.

Grand Rounds April 18, 2025: Colchicine and Spironolactone Post-MI — A Review of the Late-Breaking Results of the CLEAR OASIS 9 Trial (Sanjit S. Jolly, MD, MSc, FRCPC)

Speaker

Sanjit S. Jolly, MD, MSc, FRCPC
Interventional Cardiologist, Hamilton Health Sciences
Stuart Connolly Chair in Cardiology
Professor of Medicine, McMaster University

Keywords

Myocardial Infarction; Cardiology; Heart Failure; Colchicine; Spironolactone

Key Points

  • 20 years ago, an article published in Nature hypothesized that if we could find a cardioprotective drug to lower cardio-reactive protein (CRP), we could eliminate heart disease.
  • Over the last 2 decades, there have been successes and failures on that front. The Cardiovascular Inflammation Reduction Trial (CIRT) found that methotrexate did not reduce the rate of major adverse cardiovascular events. The Canakinimab Anti-inflammatory Thrombosis Outcomes Study (CANTOS) found that higher doses of canakinumab reduced cardiovascular (CV) death, myocardial infarction (MI), or stroke by over 15% during follow-up.
  • The CoLchicine and spironolactonE in patients with myocardial infARction/SYNERGY Stent Registry – Organization to Assess Strategies of Ischemic Syndromes 9 (CLEAR SYNERGY OASIS 9) Trial was a large, simple, randomized trial of 7,000 patients with ST-elevation myocardial infarction or large non-ST-elevation myocardial infarction. Participants were randomized in a 2×2 factorial; first to either colchicine or placebo, then to either spironolactone or placebo.
  • The primary outcome in the first factorial was the effect was treatment with colchicine vs placebo on a composite of CV death, MI, stroke, or IDR. The co-primary outcomes in the subsequent factorial were the effects of spironolactone vs a placebo on 1) a composite of CV death or heart failure (HF) and 2) a composite of CV death, HF, stroke, or MI.
  • There have been 2 large trials looking at colchicine in cardiovascular disease: COLCOT and LODOC02. The CLEAR trial started before the results of the COLCOT trial, as the research team believed a larger confirmatory trial with more power was needed and replication of power results were important for Class 1 indications in guidelines. CLEAR is the largest trial of colchicine in acute MI, with substantially more events than prior trials.
  • In the first factorial, they found that while CRP was reduced with colchicine, acute and long-term colchicine did not reduce the composite of CV death, MI, stroke, or ischemia-driven revascularization. Colchicine was also associated with an increase in diarrhea, a known side effect of the drug. The research team believes the role of colchicine post-MI remains uncertain.
  • There have been 2 trials looking at Mineralocorticoid Receptor Antagonists (MRA) post-MI in patients without HF: REMINDER and ALBATROSS. Their results left some questions unanswered.
  • In the second factorial, they found that routine spironolactone post-MI did not reduce either co-primary outcome. There was a reduction in new or worsening heart failure, and on-treatment analysis suggests a potential benefit.

Discussion Themes

Outcomes have improved remarkably over the last 20 years, such that HF event rates in a population with predominantly ST-elevation MI are around 3%; a significant drop from the roughly 20% HF event rate in that population 20 years ago. That makes it more difficult to show treatment effects in this population.

The study team developed their inclusion criteria to select for a study population that would be applicable in standard clinical practice. The trial became more pragmatic as the study went on as a result of pivots they made in response to the COVID-19 pandemic.

Key challenges were driven by the COVID-19 pandemic. These included shipping expenses, which spiked significantly; shifting logistics, regarding who would receive the materials; and a pause in recruitment. The study team also came up against varying drug approvals in different locations; this was a global trial, taking place over roughly 70 sites in 11 countries.

Grand Rounds April 11, 2025: Pridopidine in ALS: Results from the Healey Platform Trial (Jeremy M. Shefner, MD, PhD)

Speaker

Jeremy M. Shefner, MD, PhD
Professor of Neurology
Barrow Neurological Institute

Keywords

ALS; Platform Trial; Phase 2 Trial

Key Points

  • Amyotrophic Lateral Sclerosis, or ALS, is an age-related degenerative disease affecting the primary motor neurons in the brain and the alpha motor neurons in the spinal cord. It causes weakness in the limbs, breathing muscles, and facial muscles. Males are affected significantly more than females; 1 in 700 men will die of ALS.
  • The average survival after the onset of the first ALS symptom is about 4 years – a small improvement despite decades of clinical trials. Treatment modestly prolongs both survival, function, and quality of life for patients with ALS.
  • In the last decade, 2 drugs has been approved by the U.S. Food and Drug Administration (FDA) for use in patients with ALS. These drugs reached Phase 3 or approval on the basis of small trials; however, subsequent large multinational trials failed to meet their primary endpoints, leading to the withdrawal of one agent and a reduction in the use of another.
  • There is a robust pipeline of drugs in early development and a network of high-quality study sites. However, clinical evaluation is slow, for many reasons: the episodic nature of ALS requires new staffing and training for every trial; start-up is slow; participants wish to reduce placebo assignments as much as possible; and costs are excessive.
  • The HEALEY phase-2 platform trial was developed to meet the needs of the ALS community and with the ultimate goal of moving drugs that were more likely to succeed to phase 3. It’s intended to function as a perpetual trial, as patient-friendly as possible, with a broad set of inclusion criteria so that a placebo group can be shared between trials.
  • This approach has several advantages over traditional clinical trials: It reduces the number of participants who must be assigned to the placebo group; it cuts time; and it cuts costs. Since 2020, 7 drugs have gone through the pipeline.
  • One of these drugs is Pridopidine. Participants were randomized 3:1 to receive either Pridopidine or a placebo for 24 weeks. The primary outcomes were change in the ALS Functional Rating Scale-Revised (ALSFRS-R) total score and survival.
  • While Pridopidine was safe and well tolerated, there was no overall effect on the primary endpoint. However, potentially meaningful signals were seen suggesting efficacy in secondary endpoints, particularly measures of motor speech performance. And in participants with definite ALS and baseline time from symptom onset of 18 months or less – a prespecified group characterized by rapid disease progression – Pridopidine had a greater impact.
  • While sample sizes were small in the subgroup analyses, the magnitude of the effects seen – as well as their consistency – add to the body of evidence suggesting that Pridopidine may have an impact on patients with ALS. The results support further evaluation of pridopidine in a phase 3 study, which is currently being planned.

Discussion Themes

HEALEY is a phase 2 platform trial, so the anticipation is that drugs that show promise will go into larger trials. It’s important to note that the size of these individual arms are as large or larger than other treatment programs that have led to FDA approval in the past.

The research team’s inclusion criteria allows for a broad selection of participants, with disease onset up to 3 years and a minimal vital capacity of 50%. This is broader than other ALS trials, developed with the need for ALS patients to have access to a broader range of experimental therapeutics in mind.

Participants are drawn to this platform because there is a smaller chance that they’ll receive a placebo. However, they must be on board with receiving any of the drugs being studied. This is a different discussion than people are used to having in trial recruitment, where typically a single drug or drug combination is being studied. Investigators must explain that all of the drugs are viable options.

Adaptation is a challenge; changes to the protocol are a big commitment. After a change is made, data from the past placebo groups can’t be used.

Grand Rounds March 28, 2025: A Cross-Sectional Study of GPT-4–Based Plain Language Translation of Clinical Notes to Improve Patient Comprehension of Disease Course and Management (Anivarya Kumar, BA; Matthew Engelhard, MD, PhD)

Speakers

Anivarya Kumar, BA
Fourth-Year Medical Student
Duke University School of Medicine

Matthew Engelhard, MD, PhD
Assistant Professor, Department of Biostatistics & Bioinformatics
Duke University School of Medicine

Keywords

Health Literacy; Large Language Models; Artificial Intelligence; Electronic Health Records

Key Points

  • Limited health literacy (HL) has tangible effects on morbidity and mortality: it’s associated with higher rates of hospital admissions and readmissions; medication nonadherence; healthcare costs; and all-cause mortality. 9 in 10 adults have limited HL, and rates are 2 – 3 times lower in marginalized populations.
  • 71% of patients report accessing their electronic health records (EHRs) to read documentation from their clinical visits, particularly the discharge summary notes (DSNs). But clinical notes have low levels of readability, hindering patients’ ability to engage in shared decision-making.
  • The research team looked at whether a Generative-Pre-trained-Transformer-4 (GPT-4)-based plain language translation of DSNs could improve patient comprehension of disease course and management.
  • 533 patients, recruited from a pool of EHR users, were randomly assigned 4 DSNs to assess. After reading the DSNs – 2 translated into more accessible language, 2 untranslated – patients answered questions assessing their objective comprehension, subjective comprehension, confidence, and time spent on each DSN.
  • Compared to the untranslated DSNs, objective understanding of the translated DSNs increased by 6.1%; subjective understanding increased 18%; confidence increased 45%; and average time spent with the DSNs decreased 51%.
  • The research team concluded that GPT translation of DSNs significantly improved patient comprehension of disease course and management and optimized time spent reading them. The effect was significantly greater in marginalized populations with historically low health literacy, reducing the gap in comprehension scores between patient populations.
  • Limitations included the use of standardized DSNs as opposed to real-world DSNs; the use of MyChart when enrolling patients, leading to a participant group with a higher baseline HL; and the modest number of Hispanic patients enrolled in the study.
  • Race is a significant and independent factor for HL. Preliminary data suggests that GPT translation can help close this gap. The research team identified this as an area for further study.

Discussion Themes

While discharge instructions alone can be great for providing patients with action items, they lack some of the context that DSNs can provide, lending the patient a more complete understanding of their condition.

The advantages of providing pre-generated materials, as opposed to pointing patients to an large language model (LLM) like Chat GPT for a more interactive explanation of their condition, include the potential for screening by a healthcare professional and less of a burden on the patient.

The study team ended up favoring “semantically-focused” translations over translations that focused solely on simplifying the language or avoiding jargon. When the LLM was asked to focus on semantics, it was more likely to define concepts and their implications.

Health literacy and reading level are not necessarily on par, and patient-centric or accessible language/LLMs are very important to consider. This may require further investigation, e.g. through qualitative interviews.

Grand Rounds March 21, 2025: Generative Artificial Intelligence in Clinical Trials: A Driver of Efficiency and Democratization of Care (Alexander J. “AJ” Blood, MD, MSc)

Speaker

Alexander J. “AJ” Blood, MD, MSc
Associate Director, Accelerator for Clinical Transformation Research Group
Instructor of Medicine at Harvard Medical School
Cardiologist and Intensivist
Brigham and Women’s Hospital

Keywords

Artificial Intelligence; Cost; Large Language Models; Enrollment; Eligibility; Recruitment

Key Points

  • The Accelerator for Clinical Transformation (ACT) is a research group that seeks to use emerging technology to try and expand access to healthcare and improve quality and quantity of healthcare delivery. They focus on team-based models and scalable applications.
  • It’s becoming more expensive and time-consuming to move a drug from the clinical trial stage to approval. Patient recruitment is the leading driver of costs in clinical trials, and 55% of trials that fail to complete cite low accrual rate as the reason for study termination. There’s pressure from industry to conduct clinical trials in a way that is faster, cheaper, and better for both the patients and the research environment.
  • ACT conducted a pilot study in which they embedded a Large Language Model (LLM) tool called RECTIFIER into an active clinical trial of patients with heart failure. RECTIFIER is an AI-powered, comprehensive software application able to ask and answer questions about unstructured clinical data. In a pilot study, RECTIFIER determined patient eligibility with higher accuracy and specificity than study staff, indicating its potential to streamline screening.
  • LLMs are the engines that power the software. There are two key challenges that need to be taken into consideration to use these tools effectively: 1) there’s a content window – a limit to the amount of Electronic Health Record (EHR) data you can pull in; and 2) Using LLMs is expensive.
  • Following up on the pilot study results, ACT conducted a prospective randomized controlled trial: The Manual Versus AI-Assisted Clinical Trial Screening Using LLMs (MAPS-LLM) trial. MAPS-LLM compared two methods for analyzing a randomized pool of potentially eligible participants: manual review by study staff, and RECITIFIER-augmented review by study staff. Their primary endpoint was eligibility determination.
  • They found that AI-assisted patient screening using the RECTIFIER system significantly improved eligibility determination and enrollment compared with manual screening in a heart failure clinical trial.
  • ACT concluded that implementing AI-assisted tools like RECTIFIER can enhance clinical trial efficiency, reduce resource utilization, and promote equitable recruitment, potentially leading to faster trial completion and earlier patient access to novel therapies. Generative AI is likely to play a significant role in the future of clinical trials.

Discussion Themes

Study staff in the MAPS-LLM intervention arm were able to direct more time and effort towards contacting patients and managing patients with the time they would have spent reviewing charts and manually screening the EHR.

The rate of eligibility between the two arms was equivalent; the difference was, the AI-augmented group was able to assess twice as many potentially eligible patients.

While this tool can do a lot of analytical work, a human element will be essential to utilizing it effectively and to bringing “human intelligence” to participant enrollment.

The ACT team has started to pilot this technology in other disease areas, including cardiology more broadly, endocrinology, oncology, and gastroenterology.