Grand Rounds March 28, 2025: A Cross-Sectional Study of GPT-4–Based Plain Language Translation of Clinical Notes to Improve Patient Comprehension of Disease Course and Management (Anivarya Kumar, BA; Matthew Engelhard, MD, PhD)

Speakers

Anivarya Kumar, BA
Fourth-Year Medical Student
Duke University School of Medicine

Matthew Engelhard, MD, PhD
Assistant Professor, Department of Biostatistics & Bioinformatics
Duke University School of Medicine

Keywords

Health Literacy; Large Language Models; Artificial Intelligence; Electronic Health Records

Key Points

  • Limited health literacy (HL) has tangible effects on morbidity and mortality: it’s associated with higher rates of hospital admissions and readmissions; medication nonadherence; healthcare costs; and all-cause mortality. 9 in 10 adults have limited HL, and rates are 2 – 3 times lower in marginalized populations.
  • 71% of patients report accessing their electronic health records (EHRs) to read documentation from their clinical visits, particularly the discharge summary notes (DSNs). But clinical notes have low levels of readability, hindering patients’ ability to engage in shared decision-making.
  • The research team looked at whether a Generative-Pre-trained-Transformer-4 (GPT-4)-based plain language translation of DSNs could improve patient comprehension of disease course and management.
  • 533 patients, recruited from a pool of EHR users, were randomly assigned 4 DSNs to assess. After reading the DSNs – 2 translated into more accessible language, 2 untranslated – patients answered questions assessing their objective comprehension, subjective comprehension, confidence, and time spent on each DSN.
  • Compared to the untranslated DSNs, objective understanding of the translated DSNs increased by 6.1%; subjective understanding increased 18%; confidence increased 45%; and average time spent with the DSNs decreased 51%.
  • The research team concluded that GPT translation of DSNs significantly improved patient comprehension of disease course and management and optimized time spent reading them. The effect was significantly greater in marginalized populations with historically low health literacy, reducing the gap in comprehension scores between patient populations.
  • Limitations included the use of standardized DSNs as opposed to real-world DSNs; the use of MyChart when enrolling patients, leading to a participant group with a higher baseline HL; and the modest number of Hispanic patients enrolled in the study.
  • Race is a significant and independent factor for HL. Preliminary data suggests that GPT translation can help close this gap. The research team identified this as an area for further study.

Discussion Themes

While discharge instructions alone can be great for providing patients with action items, they lack some of the context that DSNs can provide, lending the patient a more complete understanding of their condition.

The advantages of providing pre-generated materials, as opposed to pointing patients to an large language model (LLM) like Chat GPT for a more interactive explanation of their condition, include the potential for screening by a healthcare professional and less of a burden on the patient.

The study team ended up favoring “semantically-focused” translations over translations that focused solely on simplifying the language or avoiding jargon. When the LLM was asked to focus on semantics, it was more likely to define concepts and their implications.

Health literacy and reading level are not necessarily on par, and patient-centric or accessible language/LLMs are very important to consider. This may require further investigation, e.g. through qualitative interviews.

Grand Rounds March 21, 2025: Generative Artificial Intelligence in Clinical Trials: A Driver of Efficiency and Democratization of Care (Alexander J. “AJ” Blood, MD, MSc)

Speaker

Alexander J. “AJ” Blood, MD, MSc
Associate Director, Accelerator for Clinical Transformation Research Group
Instructor of Medicine at Harvard Medical School
Cardiologist and Intensivist
Brigham and Women’s Hospital

Keywords

Artificial Intelligence; Cost; Large Language Models; Enrollment; Eligibility; Recruitment

Key Points

  • The Accelerator for Clinical Transformation (ACT) is a research group that seeks to use emerging technology to try and expand access to healthcare and improve quality and quantity of healthcare delivery. They focus on team-based models and scalable applications.
  • It’s becoming more expensive and time-consuming to move a drug from the clinical trial stage to approval. Patient recruitment is the leading driver of costs in clinical trials, and 55% of trials that fail to complete cite low accrual rate as the reason for study termination. There’s pressure from industry to conduct clinical trials in a way that is faster, cheaper, and better for both the patients and the research environment.
  • ACT conducted a pilot study in which they embedded a Large Language Model (LLM) tool called RECTIFIER into an active clinical trial of patients with heart failure. RECTIFIER is an AI-powered, comprehensive software application able to ask and answer questions about unstructured clinical data. In a pilot study, RECTIFIER determined patient eligibility with higher accuracy and specificity than study staff, indicating its potential to streamline screening.
  • LLMs are the engines that power the software. There are two key challenges that need to be taken into consideration to use these tools effectively: 1) there’s a content window – a limit to the amount of Electronic Health Record (EHR) data you can pull in; and 2) Using LLMs is expensive.
  • Following up on the pilot study results, ACT conducted a prospective randomized controlled trial: The Manual Versus AI-Assisted Clinical Trial Screening Using LLMs (MAPS-LLM) trial. MAPS-LLM compared two methods for analyzing a randomized pool of potentially eligible participants: manual review by study staff, and RECITIFIER-augmented review by study staff. Their primary endpoint was eligibility determination.
  • They found that AI-assisted patient screening using the RECTIFIER system significantly improved eligibility determination and enrollment compared with manual screening in a heart failure clinical trial.
  • ACT concluded that implementing AI-assisted tools like RECTIFIER can enhance clinical trial efficiency, reduce resource utilization, and promote equitable recruitment, potentially leading to faster trial completion and earlier patient access to novel therapies. Generative AI is likely to play a significant role in the future of clinical trials.

Discussion Themes

Study staff in the MAPS-LLM intervention arm were able to direct more time and effort towards contacting patients and managing patients with the time they would have spent reviewing charts and manually screening the EHR.

The rate of eligibility between the two arms was equivalent; the difference was, the AI-augmented group was able to assess twice as many potentially eligible patients.

While this tool can do a lot of analytical work, a human element will be essential to utilizing it effectively and to bringing “human intelligence” to participant enrollment.

The ACT team has started to pilot this technology in other disease areas, including cardiology more broadly, endocrinology, oncology, and gastroenterology.

March 20, 2025: iPATH Team Explores Integration of Artificial Intelligence Into Analysis of Qualitative Data

Headshot of Dr. Sara Singer
Dr. Sara Singer, principal investigator for iPATH

Researchers from iPATH, an NIH Collaboratory Trial, described key considerations for integrating artificial intelligence tools into analyses of qualitative data.

The report was posted this month on the AcademyHealth Blog.

The iPATH trial, led by principal investigator Sara Singer at Stanford University, will test the implementation of a practice transformation strategy for type 2 diabetes in federally qualified health centers in California, Massachusetts, Ohio, and Puerto Rico. In the first phase of the project, the study team is refining the strategy by conducting case studies with 12 health centers to identify organizational conditions and processes that promote or impede the effectiveness of diabetes care.

Interviews for the 12 case studies generated 170 hours of qualitative data plus related materials. The study team explored how rapidly evolving artificial intelligence tools, such as large language models, might enhance researchers’ handling of large qualitative datasets, including labor-intensive and time-consuming processes of transcription, coding, and analysis.

Read the full report.

iPATH is supported by a grant award from the National Institute on Minority Health and Health Disparities. Learn more about iPATH.

Grand Rounds February 21, 2025: Texting for Behavior Change: Lessons Learned Across 2 Interventions to Improve Chronic Care Management (Michael Ho, MD, PhD; Sheana Bull, PhD)

Speakers

Michael Ho, MD, PhD
Kaiser Permanente Colorado

Sheana Bull, PhD
University of Colorado School of Public Health

Keywords

Text Messaging; Artificial Intelligence; Chatbots; Health Behaviors

Key Points

  • Ample evidence now exists demonstrating the benefit of using text messaging in support of health behavior and access to care. It’s ubiquitous, increasing reach; theory in message design is impactful; and it can improve adherence to medical appointments and health behaviors.
  • Two NIH Collaboratory Trials, Nudge and Chat 4 Heart Health (C4HH), test the effectiveness of text messaging interventions to support behavior change. Nudge randomized patients to receive usual care, generic texts, behavioral texts, or behavioral texts plus chatbot messages. Their primary outcome was medication adherence.
  • C4HH, the subsequent trial, is randomizing patients to receive a generic text message curriculum; an AI chatbot messaging curriculum; or AI chatbot messages plus proactive pharmacist support. Their primary outcome is cardiovascular risk factors, as measured by the American Heart Association’s “Life’s Essential 8” adherence.
  • Nudge used an opt-out consent approach where CC4H used an opt-in consent approach. In the former, the research team noted, patients who identified as Black, Hispanic, and primary Spanish speakers were more likely to remain in the study. An opt-out approach in the appropriate context may be a way to diversify clinical trial populations and improve external validity of results.
  • The use of AI chatbots allows users to generate questions in their own words and the system to retrieve a response from a closed, curated library.
  • Message engagement is key to text messaging interventions. Participants in the Nudge study who were randomized to optimized texts had more questions. Questions were related to medications, refill logistics, and costs. The study team hypothesizes that the optimized texts may have led to greater patient engagement, and therefore more questions about their medications.
  • Over 12 months, the Nudge study found no significant difference in the rates of prescription refills, between the 3 intervention arms and usual care. CC4H is ongoing, and will send a higher volume of messages in an effort to engage patients and change patient behavior.
  • So far, the top 5 topics in messages initiated by C4HH participants have been healthy eating, physical activity, managing cholesterol, quitting smoking, and medication management.

Discussion Themes

The study team had to be very careful to ensure that patient health data, including cell phone numbers and the messages sent, were encrypted. Vendors and phone carriers were not able to access this data and it was not stored on their servers.

One of the challenges they encountered was that their systems weren’t integrated into the health care organizations’ pharmacies or electronic health records. The integration piece will be key to any future sustainability.

As technology evolves significantly over the course of, say, a 5-year study, developing the skillset to utilize interactive interventions or a SMART design could be helpful for investigators interested in conducting research in this area.

Grand Rounds March 21, 2025: Generative Artificial Intelligence in Clinical Trials: A Driver of Efficiency and Democratization of Care (Alexander J. “AJ” Blood, MD, MSc)

Speaker:

Alexander J. “AJ” Blood, MD, MSc
Associate Director, Accelerator for Clinical Transformation Research Group
Instructor of Medicine at Harvard Medical School
Cardiologist and Intensivist
Brigham and Women’s Hospital

Title: Generative Artificial Intelligence in Clinical Trials: A Driver of Efficiency and Democratization of Care

Date: Friday, March 21, 2025, 1:00-2:00 p.m. ET

Please click the link below to join the webinar:

https://duke.zoom.us/j/96025253609?pwd=xr6PQHaPDQ24b2FFaytZw3HblN3k7e.1

Passcode: 646677

One-Tap Mobile

+13052241968,,96025253609#,,,,*646677# US

+13092053325,,96025253609#,,,,*646677# US

Audio Only Option

+1 305 224 1968

+1 309 205 3325

International numbers available: https://duke.zoom.us/u/aJtwMRxLu

Webinar ID: 960 2525 3609
Passcode: 646677

Grand Rounds December 6, 2024: Opportunities and Challenges in the Use of Large Language Models for Post-Marketing Surveillance of Medical Products (Michael E. Matheny, MD, MS, MPH)

Speaker

Michael E. Matheny, MD, MS, MPH
Director, Center for Improving the Public’s Health Through Informatics
Professor of Biomedical Informatics, Biostatistics, and Medicine
Vanderbilt University Medical Center
Staff Scientist, Geriatrics Research Education and Clinical Care Service
Associate Director, VA ORD VINCI
Tennessee Valley Healthcare System VA

Keywords

Artificial Intelligence; Large Language Models; Surveillance; Medical Products

Key Points

  • Increasingly, leaders in many disciplines are finding new applications for Artificial Intelligence (AI). Within healthcare, this technology is being used to support clinical decision-making; imaging processing; drug discovery; clinical trials; and as Ambient and Autonomous AI.
  • Large Language Models (LLMs) are a subset of generative AI. Since 2012, LLMs have emerged as a promising new technology with rapid growth, evolution of capacity and reach, and many potential applications in healthcare and clinical research.
  • There is significant interest in using LLMs to assist with patient trial matching, clinical trial planning, and the development of trial protocols and consent documents.
  • Another key area that LLMs could provide support in is medical product safety surveillance, with potential applications in adverse event detection, probabilistic phenotyping, and information synthesis.
  • The post-marketing surveillance space utilizes an ecosystem of healthcare data, imaging, radiology reports, insurance, structured data, medical literature, and social media. These sources could be integrated to conduct LLM reasoning and extractions.
  • Key challenges in safe and effective use of LLMs for this purpose include the lack of evaluation for medical product surveillance, the complexities of prompt engineering, hallucination risk (i.e., false positives), and the fact that evolving models over time challenge stable performance estimates.

Discussion Themes

 A segment of the clinical workforce could be trained to be “super users,” partnering with development teams in order to make sure that these tools are working appropriately in a clinical environment.

There is substantial interest in using LLMs to support clinical decision-making. However, studies have shown that the quality of the AI output can influence the performance of the clinicians. Especially in high-risk clinical environments, any drift in those algorithms could result in adverse clinical outcomes. The life cycle approach to conceptualization, development, implementation, surveillance, and maintenance will be necessary to achieve and maintain performance.

July 1, 2024: Latest Podcast Features Michael Pencina and Brian Anderson of Coalition for Health AI

Headshots of Dr. Michael Pencina and Dr. Brian AndersonIn a new episode of our Rethinking Clinical Trials podcast, Drs. Michael Pencina and Brian Anderson of the Coalition for Health AI speak with host Dr. Adrian Hernandez about public-private partnerships in a trustworthy health AI ecosystem. Pencina and Anderson presented on their experiences during the March 8 session of PCT Grand Rounds.

Listen and subscribe to the podcast on SoundCloud or Apple Podcasts, and view the full March 8 PCT Grand Rounds webinar.

March 6, 2024: In This Week’s PCT Grand Rounds, Public-Private Partnerships in Health AI

In this Friday’s PCT Grand Rounds, Michael Pencina of Duke University will present “Public-Private Partnerships in the Trustworthy Health AI Ecosystem.”

The Grand Rounds session will be held on Friday, March 8, 2024, at 1:00 pm eastern.

Pencina is a professor of biostatistics and bioinformatics and the vice dean for data science in the Duke University School of Medicine. He is the director of the university’s Duke AI Health initiative and the chief data scientist for Duke Health.

Join the online meeting.

August 2, 2023: Want to Play a Game? AI and Machine Learning in This Week’s PCT Grand Rounds

Headshot of Dr. Eric Perakslis
Dr. Eric Perakslis

In this Friday’s PCT Grand Rounds, Eric Perakslis of Duke University will present “AI & ML: Want to Play a Game?”

The Grand Rounds session will be held on Friday, August 4, 2023, at 1:00 pm eastern.

Perakslis is a professor in population health sciences and the chief research technology strategist in the Duke University School of Medicine.

Join the online meeting.

May 4, 2022: Ethics Core Members Pen Guest Editorial for AJOB Focus on Machine Learning in Healthcare

In a guest editorial in the American Journal of Bioethics, members of the NIH Pragmatic Trials Collaboratory’s Ethics and Regulatory Core introduced the issue’s target article and peer commentaries on artificial intelligence and machine learning in healthcare. Prof. Kayte Spector-Bagdady and Drs. Vasiliki Rahimzadeh and Kaitlyn Jaffe, who are Core members, were joined by coauthor Dr. Jonathan Moreno in writing the editorial.

The target article of the themed collection proposes a research ethics framework for the clinical translation of healthcare machine learning. In several peer commentaries accompanying the article, experts offer their perspectives on the proposed framework, including critiques of “the insufficiency of current ethics and regulatory solutions to adequately protect communities at higher risk for [machine learning] bias.”

Read the full editorial, “Promoting Ethical Deployment of Artificial Intelligence and Machine Learning in Healthcare.” Learn more about our Ethics and Regulatory Core.