Large Language Models Archives - Rethinking Clinical Trials

Grand Rounds April 25, 2025: Automated Response Technology Integrated into EMR and Physician-Patient Communication (Ming Tai-Seale, PhD, MPH)

Posted on May 2, 2025 by Sophia Ramirez

Speaker

Ming Tai-Seale, PhD, MPH
Professor
Departments of Family Medicine and Medicine (Bioinformatics)
University of California San Diego School of Medicine
Director of UC San Diego Learning Health Systems Science Center

Keywords

Electronic Health Record; Artificial Intelligence; MyChart; Patient Messages; Large Language Models; Clinician Well-Being; Mental Health

Key Points

Physician work is increasingly centered around the electronic health record (EHR). It consumes nearly 50% of scheduled clinic time. The volume of patient messages in MyChart increased significantly from 2020 to 2022, and remains much higher than pre-pandemic levels.

Research published in Health Affairs and JAMA Network Open suggests that this influx of inbox messages is detrimental to physicians’ well-being. The emotional timbre of messages from patients plays a role, as well; in an analysis of EHR inbasket messages, the research team found messages from patients that contained expletives, vitriol and personal attacks.

The research team sought to examine the association between generative AI (GenAI)-drafted replies for patient messages and physician time spent on answering messages. They were also looking at the quality of GenAI-drafted replies for messages dealing with mental heath concerns.

The team created a prompt within the EHR that gave physicians the option to either use an AI-generated response as a starting point or to start with a blank reply. Messages eligible for responses drafted by GenAI included refills, results, paperwork, and general questions.

The pilot study took place from June 16 to July 12, 2023, targeting primary care attending physicians at University of California San Diego. 52 physician volunteers received the intervention; the 70 physicians in the control arm did not.

In the pilot study, clinicians who were given the option of a GenAI-drafted reply spent more time reading patient messages. There was no change in average reply time.

When clinicians received messages dealing with mental health issues, replies drafted by more recent versions of GenAI had more utility than older versions.

The physicians expressed that they valued the GenAI-drafted replies as a compassionate starting point for their communication. They noted areas for improvement, like a robotic tone, and emphasized the continued need for human oversight and intervention.

The study team acknowledged potential risks when using large language models (LLMs) in mental health communication. These included a loss of human touch and empathy; overreliance and deskilling; and privacy and security risks.

This is an ongoing effort. Next steps include using LLMs to facilitate analyses of qualitative data on electronic patient-clinician communication; triangulating qualitative and quantitative data in the EHR; and aiming for a more comprehensive understanding of mental health communication and how LLMs might improve its quality.

Discussion Themes

Anecdotally, the researchers have heard from physicians that ART technology – which Epic and Microsoft continue to refine – seems to have improved. But issues still remain, such as GenAI recommending patients see clinicians from external hospital systems.

When a modified GenAI-drafted reply was sent to a patient, a disclaimer was included: “Part of this message was generated automatically.” The research team felt that it was important to provide this transparency and disclose to patients when AI contributed to the messaging they received.

Health systems and professional organizations must develop standards advocating for equity in the implementation of and access to these tools.

Grand Rounds March 28, 2025: A Cross-Sectional Study of GPT-4–Based Plain Language Translation of Clinical Notes to Improve Patient Comprehension of Disease Course and Management (Anivarya Kumar, BA; Matthew Engelhard, MD, PhD)

Posted on April 3, 2025April 3, 2025 by Sophia Ramirez

Speakers

Anivarya Kumar, BA
Fourth-Year Medical Student
Duke University School of Medicine

Matthew Engelhard, MD, PhD
Assistant Professor, Department of Biostatistics & Bioinformatics
Duke University School of Medicine

Keywords

Health Literacy; Large Language Models; Artificial Intelligence; Electronic Health Records

Key Points

Limited health literacy (HL) has tangible effects on morbidity and mortality: it’s associated with higher rates of hospital admissions and readmissions; medication nonadherence; healthcare costs; and all-cause mortality. 9 in 10 adults have limited HL, and rates are 2 – 3 times lower in marginalized populations.

71% of patients report accessing their electronic health records (EHRs) to read documentation from their clinical visits, particularly the discharge summary notes (DSNs). But clinical notes have low levels of readability, hindering patients’ ability to engage in shared decision-making.

The research team looked at whether a Generative-Pre-trained-Transformer-4 (GPT-4)-based plain language translation of DSNs could improve patient comprehension of disease course and management.

533 patients, recruited from a pool of EHR users, were randomly assigned 4 DSNs to assess. After reading the DSNs – 2 translated into more accessible language, 2 untranslated – patients answered questions assessing their objective comprehension, subjective comprehension, confidence, and time spent on each DSN.

Compared to the untranslated DSNs, objective understanding of the translated DSNs increased by 6.1%; subjective understanding increased 18%; confidence increased 45%; and average time spent with the DSNs decreased 51%.

The research team concluded that GPT translation of DSNs significantly improved patient comprehension of disease course and management and optimized time spent reading them. The effect was significantly greater in marginalized populations with historically low health literacy, reducing the gap in comprehension scores between patient populations.

Limitations included the use of standardized DSNs as opposed to real-world DSNs; the use of MyChart when enrolling patients, leading to a participant group with a higher baseline HL; and the modest number of Hispanic patients enrolled in the study.

Race is a significant and independent factor for HL. Preliminary data suggests that GPT translation can help close this gap. The research team identified this as an area for further study.

Discussion Themes

While discharge instructions alone can be great for providing patients with action items, they lack some of the context that DSNs can provide, lending the patient a more complete understanding of their condition.

The advantages of providing pre-generated materials, as opposed to pointing patients to an large language model (LLM) like Chat GPT for a more interactive explanation of their condition, include the potential for screening by a healthcare professional and less of a burden on the patient.

The study team ended up favoring “semantically-focused” translations over translations that focused solely on simplifying the language or avoiding jargon. When the LLM was asked to focus on semantics, it was more likely to define concepts and their implications.

Health literacy and reading level are not necessarily on par, and patient-centric or accessible language/LLMs are very important to consider. This may require further investigation, e.g. through qualitative interviews.

Grand Rounds March 21, 2025: Generative Artificial Intelligence in Clinical Trials: A Driver of Efficiency and Democratization of Care (Alexander J. “AJ” Blood, MD, MSc)

Posted on March 27, 2025 by Sophia Ramirez

Speaker

Alexander J. “AJ” Blood, MD, MSc
Associate Director, Accelerator for Clinical Transformation Research Group
Instructor of Medicine at Harvard Medical School
Cardiologist and Intensivist
Brigham and Women’s Hospital

Keywords

Artificial Intelligence; Cost; Large Language Models; Enrollment; Eligibility; Recruitment

Key Points

The Accelerator for Clinical Transformation (ACT) is a research group that seeks to use emerging technology to try and expand access to healthcare and improve quality and quantity of healthcare delivery. They focus on team-based models and scalable applications.

It’s becoming more expensive and time-consuming to move a drug from the clinical trial stage to approval. Patient recruitment is the leading driver of costs in clinical trials, and 55% of trials that fail to complete cite low accrual rate as the reason for study termination. There’s pressure from industry to conduct clinical trials in a way that is faster, cheaper, and better for both the patients and the research environment.

ACT conducted a pilot study in which they embedded a Large Language Model (LLM) tool called RECTIFIER into an active clinical trial of patients with heart failure. RECTIFIER is an AI-powered, comprehensive software application able to ask and answer questions about unstructured clinical data. In a pilot study, RECTIFIER determined patient eligibility with higher accuracy and specificity than study staff, indicating its potential to streamline screening.

LLMs are the engines that power the software. There are two key challenges that need to be taken into consideration to use these tools effectively: 1) there’s a content window – a limit to the amount of Electronic Health Record (EHR) data you can pull in; and 2) Using LLMs is expensive.
Following up on the pilot study results, ACT conducted a prospective randomized controlled trial: The Manual Versus AI-Assisted Clinical Trial Screening Using LLMs (MAPS-LLM) trial. MAPS-LLM compared two methods for analyzing a randomized pool of potentially eligible participants: manual review by study staff, and RECITIFIER-augmented review by study staff. Their primary endpoint was eligibility determination.

They found that AI-assisted patient screening using the RECTIFIER system significantly improved eligibility determination and enrollment compared with manual screening in a heart failure clinical trial.

ACT concluded that implementing AI-assisted tools like RECTIFIER can enhance clinical trial efficiency, reduce resource utilization, and promote equitable recruitment, potentially leading to faster trial completion and earlier patient access to novel therapies. Generative AI is likely to play a significant role in the future of clinical trials.

Discussion Themes

Study staff in the MAPS-LLM intervention arm were able to direct more time and effort towards contacting patients and managing patients with the time they would have spent reviewing charts and manually screening the EHR.

The rate of eligibility between the two arms was equivalent; the difference was, the AI-augmented group was able to assess twice as many potentially eligible patients.

While this tool can do a lot of analytical work, a human element will be essential to utilizing it effectively and to bringing “human intelligence” to participant enrollment.

The ACT team has started to pilot this technology in other disease areas, including cardiology more broadly, endocrinology, oncology, and gastroenterology.

Grand Rounds December 6, 2024: Opportunities and Challenges in the Use of Large Language Models for Post-Marketing Surveillance of Medical Products (Michael E. Matheny, MD, MS, MPH)

Posted on December 11, 2024 by Sophia Ramirez

Speaker

Michael E. Matheny, MD, MS, MPH
Director, Center for Improving the Public’s Health Through Informatics
Professor of Biomedical Informatics, Biostatistics, and Medicine
Vanderbilt University Medical Center
Staff Scientist, Geriatrics Research Education and Clinical Care Service
Associate Director, VA ORD VINCI
Tennessee Valley Healthcare System VA

Keywords

Artificial Intelligence; Large Language Models; Surveillance; Medical Products

Key Points

Increasingly, leaders in many disciplines are finding new applications for Artificial Intelligence (AI). Within healthcare, this technology is being used to support clinical decision-making; imaging processing; drug discovery; clinical trials; and as Ambient and Autonomous AI.

Large Language Models (LLMs) are a subset of generative AI. Since 2012, LLMs have emerged as a promising new technology with rapid growth, evolution of capacity and reach, and many potential applications in healthcare and clinical research.

There is significant interest in using LLMs to assist with patient trial matching, clinical trial planning, and the development of trial protocols and consent documents.

Another key area that LLMs could provide support in is medical product safety surveillance, with potential applications in adverse event detection, probabilistic phenotyping, and information synthesis.

The post-marketing surveillance space utilizes an ecosystem of healthcare data, imaging, radiology reports, insurance, structured data, medical literature, and social media. These sources could be integrated to conduct LLM reasoning and extractions.

Key challenges in safe and effective use of LLMs for this purpose include the lack of evaluation for medical product surveillance, the complexities of prompt engineering, hallucination risk (i.e., false positives), and the fact that evolving models over time challenge stable performance estimates.

Discussion Themes

A segment of the clinical workforce could be trained to be “super users,” partnering with development teams in order to make sure that these tools are working appropriately in a clinical environment.

There is substantial interest in using LLMs to support clinical decision-making. However, studies have shown that the quality of the AI output can influence the performance of the clinicians. Especially in high-risk clinical environments, any drift in those algorithms could result in adverse clinical outcomes. The life cycle approach to conceptualization, development, implementation, surveillance, and maintenance will be necessary to achieve and maintain performance.

Grand Rounds December 6, 2024: Opportunities and Challenges in the Use of Large Language Models for Post-Marketing Surveillance of Medical Products (Michael E. Matheny, MD, MS, MPH)

Speaker:

Michael E. Matheny, MD, MS, MPH
Director, Center for Improving the Public’s Health Through Informatics
Professor of Biomedical Informatics, Biostatistics, and Medicine
Vanderbilt University Medical Center
Staff Scientist, Geriatrics Research Education and Clinical Care Service
Associate Director, VA ORD VINCI
Tennessee Valley Healthcare System VA

Title: Opportunities and Challenges in the Use of Large Language Models for Post-Marketing Surveillance of Medical Products

Date: Friday, December 6, 2024, 1:00-2:00 p.m. ET

Please click the link below to join the webinar:

https://duke.zoom.us/j/92035615903?pwd=uLadvLbIgFYcfKbhFUaisHoi4zA1Eg.1

Passcode: 496990

One-Tap Mobile

+16469313860,,92035615903#,,,,*496990# US

+13017158592,,92035615903#,,,,*496990# US

Audio Only Options

+1 646 931 3860 US

+1 301 715 8592 US

International numbers available: https://duke.zoom.us/u/acSKafc4g

Webinar ID: 920 3561 5903
Passcode: 496990

COVID-19 Resources

COVID-19 Resources

Rethinking Clinical Trials

A Living Textbook of Pragmatic Clinical Trials

Large Language Models

Grand Rounds April 25, 2025: Automated Response Technology Integrated into EMR and Physician-Patient Communication (Ming Tai-Seale, PhD, MPH)

Speaker

Keywords

Key Points

Discussion Themes

Grand Rounds March 28, 2025: A Cross-Sectional Study of GPT-4–Based Plain Language Translation of Clinical Notes to Improve Patient Comprehension of Disease Course and Management (Anivarya Kumar, BA; Matthew Engelhard, MD, PhD)

Speakers

Keywords

Key Points

Discussion Themes

Grand Rounds March 21, 2025: Generative Artificial Intelligence in Clinical Trials: A Driver of Efficiency and Democratization of Care (Alexander J. “AJ” Blood, MD, MSc)

Speaker

Keywords

Key Points

Discussion Themes

Grand Rounds December 6, 2024: Opportunities and Challenges in the Use of Large Language Models for Post-Marketing Surveillance of Medical Products (Michael E. Matheny, MD, MS, MPH)

Speaker

Keywords

Key Points

Discussion Themes

Grand Rounds December 6, 2024: Opportunities and Challenges in the Use of Large Language Models for Post-Marketing Surveillance of Medical Products (Michael E. Matheny, MD, MS, MPH)