Grand Rounds April 25, 2025: Automated Response Technology Integrated into EMR and Physician-Patient Communication (Ming Tai-Seale, PhD, MPH)

Speaker

Ming Tai-Seale, PhD, MPH
Professor
Departments of Family Medicine and Medicine (Bioinformatics)
University of California San Diego School of Medicine
Director of UC San Diego Learning Health Systems Science Center

Keywords

Electronic Health Record; Artificial Intelligence; MyChart; Patient Messages; Large Language Models; Clinician Well-Being; Mental Health

Key Points

  • Physician work is increasingly centered around the electronic health record (EHR). It consumes nearly 50% of scheduled clinic time. The volume of patient messages in MyChart increased significantly from 2020 to 2022, and remains much higher than pre-pandemic levels.
  • Research published in Health Affairs and JAMA Network Open suggests that this influx of inbox messages is detrimental to physicians’ well-being. The emotional timbre of messages from patients plays a role, as well; in an analysis of EHR inbasket messages, the research team found messages from patients that contained expletives, vitriol and personal attacks.
  • The research team sought to examine the association between generative AI (GenAI)-drafted replies for patient messages and physician time spent on answering messages. They were also looking at the quality of GenAI-drafted replies for messages dealing with mental heath concerns.
  • The team created a prompt within the EHR that gave physicians the option to either use an AI-generated response as a starting point or to start with a blank reply. Messages eligible for responses drafted by GenAI included refills, results, paperwork, and general questions.
  • The pilot study took place from June 16 to July 12, 2023, targeting primary care attending physicians at University of California San Diego. 52 physician volunteers received the intervention; the 70 physicians in the control arm did not.
  • In the pilot study, clinicians who were given the option of a GenAI-drafted reply spent more time reading patient messages. There was no change in average reply time.
  • When clinicians received messages dealing with mental health issues, replies drafted by more recent versions of GenAI had more utility than older versions.
  • The physicians expressed that they valued the GenAI-drafted replies as a compassionate starting point for their communication. They noted areas for improvement, like a robotic tone, and emphasized the continued need for human oversight and intervention.
  • The study team acknowledged potential risks when using large language models (LLMs) in mental health communication. These included a loss of human touch and empathy; overreliance and deskilling; and privacy and security risks.
  • This is an ongoing effort. Next steps include using LLMs to facilitate analyses of qualitative data on electronic patient-clinician communication; triangulating qualitative and quantitative data in the EHR; and aiming for a more comprehensive understanding of mental health communication and how LLMs might improve its quality.

Discussion Themes

Anecdotally, the researchers have heard from physicians that ART technology – which Epic and Microsoft continue to refine – seems to have improved. But issues still remain, such as GenAI recommending patients see clinicians from external hospital systems.

When a modified GenAI-drafted reply was sent to a patient, a disclaimer was included: “Part of this message was generated automatically.” The research team felt that it was important to provide this transparency and disclose to patients when AI contributed to the messaging they received.

Health systems and professional organizations must develop standards advocating for equity in the implementation of and access to these tools.

Grand Rounds March 28, 2025: A Cross-Sectional Study of GPT-4–Based Plain Language Translation of Clinical Notes to Improve Patient Comprehension of Disease Course and Management (Anivarya Kumar, BA; Matthew Engelhard, MD, PhD)

Speakers

Anivarya Kumar, BA
Fourth-Year Medical Student
Duke University School of Medicine

Matthew Engelhard, MD, PhD
Assistant Professor, Department of Biostatistics & Bioinformatics
Duke University School of Medicine

Keywords

Health Literacy; Large Language Models; Artificial Intelligence; Electronic Health Records

Key Points

  • Limited health literacy (HL) has tangible effects on morbidity and mortality: it’s associated with higher rates of hospital admissions and readmissions; medication nonadherence; healthcare costs; and all-cause mortality. 9 in 10 adults have limited HL, and rates are 2 – 3 times lower in marginalized populations.
  • 71% of patients report accessing their electronic health records (EHRs) to read documentation from their clinical visits, particularly the discharge summary notes (DSNs). But clinical notes have low levels of readability, hindering patients’ ability to engage in shared decision-making.
  • The research team looked at whether a Generative-Pre-trained-Transformer-4 (GPT-4)-based plain language translation of DSNs could improve patient comprehension of disease course and management.
  • 533 patients, recruited from a pool of EHR users, were randomly assigned 4 DSNs to assess. After reading the DSNs – 2 translated into more accessible language, 2 untranslated – patients answered questions assessing their objective comprehension, subjective comprehension, confidence, and time spent on each DSN.
  • Compared to the untranslated DSNs, objective understanding of the translated DSNs increased by 6.1%; subjective understanding increased 18%; confidence increased 45%; and average time spent with the DSNs decreased 51%.
  • The research team concluded that GPT translation of DSNs significantly improved patient comprehension of disease course and management and optimized time spent reading them. The effect was significantly greater in marginalized populations with historically low health literacy, reducing the gap in comprehension scores between patient populations.
  • Limitations included the use of standardized DSNs as opposed to real-world DSNs; the use of MyChart when enrolling patients, leading to a participant group with a higher baseline HL; and the modest number of Hispanic patients enrolled in the study.
  • Race is a significant and independent factor for HL. Preliminary data suggests that GPT translation can help close this gap. The research team identified this as an area for further study.

Discussion Themes

While discharge instructions alone can be great for providing patients with action items, they lack some of the context that DSNs can provide, lending the patient a more complete understanding of their condition.

The advantages of providing pre-generated materials, as opposed to pointing patients to an large language model (LLM) like Chat GPT for a more interactive explanation of their condition, include the potential for screening by a healthcare professional and less of a burden on the patient.

The study team ended up favoring “semantically-focused” translations over translations that focused solely on simplifying the language or avoiding jargon. When the LLM was asked to focus on semantics, it was more likely to define concepts and their implications.

Health literacy and reading level are not necessarily on par, and patient-centric or accessible language/LLMs are very important to consider. This may require further investigation, e.g. through qualitative interviews.

Grand Rounds March 21, 2025: Generative Artificial Intelligence in Clinical Trials: A Driver of Efficiency and Democratization of Care (Alexander J. “AJ” Blood, MD, MSc)

Speaker

Alexander J. “AJ” Blood, MD, MSc
Associate Director, Accelerator for Clinical Transformation Research Group
Instructor of Medicine at Harvard Medical School
Cardiologist and Intensivist
Brigham and Women’s Hospital

Keywords

Artificial Intelligence; Cost; Large Language Models; Enrollment; Eligibility; Recruitment

Key Points

  • The Accelerator for Clinical Transformation (ACT) is a research group that seeks to use emerging technology to try and expand access to healthcare and improve quality and quantity of healthcare delivery. They focus on team-based models and scalable applications.
  • It’s becoming more expensive and time-consuming to move a drug from the clinical trial stage to approval. Patient recruitment is the leading driver of costs in clinical trials, and 55% of trials that fail to complete cite low accrual rate as the reason for study termination. There’s pressure from industry to conduct clinical trials in a way that is faster, cheaper, and better for both the patients and the research environment.
  • ACT conducted a pilot study in which they embedded a Large Language Model (LLM) tool called RECTIFIER into an active clinical trial of patients with heart failure. RECTIFIER is an AI-powered, comprehensive software application able to ask and answer questions about unstructured clinical data. In a pilot study, RECTIFIER determined patient eligibility with higher accuracy and specificity than study staff, indicating its potential to streamline screening.
  • LLMs are the engines that power the software. There are two key challenges that need to be taken into consideration to use these tools effectively: 1) there’s a content window – a limit to the amount of Electronic Health Record (EHR) data you can pull in; and 2) Using LLMs is expensive.
  • Following up on the pilot study results, ACT conducted a prospective randomized controlled trial: The Manual Versus AI-Assisted Clinical Trial Screening Using LLMs (MAPS-LLM) trial. MAPS-LLM compared two methods for analyzing a randomized pool of potentially eligible participants: manual review by study staff, and RECITIFIER-augmented review by study staff. Their primary endpoint was eligibility determination.
  • They found that AI-assisted patient screening using the RECTIFIER system significantly improved eligibility determination and enrollment compared with manual screening in a heart failure clinical trial.
  • ACT concluded that implementing AI-assisted tools like RECTIFIER can enhance clinical trial efficiency, reduce resource utilization, and promote equitable recruitment, potentially leading to faster trial completion and earlier patient access to novel therapies. Generative AI is likely to play a significant role in the future of clinical trials.

Discussion Themes

Study staff in the MAPS-LLM intervention arm were able to direct more time and effort towards contacting patients and managing patients with the time they would have spent reviewing charts and manually screening the EHR.

The rate of eligibility between the two arms was equivalent; the difference was, the AI-augmented group was able to assess twice as many potentially eligible patients.

While this tool can do a lot of analytical work, a human element will be essential to utilizing it effectively and to bringing “human intelligence” to participant enrollment.

The ACT team has started to pilot this technology in other disease areas, including cardiology more broadly, endocrinology, oncology, and gastroenterology.

Grand Rounds February 21, 2025: Texting for Behavior Change: Lessons Learned Across 2 Interventions to Improve Chronic Care Management (Michael Ho, MD, PhD; Sheana Bull, PhD)

Speakers

Michael Ho, MD, PhD
Kaiser Permanente Colorado

Sheana Bull, PhD
University of Colorado School of Public Health

Keywords

Text Messaging; Artificial Intelligence; Chatbots; Health Behaviors

Key Points

  • Ample evidence now exists demonstrating the benefit of using text messaging in support of health behavior and access to care. It’s ubiquitous, increasing reach; theory in message design is impactful; and it can improve adherence to medical appointments and health behaviors.
  • Two NIH Collaboratory Trials, Nudge and Chat 4 Heart Health (C4HH), test the effectiveness of text messaging interventions to support behavior change. Nudge randomized patients to receive usual care, generic texts, behavioral texts, or behavioral texts plus chatbot messages. Their primary outcome was medication adherence.
  • C4HH, the subsequent trial, is randomizing patients to receive a generic text message curriculum; an AI chatbot messaging curriculum; or AI chatbot messages plus proactive pharmacist support. Their primary outcome is cardiovascular risk factors, as measured by the American Heart Association’s “Life’s Essential 8” adherence.
  • Nudge used an opt-out consent approach where CC4H used an opt-in consent approach. In the former, the research team noted, patients who identified as Black, Hispanic, and primary Spanish speakers were more likely to remain in the study. An opt-out approach in the appropriate context may be a way to diversify clinical trial populations and improve external validity of results.
  • The use of AI chatbots allows users to generate questions in their own words and the system to retrieve a response from a closed, curated library.
  • Message engagement is key to text messaging interventions. Participants in the Nudge study who were randomized to optimized texts had more questions. Questions were related to medications, refill logistics, and costs. The study team hypothesizes that the optimized texts may have led to greater patient engagement, and therefore more questions about their medications.
  • Over 12 months, the Nudge study found no significant difference in the rates of prescription refills, between the 3 intervention arms and usual care. CC4H is ongoing, and will send a higher volume of messages in an effort to engage patients and change patient behavior.
  • So far, the top 5 topics in messages initiated by C4HH participants have been healthy eating, physical activity, managing cholesterol, quitting smoking, and medication management.

Discussion Themes

The study team had to be very careful to ensure that patient health data, including cell phone numbers and the messages sent, were encrypted. Vendors and phone carriers were not able to access this data and it was not stored on their servers.

One of the challenges they encountered was that their systems weren’t integrated into the health care organizations’ pharmacies or electronic health records. The integration piece will be key to any future sustainability.

As technology evolves significantly over the course of, say, a 5-year study, developing the skillset to utilize interactive interventions or a SMART design could be helpful for investigators interested in conducting research in this area.

November 12, 2024: Duke-Margolis and Duke Health to Co-Host Webinar on AI Governance in Health Systems

On November 18, 2024 at 2 pm eastern, the Duke Margolis Institute for Health Policy and Duke Health will co-host “From Principles to Practice: Exploring AI Governance in Health Systems.” In this public webinar, Duke-Margolis and Duke Health will discuss their newly released white paper on how health systems are navigating the role of AI governance.

The webinar will begin with an overview presentation of key takeaways from the white paper, followed by a fireside chat where experts will discuss the benefits of governance and lessons learned while building their own AI governance processes. After the fireside chat, there will be a panel discussion on methods and supports to facilitate the democratization of AI governance so more health organizations can safely and responsibly use these novel tools.

Registration is required for participation, but there is no cost to attend. Continuing education credits are available for several disciplines for participants who are affiliated with VA. Contact margolisevents@duke.edu with any questions.

Learn more and register today.

Grand Rounds June 28, 2024: Using ChatGPT to Facilitate Truly Informed Medical Consent (Fatima N. Mirza, MD, MPH)

Speaker

Fatima N. Mirza, MD, MPH
Chief Resident
Department of Dermatology
Warren Alpert Medical School of Brown University

Keywords

Artificial Intelligence; ChatGPT; Informed Consent

Key Points

  • Artificial Intelligence (AI), when implemented thoughtfully in clinical settings, can lead to meaningful results for patients.
  • Medicine has long fallen short of the ideal of informed consent, in part due to the limited readability of informed consent forms; as of 2020, 54% of Americans were estimated to read below the sixth-grade reading level. This has implications for patient understanding and quality of care.
  • Using ChatGPT-4, the research team took LifeSpan Healthcare System’s (LHS) surgical consent form from a 12.6 Flesch-Kincaid reading level to a 6.7. After hospital leadership reviewed the revised form and the research team had addressed concerns around biases, approvals, and future needs, the revised form was deployed across LHS.
  • This real-world implementation demonstrated the potential for AI to make meaningful improvements to patient care and communication. As a proof-of-concept, it sparked the interest of other health systems.
  • To investigate the clinical implications of consent form readability, the research team analyzed 798 federally funded clinical trials providing accessible informed consent forms. A 16% increase in the dropout rate was associated with each additional Flesch-Kincaid grade-level increase in the language.
  • The observed association between consent form complexity and participant dropout rate could be attributed to misaligned expectations; erosion of trust; participant surprise and dissatisfaction; and reduced engagement. This highlights the importance of clear, accessible communication throughout the entire trial process, not just enrollment.
  • Improved readability of consent forms is crucial for the design and implementation of clinical trials, potentially leading to more inclusive, efficient, and impactful clinical research.

Discussion Themes

Medical malpractice attorneys were part of the cohort that reviewed the simplified consent forms. They shared that when these cases go to court, a jury will often review the informed consent document. In these cases, it can more protective to have documentation that people can understand.

There is the potential for tailored chatbots that could personalize the consent process for patients, but expert oversight would be crucial. AI models can “hallucinate,” saying something incorrect with absolute certainty. Dr. Mirza indicated that the field isn’t ready for those bespoke consents; centralized documents, as least for the time being, are the way to go.

AI is going to be integrated into our healthcare system. It’s important for clinicians, researchers, and people who really care about patients, as well as the patients themselves, to have a seat at the table discussing these early models.