Grand Rounds December 6, 2024: Opportunities and Challenges in the Use of Large Language Models for Post-Marketing Surveillance of Medical Products (Michael E. Matheny, MD, MS, MPH)

Speaker

Michael E. Matheny, MD, MS, MPH
Director, Center for Improving the Public’s Health Through Informatics
Professor of Biomedical Informatics, Biostatistics, and Medicine
Vanderbilt University Medical Center
Staff Scientist, Geriatrics Research Education and Clinical Care Service
Associate Director, VA ORD VINCI
Tennessee Valley Healthcare System VA

Keywords

Artificial Intelligence; Large Language Models; Surveillance; Medical Products

Key Points

Increasingly, leaders in many disciplines are finding new applications for Artificial Intelligence (AI). Within healthcare, this technology is being used to support clinical decision-making; imaging processing; drug discovery; clinical trials; and as Ambient and Autonomous AI.

Large Language Models (LLMs) are a subset of generative AI. Since 2012, LLMs have emerged as a promising new technology with rapid growth, evolution of capacity and reach, and many potential applications in healthcare and clinical research.

There is significant interest in using LLMs to assist with patient trial matching, clinical trial planning, and the development of trial protocols and consent documents.

Another key area that LLMs could provide support in is medical product safety surveillance, with potential applications in adverse event detection, probabilistic phenotyping, and information synthesis.

The post-marketing surveillance space utilizes an ecosystem of healthcare data, imaging, radiology reports, insurance, structured data, medical literature, and social media. These sources could be integrated to conduct LLM reasoning and extractions.

Key challenges in safe and effective use of LLMs for this purpose include the lack of evaluation for medical product surveillance, the complexities of prompt engineering, hallucination risk (i.e., false positives), and the fact that evolving models over time challenge stable performance estimates.

Discussion Themes

A segment of the clinical workforce could be trained to be “super users,” partnering with development teams in order to make sure that these tools are working appropriately in a clinical environment.

There is substantial interest in using LLMs to support clinical decision-making. However, studies have shown that the quality of the AI output can influence the performance of the clinicians. Especially in high-risk clinical environments, any drift in those algorithms could result in adverse clinical outcomes. The life cycle approach to conceptualization, development, implementation, surveillance, and maintenance will be necessary to achieve and maintain performance.

COVID-19 Resources