Ethics for Artificial Intelligence and Machine Learning in Pragmatic Clinical Trials

Section 4 Training Data Generation

Contributors

Choices regarding what data are procured and how they are used to build algorithmic models are critical to the integrity of the resulting AI/ML system. To be more equity-enhancing, researchers should develop AI systems with the practical realities of point of care in mind (McCradden et al 2022) and therefore confirm that training data are representative of the intended populations and—if not—assess any biases that might result. Researchers must also ascertain who is (and is not) represented in training data and the effects, if any, this will have on scientific endpoints.

Investigator tip: If patient data from electronic health records are used, consider whether healthcare access points affect representation. Biases can arise because training datasets generated from “patients” only represent people with access to healthcare, and are much more likely to be drawn from major academic medical centers with the most up-to-date data technology. Likewise, datasets generated from “research participants” reflect only those recruited to enroll and who subsequently consent to participate. Studies show there are known differences in consent rates between historically included and excluded populations (Spector-Bagdady et al 2021).

Previous Section Next Section

SECTIONS

CHAPTER SECTIONS

sections

REFERENCES

McCradden MD, Anderson JA, Stephenson EA, et al. 2022. A research ethics framework for the clinical translation of healthcare machine learning. Am J Bioeth. 22(5):8-22. doi:10.1080/15265161.2021.2013977. PMID: 35048782.

Spector-Bagdady K, Tang S, Jabbour S, et al. 2021. Respecting autonomy and enabling diversity: The effect of eligibility and enrollment on research data demographics. Health Aff. 40(12):1892-1899. doi:10.1377/hlthaff.2021.01197. PMID: 34871076.

Version History

Published November 7, 2023

COVID-19 Resources

COVID-19 Resources

Rethinking Clinical Trials

A Living Textbook of Pragmatic Clinical Trials

Training Data Generation

Ethics for Artificial Intelligence and Machine Learning in Pragmatic Clinical Trials

Section 4

Training Data Generation

Contributing Editor

SECTIONS

sections

REFERENCES

current section :

Training Data Generation

Citation: