Treffer: Raising awareness of potential biases in medical machine learning: Experience from a Datathon.

Title:
Raising awareness of potential biases in medical machine learning: Experience from a Datathon.
Authors:
Hochheiser H; Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA USA., Klug J; UPMC Intensive Care Unit Service Center, UPMC, Pittsburgh, PA, USA., Mathie T; Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA., Pollard TJ; MIT Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA., Raffa JD; MIT Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA., Ballard SL; Health Informatics, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA., Conrad EA; Health Informatics, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA., Edakalavan S; Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA USA., Joseph A; Division of Critical Care Medicine, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA., Alnomasy N; Health Informatics, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA.; College of Nursing, Medical Surgical Department, University of Ha'il, Ha'il, Saudi Arabia., Nutman S; Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA., Hill V; Health Informatics, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA., Kapoor S; Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA., Claudio EP; Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA USA., Kravchenko OV; Department of Family and Community Medicine, University of Pittsburgh, Pittsburgh, PA, USA., Li R; Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA., Nourelahi M; Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA USA., Diaz J; Health Informatics, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, USA., Taylor WM; Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA., Rooney SR; Division of Cardiology, Department of Pediatrics, Children's Hospital of Pittsburgh, University of Pittsburgh, Pittsburgh, PA, USA., Woeltje M; Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA., Celi LA; MIT Laboratory for Computational Physiology, Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA.; Division of Pulmonary, Critical Care and Sleep Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA.; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA., Horvat CM; Department of Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA, USA.
Source:
MedRxiv : the preprint server for health sciences [medRxiv] 2024 Nov 02. Date of Electronic Publication: 2024 Nov 02.
Publication Type:
Journal Article; Preprint
Language:
English
Journal Info:
Country of Publication: United States NLM ID: 101767986 Publication Model: Electronic Cited Medium: Internet NLM ISO Abbreviation: medRxiv Subsets: PubMed not MEDLINE
Comments:
Update in: PLOS Digit Health. 2025 Jul 11;4(7):e0000932. doi: 10.1371/journal.pdig.0000932.. (PMID: 40644462)
Grant Information:
R01 NS118716 United States NS NINDS NIH HHS; R01 EB017205 United States EB NIBIB NIH HHS; U54 TW012043 United States TW FIC NIH HHS; OT2 OD032701 United States OD NIH HHS; T15 LM007059 United States LM NLM NIH HHS; K23 HD099331 United States HD NICHD NIH HHS
Entry Date(s):
Date Created: 20241106 Latest Revision: 20250715
Update Code:
20250715
PubMed Central ID:
PMC11537317
DOI:
10.1101/2024.10.21.24315543
PMID:
39502657
Database:
MEDLINE

Weitere Informationen

Objective: To challenge clinicians and informaticians to learn about potential sources of bias in medical machine learning models through investigation of data and predictions from an open-source severity of illness score.
Methods: Over a two-day period (total elapsed time approximately 28 hours), we conducted a datathon that challenged interdisciplinary teams to investigate potential sources of bias in the Global Open Source Severity of Illness Score. Teams were invited to develop hypotheses, to use tools of their choosing to identify potential sources of bias, and to provide a final report.
Results: Five teams participated, three of which included both informaticians and clinicians. Most (4/5) used Python for analyses, the remaining team used R. Common analysis themes included relationship of the GOSSIS-1 prediction score with demographics and care related variables; relationships between demographics and outcomes; calibration and factors related to the context of care; and the impact of missingness. Representativeness of the population, differences in calibration and model performance among groups, and differences in performance across hospital settings were identified as possible sources of bias.
Discussion: Datathons are a promising approach for challenging developers and users to explore questions relating to unrecognized biases in medical machine learning algorithms.