Treffer: Raising awareness of potential biases in medical machine learning: Experience from a Datathon.
J Crit Care. 2006 Jun;21(2):133-41. (PMID: 16769456)
AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:488-497. (PMID: 38827048)
PLOS Digit Health. 2023 Jun 22;2(6):e0000278. (PMID: 37347721)
BMJ Open. 2021 Jul 9;11(7):e048008. (PMID: 34244270)
Ann Intern Med. 2019 Jan 1;170(1):W1-W33. (PMID: 30596876)
Radiol Artif Intell. 2023 Apr 05;5(3):e230055. (PMID: 37293341)
J Am Soc Nephrol. 2021 Jun 1;32(6):1305-1317. (PMID: 33837122)
J Am Acad Dermatol. 2022 Jul;87(1):157-159. (PMID: 34252465)
Ann Intern Med. 2019 Jan 1;170(1):51-58. (PMID: 30596875)
Circulation. 2000 Jun 13;101(23):E215-20. (PMID: 10851218)
Crit Care Med. 2022 Jul 1;50(7):1040-1050. (PMID: 35354159)
Sci Data. 2018 Sep 11;5:180178. (PMID: 30204154)
Sci Adv. 2023 May 26;9(21):eadd2704. (PMID: 37235647)
BMJ. 2024 Apr 16;385:e078378. (PMID: 38626948)
J Med Internet Res. 2016 Aug 24;18(8):e230. (PMID: 27558834)
J Med Internet Res. 2022 Aug 25;24(8):e36823. (PMID: 36006692)
JAMIA Open. 2024 Dec 30;8(1):ooae149. (PMID: 39737346)
JAMA Dermatol. 2018 Nov 1;154(11):1247-1248. (PMID: 30073260)
NPJ Digit Med. 2023 Sep 12;6(1):170. (PMID: 37700029)
JAMA Pediatr. 2022 Jun 1;176(6):569-575. (PMID: 35435935)
JAMA Pediatr. 2018 Jun 1;172(6):550-556. (PMID: 29710324)
Lancet Digit Health. 2025 Jan;7(1):e64-e88. (PMID: 39701919)
J Am Med Inform Assoc. 2024 Apr 19;31(5):1172-1183. (PMID: 38520723)
Ann Intern Med. 2015 Jan 6;162(1):W1-73. (PMID: 25560730)
Ann Intern Med. 2015 Jan 6;162(1):55-63. (PMID: 25560714)
Int J Med Inform. 2018 Apr;112:40-44. (PMID: 29500020)
Eur Heart J Digit Health. 2022 Apr 12;3(2):125-140. (PMID: 36713011)
Anaesth Intensive Care. 2012 Nov;40(6):980-94. (PMID: 23194207)
JAMA Netw Open. 2023 Dec 1;6(12):e2345050. (PMID: 38100101)
Science. 2019 Oct 25;366(6464):447-453. (PMID: 31649194)
N Engl J Med. 2020 Dec 17;383(25):2477-2478. (PMID: 33326721)
Sci Transl Med. 2016 Apr 6;8(333):333ps8. (PMID: 27053770)
J Biomed Inform. 2021 Jan;113:103621. (PMID: 33220494)
Weitere Informationen
Objective: To challenge clinicians and informaticians to learn about potential sources of bias in medical machine learning models through investigation of data and predictions from an open-source severity of illness score.
Methods: Over a two-day period (total elapsed time approximately 28 hours), we conducted a datathon that challenged interdisciplinary teams to investigate potential sources of bias in the Global Open Source Severity of Illness Score. Teams were invited to develop hypotheses, to use tools of their choosing to identify potential sources of bias, and to provide a final report.
Results: Five teams participated, three of which included both informaticians and clinicians. Most (4/5) used Python for analyses, the remaining team used R. Common analysis themes included relationship of the GOSSIS-1 prediction score with demographics and care related variables; relationships between demographics and outcomes; calibration and factors related to the context of care; and the impact of missingness. Representativeness of the population, differences in calibration and model performance among groups, and differences in performance across hospital settings were identified as possible sources of bias.
Discussion: Datathons are a promising approach for challenging developers and users to explore questions relating to unrecognized biases in medical machine learning algorithms.
(Copyright: © 2025 Hochheiser et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
The authors have declared that no competing interests exist.