Serviceeinschränkungen vom 12.-22.02.2026 - weitere Infos auf der UB-Homepage

Treffer: Machine learning in epidemiology: An introduction, comparison with traditional methods, and a case study of predicting extreme longevity.

Title:
Machine learning in epidemiology: An introduction, comparison with traditional methods, and a case study of predicting extreme longevity.
Authors:
Atias D; Department of Epidemiology and Preventive Medicine, School of Public Health, Gray Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel; Department of Family Medicine, Maccabi Healthcare Services, Tel Aviv, Israel., Ashri S; Department of Epidemiology and Preventive Medicine, School of Public Health, Gray Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel., Goldbourt U; Department of Epidemiology and Preventive Medicine, School of Public Health, Gray Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel., Benyamini Y; Bob Shapell School of Social Work, Tel Aviv University, Tel Aviv, Israel., Gilad-Bachrach R; Department of Biomedical Engineering, Tel Aviv University, Tel Aviv, Israel; Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel., Hasin T; Jesselson Integrated Heart Center, The Eisenberg R&D Authority, Shaare Zedek Medical Center, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem, Israel., Gerber Y; Department of Epidemiology and Preventive Medicine, School of Public Health, Gray Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel., Obolski U; Department of Epidemiology and Preventive Medicine, School of Public Health, Gray Faculty of Medical & Health Sciences, Tel Aviv University, Tel Aviv, Israel. Electronic address: uriobols@tauex.tau.ac.il.
Source:
Annals of epidemiology [Ann Epidemiol] 2025 Oct; Vol. 110, pp. 23-33. Date of Electronic Publication: 2025 Jul 21.
Publication Type:
Journal Article; Comparative Study
Language:
English
Journal Info:
Publisher: Elsevier Country of Publication: United States NLM ID: 9100013 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1873-2585 (Electronic) Linking ISSN: 10472797 NLM ISO Abbreviation: Ann Epidemiol Subsets: MEDLINE
Imprint Name(s):
Original Publication: New York, NY : Elsevier, c1990-
Contributed Indexing:
Keywords: Centenarians; Interpretable artificial intelligence; Machine learning; Prediction model
Entry Date(s):
Date Created: 20250723 Date Completed: 20251012 Latest Revision: 20251012
Update Code:
20251013
DOI:
10.1016/j.annepidem.2025.07.024
PMID:
40701371
Database:
MEDLINE

Weitere Informationen

Background: Healthcare data volume is increasingly expanding, presenting both challenges and opportunities. Traditional statistical methods applied in epidemiology, such as logistic regression (LR), albeit widely used, holds limited ability to handle the complexity and high dimensionality of modern datasets. In contrast, machine learning (ML) methods can model complex, non-linear relationships and are less constrained by parametric assumptions, ideal for uncovering hidden patterns.
Methods: In this study, we aim to introduce ML applications for epidemiologic research and explore three predictive models: LR as a traditional modeling approach, and least absolute shrinkage and selection operator (LASSO) regression and eXtreme Gradient Boosting (XGBoost) as ML approaches. We demonstrate how ML approaches, particularly XGBoost, can benefit epidemiologic research through a real-world case study. We present common steps: data preprocessing, model creation and evaluation processes. Additionally, we address the "black box" nature of ML models and present post hoc explanation tools to enhance interpretability.
Results: We examined the case of near-centenarianism (reaching age of 95 years or older) prediction using midlife predictors (i.e., demographic, clinical, lifestyle, occupational and dietary variables) in a cohort of approximately 10,000 middle-aged working men recruited in 1963 and followed until death or until 2019. Models were fitted and calibrated on a training set, showing good predictive performances on a separate test set. XGboost, LASSO regression, and LR achieved ROC-AUC values of 0.72 (95 % CI: 0.66-0.75), 0.71 (95 % CI: 0.67-0.74) and 0.69 (95 % CI: 0.66-0.73), respectively. Explainability analysis identified key predictors for longevity, including systolic blood pressure, smoking status, and a history of myocardial infarction; consistent with prior studies.
Conclusions: In conclusion, our findings highlight the potential of ML to enhance epidemiological studies by handling complex interactions and high-dimensional data, suggesting a complementary approach to traditional methods.
(Copyright © 2025 The Authors. Published by Elsevier Inc. All rights reserved.)

Declaration of Competing Interest The authors have no conflict of interests to disclose