Treffer: Distributed learning from multiple EHR databases: Contextual embedding models for medical events.

Title:
Distributed learning from multiple EHR databases: Contextual embedding models for medical events.
Authors:
Li Z; Emory University, Department of Biostatistics and Bioinformatics, Atlanta, GA 30332, USA., Roberts K; University of Texas, Health Science Center at Houston, School of Biomedical Informatics, Houston, TX 77030, USA., Jiang X; University of Texas, Health Science Center at Houston, School of Biomedical Informatics, Houston, TX 77030, USA. Electronic address: Xiaoqian.Jiang@uth.tmc.edu., Long Q; University of Pennsylvania, Perelman School of Medicine, Department of Biostatistics, Epidemiology and Informatics, Philadelphia, PA 19104, USA. Electronic address: qlong@pennmedicine.upenn.edu.
Source:
Journal of biomedical informatics [J Biomed Inform] 2019 Apr; Vol. 92, pp. 103138. Date of Electronic Publication: 2019 Feb 27.
Publication Type:
Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't
Language:
English
Journal Info:
Publisher: Elsevier Country of Publication: United States NLM ID: 100970413 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1532-0480 (Electronic) Linking ISSN: 15320464 NLM ISO Abbreviation: J Biomed Inform Subsets: MEDLINE
Imprint Name(s):
Publication: Orlando : Elsevier
Original Publication: San Diego, CA : Academic Press, c2001-
References:
JMIR Med Inform. 2018 May 16;6(2):e33. (PMID: 29769172)
AMIA Jt Summits Transl Sci Proc. 2016 Jul 20;2016:41-50. (PMID: 27570647)
Ann Intern Med. 2010 Nov 2;153(9):600-6. (PMID: 21041580)
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2017 Nov;2017:764-769. (PMID: 29375929)
Sci Data. 2016 May 24;3:160035. (PMID: 27219127)
Med Care. 2010 Jun;48(6 Suppl):S106-13. (PMID: 20473190)
JAMA. 1999 Oct 20;282(15):1466-71. (PMID: 10535438)
JMIR Med Inform. 2018 Apr 13;6(2):e20. (PMID: 29653917)
J Am Med Inform Assoc. 2013 Jan 1;20(1):117-21. (PMID: 22955496)
Pac Symp Biocomput. 2018;23:123-132. (PMID: 29218875)
PLoS One. 2013 Jun 24;8(6):e66341. (PMID: 23826094)
J Am Med Inform Assoc. 2017 Mar 1;24(2):361-370. (PMID: 27521897)
KDD. 2012;2012:280-288. (PMID: 25937993)
JMIR Med Inform. 2016 Nov 25;4(4):e39. (PMID: 27888170)
Grant Information:
R01 HG008802 United States HG NHGRI NIH HHS; R21 NS091630 United States NS NINDS NIH HHS; R21 LM012060 United States LM NLM NIH HHS; U54 HL108460 United States HL NHLBI NIH HHS; U01 EB023685 United States EB NIBIB NIH HHS; R01 GM124111 United States GM NIGMS NIH HHS; R01 GM118574 United States GM NIGMS NIH HHS; P30 CA016520 United States CA NCI NIH HHS; R01 GM114612 United States GM NIGMS NIH HHS; R01 GM118609 United States GM NIGMS NIH HHS
Contributed Indexing:
Keywords: Contextual embedding models; Diagnoses prediction; Distributed computing; EHR data
Entry Date(s):
Date Created: 20190303 Date Completed: 20200625 Latest Revision: 20200625
Update Code:
20250114
PubMed Central ID:
PMC6533615
DOI:
10.1016/j.jbi.2019.103138
PMID:
30825539
Database:
MEDLINE

Weitere Informationen

Electronic health record (EHR) data provide promising opportunities to explore personalized treatment regimes and to make clinical predictions. Compared with regular clinical data, EHR data are known for their irregularity and complexity. In addition, analyzing EHR data involves privacy issues and sharing such data is often infeasible among multiple research sites due to regulatory and other hurdles. A recently published work uses contextual embedding models and successfully builds one predictive model for more than seventy common diagnoses. Despite of the high predictive power, the model cannot be generalized to other institutions without sharing data. In this work, a novel method is proposed to learn from multiple databases and build predictive models based on Distributed Noise Contrastive Estimation (Distributed NCE). We use differential privacy to safeguard the intermediary information sharing. The numerical study with a real dataset demonstrates that the proposed method not only can build predictive models in a distributed manner with privacy protection, but also preserve model structure well and achieve comparable prediction accuracy. The proposed methods have been implemented as a stand-alone Python library and the implementation is available on Github (https://github.com/ziyili20/DistributedLearningPredictor) with installation instructions and use-cases.
(Copyright © 2019. Published by Elsevier Inc.)