Treffer: srBERT: automatic article classification model for systematic review using BERT.
Zhong Xi Yi Jie He Xue Bao. 2010 Jul;8(7):636-40. (PMID: 20619139)
Zhongguo Zhen Jiu. 2011 Jan;31(1):19-22. (PMID: 21355147)
PLoS One. 2010 Jul 13;5(7):e11553. (PMID: 20644625)
BMJ. 2013 Jan 10;346:f139. (PMID: 23305843)
PLoS One. 2013 Aug 20;8(8):e71838. (PMID: 23977157)
J Comp Eff Res. 2013 May;2(3):273-82. (PMID: 24236626)
BMJ Open. 2017 Feb 27;7(2):e012545. (PMID: 28242767)
Syst Rev. 2018 Jan 09;7(1):3. (PMID: 29316980)
J R Soc Med. 2007 Apr;100(4):187-90. (PMID: 17404342)
Exp Neurobiol. 2018 Feb;27(1):1-15. (PMID: 29535565)
J Clin Epidemiol. 2016 Nov;79:120-129. (PMID: 27387966)
Integr Med Res. 2021 Jun;10(2):100680. (PMID: 33747784)
Res Synth Methods. 2011 Mar;2(1):1-14. (PMID: 26061596)
BMC Med Res Methodol. 2011 Jun 17;11:92. (PMID: 21682870)
BMC Med Inform Decis Mak. 2010 Sep 28;10:56. (PMID: 20920176)
Cochrane Database Syst Rev. 2019 Oct 3;10:ED000142. (PMID: 31643080)
Bioinformatics. 2020 Feb 15;36(4):1234-1240. (PMID: 31501885)
Weitere Informationen
Background: Systematic reviews (SRs) are recognized as reliable evidence, which enables evidence-based medicine to be applied to clinical practice. However, owing to the significant efforts required for an SR, its creation is time-consuming, which often leads to out-of-date results. To support SR tasks, tools for automating these SR tasks have been considered; however, applying a general natural language processing model to domain-specific articles and insufficient text data for training poses challenges.
Methods: The research objective is to automate the classification of included articles using the Bidirectional Encoder Representations from Transformers (BERT) algorithm. In particular, srBERT models based on the BERT algorithm are pre-trained using abstracts of articles from two types of datasets, and the resulting model is then fine-tuned using the article titles. The performances of our proposed models are compared with those of existing general machine-learning models.
Results: Our results indicate that the proposed srBERT <subscript>my</subscript> model, pre-trained with abstracts of articles and a generated vocabulary, achieved state-of-the-art performance in both classification and relation-extraction tasks; for the first task, it achieved an accuracy of 94.35% (89.38%), F1 score of 66.12 (78.64), and area under the receiver operating characteristic curve of 0.77 (0.9) on the original and (generated) datasets, respectively. In the second task, the model achieved an accuracy of 93.5% with a loss of 27%, thereby outperforming the other evaluated models, including the original BERT model.
Conclusions: Our research shows the possibility of automatic article classification using machine-learning approaches to support SR tasks and its broad applicability. However, because the performance of our model depends on the size and class ratio of the training dataset, it is important to secure a dataset of sufficient quality, which may pose challenges.
(© 2021. The Author(s).)