Treffer: A large dataset of software mentions in the biomedical literature (the code)

Title:
A large dataset of software mentions in the biomedical literature (the code)
Publisher Information:
Zenodo
Publication Year:
2022
Collection:
Zenodo
Document Type:
other/unknown material
Language:
unknown
DOI:
10.5281/zenodo.7041594
Rights:
info:eu-repo/semantics/openAccess ; Creative Commons Attribution 4.0 International ; https://creativecommons.org/licenses/by/4.0/legalcode
Accession Number:
edsbas.284B4285
Database:
BASE

Weitere Informationen

The code accompanying ournew dataset of software mentions in biomedical papers ( dataset , preprint ). Plain-text software mentions are extracted with a trained SciBERT model from several sources: the NIH PubMed Central collection and from papers provided by various publishers to the Chan Zuckerberg Initiative. The dataset provides sources, context and metadata, and, for a number of mentions, the disambiguated software entities and links.