Treffer: A large dataset of software mentions in the biomedical literature (the code)
Title:
A large dataset of software mentions in the biomedical literature (the code)
Authors:
Publisher Information:
Zenodo
Publication Year:
2022
Collection:
Zenodo
Subject Terms:
Document Type:
other/unknown material
Language:
unknown
Relation:
https://doi.org/10.5281/zenodo.7041593; https://doi.org/10.5281/zenodo.7041594; oai:zenodo.org:7041594
DOI:
10.5281/zenodo.7041594
Availability:
Rights:
info:eu-repo/semantics/openAccess ; Creative Commons Attribution 4.0 International ; https://creativecommons.org/licenses/by/4.0/legalcode
Accession Number:
edsbas.284B4285
Database:
BASE
Weitere Informationen
The code accompanying ournew dataset of software mentions in biomedical papers ( dataset , preprint ). Plain-text software mentions are extracted with a trained SciBERT model from several sources: the NIH PubMed Central collection and from papers provided by various publishers to the Chan Zuckerberg Initiative. The dataset provides sources, context and metadata, and, for a number of mentions, the disambiguated software entities and links.