Treffer: Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction.

Title:
Auto-generated materials database of Curie and Néel temperatures via semi-supervised relationship extraction.
Authors:
Court CJ; Cavendish Laboratory, Department of Physics, University of Cambridge, J.J. Thomson Avenue, Cambridge CB3 0HE, UK., Cole JM; Cavendish Laboratory, Department of Physics, University of Cambridge, J.J. Thomson Avenue, Cambridge CB3 0HE, UK.; ISIS Neutron and Muon Source, STFC Rutherford Appleton Laboratory, Harwell Science and Innovation Campus, Didcot, Oxfordshire OX11 0QX, UK.; Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439, USA.; Department of Chemical Engineering and Biotechnology, University of Cambridge, West Cambridge Site, Philippa Fawcett Drive, Cambridge CB3 0FS, UK.
Source:
Scientific data [Sci Data] 2018 Jun 19; Vol. 5, pp. 180111. Date of Electronic Publication: 2018 Jun 19.
Publication Type:
Journal Article; Research Support, U.S. Gov't, Non-P.H.S.; Research Support, Non-U.S. Gov't
Language:
English
Journal Info:
Publisher: Nature Publishing Group Country of Publication: England NLM ID: 101640192 Publication Model: Electronic Cited Medium: Internet ISSN: 2052-4463 (Electronic) Linking ISSN: 20524463 NLM ISO Abbreviation: Sci Data Subsets: PubMed not MEDLINE
Imprint Name(s):
Original Publication: London : Nature Publishing Group, 2014-
References:
J Cheminform. 2011 May 16;3(1):17. (PMID: 21575201)
J Cheminform. 2014 Apr 28;6:17. (PMID: 24834132)
Sci Data. 2017 Sep 12;4:170127. (PMID: 28895943)
Chem Rev. 2017 Jun 28;117(12 ):7673-7761. (PMID: 28475312)
J Chem Inf Model. 2016 Oct 24;56(10 ):1894-1904. (PMID: 27669338)
Molecular Sequence:
figshare 10.6084/m9.figshare.c.3954418
Entry Date(s):
Date Created: 20180620 Date Completed: 20190225 Latest Revision: 20190225
Update Code:
20250114
PubMed Central ID:
PMC6007086
DOI:
10.1038/sdata.2018.111
PMID:
29917013
Database:
MEDLINE

Weitere Informationen

Large auto-generated databases of magnetic materials properties have the potential for great utility in materials science research. This article presents an auto-generated database of 39,822 records containing chemical compounds and their associated Curie and Néel magnetic phase transition temperatures. The database was produced using natural language processing and semi-supervised quaternary relationship extraction, applied to a corpus of 68,078 chemistry and physics articles. Evaluation of the database shows an estimated overall precision of 73%. Therein, records processed with the text-mining toolkit, ChemDataExtractor, were assisted by a modified Snowball algorithm, whose original binary relationship extraction capabilities were extended to quaternary relationship extraction. Consequently, its machine learning component can now train with ≤ 500 seeds, rather than the 4,000 originally used. Data processed with the modified Snowball algorithm affords 82% precision. Database records are available in MongoDB, CSV and JSON formats which can easily be read using Python, R, Java and MatLab. This makes the database easy to query for tackling big-data materials science initiatives and provides a basis for magnetic materials discovery.