Treffer: A framework to evaluate machine learning crystal stability predictions
application/pdf
text/xml
English
English
1528766160
From OAIster®, provided by the OCLC Cooperative.
Weitere Informationen
Acknowledgements: J.R. acknowledges support from the German Academic Scholarship Foundation (Studienstiftung). A.A.L. acknowledges support from the Royal Society. A.J. and K.A.P. acknowledge the US Department of Energy, Office of Science, Office of Basic Energy Sciences, Materials Sciences and Engineering Division under contract no. DE-AC02-05-CH11231 (Materials Project programme KC23MP). This work used computational resources provided by the National Energy Research Scientific Computing Center (NERSC), a US Department of Energy Office of Science User Facility operated under contract no. DE-AC02-05-CH11231. We thank H.-C. Wang, S. Botti and M. A. L. Marques for their valuable contribution in crafting and freely sharing the WBM dataset. We thank R. Armiento, F. A. Faber and A. S. Parackal for helping develop the evaluation procedures for Wren upon which this work builds. We also thank R. Elijosius for assisting in the initial implementation of Wrenformer and M. Neumann, L. Barroso-Luque and Y. Park for submitting compliant models to the leaderboard. We thank J. Blake Gibson, S. Ping Ong, C. Chen, T. Xie, P. Zhong and E. Dogus Cubuk for helpful discussions.
<jats:title>Abstract</jats:title> <jats:p>The rapid adoption of machine learning in various scientific domains calls for the development of best practices and community agreed-upon benchmarking tasks and metrics. We present Matbench Discovery as an example evaluation framework for machine learning energy models, here applied as pre-filters to first-principles computed data in a high-throughput search for stable inorganic crystals. We address the disconnect between (1) thermodynamic stability and formation energy and (2) retrospective and prospective benchmarking for materials discovery. Alongside this paper, we publish a Python package to aid with future model submissions and a growing online leaderboard with adaptive user-defined weighting of various performance metrics allowing researchers to prioritize the metrics they value most. To answer the question of which machine learning methodology performs best at materials discovery, our initial release includes random forests, graph neural networks, one-shot predictors, iterative Bayesian optimizers and universal interatomic potentials. We highlight a misalignment between commonly used regression metrics and more task-relevant classification metrics for materials discovery. Accurate regressors are susceptible to unexpectedly high false-positive rates if those accurate predictions lie close to the decision boundary at 0 eV per atom above the convex hull. The benchmark results demonstrate that universal interatomic potentials have advanced sufficiently to effectively and cheaply pre-screen thermodynamic stable hypothetical materials in future expansions of high-throughput materials databases.</jats:p>