Treffer: An asynchronous writing method for restart files in the gysela code in prevision of exascale systems

Title:
An asynchronous writing method for restart files in the gysela code in prevision of exascale systems
Contributors:
Institut de Recherche sur la Fusion par confinement Magnétique (IRFM), Direction de Recherche Fondamentale (CEA) (DRF (CEA)), Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA), Maison de la Simulation (MDLS), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Institut National de Recherche en Informatique et en Automatique (Inria)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
Source:
CEMRACS 2012, Jul 2012, Luminy, France. ⟨10.1051/proc/201343007⟩
Publisher Information:
CCSD, 2012.
Publication Year:
2012
Collection:
collection:CEA
collection:CNRS
collection:MDLS
collection:DSM-IRFM
collection:UVSQ
collection:GENCI
collection:CEA-DRF
collection:CEA-CAD
Subject Geographic:
Original Identifier:
HAL: hal-01048745
Document Type:
Konferenz conferenceObject<br />Conference papers
Language:
English
Relation:
info:eu-repo/semantics/altIdentifier/doi/10.1051/proc/201343007
DOI:
10.1051/proc/201343007
Rights:
info:eu-repo/semantics/OpenAccess
Accession Number:
edshal.hal.01048745v1
Database:
HAL

Weitere Informationen

The present work deals with an optimization procedure developed in the full-f global GYrokinetic SEmi-LAgrangian code (GYSELA). Optimizing the writing of the restart files is necessary to reduce the computing impact of crashes. These files require a very large memory space, and particularly so for very large mesh sizes. The limited bandwidth of the data pipe between the computing nodes and the storage system induces a non-scalable part in the GYSELA code, which increases with the mesh size. Indeed the transfer time of RAM to data depends linearly on the files size. The necessity of non synchronized writing-in-file procedure is therefore crucial. A new GYSELA module has been developed. This asynchronous procedure allows the frequent writing of the restart files, whilst preventing a severe slowing down due to the limited writing bandwidth. This method has been improved to generate a checksum control of the restart files, and automatically rerun the code in case of a crash for any cause.