Treffer: Massively Parallel Algorithms and Data Structures for Extreme-Scale Matrix-Free Simulations on Hybrid Tetrahedral Grids
Weitere Informationen
Simulations are crucial to answer fundamental and practical questions of science and engineering, especially when theoretical assessments and experiments are insufficient, ineligible, or unfeasible. Unfortunately, analytical solutions of partial differential equations that model relevant physical phenomena are rarely known, motivating the development of efficient methods for their numerical approximation. Meaningful conclusions require high spatial and temporal resolution, leading to discrete problems that only efficient computational resources can handle. To demonstrate the challenges of the approximation of partial differential equations and their solutions at the extreme scale, this thesis utilizes the simulation of convection in the Earth’s mantle as an example application. Subject to buoyancy-driven creeping motion, the mantle material’s movement critically impacts the development of mountain ranges, mid-ocean ridges, earthquakes, and volcanism. The lack of suitable measurement techniques for the experimental quantification of its properties calls for the aid of computational methods. However, highly resolved mantle circulation simulations are an ambitious endeavor since discretizing the underlying mathematical model with a global resolution of 1km results in linear systems with trillions ($10^{12}$) of unknowns. Their sheer size requires the design of specialized solution algorithms and data structures. In particular, extreme-scalable, massively parallel, matrix-free methods of optimal space- and time-complexity are necessary. This thesis addresses those requirements via the concept of Hybrid Tetrahedral Grids. The fundamental data structure builds upon an unstructured coarse mesh approximating the domain geometry. Subsequent regular refinement leads to a hierarchical, block-structured, tetrahedral grid. It enables the implementation of efficient, massively parallel, matrix- free finite element methods, focusing on geometric multigrid to solve the arising linear systems. A significant contribution of this thesis is the development of the open-source software HyTeG, which realizes the features above in a modular, sustainable, and flexible framework for state-of-the-art and future supercomputers. Five core publications describe and study the architecture and implementation of relevant algorithms and data structures, strongly focusing on extreme scalability. The concrete contributions include the generalization of the core concepts to enable arbitrary finite element discretizations. Extreme-scale studies of matrix-free multigrid solvers for the Stokes system demonstrate the parallel performance subject to the notion of textbook multigrid efficiency, solving saddle point problems with about $3 \cdot 10^{12}$ unknowns on more than 140,000 processes. The thesis contributes a massively parallel particle-based Eulerian-Lagrangian method for the advection-diffusion equation and corresponding scalability experiments with more than $5 \cdot 10^{10}$ particles. Finally, it presents a massively parallel in-memory checkpointing scheme that enables efficient recovery after run time faults during simulations and scales to more than 260,000 processes. This thesis’ detailed description of the corresponding algorithms, data structures, dis- cretization techniques, communication patterns, and performance models, as well as the contribution to the software itself, build a sustainable foundation for the simulation of even more complex physical systems at the extreme scale. Those are essential to study, validate, and verify detailed, quantitative assertions related to various applications.