Advances in High-Performance Computational Earth Sciences: Applications and Frameworks (IHPCES) Session 1

Time and Date: 13:35 - 15:15 on 11th June 2018

Room: M3

Chair: Xing Cai

319 Development of scalable three-dimensional elasto-plastic nonlinear wave propagation analysis method for earthquake damage estimation of soft grounds [abstract]
Abstract: In soft complex grounds, earthquakes cause damages with large deformation such as landslides and subsidence. Use of elasto-plastic models as the constitutive equation of soils is suitable for evaluation of nonlinear wave propagation with large ground deformation. However, there is no example of elasto-plastic nonlinear wave propagation analysis method capable of simulating a large-scale soil deformation problem. In this study, we developed a scalable elasto-plastic nonlinear wave propagation analysis program based on three-dimensional nonlinear finite-element method. The program attains 86.2% strong scaling efficiency from 240 CPU cores to 3840 CPU cores of PRIMEHPC FX10 based Oakleaf-FX, with 8.85 TFLOPS (15.6% of peak) performance on 3840 CPU cores. We verified the elasto-plastic nonlinear wave propagation program through convergence analysis, and conducted an analysis with large deformation for an actual soft ground modeled using 47,813,250 degrees-of-freedom.
Atsushi Yoshiyuki, Kohei Fujita, Tsuyoshi Ichimura, Muneo Hori and Lalith Wijerathne
297 A New Matrix-free Approach for Large-scale Geodynamic Simulations and its Performance [abstract]
Abstract: We report on a two-scale approach for efficient matrix-free finite element simulations. The proposed method is based on surrogate element matrices constructed by low-order polynomial approximations, and applied to a Stokes-like PDE system with variable viscosity as a key component in mantle convection models. We set the basis for a rigorous performance analysis inspired by concept of parallel textbook multigrid efficiency and study the weak scaling behavior on SuperMUC, a peta-scale supercomputer system. For a real-world geodynamical model, we achieve a parallel efficiency of 95\% on up to 47\,250 compute cores. Our largest simulation uses a trillion ($\mathcal{O}(10^{12})$) degrees of freedom for a global mesh resolution of 1.7\,km.
Simon Bauer, Markus Huber, Marcus Mohr, Ulrich Rüde and Barbara Wohlmuth
384 Viscoelastic Crustal Deformation Computation Method with Reduced Random Memory Accesses for GPU-based Computers [abstract]
Abstract: The computation of crustal deformation following a given fault slip is important for understanding earthquake generation processes and reduction of damage. In crustal deformation analysis, reflecting the complex geometry and material heterogeneity of the crust is important, and use of large-scale unstructured finite-element method is suitable. However, since the computation area is large, its computation cost has been a bottleneck. In this study, we develop a fast unstructured finite-element solver for GPU-based large-scale computers. By computing several times steps together, we reduce random access, together with the use of predictors suitable for viscoelastic analysis to reduce the total computational cost. The developed solver enabled 2.79 times speedup from the conventional solver. We show an application example of the developed method through a viscoelastic deformation analysis of the Eastern Mediterranean crust and mantle following a hypothetical M~9 earthquake in Greece by using a 2,403,562,056 degree-of-freedom finite-element model.
Takuma Yamaguchi, Kohei Fujita, Tsuyoshi Ichimura, Anne Glerum, Ylona van Dinther, Takane Hori, Olaf Schenk, Muneo Hori and Maddegedara Lalith
26 An Event Detection Framework for Virtual Observation System: Anomaly Identification for An Acme Land Simulation [abstract]
Abstract: Based on previous work on in-situ data transfer infrastructure and compiler-based software analysis, we have designed a virtual observation system for real-time computer simulations. This paper presents an event detection framework for a virtual observation system. By using signal processing and detection approaches to the memory-based data streams, this framework can be reconfigured to capture high-frequency events and low-frequency events. These approaches used in the framework can dramatically reduce the data transfer needed for in-situ data analysis (between distributed computing nodes or between the CPU/GPU nodes). In the paper, we also use a terrestrial ecosystem system simulation within the Earth System Model to demonstrate the practical values of this effort.
Zhuo Yao and Dali Wang

Advances in High-Performance Computational Earth Sciences: Applications and Frameworks (IHPCES) Session 2

Time and Date: 15:45 - 17:25 on 11th June 2018

Room: M3

Chair: Xing Cai

184 Enabling Adaptive Mesh Refinement for Single Components of ECHAM6 [abstract]
Abstract: Adaptive mesh refinement (AMR) can be used to improve climate simulations since these exhibit features on multiple scales which would be too expensive to resolve using a uniform mesh. In particular, paleo-climate simulations as done in the framework of the German PalMod project only allow for low resolution simulations. Instead of constructing a complex model like an earth system model (ESM) based on AMR, it is desirable to apply the AMR to single components of the existing ESM. We explore the applicability of a forest of trees data structure to incorporate AMR into an existing model. The performance of the data structure is tested by an idealized test case using a numerical scheme for tracer transport in ECHAM6. The numerical results show that the data structure is compatible with the data structure of the original model and also demonstrate improvements of the efficiency compared to non-adaptive meshes.
Yumeng Chen, Konrad Simon and Jörn Behrens
228 Efficient and accurate evaluation of Bezier tensor product surfaces [abstract]
Abstract: This article proposes a bivariate compensated Volk and Schumaker (CompVSTP) algorithm, which extends the compensated Volk and Schumaker (CompVS) algorithm, to evaluate Bezier tensor product surfaces with floating-point coefficients and coordinates. The CompVSTP algorithm is obtained by applying error-free transformations to improve the traditional Volk and Schumaker tensor product (VSTP) algorithm. We study in detail the forward error analysis of the VSTP, CompVS and CompVSTP algorithms. Our numerical experiments illustrate that the CompVSTP algorithm is much more accurate than the VSTP algorithm, relegating the influence of the condition numbers up to second order in the rounding unit of the computer.
Jing Lan, Hao Jiang and Peibing Du