Advances in High-Performance Computational Earth Sciences: Applications and Frameworks (IHPCES) Session 1

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: Cockatoo

Chair: K. Nakajima

454 Progress towards nonhydrostatic adaptive mesh dynamics for multiscale climate models (Invited) [abstract]
Abstract: Many of the atmospheric phenomena with the greatest potential impact in future warmer climates are inherently multiscale. Such meteorological systems include hurricanes and tropical cyclones, atmospheric rivers, and other types of hydrometeorological extremes. These phenomena are challenging to simulate in conventional climate models due to the relatively coarse uniform model resolutions relative to the native nonhydrostatic scales of the phenomonological dynamics. To enable studies of these systems with sufficient local resolution for the multiscale dynamics yet with sufficient speed for climate-change studies, we have adapted existing adaptive mesh dynamics packages for global atmospheric modeling. In this talk, we present an adaptive, conservative finite volume approach for moist non-hydrostatic atmospheric dynamics. The approach is based on the compressible Euler equations on 3D thin spherical shells, where the radial direction is treated implicitly (using a fourth-order Runga-Kutta IMEX scheme) to eliminate time step constraints from vertical acoustic waves. Refinement is performed only in the horizontal directions. The spatial discretization is the equiangular cubed-sphere mapping, with a fourth-order accurate discretization to compute flux averages on faces. By using both space-and time-adaptive mesh refinement, the solver allocates computational effort only where greater accuracy is needed. The resulting method is demonstrated to be highly accurate for model problems, and robust at solution discontinuities and stable for large aspect ratios. We present comparisons using a simplified physics package for dycore comparisons of moist physics. Bio: William D. Collins is an internationally recognized expert in climate modeling and climate change science. His personal research concerns the interactions among greenhouse gases and aerosols, the coupled climate system, and global environmental change. At Lawrence Berkeley National Laboratory (LBNL), Dr. Collins serves as the Director for the Climate and Ecological Sciences Division. At the University of California, Berkeley, he teaches in the Department of Earth and Planetary Science and directs the new multi-campus Climate Readiness Institute (CRI). Dr. Collins’s role in launching the Department of Energy’s Accelerated Climate Model for Energy (ACME) program was awarded the U.S. Department of Energy Secretary’s Achievement Award on May 7, 2015. He is also a Fellow of the American Association for the Advancement of Science (AAAS). He was a Lead Author on the Fourth Assessment of the Intergovernmental Panel on Climate Change (IPCC), for which the IPCC was awarded the 2007 Nobel Peace Prize, and was also a Lead Author on the recent Fifth Assessment. Before joining Berkeley and Berkeley Lab, Dr. Collins was a senior scientist at the National Center for Atmospheric Research (NCAR) and served as Chair of the Scientific Steering Committee for the DOE/NSF Community Climate System Model project. Dr. Collins received his undergraduate degree in physics from Princeton University and earned an M.S. and Ph.D. in astrophysics from the University of Chicago.
William Collins, Hans Johansen, Travis O'Brien, Jeff Johnson, Elijah Goodfriend and Noel Keen
276 Towards characterizing the variability of statistically consistent Community Earth System Model simulations [abstract]
Abstract: Large, complex codes such as earth system models are in a constant state of development, requiring frequent software quality assurance. The recently developed Community Earth System Model (CESM) Ensemble Consistency Test (CESM-ECT) provides an objective measure of statistical consistency for new CESM simulation runs, which has greatly facilitated error detection and rapid feedback for model users and developers. CESM-ECT determines consistency based on an ensemble of simulations that represent the same earth system model. Its statistical distribution embodies the natural variability of the model. Clearly the composition of the employed ensemble is critical to CESM-ECT's effectiveness. In this work we examine whether the composition of the CESM-ECT ensemble is adequate for characterizing the variability of a consistent climate. To this end, we introduce minimal code changes into CESM that should pass the CESM-ECT, and we evaluate the composition of the CESM-ECT ensemble in this context. We suggest an improved ensemble composition that better captures the accepted variability induced by code changes, compiler changes, and optimizations, thus more precisely facilitating the detection of errors in the CESM hardware or software stack as well as enabling more in-depth code optimization and the adoption of new technologies.
Daniel Milroy, Allison Baker, Dorit Hammerling, John Dennis, Sheri Mickelson, Elizabeth Jessup
318 A New Approach to Ocean Eddy Detection, Tracking, and Event Visualization -Application to the Northwest Pacific Ocean- [abstract]
Abstract: High-resolution ocean general circulation models have advanced the numerical study of ocean eddies. To gain an understanding of ocean eddies from the large volume of data produced by simulations, visualizing just the distribution of eddies at each time step is insufficient; time-variations in eddy events and phenomena must also be considered. However, existing methods cannot accurately detect and track eddy events such as amalgamation and bifurcation. In this study, we propose a new approach for eddy detection, tracking, and event visualization based on an eddy classification system. The proposed method detects streams and currents in addition to eddies, and it classifies detected eddies into several categories using the additional stream and current information. By tracking how the classified eddies vary over time, it is possible to detect events such as eddy amalgamation and bifurcation as well as the interaction between eddies and ocean currents. We visualize the detected eddies and events in a time series of images (or animation), enabling us to gain an intuitive understanding of a region of interest hidden in a high-resolution data set.
Daisuke Matsuoka, Fumiaki Araki, Yumi Inoue, Hideharu Sasaki
285 SC-ESAP: A Parallel Application Platform for Earth System Model [abstract]
Abstract: The earth system model is one of the most complicated computer simulation software in the human development history, which is the basis of understanding and predicting the climate change, and an important tool to support the climate change related decisions. CAS-ESM, Chinese Academy of Science Earth System Model, is developed by the Institute of Atmospheric Physics(IAP) and its cooperators. This system contains the complete components of the climate system and ecological environment system including global atmospheric general circulation model(AGCM), global oceanic general circulation model(OGCM), ice model, land model, atmospheric chemistry model, dynamic global vegetation model(DGVM), ocean biogeochemistry model(OBM) and regional climate model(RCM), etc. Since CAS-ESM is a complex system and is designed as a scalable and pluggable system, a parallel software platform(SC-ESSP) is needed. SC-ESSP will be developed as an open software platform running on Chinese earth system numerical simulation facilities for different developers and users, which requires that the component models need to be standard and unified, and the platform should be pluggable, high performance and easy-to-use. To achieve this goal, based on the platform of Community Earth System Model(CESM), a parallel software application platform named SC-ESAP is designed for CAS-ESM, mainly including compile and run scripts, standard and unified component models, 3-D coupler component, coupler interface creator and some parallel and optimization work. A component framework SC-ESMF will be developed based on the framework SC-Tangram for the more distant future.
Jinrong Jiang, Tianyi Wang, Xuebin Chi, Huiqun Hao, Yuzhu Wang, Yiran Chen, He Zhang

Advances in High-Performance Computational Earth Sciences: Applications and Frameworks (IHPCES) Session 2

Time and Date: 14:30 - 16:10 on 6th June 2016

Room: Cockatoo

Chair: Yifeng Cui

554 Xeon and Xeon Phi-Aware Kernel Design for Seismic Simulations using ADER-DG FEM (Invited) [abstract]
Abstract: Kernels in the ADER-DG method, when solving the elastic wave equations, boil down to sparse and dense matrix multiplications of small sizes. At the example of the Earthquake simulations code SeisSol, we will investigate how this routines can be implemented and speeded-up by the code generation tool LIBXSMM. A long these lines we will analyze different tradeoffs of switching from sparse to dense matrix multiplication kernels and report performance with respect to time-to-solution and energy consumption. Bio: Alexander Heinecke studied Computer Science and Finance and Information Management at Technische Universität München, Germany. In 2010 and 2012, he completed internships at Intel in Munich, Germany and at Intel Labs Santa Clara, CA, USA. In 2013 he completed his Ph.D. studies at TUM and joined Intel’s Parallel Computing in Santa Clara in 2014. His core research topic is the use of multi- and many-core architectures in advanced scientific computing applications. In 2014, he and his co-authors were selected as Gordon Bell finalists for running multi-physics earthquake simulations at multi-petaflop performance on more than 1.5 million of cores.
Alexander Heinecke
391 Octree-Based Multiple-Material Parallel Unstructured Mesh Generation Method for Seismic Response Analysis of Soil-Structure Systems [abstract]
Abstract: We developed an unstructured finite element mesh generation method capable of modeling multiple-material complex geometry problems for large-scale seismic analysis of soil-structure systems. We used an octree structure to decompose the target domain into small subdomains and use the multiple material marching cubes method for robust and parallel tetrahedralization of each subdomain. By using the developed method on a 32 core shared memory machine, we could generate a 594,168,792 tetrahedral element soil-structure model of a power plant in 13 h 01 min. The validity of the generated model was confirmed by conducting a seismic response analysis on 2,304 compute nodes of the K computer at RIKEN. Although the model contains a small approximation in geometry (half of the minimum octree size) at present, we can expect fast and high quality meshing of large-scale models by making geometry correction in the future, which is expected to help improve the seismic safety of important structures and complex urban systems.
Kohei Fujita, Keisuke Katsushima, Tsuyoshi Ichimura, Muneo Hori, Maddegedara Lalith
385 Parallel Iterative Solvers for Ill-conditioned Problems with Heterogeneous Material Properties [abstract]
Abstract: The efficiency and robustness of preconditioned parallel iterative solvers, based on domain decomposition for ill-conditioned problems with heterogeneous material properties, are evaluated in the present work. The preconditioning method is based on the BILUT(p,d,t) method proposed by the author in a previous study, and two types of domain decomposition procedures, LBJ (Localized Block Jacobi) and HID (Hierarchical Interface Decomposition), are considered. The proposed methods are implemented using the Hetero3D code, which is a parallel finite-element benchmark program for solid mechanics problems, and the code provides excellent scalability and robustness on up to 240 nodes (3,840 cores) of the Fujitsu PRIMEHPC FX10 (Oakleaf-FX) at the Information Technology Center, the University of Tokyo. Generally, HID provides better efficiency and robustness than LBJ for a wide range of values of parameters.
Kengo Nakajima

Advances in High-Performance Computational Earth Sciences: Applications and Frameworks (IHPCES) Session 3

Time and Date: 16:40 - 18:20 on 6th June 2016

Room: Cockatoo

Chair: Yifeng Cui

549 Inside the Pascal GPU Architecture and Benefits to Seismic Applications (Invited) [abstract]
Abstract: Stencil computations are one of the major computational patterns for seismic applications. In this talk I will first describe techniques to implement stencil computations efficiently on GPU. Then I will introduce the Pascal architecture in NVIDIA’s latest Tesla P100 GPU, especially focusing on new architecture features such as HBM2 and NVLINK. I will highlight how those features will enable significant performance improvement for seismic applications. Pascal also introduces GPU page fault which enables Unified Virtual Memory on GPU. I will illustrate how UVM will simplify GPU programming by removing the need to manage GPU data manually in the code while still get good performance in most cases. Bio: Peng Wang is a senior engineer in the HPC developer technology group of NVIDIA, where he works on parallelizing and optimizing scientific applications on GPU. One of his main focuses is on optimizing seismic algorithms on GPU. He got his Ph.D. in computational astrophysics from Stanford University.
Peng Wang
433 High-productivity Framework for Large-scale GPU/CPU Stencil Applications [abstract]
Abstract: A high-productivity framework for multi-GPU and multi-CPU computation of stencil applications is proposed. Our framework is implemented in C++ and CUDA languages. It automatically translates user-written stencil functions that update a grid point and generates both GPU and CPU codes. The programmers write user code just in the C++ language, and can execute the translated user code on either multiple multicore CPUs or multiple GPUs with optimization. The user code can be executed on multiple GPUs with the auto-tuning mechanism and the overlapping method to hide communication cost by computation. It can be also executed on multiple CPUs with OpenMP. The compressible flow code on GPU exploiting the optimizations provided by the framework has achieved 2.7 times faster than the non-optimized version.
Takashi Shimokawabe, Takayuki Aoki, Naoyuki Onodera
305 GPU acceleration of a non-hydrostatic ocean model with a multigrid Poisson/Helmholtz solver [abstract]
Abstract: To meet the demand for fast and detailed calculations in numerical ocean simulations, we implemented a non-hydrostatic ocean model on a graphics processing unit (GPU). We improved the model’s Poisson/Helmholtz solver by optimizing the memory access, using instruction-level parallelism, and applying a mixed precision calculation to the preconditioning of the Poisson/Helmholtz solver. The GPU-implemented model was 4.7 times faster than a comparable central processing unit execution. The output errors due to this implementation will not significantly influence oceanic studies.
Takateru Yamagishi, Yoshimasa Matsumura