### Large Scale Computational Physics (LSCP) Session 1

#### Time and Date: 10:15 - 11:55 on 2nd June 2015

#### Room: V102

#### Chair: Fukuko YUASA

757 | Workshop on Large Scale Computational Physics - LSCP [abstract]Abstract: The LSCP workshop focuses on symbolic and numerical methods and simulations, algorithms and tools (software and hardware) for developing and running large-scale computations in physical sciences. Special attention goes to parallelism, scalability and high numerical precision. System architectures are also of interest as long as they are supporting physics related calculations, such as: massively parallel systems, GPUs, many-integrated-cores, distributed (cluster, grid/cloud) computing, and hybrid systems. Topics are chosen from areas including: theoretical physics (high energy physics, nuclear physics, astrophysics, cosmology, quantum physics, accelerator physics), plasma physics, condensed matter physics, chemical physics, molecular dynamics, bio-physical system modeling, material science/engineering, nanotechnology, fluid dynamics, complex and turbulent systems, and climate modeling. |
Elise de Doncker, Fukuko Yuasa |

96 | The Particle Accelerator Simulation Code PyORBIT [abstract]Abstract: The particle accelerator simulation code PyORBIT is presented. The structure, implementation, history, parallel and simulation capabilities, and future development of the code are discussed. The PyORBIT code is a new implementation and extension of algorithms of the original ORBIT code that was developed for the Spallation Neutron Source accelerator at the Oak Ridge National Laboratory. The PyORBIT code has a two level structure. The upper level uses the Python programming language to control the flow of intensive calculations performed by the lower level code implemented in the C++ language. The parallel capabilities are based on MPI communications. The PyORBIT is an open source code accessible to the public through the Google Open Source Projects Hosting service. |
Andrei Shishlo |

115 | Simulations of several finite-sized objects in plasma [abstract]Abstract: Interaction of plasma with finite-sized objects is one of central problems in the physics of plasmas. Since object charging is often nonlinear and involved, it is advisable to address this problem with numerical simulations. First-principle simulations allow studying trajectories of charged plasma particles in self-consistent force fields. One of such approaches is the particle-in-cell (PIC) method, where the use of spatial grid for the force calculation significantly reduces the computational complexity. Implementing finite-sized objects in PIC simulations is often a challenging task. In this work we present simulation results and discuss the numerical representation of objects in the DiP3D code, which enables studies of several independent objects in various plasma environments. |
Wojciech Miloch |

196 | DiamondTorre GPU implementation algorithm of the RKDG solver for fluid dynamics and its using for the numerical simulation of the bubble-shock interaction problem [abstract]Abstract: In this paper the solver based upon the RKDG method for solving three-dimensional Euler equations of gas dynamics is considered. For the numerical scheme the GPU implementation algorithm called DiamondTorre is used, which helps to improve the performance speed of calculations. The problem of the interaction of a spherical bubble with a planar shock wave is considered in the three-dimensional setting. The obtained calculations are in agreement with the known results of experiments and numerical simulations. The calculation results are obtained with the use of the PC. |
Boris Korneev, Vadim Levchenko |

460 | Optimal Temporal Blocking for Stencil Computation [abstract]Abstract: Temporal blocking is a class of algorithms which reduces the required memory bandwidth (B/F ratio) of a given stencil computation, by “blocking” multiple time steps. In this paper, we prove that a lower limit exists for the reduction of the B/F attainable by temporal blocking, under certain conditions. We introduce the PiTCH tiling, an example of temporal blocking method that achieves the optimal B/F ratio. We estimate the performance of PiTCH tiling for various stencil applications on several modern CPUs. We show that PiTCH tiling achieves 1.5 ∼ 2 times better B/F reduction in three-dimensional applications, compared to other temporal blocking schemes. We also show that PiTCH tiling can remove the bandwidth bottleneck from most of the stencil applications considered. |
Takayuki Muranushi, Junichiro Makino |