Large Scale Computational Physics (LSCP) Session 1

Time and Date: 16:20 - 18:00 on 13th June 2017

Room: HG D 5.2

Chair: Fukuko Yuasa

-6	Workshop on Large-Scale Computational Physics - LSCP 2017 [abstract] Abstract: [No abstract available]	Elise de Doncker and Fukuko Yuasa
272	Solution of Few-Body Coulomb Problems with Latent Matrices on Multicore Processors [abstract] Abstract: We re-formulate a classical numerical method for the solution of systems of linear equations to tackle problems with latent data, that is, linear systems of dimension that is a priori unknown. This type of systems appears in the solution of few-body Coulomb problems for Atomic Simulation Physics, in the form of multidimensional partial differential equations (PDEs) that require the numerical solution of a sequence of recurrent dense linear systems of growing scale. The large dimension of these systems, with up to several hundred thousands of unknowns, is tackled in our approach via a task-parallel implementation of a solver based on the QR factorizaton. This method is parallelized using the OmpSs framework, showing fair strong and weak scalability on a multicore processor equipped with 12 Intel cores.	Luis Biedma, Flavio Colavecchia and Enrique S. Quintana-Orti
340	A Global Network for Non-Collective Communication in Autonomous Systems [abstract] Abstract: Large-scale simulation enables realistic 3D reproductions of micro-structure evolution in many problems of computational material science [1]. With an increasing number of processing units, global communications become a bottleneck and limit the scalability. Therefore, NAStJA decomposes the simulated domain in small blocks and distributes those blocks over the processing units. Interacting processing units build a local neighborhood and act autonomously in this neighborhood. This limits the number of connections for each processing unit and therefore the local communication overhead, and leads to high scalability. Apart from the communication between local neighborhoods, a global information exchange is required. We explain the conditions and requirements for this exchange and present the benefits of a multidimensional Manhattan street network [2-4]. It is simple but sufficiently fast for a global information exchange, if the information is not time critical, i.e. the exchange has to be global only after several time steps. This global network satisfies the requirements for a global block management that connects the autonomous processes. Because of its super-linear scaling the approach is very useful for massively parallel simulations. The block distribution scales in a linear matter, and the communication overhead of the global block management can be neglected such that small blocks benefit from cache effects and result in a super-linear scaling, i.e. efficiency higher than unity. The global information exchange is based on a multi-hop exchange, where each message is sent to the direct neighbors and then spread to the whole network in a specified number of hops. Between these hops the computation goes on, so that the global exchange overlaps with the computation. The number of hops must be small enough to not influence the simulated physics. NAStJA supports regular grids with a calculating stencil sweeping through the simulated domain. In computational material science many problems can be described using phase-field methods or cellular automata, both based on a regular grid. This is a grateful task for parallel programming. However, many of these problems require calculations only in small regions of the simulated domain. This is why NAStJA allocates and distributes only those blocks that contain such a computing region. As the computing regions move in the simulated domain throughout the simulation, the corresponding blocks are created or deleted autonomously by the processes in the local neighborhood. The overhead for the local neighborhood communication is acceptable compared to the allocation of unneeded blocks. The current implementation of NAStJA is heavily under development, however, it is being already employed for a phase-field method specially for droplets [5], a phase-field crystal model [6, 7] and for the Potts model, a cellular automate for biological cells [8]. It can be simply extended with a wide range of algorithms that work on finite difference schema or other regular grid methods. These techniques allow advancing to previously unfeasible, extremely large-scale simulation. Especially for phase-field simulations, the computing region is only a small part of the simulated domain. Here the calculation occurs only in the interface region between the phases. As an illustration, the morphology of a water droplet on a structured surface simulated with the phase-field method has a small computing region which is the interface region between the water and the surrounding gas. The simulated quantities are constant inside and outside of the droplet. In phase-field simulation the width of the interface is chosen as about 10 cells. Using a regular grid, the mandatory resolution of the finest structure defines the scale and thus the total number of cells in the simulation domain. For a 1 µl droplet and a structure size of 20 nm with a resolution of at least twice the interface width, this results in a simulation domain of > 10^12 cells. This is too large for a traditional phase-field code that allocates the whole simulated domain and results in an intractable computational task. The presented techniques from NAStJA address these issues and improve the feasibility of large-scale simulation. We show measurements and theoretical calculation for the Manhattan street network compared to a global collective communication. As an example application we present the phase-field method. [1] Martin Bauer, Johannes Hötzer, Marcus Jainta, Philipp Steinmetz, Marco Berghoff, Florian Schornbaum, Christian Godenschwager, Harald Köstler, Britta Nestler, and Ulrich Rüde. Massively parallel phase-field simulations for ternary eutectic directional solidification. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, page 8. ACM, 2015. [2] Bhumip Khasnabish. Topological properties of manhattan street networks. Electronics Letters, 25(20):1388–1389, 1989. [3] Tein-Yaw Chung and Dharma P Agrawal. Design and analysis of multidimensional manhattan street networks. IEEE transactions on communications, 41(2):295–298, 1993. [4] Francesc Comellas, Cristina Dalfó, and Miguel Angel Fiol. Multidimensional manhattan street networks. SIAM Journal on Discrete Mathematics, 22(4):1428–1447, 2008. [5] Marouen Ben Said, Michael Selzer, Britta Nestler, Daniel Braun, Christian Greiner, and Harald Garcke. A phase-field approach for wetting phenomena of multiphase droplets on solid surfaces. Langmuir, 30(14):4033–4039, 2014. [6] Ken R. Elder, Nikolas Provatas, Joel Berry, Peter Stefanovic, and Martin Grant. Phase-field crystal modeling and classical density functional theory of freezing. Physical Review B, 75(6):064107, 2007. [7] Marco Berghoff and Britta Nestler. Phase field crystal modeling of ternary solidification microstructures. Computational Condensed Matter, 4:46–58, 2015. [8] François Graner and James A Glazier. Simulation of biological cell sorting using a two-dimensional extended potts model. Physical Review Letters, 69(13):2013, 1992.	Marco Berghoff and Ivan Kondov
419	Parallel Acoustic Field Simulation with Respect to Scattering of Sound on Local Inhomogeneities [abstract] Abstract: The report presents developed approach to simulation of acoustic fields in enclosed media. This method is based on the use of Rayleigh's integral for calculation of secondary sources generated by a wave falling onto media boundaries. The implementing algorithm is highly parallelizable, implies loosely coupled parallel branches with only few points of inter-thread communication. On the other hand, the algorithm is exponential upon an average number of reflections which occur to a single wave element emitted by a primary source, although for practical applications this number can be reduced enough to provide accurate results with reasonable time and space consumptions. The proposed algorithm is based on the approximate superposition of acoustical fields and provides adequate results, as long as the used equations of acoustics are linear. To calculate scattering properties of reflecting boundaries, the algorithm represents a geometric model of sound media propagation as a set of small flat vibrating pistons. Each wave element falling onto such a piston makes one radiate reflected sound in all directions and makes it possible to construct an algorithm which accepts sets of sources and reflecting surfaces. It also yields a field distribution over specified points such that each source, primary or secondary, can be associated with an element of parallel execution and be managed via a list of polymorphic sources implementing a task list. The report covers a mathematical formulation of the problem, defines an object model used to implement the algorithm, and provides some analysis of the algorithm in sequential and parallel forms.	Andrey Chusov, Lubov Statsenko, Alexsey Lysenko, Sergey Kuligin, Nina Cherkassova, Petr Unru and Maya Bernavskaya
508	Large-Scale Simulation of Cloud Cavitation Collapse [abstract] Abstract: We present a high performance computing framework for large scale simulation of compressible multicomponent flows, applied to cloud cavitation collapse. The governing equations are discretized by a Godunov-type finite volume method on a uniform structured grid. The bubble interface is captured by a diffuse interface method and treated as a mixing region of the liquid and gas phases. The framework is based on our Cubism library which enables a framework for the efficient treatment of high-order compact stencil schemes that can harness the capabilities of massively parallel computer architectures and allows for processing up to 10^13 computational cells. We present validations of our approach on several classical benchmark examples and study the collapse of a cloud of O(10^3) bubbles.	Ursula Rasthofer, Fabian Wermelinger, Panagiotis Hadjidoukas and Petros Koumoutsakos
597	Feynman loop numerical integral expansions for 3-loop vertex diagrams [abstract] Abstract: We address 3-loop vertex Feynman diagrams with massless internal lines, and which may exhibit UV-singularities. The computational methods target automatic numerical integration and extrapolation to approximate the leading coefficients of the integral expansion with respect to the dimensional regularization parameter. Convergence accelaration is achieved using linear extrapolation. Multivariate integration is performed with the ParInt software package, layered over MPI (Message Passing Interface) to speed up the computations. Integrand transformations result in relieving the effect of singular behavior in the integrand.	Elise de Doncker and Fukuko Yuasa