Mathematical Methods and Algorithms for Extreme Scale (MMAES) Session 1

Time and Date: 16:20 - 18:00 on 7th June 2016

Room: Rousseau West

Chair: Vassil Alexandrov

482	Reducing Communication in Distributed Asynchronous Iterative Methods [abstract] Abstract: Communication costs have become an important factor in evaluating the performance of massively parallel algorithms. Asynchronous iterative methods have the potential to reduce these communication costs compared to their synchronous counterparts for solving systems of equations. The goal of this paper is to develop a communication-avoiding iterative method using an asynchronous implementation. Implemented using passive one-sided remote memory access (RMA) MPI functions, the method presented is a variation of the asynchronous Gauss-Seidel method. The variation is based on the Southwell method, where rows are relaxed greedily, instead of sequentially, by choosing the row with the maximum residual value. By comparing a process's residual value to its neighbors', the process decides to relax if it holds the maximum moduli residual. Additionally, a parameter is experimentally determined that dictates how long a process will wait after it successfully relaxes in order to let its update to propagate changes through its neighbors. Experimental results show that this method reduces communication costs compared to several other asynchronous iterative methods and the classic synchronous Jacobi method.	Jordi Wolfson-Pou, Edmond Chow
321	A Robust Technique to Make a 2D Advection Solver Tolerant to Soft Faults [abstract] Abstract: We present a general technique to solve Partial Differential Equations, called robust stencils, which make them tolerant to soft faults, i.e. bit flips arising in memory or CPU calculations. We show how it can be applied to a two-dimensional Lax-Wendroff solver. The resulting 2D robust stencils are derived using an orthogonal application of their 1D counterparts. Combinations of 3 to 5 base stencils can then be created. We describe how these are then implemented in a parallel advection solver. Various robust stencil combinations are explored, representing tradeoff between performance and robustness. The results indicate that the 3-stencil robust combinations are slightly faster on large parallel workloads than Triple Modular Redundancy (TMR). They also have one third of the memory footprint. We expect the improvement to be significant if suitable optimizations are performed. Because faults are avoided each time new points are computed, the proposed stencils are also comparably robust to faults as TMR for a large range of error rates. The technique can be generalized to 3D (or higher dimensions) with similar benefits.	Peter Strazdins, Brian Lee, Brendan Harding, Jackson Mayo, Jaideep Ray, Robert Armstrong