Tools for Program Development and Analysis in Computational Science (TOOLS) Session 1

Time and Date: 11:00 - 12:40 on 12th June 2014

Room: Bluewater II

Chair: Jie Tao

335 High Performance Message-Passing InfiniBand Communication Device for Java HPC [abstract]
Abstract: MPJ Express is a Java messaging system that implements an MPI-like interface. It is used for writing parallel Java applications on High Performance Computing (HPC) hardware including commodity clusters. The software is capable of executing in multicore and cluster mode. In the cluster mode, it currently supports Ethernet and Myrinet based interconnects and provide specialized communication devices for these networks. One recent trend in distributed memory parallel hardware is the emergence of InfiniBand interconnect, which is a high-performance proprietary network and provides low latency and high bandwidth for parallel MPI applications. Currently there is no direct support available in Java (and hence MPJ Express) to exploit the performance benefits of InfiniBand networks. The only option to run distributed Java programs over InfiniBand networks is to rely on TCP/IP emulation layers like IP over InfiniBand (IPoIB) and Sockets Direct Protocol (SDP), which provide poor communication performance. To tackle this issue in the context of MPJ Express, this paper presents a low-level communication device called ibdev that can be used to execute parallel Java applications on InfiniBand clusters. MPJ Express is based on a layered architecture and hence users can opt to use ibdev at runtime on an InfiniBand equipped commodity cluster. ibdev improves Java application performance with access to InfiniBand hardware using native verbs API. Our performance evaluation reveals that MPJ Express achieves much better latency and bandwidth using this new device, compared to IPoIB and SDP. Improvement in communication performance is also evident in NAS parallel benchmark results where ibdev helps MPJ Express achieve better scalability and speedups as compared to IPoIB and SDP. The results show that it is possible to reduce the performance gap between Java and native languages with efficient support for low level communication libraries.
Omar Khan, Mohsan Jameel, Aamir Shafi
300 A High Level Programming Environment for Accelerator-based Systems [abstract]
Abstract: Some of the critical hurdles for the widespread adoption of accelerators in high performance computing are portability and programming difficulty. To be an effective HPC platform, these systems need a high level software development environment to facilitate the porting and development of applications, so they can be portable and run efficiently on either accelerators or CPUs. In this paper we present a high level parallel programming environment for accelerator-based systems, which consists of tightly coupled compilers, tools, and libraries that can interoperate and hide the complexity of the system. Ease of use is possible with compilers making it feasible for users to write applications in Fortran, C, or C++ with OpenACC directives, tools to help users port, debug, and optimize for both accelerators and conventional multi-core CPUs, and with auto-tuned scientific libraries.
Luiz Derose, Heidi Poxon, James Beyer, Alistair Hart
277 Supporting relative debugging for large-scale UPC programs [abstract]
Abstract: Relative debugging is a useful technique for locating errors that emerge from porting existing code to new programming language or to new computing platform. Recent attention on the UPC programming language has resulted in a number of conventional parallel programs, for example MPI programs, being ported to UPC. This paper gives an overview on the data distribution concepts used in UPC and establishes the challenges in supporting relative debugging technique for UPC programs that run on large supercomputers. The proposed solution is implemented on an existing parallel relative debugger ccdb, and the performance is evaluated on a Cray XE6 system with 16,348 cores.
Minh Ngoc Dinh, David Abramson, Jin Chao, Bob Moench, Andrew Gontarek, Luiz Derose

Tools for Program Development and Analysis in Computational Science (TOOLS) Session 2

Time and Date: 14:10 - 15:50 on 12th June 2014

Room: Bluewater II

Chair: Jie Tao

97 Near Real-time Data Analysis of Core-Collapse Supernova Simulations With Bellerophon [abstract]
Abstract: We present an overview of a software system, Bellerophon, built to support a production-level HPC application called CHIMERA, which simulates core-collapse supernova events at the petascale. Developed over the last four years, Bellerophon enables CHIMERA’s geographically dispersed team of collaborators to perform data analysis in near real-time. Its n-tier architecture provides an encapsulated, end-to-end software solution that enables the CHIMERA team to quickly and easily access highly customizable animated and static views of results from anywhere in the world via a web-deliverable, cross-platform desktop application. In addition, Bellerophon addresses software engineering tasks for the CHIMERA team by providing an automated mechanism for performing regression testing on a variety of supercomputing platforms. Elements of the team’s workflow management needs are met with software tools that dynamically generate code repository statistics, access important online resources, and monitor the current status of several supercomputing resources.
E. J. Lingerfelt, O. E. B. Messer, S. S. Desai, C. A. Holt, E. J. Lentz
148 Toward Better Understanding of the Community Land Model within the Earth System Modeling Framework [abstract]
Abstract: One key factor in the improved understanding of earth system science is the development and improvement of high fidelity earth system models. Along with the deeper understanding of system processes, the complexity of software systems of those modelling systems becomes a barrier for further rapid model improvements and validation. In this paper, we present our experience on better understanding the Community Land Model (CLM) within an earth system modelling framework. First, we give an overview of the software system of the global offline CLM system. Second, we present our approach to better understand the CLM software structure and data structure using advanced software tools. After that, we focus on the practical issues related to CLM computational performance and individual ecosystem function. Since better software engineering practices are much needed for general scientific software systems, we hope those considerations can be beneficial to many other modeling research programs involving multiscale system dynamics.
Dali Wang, Joseph Schuchart, Tomislav Janjusic, Frank Winkler, Yang Xu, Christos Kartsaklis
155 Detecting and visualising process relationships in Erlang [abstract]
Abstract: Static software analyser tools can help in program comprehension by detecting relations among program parts. Detecting relations among the concurrent program parts, e.g. relations between processes, is not straightforward. In case of dynamic languages only a (good) approximation of the real dependencies can be calculated. In this paper we present algorithms to build a process relation graph for Erlang programs. The graph contains direct relation through message passing and hidden relations represented by the ETS tables.
Melinda Tóth, István Bozó