Session5 16:20 - 18:00 on 13th June 2017

ICCS 2017 Main Track (MT) Session 5

Time and Date: 16:20 - 18:00 on 13th June 2017

Room: HG F 30

Chair: Eleni Chatzi

267 Support managing population aging stress of emergency departments in a computational way [abstract]
Abstract: Old people usually have more complex health problems and use healthcare services more frequently than young people. It is obvious that the increasing old people both in number and proportion will challenge the emergency departments (ED). This paper firstly presents a way to quantitatively predict and explain this challenge by using simulation techniques. Then, we outline the capability of simulation for decision support to overcome this challenge. Specifically, we use simulation to predict and explain the impact of population aging over an ED. In which, a precise ED simulator which has been validated for a public hospital ED will be used to predict the behavior of an ED under population aging in the next 15 years. Our prediction shows that the stress of population aging to EDs can no longer be ignored and ED upgrade must be carefully planned. Based on this prediction, the cost and benefits of several upgrade proposals are evaluated.
Zhengchun Liu, Dolores Rexachs, Francisco Epelde and Emilio Luque
146 Hemocell: a high-performance microscopic cellular library [abstract]
Abstract: We present a high-performance computational framework (Hemocell) with validated cell-material models, which provides the necessary tool to target challenging biophysical questions in relation to blood flows, e.g. the influence of transport characteristics on platelet bonding and aggregation. The dynamics of blood plasma are resolved by using the lattice Boltzmann method (LBM), while the cellular membranes are implemented using a discrete element method (DEM) coupled to the fluid as immersed boundary method (IBM) surfaces. In the current work a selected set of viable technical solutions are introduced and discussed, whose application translates to significant performance benefits. These solutions extend the applicability of our framework to up to two orders of magnitude larger, physiologically relevant settings.
Gábor Závodszky, Britt van Rooij, Victor Azizi, Saad Alowayyed and Alfons Hoekstra
275 Brownian dynamics simulations to explore experimental microsphere diffusion with optical tweezers. [abstract]
Abstract: We develop two-dimensional Brownian dynamics simulations to examine the motion of disks under thermal fluctuations and Hookean forces. Our simulations are designed to be experimental-like, since the experimental conditions define the available time-scales which characterize the solution of Langevin equations. To define the fluid model and methodology, we explain the basics of the theory of Brownian motion applicable to quasi-twodimensional diffusion of optically-trapped microspheres. Using the data produced by the simulations, we propose an alternative methodology to calculate diffusion coefficients. We obtain that, using typical input parameters in video-microscopy experiments, the averaged values of the diffusion coefficient differ from the theoretical one less than a 1%.
Manuel Pancorbo, Miguel Ángel Rubio and Pablo Domínguez-García
377 Numerical simulation of a compound capsule in a constricted microchannel [abstract]
Abstract: Simulations of the passage of eukaryotic cells through a constricted channel aid in studying the properties of cancer cells and their transport through the bloodstream. Compound capsules, which explicitly model the outer cell membrane and nuclear lamina, have the potential to improve fidelity of computational models. However, general simulations of compound capsules through a constricted microchannel have not been conducted and the influence of the compound capsule model on computational performance is not well known. In this study, we extend a parallel hemodynamics application to simulate the fluid-structure interaction between compound capsules and fluid. With this framework, we compare the deformation of simple and compound capsules in constricted microchannels, and explore how this deformation depends on the capillary number and on the volume fraction of the inner membrane. The parallel performance of the computational framework in this setting is evaluated and lessons for future development are discussed.
John Gounley, Erik Draeger and Amanda Randles

ICCS 2017 Main Track (MT) Session 12

Time and Date: 16:20 - 18:00 on 13th June 2017

Room: HG D 1.1

Chair: Manfred Trummer

490 Parallel Parity Games: a Multicore Attractor for the Zielonka Recursive Algorithm [abstract]
Abstract: Parity games are abstract infinite-duration two-player games, widely studied in computer science. Several solution algorithms have been proposed and also implemented in the community tool of choice called PGSolver, which has declared the Zielonka Recursive (ZR) algorithm the best performing on randomly generated games. With the aim of scaling and solving wider classes of parity games, several improvements and optimizations have been proposed over the existing algorithms. However, no one has yet explored the benefit of using the full computational power of which even common modern multicore processors are capable of. This is even more surprisingly by considering that most of the advanced algorithms in PGSolver are sequential. In this paper we introduce and implement, on a multicore architecture, a parallel version of the Attractor algorithm, that is the main kernel of the ZR algorithm. This choice follows our investigation that more of the 99% of the execution time of the ZR algorithm is spent in this module. We provide testing on graphs up to 20K nodes generated through PGSolver and we discuss performance analysis in terms of strong and weak scaling.
Umberto Marotta, Aniello Murano, Rossella Arcucci and Loredana Sorrentino
492 Replicated Synchronization for Imperative BSP Programs [abstract]
Abstract: The BSP model (Bulk Synchronous Parallel) simplifies the construction and evaluation of parallel algorithms, with its simplified synchronization structure and cost model. Nevertheless, imperative BSP programs can suffer from synchronization errors. Programs with textually aligned barriers are free from such errors, and this structure eases program comprehension. We propose a simplified formalization of barrier inference as data flow analysis, which verifies statically whether an imperative BSP program has replicated synchronization, which is a sufficient condition for textual barrier alignment.
Arvid Jakobsson, Frederic Dabrowski, Wadoud Bousdira, Frederic Loulergue and Gaetan Hains
496 IMCSim: Parameterized Performance Prediction for Implicit Monte Carlo Codes [abstract]
Abstract: We design an application model (IMCSim) of the implicit Monte Carlo particle code IMC using the Performance Prediction Toolkit (PPT), a discrete-event simulation-based modeling framework for predicting code performance on a large range of parallel platforms. We present validation results for IMCSim. We then use the fast parameter scanning that such a high-level loop-structure model of a complex code enables to predict optimal IMC parameter settings for interconnect latency hiding. We find that variations in interconnect bandwidth have a significant effect on optimal parameter values, thus suggesting the use of IMCSim as a pre-step to substantial IMC runs to quickly identify optimal parameter values for the specific hardware platform that IMC runs on.
Stephan Eidenbenz, Alex Long, Jason Liu, Olena Tkachenko and Robert Zerr
532 Efficient Implicit Parallel Patterns for GIS [abstract]
Abstract: With the data growth, the need to parallelize treatments become crucial in numerous domains. But for non-specialists it is still difficult to tackle parallelism technicalities as data distribution, communications or load balancing. For the geoscience domain we propose a solution based on implicit parallel patterns. These patterns are abstract models for a class of algorithms which can be customized and automatically transformed in a parallel execution. In this paper, we describe a pattern for stencil computation and a novel pattern dealing with computation following a pre-defined order. They are particularly used in geosciences and we illustrate them with the flow direction and the flow accumulation computations.
Kevin Bourgeois, Sophie Robert, Sébastien Limet and Victor Essayan
103 Taking Lessons Learned from a Proxy Application to a Full Application for SNAP and PARTISN [abstract]
Abstract: SNAP is a proxy application which simulates the computational motion of a neutral particle transport code, PARTISN. In this work, we have adapted parts of SNAP separately; we have re-implemented the iterative shell of SNAP in the task-model runtime Legion, showing an improvement to the original schedule, and we have created multiple Kokkos implementations of the computational kernel of SNAP, displaying similar performance to the native Fortran. We then translate our Kokkos experiments in SNAP to PARTISN, necessitating engineering development, regression testing, and further thought.
Geoffrey Womeldorff, Joshua Payne and Benjamin Bergen

Workshop on Teaching Computational Science and Bridging the HPC Talent Gap with Computational Science Research Methods (WTCS) Session 3

Time and Date: 16:20 - 18:00 on 13th June 2017

Room: HG D 1.2

Chair: Angela B. Shiflet

260 Learning Outcomes based Evaluation of HPC Professional Training [abstract]
Abstract: Very often, when evaluating professional training courses and events the analysis is reduced to a set of statistics related to who attended, how many they were, would they come again. The more pertinent questions about the value and relevance of acquired skills and how they could be applied to future work by the attendees tend to be side-tracked. BSC is committed to provide an international high-level education and training program. As part of that commitment, we are conducting a study of the learning outcomes from our courses, and looking into the affect they have on the transfer of knowledge into improved work methodologies/ routines. Supercomputing Centres worldwide, such as BSC, face the challenge of supporting a growing base of HPC users with little or no HPC experience with mainly scientific background. In addition, HPC training centres feel the need to respond to the clearly observed convergence of Data Science and Computational Science research methods into Data and Compute Intensive Science methods by enriching their programs with the necessary course for the diverse audience of domain science researchers needing HPC skills to tackle societal, economic and scientific challenges. The goals of the training evaluation are to capture the impact of the training program and allow an inside in not only the personal progress of the attendees but as well provide understanding how to support the attendees in implementing the learned methodologies/tools in their work. The trainees should be able to bridge the gap between the training classroom and the work/studies after that and to develop the ability to recognise the context for direct implementation or re-design of a methodology/ tools solution. These skills are directly linked to the impact of training and should be perceived as a learning outcome. Usually the skills are developed by using appropriate teaching methods over time, which is possible in the context of a longer-term learning environment, e.g. university degree. The Kirkpatrick model and the related practices the company (http://www.kirkpatrickpartners.com/) suggest as a way to support the on-the-job implementation of training is to build an on-line continuation of a training event which creates a context similar to that of a longer courses and thus facilitate the motivation to implement new skills by changing an established routine. That is time and resource consuming and thus the challenge is to facilitate the needed support of on-the-job implementation with minimal cost impact for the HPC centres.
Nia Alexandrov and Maria-Ribera Sancho
77 Teaching High Performance Computing at a US Regional university: curriculum, resources, student projects. [abstract]
Abstract: In this talk, we focus on the challenges and opportunities in the development of a comprehensive upper undergraduate/first year graduate course in high performance scientific computing at a regional university in the United States. The details of this presentation are based on a new course given at Idaho State University during 2016/2017 academic year. In the first part of the presentation, we focus on the curriculum, discussing the four major parts of the course: serial code optimization; OpenMP, MPI and CUDA programming. In this part of the presentation, we focus on the time requirements and accessibility of the material to the motivated math, science, and engineering students. In the next part of the talk, we focus on resources, and discuss the existing supercomputer educational projects available at the national level if a regional University does not have adequate resources to support a class on HPC. We describe our experience accessing the supercomputer facilities in the United States through the support of national laboratories and federally supported projects at top US Universities. (Mostly, we focus on our collaboration with the Idaho National Laboratory and the National Center for Supercomputing at the University of Illinois at Urbana-Champaign.) In the final part of the presentation, we review the results of the students projects related to the parallel implementations of the well known Krylov-FFT solvers for large sparse algebraic systems appeared in different relevant applications.
Yury Gryazin
583 Data processing as a basic competence of engineering education [abstract]
Abstract: The need in data science specialists is growing very fast, nevertheless the gap in knowledge is mostly bridged with special study programs. Quite often data science skills are needed additional to or embedded in the main study subject. This is especially the case in engineering study programs. In this position paper we discuss several possibilities to fill this gap.
Andreas Pester and Thomas Klinger
542 Towards Data Science Literacy [abstract]
Abstract: Promoting data science represents an increasingly important facet of general education. This paper describes the design and implementation of a course targeted at a non-technical audience and centered on data science literacy, with a focus on collecting, processing, analyzing and using data. The objective is through general education to prime students at an early stage of their college education for the changes in the data-driven society and to provide them with skills to harness the power of data. Our experience and evaluation results indicate that it is realistic for a diverse population of undergraduate students to acquired data science literacy and practical skills through a general education course.
Christo Dichev and Darina Dicheva
431 A Way How to Impart Data Science Skills to Computer Science Students Exemplified by OBDA-Systems Development [abstract]
Abstract: Nowadays to explore and examine data from a variety of angles to tackle Big Data problems, to devise data-driven solutions to the most pressing challenges, it is necessary to build multidisciplinary students’ skills set for innovative methods not only for Masters in Data Science Programs but in traditional Computer Science Programs too. In the paper, we describe how teaching methods and tools, which are used to train students to develop Ontology-Based Data Access systems with natural language interface to relational databases, help Master’s Degree students in Computer Science to collaborate with Data Scientists in real-world interdisciplinary projects and prepare them for a data science career. We use ontology engineering in a combination with Natural Language Processing methods based on lexico-syntactic patterns, in particular, to extract needful data from structured, semi-structured and unstructured datasets in a uniform way to analyze real-world Russian social networks related to new building area.
Svetlana Chuprina, Igor Postanogov and Taisiya Kostareva

Agent-based simulations, adaptive algorithms and solvers (ABS-AAS) Session 5

Time and Date: 16:20 - 18:00 on 13th June 2017

Room: HG D 7.1

Chair: Kamil Piętak

472 Declarative Representation and Solution of Vehicle Routing with Pickup and Delivery Problem [abstract]
Abstract: Recently we have proposed a multi-agent system that provides an intelligent logistics brokerage service focusing on the transport activity for the efficient allocation of transport resources (vehicles or trucks) to the transport applications. The freight broker agent has a major role to coordinate transportation arrangements of transport customers (usually shippers and consignees) with transport resource providers or carriers, following the freight broker business model. We focus on the fundamental function of this business that aims to find available trucks and to define their feasible routes for transporting requested customer loads. The main contribution of this paper is on formulating our scheduling problem as a special type of vehicle routing with pickup and delivery problem. We propose a new set partitioning model of our specific problem. Vehicle routes are defined on the graph of cities, rather than on the graph of customer orders, as typically proposed by set partitioning formulations. This approach is particularly useful when a large number of customer orders sharing a significantly lower number of pickup and delivery points must be scheduled. Our achievement is the declarative representation and solution of the model using ECLiPSe state-of-the-art constraint logic programming system.
Amelia Badica, Costin Badica, Florin Leon and Lucian Luncean
153 A multi-world agent-based model working at several spatial and temporal scales for simulating complex geographic systems [abstract]
Abstract: Interest in the modelling and simulation of complex systems with processes occurring at several spatial and temporal scales is increasing, particularly in biological, historical and geographic studies. In this multi-scale modelling study, we propose a generic model to account for processes operating at several scales. In this approach, a ‘world’ corresponds to a complete and self-sufficient submodel with its own places, agents, spatial resolution and temporal scale. Represented worlds can be nested: a world (with new scales) may have a greater level of detail than the model at the next level up, making it possible to study phenomena with greater precision. This process can be reiterated, to create many additional scales, with no formal limit. Worlds’ simulations can be triggered simultaneously or in cascade. Within a world, agents can choose destinations in other worlds, to which they can travel using routes and inter-world ‘gates’. Once they arrive in a destination world, the agents ‘fit’ the new scale. An agent in a given world can also perceive and interact with other agents, regardless of the world to which they belong, provided they are encompassed by its perception disc. We present an application of this model to the issue of the spread of black rats by means of commercial transportation in Senegal (West Africa).
Pape Adama Mboup, Karim Konaté and Jean Le Fur
466 Role of Behavioral Heterogeneity in Aggregate Financial Market Behavior: An Agent-Based Approach [abstract]
Abstract: In this paper, an agent-based model of stock market is proposed to study the effects of cognitive processes and behaviors of the traders (e.g. decision-making, interpretation of public information and learning) on the emergent phenomena of financial markets. In financial markets, psychology and sociology of the traders play a critical role in giving rise to unique and unexpected (emergent) macroscopic properties. This study suggests that local interactions, rational and irrational decision-making approaches and heterogeneity, which has been incorporated into different aspects of agent design, are among the key elements in modeling financial markets. When heterogeneity of the strategies used by the agents increases, volatility clustering and excess kurtosis arises in the model, which is in agreement with real market fluctuations. To evaluate the effectiveness and validity of the approach, a series of statistical analysis was conducted to test the artificial data with respect to a benchmark provided by the Bank of America (BAC) stock over a sufficiently long period of time. The results revealed that the model was able to reproduce and explain some of the most important stylized facts observed in actual financial time series and was consistent with empirical observations.
Yasaman Kamyab Hessary and Mirsad Hadzikadic
233 A case based reasoning based multi-agent system for the reactive container stacking in seaport terminals [abstract]
Abstract: With the continuous development of seaports, problems related to the storage of containers in terminals have emerged. Unfortunately, existing systems suffer limitations related to the distributed monitoring and control, real-time stacking strategies efficiency and their ability to handle dangerous containers. In this paper, we suggest a multi-agent architecture based on a set of knowledge models and learning mechanisms for disturbance and reactive decision making management. The suggested system is able to capture, store and reuse knowledge in order to detect disturbances and select the most appropriate container location by using a Case Based Reasoning (CBR) approach. The proposed system takes into account the storage of dangerous containers and combines Multi-Agent Systems (MAS) and case based reasoning to handle different types of containers.
Ines Rekik, Sabeur Elkosantini and Habib Chabchoub

Data-Driven Computational Sciences (DDCS) Session 2

Time and Date: 16:20 - 18:00 on 13th June 2017

Room: HG D 7.2

Chair: Craig Douglas

242 Human Identification and Localization by Robots in Collaborative Environments [abstract]
Abstract: Environments in which mobile robots and humans must coexist tend to be quite dangerous to the humans. Many employers have resorted to separating the two groups since the robots move quickly and do not maneuver around humans easily resulting in human injuries. In this paper we provide a roadmap towards being able to integrate the two worker groups (human and robots) to increase both efficiency and safety. Improved human to robot communication and collaboration has implications in multiple applications. For example: (1) Robots that manage all aspects of dispensing items (e.g., drugs in pharmacies or supplies and tools in a remote workplace), reducing human errors. (2) Dangerous location capable robots that triage injured subjects using remote sensing of vital signs. (3) 'Smart' crash carts that move themselves to a required location in a hospital or in the field, help dispense drugs and tools, save time and money, and prevent accidents.
Craig C. Douglas and Robert A. Lodder
257 Data-driven design of an Ebola therapeutic [abstract]
Abstract: Data-driven computational science has found many applications in drug design. Molecular data are commonly used to design new drug molecules. Engineering process simulations guide the development of the Chemistry, Manufacturing, and Controls (CMC) section of Investigational New Drug (IND) applications filed at FDA. Computer simulations can also guide the design of human clinical trials. Formulation is very important in drug delivery. The wrong formulation can render a drug product useless. The amount of preclinical (animal and in vitro) work that must be done before a new drug candidate can be tested in humans can be a problem. The cost of these cGxP studies is typically $3-$5 million. If the wrong drug product formulation is tested, new iterations of the formulation must be tested with additional costs of 3 to $5 million each. Data-driven computational science can help reduce this cost. In the absence of existing human exposure, a battery of tests involving acute and chronic toxicology, cardiovascular, central nervous system, and respiratory safety pharmacology must be performed in at least two species before FDA will permit testing in humans. However, for many drugs (such as those beginning with natural products) there is a history of human exposure. In these cases, computer modeling of a population to determine human exposure may be adequate to permit phase 1 studies with a candidate formulation in humans. The CDC’s National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews and physical examinations. The NHANES database can be mined to determine the average and 90th percentile exposures to a food additive, and early human formulation testing conducted at levels beneath those to which the US population is ordinarily exposed through food. These data can be combined with data mined from international chemical shipments to validate an exposure model. This paper describes the data driven formulation testing process using a new candidate Ebola treatment that, unlike vaccines, can be used after a person has contracted the disease. This drug candidate’s mechanism of action permits it to be potentially used against all strains of the virus, a characteristic that vaccines might not share.
Robert Lodder
383 Transforming a Local Medical Image Analysis for Running on a Hadoop Cluster [abstract]
Abstract: There is a progressive digitization in many medical fields, such as digital microscopy, which leads to an increase in data volume and processing demands for the underlying computing infrastructure. This paper explores scaling behaviours of a Ki-67 analysis application, which processes medical image tiles, originating from a WSI (Whole Slide Image) file format. Furthermore, it describes how the software is transformed from a Windows PC to a distributed Linux cluster environment. A test for platform independence revealed a non-deterministic behaviour of the application, which has been fixed successfully. The speedup of the application is determined. The slope of the increase is quite close to 1, i.e. there is almost no loss due to a parallelization overhead. Beyond the cluster's hardware limit (72 cores, 144 threads, 216 GB RAM) the speedup saturates to a value around 64. This is a strong improvement of the original software, whose speedup is limited to two.
Marco Strutz, Hermann Heßling and Achim Streit
208 Decentralized Dynamic Data-Driven Monitoring of Dispersion Processes on Partitioned Domains [abstract]
Abstract: The application of mobile sensor-carrying vehicles for online estimating dynamic dispersion processes is extremely beneficial. Based on current estimates that rely on past measurements and forecasts obtained from a discretized PDE-model, the movement of the vehicles can be adapted resulting in measurements at more informative locations. In this work, a novel decentralized monitoring approach based on a partitioning of the spatial domain into several subdomains is proposed. Each sensor is assigned to the subdomain it is located in and is only required to maintain a process and multi-vehicle model related to its subdomain. In this way, vast communication requirements of related centralized approaches and costly full model simulations are avoided making the presented approach more scalable with respect to a larger number of sensor-carrying vehicles and a larger problem domain. The approach consists of a new prediction and update method based on a domain decomposition method and a partitioned variant of the Ensemble Square Root Filter getting along with a minimum exchange of data between sensors on neighboring subdomains. Furthermore, a cooperative vehicle controller is applied in such a way that a dynamic adaption of the sensor distribution becomes possible.
Tobias Ritter, Stefan Ulbrich and Oskar von Stryk
265 A Framework for Direct and Transparent Data Exchange of Filter-stream Applications in Multi-GPUs Architectures [abstract]
Abstract: The massive data generation has been pushing for significant advances in computing architectures, reflecting in heterogeneous architectures composed by different types of processing units. The filter-stream paradigm is typically used to exploit the parallel processing power of these new architectures. The efficiency of applications in this paradigm is achieved by exploring a set of interconnected computers (cluster) using filters and communication between them in a coordinated way. In this work we propose, implement and test a generic abstraction for direct and transparent data exchange of filter-stream applications in heterogeneous cluster with multi-GPU (Graphics Processing Units) architectures. This abstraction allows hiding from the programmers all the low-level implementation details related to GPU communication and the control related to the location of filters. Further, we consolidate such abstraction into a framework. Empirical assessments using a real application show that the proposed abstraction layer ease the implementation of filter-stream applications without compromising the overall application performance.
Leonardo Rocha, Gabriel Ramons, Guilherme Andrade, Rafael Sachetto, Daniel Madeira, Renan Carvalho, Renato Ferreira and Fernando Mourão

Advances in High-Performance Computational Earth Sciences: Applications and Frameworks (IHPCES) Session 2

Time and Date: 16:20 - 18:00 on 13th June 2017

Room: HG D 3.2

Chair: Xing Cai

590 Fast Finite Element Analysis Method Using Multiple GPUs for Crustal Deformation and its Application to Stochastic Inversion Analysis with Geometry Uncertainty [abstract]
Abstract: Crustal deformation computation using 3-D high-fidelity models has been in heavy demand due to accumulation of observational data. This approach is computationally expensive and more than 100,000 repetitive computations are required for various application including Monte Carlo simulation, stochastic inverse analysis, and optimization. To handle the massive computation cost, we develop a fast Finite Element (FE) analysis method using multi-GPUs for crustal deformation. We use algorithms appropriate for GPUs and accelerate calculations such as sparse matrix-vector product. By reducing the computation time, we are able to conduct multiple crustal deformation computations in a feasible timeframe. As an application example, we conduct stochastic inverse analysis considering uncertainties in geometry and estimate coseismic slip distribution in the 2011 Tohoku Earthquake, by performing 360,000 crustal deformation computations for different 80,000,000 DOF FE models using the proposed method.
Takuma Yamaguchi, Kohei Fujita, Tsuyoshi Ichimura, Takane Hori, Muneo Hori and Lalith Wijerathne
599 Optimizing domain decomposition in an ocean model: the case of NEMO [abstract]
Abstract: Earth System Models are critical tools for the study of our climate and its future trends. These models are in constant evolution and their growing complexity entails an incrementing demand of the resources they require. Since the cost of using these state-of-the-art models is huge, looking closely at the factors that are able to impact their computational performance is mandatory. In the case of the state-of-the-art ocean model NEMO (Nucleus for European Modelling of the Ocean), used in many projects around the world, not enough attention has been given to the domain decomposition. In this work we show the impact that the selection of a particular domain decomposition can have on computational performance and how the proposed methodology substantially improves it.
Oriol Tintó Prims, Mario Acosta, Miguel Castrillo, Ana Cortés, Alícia Sanchez, Kim Serradell and Francisco J. Doblas-Reyes
154 Data Management and Volcano Plume Simulation with Parallel SPH Method and Dynamic Halo Domains [abstract]
Abstract: This paper presents data management and strategies for implementing smoothed particle hydrodynamics (SPH) method to simulate volcano plumes. These simulations require a careful definition of the domain of interest and multi-phase material involved in the flow, both of which change over time and involve transport over vast distances in a short time. Computational strategies are developed to overcome these challenges by building mechanisms for efficient creation and deletion of particles for simulation, parallel processing (using the message passing interface (MPI)) and a dynamically defined halo domain (a domain that "optimally" captures all the material involved in the flow). A background grid is adopted to reduce neighbor search costs and to decompose the domain. A Space Filing Curve (SFC) based ordering is used to assign unique identifiers to background grid entities and particles. Time-dependent SFC based indices are assigned to particles to guarantee uniqueness of the identifier. Both particles and background grids are managed by hash tables which can ensure quick and flexible access. An SFC based three dimensional (3D) domain decomposition and a dynamic load balancing strategy are implemented to ensure good load balance. Several strategies are developed to improve performance: dynamic halo domains, calibrated particle weight and optimized work load check intervals. Numerical tests show that our code has good scalability and performance. The strategies described in this paper can be further applied to many other implementations of mesh-free methods, especially those implementations that require flexibility in adding and deleting of particles.
Zhixuan Cao, Abani Patra and Matthew Jones

Large Scale Computational Physics (LSCP) Session 1

Time and Date: 16:20 - 18:00 on 13th June 2017

Room: HG D 5.2

Chair: Fukuko Yuasa

-6 Workshop on Large-Scale Computational Physics - LSCP 2017 [abstract]
Abstract: [No abstract available]
Elise de Doncker and Fukuko Yuasa
272 Solution of Few-Body Coulomb Problems with Latent Matrices on Multicore Processors [abstract]
Abstract: We re-formulate a classical numerical method for the solution of systems of linear equations to tackle problems with latent data, that is, linear systems of dimension that is a priori unknown. This type of systems appears in the solution of few-body Coulomb problems for Atomic Simulation Physics, in the form of multidimensional partial differential equations (PDEs) that require the numerical solution of a sequence of recurrent dense linear systems of growing scale. The large dimension of these systems, with up to several hundred thousands of unknowns, is tackled in our approach via a task-parallel implementation of a solver based on the QR factorizaton. This method is parallelized using the OmpSs framework, showing fair strong and weak scalability on a multicore processor equipped with 12 Intel cores.
Luis Biedma, Flavio Colavecchia and Enrique S. Quintana-Orti
340 A Global Network for Non-Collective Communication in Autonomous Systems [abstract]
Abstract: Large-scale simulation enables realistic 3D reproductions of micro-structure evolution in many problems of computational material science [1]. With an increasing number of processing units, global communications become a bottleneck and limit the scalability. Therefore, NAStJA decomposes the simulated domain in small blocks and distributes those blocks over the processing units. Interacting processing units build a local neighborhood and act autonomously in this neighborhood. This limits the number of connections for each processing unit and therefore the local communication overhead, and leads to high scalability. Apart from the communication between local neighborhoods, a global information exchange is required. We explain the conditions and requirements for this exchange and present the benefits of a multidimensional Manhattan street network [2-4]. It is simple but sufficiently fast for a global information exchange, if the information is not time critical, i.e. the exchange has to be global only after several time steps. This global network satisfies the requirements for a global block management that connects the autonomous processes. Because of its super-linear scaling the approach is very useful for massively parallel simulations. The block distribution scales in a linear matter, and the communication overhead of the global block management can be neglected such that small blocks benefit from cache effects and result in a super-linear scaling, i.e. efficiency higher than unity. The global information exchange is based on a multi-hop exchange, where each message is sent to the direct neighbors and then spread to the whole network in a specified number of hops. Between these hops the computation goes on, so that the global exchange overlaps with the computation. The number of hops must be small enough to not influence the simulated physics. NAStJA supports regular grids with a calculating stencil sweeping through the simulated domain. In computational material science many problems can be described using phase-field methods or cellular automata, both based on a regular grid. This is a grateful task for parallel programming. However, many of these problems require calculations only in small regions of the simulated domain. This is why NAStJA allocates and distributes only those blocks that contain such a computing region. As the computing regions move in the simulated domain throughout the simulation, the corresponding blocks are created or deleted autonomously by the processes in the local neighborhood. The overhead for the local neighborhood communication is acceptable compared to the allocation of unneeded blocks. The current implementation of NAStJA is heavily under development, however, it is being already employed for a phase-field method specially for droplets [5], a phase-field crystal model [6, 7] and for the Potts model, a cellular automate for biological cells [8]. It can be simply extended with a wide range of algorithms that work on finite difference schema or other regular grid methods. These techniques allow advancing to previously unfeasible, extremely large-scale simulation. Especially for phase-field simulations, the computing region is only a small part of the simulated domain. Here the calculation occurs only in the interface region between the phases. As an illustration, the morphology of a water droplet on a structured surface simulated with the phase-field method has a small computing region which is the interface region between the water and the surrounding gas. The simulated quantities are constant inside and outside of the droplet. In phase-field simulation the width of the interface is chosen as about 10 cells. Using a regular grid, the mandatory resolution of the finest structure defines the scale and thus the total number of cells in the simulation domain. For a 1 µl droplet and a structure size of 20 nm with a resolution of at least twice the interface width, this results in a simulation domain of > 10^12 cells. This is too large for a traditional phase-field code that allocates the whole simulated domain and results in an intractable computational task. The presented techniques from NAStJA address these issues and improve the feasibility of large-scale simulation. We show measurements and theoretical calculation for the Manhattan street network compared to a global collective communication. As an example application we present the phase-field method. [1] Martin Bauer, Johannes Hötzer, Marcus Jainta, Philipp Steinmetz, Marco Berghoff, Florian Schornbaum, Christian Godenschwager, Harald Köstler, Britta Nestler, and Ulrich Rüde. Massively parallel phase-field simulations for ternary eutectic directional solidification. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, page 8. ACM, 2015. [2] Bhumip Khasnabish. Topological properties of manhattan street networks. Electronics Letters, 25(20):1388–1389, 1989. [3] Tein-Yaw Chung and Dharma P Agrawal. Design and analysis of multidimensional manhattan street networks. IEEE transactions on communications, 41(2):295–298, 1993. [4] Francesc Comellas, Cristina Dalfó, and Miguel Angel Fiol. Multidimensional manhattan street networks. SIAM Journal on Discrete Mathematics, 22(4):1428–1447, 2008. [5] Marouen Ben Said, Michael Selzer, Britta Nestler, Daniel Braun, Christian Greiner, and Harald Garcke. A phase-field approach for wetting phenomena of multiphase droplets on solid surfaces. Langmuir, 30(14):4033–4039, 2014. [6] Ken R. Elder, Nikolas Provatas, Joel Berry, Peter Stefanovic, and Martin Grant. Phase-field crystal modeling and classical density functional theory of freezing. Physical Review B, 75(6):064107, 2007. [7] Marco Berghoff and Britta Nestler. Phase field crystal modeling of ternary solidification microstructures. Computational Condensed Matter, 4:46–58, 2015. [8] François Graner and James A Glazier. Simulation of biological cell sorting using a two-dimensional extended potts model. Physical Review Letters, 69(13):2013, 1992.
Marco Berghoff and Ivan Kondov
419 Parallel Acoustic Field Simulation with Respect to Scattering of Sound on Local Inhomogeneities [abstract]
Abstract: The report presents developed approach to simulation of acoustic fields in enclosed media. This method is based on the use of Rayleigh's integral for calculation of secondary sources generated by a wave falling onto media boundaries. The implementing algorithm is highly parallelizable, implies loosely coupled parallel branches with only few points of inter-thread communication. On the other hand, the algorithm is exponential upon an average number of reflections which occur to a single wave element emitted by a primary source, although for practical applications this number can be reduced enough to provide accurate results with reasonable time and space consumptions. The proposed algorithm is based on the approximate superposition of acoustical fields and provides adequate results, as long as the used equations of acoustics are linear. To calculate scattering properties of reflecting boundaries, the algorithm represents a geometric model of sound media propagation as a set of small flat vibrating pistons. Each wave element falling onto such a piston makes one radiate reflected sound in all directions and makes it possible to construct an algorithm which accepts sets of sources and reflecting surfaces. It also yields a field distribution over specified points such that each source, primary or secondary, can be associated with an element of parallel execution and be managed via a list of polymorphic sources implementing a task list. The report covers a mathematical formulation of the problem, defines an object model used to implement the algorithm, and provides some analysis of the algorithm in sequential and parallel forms.
Andrey Chusov, Lubov Statsenko, Alexsey Lysenko, Sergey Kuligin, Nina Cherkassova, Petr Unru and Maya Bernavskaya
508 Large-Scale Simulation of Cloud Cavitation Collapse [abstract]
Abstract: We present a high performance computing framework for large scale simulation of compressible multicomponent flows, applied to cloud cavitation collapse. The governing equations are discretized by a Godunov-type finite volume method on a uniform structured grid. The bubble interface is captured by a diffuse interface method and treated as a mixing region of the liquid and gas phases. The framework is based on our Cubism library which enables a framework for the efficient treatment of high-order compact stencil schemes that can harness the capabilities of massively parallel computer architectures and allows for processing up to 10^13 computational cells. We present validations of our approach on several classical benchmark examples and study the collapse of a cloud of O(10^3) bubbles.
Ursula Rasthofer, Fabian Wermelinger, Panagiotis Hadjidoukas and Petros Koumoutsakos
597 Feynman loop numerical integral expansions for 3-loop vertex diagrams [abstract]
Abstract: We address 3-loop vertex Feynman diagrams with massless internal lines, and which may exhibit UV-singularities. The computational methods target automatic numerical integration and extrapolation to approximate the leading coefficients of the integral expansion with respect to the dimensional regularization parameter. Convergence accelaration is achieved using linear extrapolation. Multivariate integration is performed with the ParInt software package, layered over MPI (Message Passing Interface) to speed up the computations. Integrand transformations result in relieving the effect of singular behavior in the integrand.
Elise de Doncker and Fukuko Yuasa

Solving Problems with Uncertainties (SPU) Session 2

Time and Date: 16:20 - 18:00 on 13th June 2017

Room: HG F 33.1

Chair: Vassil Alexandrov

406 Recommendation of Short-Term Activity Sequences During Distributed Events [abstract]
Abstract: The amount of social events has increased significantly and location-based services have become an integral part of our life. This makes the recommendation of activity sequences an important emerging application. Recently, the notion of a distributed event (e.g., music festival or cruise) that gathers multiple competitive activities has appeared in the literature. An attendee of such events is overwhelmed with numerous possible activities and faces the problem of activity selection with the goal to maximise satisfaction of experience. This selection is subject to various uncertainties. In this paper, we formulate the problem of recommendation of activity sequences as a combination of personalised event recommendation and scheduling problem. We present a novel integrated framework to solve it and two computation strategies to analyse the categorical, temporal and textual users' interests. We mine the users' historical traces to extract their behavioural patterns and use them in the construction of the itinerary. The evaluation of our approach on a dataset built over a cruise program shows an average improvement of 10.4% over the state-of-the-art.
Diana Nurbakova, Léa Laporte, Sylvie Calabretto and Jérôme Gensel
396 Optimal pricing model based on reduction dimension: A case of study for convenience stores [abstract]
Abstract: Pricing is one of the most vital and highly demanded component in the mix of marketing along with the Product, Place and Promotion. An organization can adopt a number of pricing strategies, typically based on corporate objectives. This paper proposes a methodology to define an optimal pricing strategy for convenience stores based on dimension reduction methods and uncertainty of data. The solution approach involves a multiple linear regression as well as a linear programming optimization model using several variables to consider. A strategy to select a set of important variables among a large number of predictors using mix of PCA and best subset methods is presented. A linear optimization model then in solved using uncertainty data and diverse business rules. To show the value of the proposed methodology computation of optimal prices are compared with previous results obtained in a pilot performed for selected stores. This strategy provides an alternative solution that allows the decision maker include proper business rules of their particular environment in order to define a price strategy that meet the objective business goals.
Laura Hervert-Escobar, Oscar Alejandro Esquivel-Flores and Raul Valente Ramirez-Velarde
388 Identification of Quasi-Stationary Dynamic Objects with the Use of Derivative Disproportion Functions [abstract]
Abstract: This paper presents an algorithm for designing a cryptographic system, in which the derivative disproportion functions (key functions) are used. This cryptographic system is used for an operative identification of a differential equation describing the movement of quasi-stationary objects. The symbols to be transmitted are encrypted by the sum of at least two of these functions combined with random coefficients. A new algorithm is proposed for decoding the received messages making use of important properties of the derivative disproportion functions. Numerical experiments are reported to demonstrate the algorithm’s reliability and robustness.
Vyacheslav V. Kalashnikov, Viktor V. Avramenko, Nataliya I. Kalashnykova and Nikolay Yu. Slipushko
369 Symbol and Bit Error Probability for Coded TQAM in AWGN Channel [abstract]
Abstract: The performance of coded modulation scheme based on the application of integer codes to TQAM constellation with $2^{2m}$ points is investigated. A method of calculating the exact value of SER in the case of TQAM over AWGN channel combined with encoding by integer codes is described. The results (SER and BER) of simulations in the case of coded 16, 64, and 256-TQAM simulations are given.
Hristo Kostadinov and Nikolai Manev
462 A comparative study of evolutionary statistical methods for uncertainty reduction in forest fire propagation prediction [abstract]
Abstract: Predicting the propagation of forest fires is a crucial point to mitigate their effects. Therefore, several computational tools or simulators have been developed to predict the fire propagation. Such tools consider the scenario (topography, vegetation types, fire front situation), and the particular conditions where the fire is evolving (vegetation conditions, meteorological conditions) to predict the fire propagation. However, these parameters are usually difficult to measure or estimate precisely, and there is a high degree of uncertainty in many of them. This uncertainty provokes a certain lack of accuracy in the predictions with the consequent risks. So, it is necessary to apply methods to reduce the uncertainty in the input parameters. This work presents a comparison of ESSIM-EA and ESSIM-DE: two methods to reduce the uncertainty in the input parameters. These methods combine Evolutionary Algorithms, Parallelism and Statistical Analysis to improve the propagation prediction.
María Laura Tardivo, Paola Caymes-Scutari, Germán Bianchini, Miguel Méndez-Garabetti, Andrés Cencerrado and Ana Cortés