Session5 16:20 - 18:00 on 11th June 2014

Main Track (MT) Session 5

Time and Date: 16:20 - 18:00 on 11th June 2014

Room: Kuranda

Chair: M. Wagner

288 OS Support for Load Scheduling in Accelerator-based Heterogeneous Systems [abstract]
Abstract: The involvement of accelerators is becoming widespread in the field of heterogeneous processing, performing computation tasks through a wide range of applications. With the advent of the various computing architectures existing currently, the need for a system-wide multitasking environment is increasing. Therefore, we present an OpenCL-based scheduler that is designed as a multi-user computing environment to make use of the full potential of available resources while running as a daemon. Multiple tasks can be issued by means of a C++ API that relies on the OpenCL C++! wrapper. At this point, the daemon takes over the control immediately and performs load scheduling. Due to its implementation, our approach can be easily applicable to a common OS. We validate our method through extensive experiments deploying a set of applications, which show that the low scheduling costs remain constant in total over a wide range of input size. Besides the different CPUs, a variety of modern GPU and other accelerator architectures are used in the experiments.
Ayman Tarakji, Niels Ole Salscheider, David Hebbeker
369 Efficient Global Element Indexing for Parallel Adaptive Flow Solvers [abstract]
Abstract: Many grid-based solvers for partial differential equations (PDE) assemble matrices explicitely for discretizing the underlying PDE operators and/or for the underlying (non-)linear systems of equations. Often, the data structures or solver packages require a consecutive global numbering of the degrees of freedom across the boundaries of different parallel subdomains. Straightforward approaches to realize this global indexing in parallel frequently result in serial parts of the assembling algorithms which causes a considerable bottleneck, in particular in large-scale applications. We present an efficient way to set up such a global indexing numbering scheme for large configurations via a position-based numeration on all parallel processes locally. The global number of shared nodes is determined via a tree-based communication pattern. We verified our implementation via state-of-the-art benchmark scenarios for incompressible flow simulations. A small performance study shows the parallel capability of our approach. The corresponding results can be generalized to other grid-based solvers that demand for global indexing in the context of large-scale parallelization.
Michael Lieb, Tobias Neckel, Hans-Joachim Bungartz, Thomas Schöps
382 Performance Improvements for a Large-Scale Geological Simulation [abstract]
Abstract: Geological models have been successfully used to identify and study geothermal energy resources. Many computer simulations based on these models are data-intensive applications. Large-scale geological simulations require high performance computing (HPC) techniques to run within reasonable time constraints and performance levels. One research area that can benefit greatly from HPC techniques is the modeling of heat flow beneath the Earth’s surface. This paper describes the application of HPC techniques to increase the scale of research with a well-established geological model. Recently, a serial C++ application based on this geological model was ported to a parallel HPC applications using MPI. An area of focus was to increase the performance of the MPI version to enable state or regional scale simulations using large numbers of processors. First, synchronous communications among MPI processes was replaced by overlapping communication and computation (asynchronous communication). Asynchronous communication improved performance over synchronous communications by averages of 28% using 56 cores in one environment and 46% using 56 cores in another. Second, an approach for load balancing involving repartitioning the data at the start of the program resulted in runtime performance improvements of 32% using 48 cores in the first environment and 14% using 24 cores in the second when compared to the asynchronous version. An additional feature, modeling of erosion, was also added to the MPI code base. The performance improvement techniques under erosion were less effective.
David Apostal, Kyle Foerster, Travis Desell, Will Gosnold
168 Lattice Gas Model for Budding Yeast: A New Approach for Density Effects [abstract]
Abstract: Yeasts in culture media grow exponentially in early period but eventually stop growing. The saturation of population growth is due to “density effect”. The budding yeast, Saccharomyces cerevisiae, is known to exhibit an age-dependent cell division. Daughter cell, which gives no birth, has longer generation time than mother, because daughter needs maturing period. So far, investigations in exponential growth period have been intensively accumulated; very little is known for the stage dependence of density effect. Here we present an "in vivo" study of density effect, applying a lattice gas model to explore the age-structure dynamics. It is, however hard to solve basic equations, because they have an infinite number of variables and parameters. The basic equations are constructed from several simplified models which have few variables and parameters. These simplified models are compared with experimental data to report two findings for stage-dependent density effect: 1) paradox of decline birthrate (PDB), and 2) mass suicide. These events suddenly and temporarily occur at early stage of density effect. The mother-daughter model leads to PDB. Namely, when the birthrate of population is decreased, then the fraction of daughter is abruptly increased. Moreover, find the average age of yeast population suddenly decreases at the inflection point. This means the mass apoptosis of aged mothers. Our results imply the existence of several types of "pheromones" that specifically inhibit the population growth.
Kei-Ichi Tainaka, Takashi Ushimaru, Toshiyuki Hagiwara, Jin Yoshimura
185 Characteristics of displacement data due to time scale for the combination of Brownian motion with intermittent adsorption [abstract]
Abstract: Single-molecule tracking data near solid surfaces contains information on diffusion that is potentially affected by adsorption. However, molecular adsorption can occur in an intermittent manner, and the overall phenomenon is regarded as slower yet normal diffusion if the time scale of each adsorption event is sufficiently shorter than the interval of data acquisition. We compare simple numerical model systems that vary in the time scale of adsorption event while sharing the same diffusion coefficient, and show that the shape of the displacement distribution depends on the time resolution. We also evaluate the characteristics by statistical quantities related to the large deviation principle.
Itsuo Hanasaki, Satoshi Uehara, Satoyuki Kawano

Main Track (MT) Session 12

Time and Date: 16:20 - 18:00 on 11th June 2014

Room: Tully I

Chair: Luiz DeRose

187 The K computer Operations: Experiences and Statistics [abstract]
Abstract: The K computer, released on September 29, 2012, is a large-scale parallel supercomputer system consisting of 82,944 compute nodes. We have been able to resolve a significant number of operation issues since its release. Some system software components have been fixed and improved to obtain higher stability and utilization. We achieved 94% service availability because of a low hardware failure rate and approximately 80% node utilization by careful adjustment of operation parameters. We found that the K computer is an extremely stable and high utilization system.
Keiji Yamamoto, Atsuya Uno, Hitoshi Murai, Toshiyuki Tsukamoto, Fumiyoshi Shoji, Shuji Matsui, Ryuichi Sekizawa, Fumichika Sueyasu, Hiroshi Uchiyama, Mitsuo Okamoto, Nobuo Ohgushi, Katsutoshi Takashina, Daisuke Wakabayashi, Yuki Taguchi, Mitsuo Yokokawa
195 Quantum mechanics study of hexane isomers through gamma-ray and graph theory combined with C1s binding energy and nuclear magnetic spectra (NMR) [abstract]
Abstract: Quantum mechanically calculated positron-electron annihilation gamma-ray spectra, C1s binding energy spectra and NMR spectra are employed to study the electronic structures of hexane and its isomers, which is assisted using graph theory. Our recent positron-electron annihilation gamma-ray spectral study of n-hexane in gas phase and core ionization (IPs) spectral studies for small alkanes and their isomers, have paved the path for the present correlation study where quantum mechanics is combined with graph theory, C1s ionization spectroscopy and nuclear magnetic resonance (NMR), to further understand the electronic structure and topology for the hexane isomers. The low-energy plane wave positron (LEPWP) model indicated that the positrophilic electrons of a molecule are dominated by the electrons in the lowest occupied valence orbital (LOVO). The most recent results using NOMO indicated that the electronic wave functions dominate the electron-positron wave functions for molecular systems. In addition to quantum mechanics, chemical graphs are also studied and are presented in the present study.
Subhojyoti Chatterjee and Feng Wang
257 Dendrogram Based Algorithm for Dominated Graph Flooding [abstract]
Abstract: In this paper, we are concerned with the problem of flooding undirected weighted graphs under ceiling constraints. We provide a new algorithm based on a hierarchical structure called {\em dendrogram}, which offers the significant advantage that it can be used for multiple flooding with various scenarios of the ceiling values. In addition, when exploring the graph through its dendrogram structure in order to calculate the flooding levels, independent sub-dendrograms are generated, thus offering a natural way for parallel processing. We provide an efficient implementation of our algorithm through suitable data structures and optimal organisation of the computations. Experimental results show that our algorithm outperforms well established classical algorithms, and reveal that the cost of building the dendrogram highly predominates over the total running time, thus validating both the efficiency and the hallmark of our method. Moreover, we exploit the potential parallelism exposed by the flooding procedure to design a multi-thread implementation. As the underlying parallelism is created on the fly, we use a queue to store the list of the sub-dendrograms to be explored, and then use a dynamic round-robin scheduling to assign them to the participating threads. This yields a load balanced and scalable process as shown by additional benchmark results. Our program runs in few seconds on an ordinary computer to flood graphs with more that $20$ millions of nodes.
Claude Tadonki
278 HP-DAEMON: High Performance Distributed Adaptive Energy-efficient Matrix-multiplicatiON [abstract]
Abstract: The demands of improving energy efficiency for high performance scientific applications arise crucially nowadays. Software-controlled hardware solutions directed by Dynamic Voltage and Frequency Scaling (DVFS) have shown their effectiveness extensively. Although DVFS is beneficial to green computing, introducing DVFS itself can incur non-negligible overhead, if there exist a large number of frequency switches issued by DVFS. In this paper, we propose a strategy to achieve the optimal energy savings for distributed matrix multiplication via algorithmically trading more computation and communication at a time adaptively with user-specified memory costs for less DVFS switches, which saves 7.5% more energy on average than a classic strategy. Moreover, we leverage a high performance communication scheme for fully exploiting network bandwidth via pipeline broadcast. Overall, the integrated approach achieves substantial energy savings (up to 51.4%) and performance gain (28.6% on average) compared to ScaLAPACK pdgemm() on a cluster with an Ethernet switch, and outperforms ScaLAPACK and DPLASMA pdgemm() respectively by 33.3% and 32.7% on average on a cluster with an Infiniband switch.
Li Tan, Longxiang Chen, Zizhong Chen, Ziliang Zong, Rong Ge, Dong Li
279 Evaluating the Performance of Multi-tenant Elastic Extension Tables [abstract]
Abstract: An important challenge in the design of databases that support multi-tenant applications is to provide a platform to manage large volumes of data collected from different businesses, social media networks, emails, news, online texts, documents, and other data sources. To overcome this challenge we proposed in our previous work a multi-tenant database schema called Elastic Extension Tables (EET) that combines multi-tenant relational tables and virtual relational tables in a single database schema. Using this approach, the tenants’ tables can be extended to support the requirements of individual tenants. In this paper, we discuss the potentials of using EET multi-tenant database schema, and show how it can be used for managing physical and virtual relational data. We perform several experiments to measure the feasibility and effectiveness of EET by comparing it with a commercially available multi-tenant schema mapping technique used by SalesForce.com. We report significant performance improvements obtained using EET when compared to Universal Table Schema Mapping (UTSM), making the EET schema a good candidate for the management of multi-tenant data in Software as a Service (SaaS) and Big Data applications.
Haitham Yaish, Madhu Goyal, George Feuerlicht

Workshop on Computational Finance and Business Intelligence (CFBI) Session 1

Time and Date: 16:20 - 18:00 on 11th June 2014

Room: Tully II

Chair: ?

100 Twin Support Vector Machine in Linear Programs [abstract]
Abstract: This paper propose a new algorithm, termed as LPTWSVM, for binary classification problem by seeking two nonparallel hyperplanes which is an improved method for TWSVM. We improve the recently proposed ITSVM and develop Generalized ITSVM. A linear function is chosen in the object function of Generalized ITSVM which leads to the primal problems of LPTWSVM. Comparing with TWSVM, a 1-norm regularization term is introduced to the objective function to implement structural risk minimization and the quadratic programming problems are changed to linear programming problems which can be solved fast and easily. Then we do not need to compute the large inverse matrices or use any optimization trick in solving our linear programs and the dual problems are unnecessary in the paper. We can introduce kernel function directly into nonlinear case which overcome the serious drawback of TWSVM. The numerical experiments verify that our LPTWSVM is very effective.
Dewei Li, Yingjie Tian
240 Determining the time window threshold to identify user sessions of stakeholders of a commercial bank portal [abstract]
Abstract: In this paper, we focus on finding the suitable value of the time threshold, which is then used in the method of user session identification based on the time. To determine its value, we used the Length variable representing the time a user spent on a particular site. We compared two values of time threshold with experimental methods of user session identification based on the structure of the web: Reference Length and H-ref. When comparing the usefulness of extracted rules using all four methods, we proved that the use of the time threshold calculated from the quartile range is the most appropriate method for identifying sessions for web usage mining.
Jozef Kapusta, Michal Munk, Peter Svec, Anna Pilkova
183 Historical Claims Data Based Hybrid Predictive Models for Hospitalization [abstract]
Abstract: Over $30 billion are wasted on unnecessary hospitalization each year, therefore it is needed to nd a better quantitative way to identify patients who are mostly likely to be hospitalized and then provide them utmost care. As a good starting point, the objective of this paper was to develop a predictive model to predict how many days patients may spend in the hospital next year based on patients' historical claims dataset, which is provided by the Heritage Health Prize Competition. The proposed predictive model applied the ensemble of binary classication and regression techniques. The model is evaluated on testing dataset in terms of the Root-Mean-Square-Error (RMSE). The best RMSE score was 0.474, and the corresponding prediction accuracy 81.9% was reasonably high. Therefore it is convincing to conclude that predictive models have the potentials to predict hospitalization and improve patients' quality of life.
Chengcheng Liu, Yong Shi

Solving Problems with Uncertainties (SPU) Session 1

Time and Date: 16:20 - 18:00 on 11th June 2014

Room: Tully III

Chair: Vassil Alexandrov

37 Wind field uncertainty in forest fire propagation prediction [abstract]
Abstract: Forest fires are a significant problem, especially in Mediterranean countries. To fight against these hazards, it is necessary to have an accurate prediction of its evolution beforehand. So, propagation models have been developed to determine the expected evolution of a forest fire. Such propagation models require input parameters to produce the predictions. Such parameters must be as accurate as possible in order to provide a prediction adjusted to the actual fire behavior. However, in many cases the information concerning the values of the input parameter is obtained by indirect measurements. Such indirect estimations imply an uncertainty degree concerning the values ​​of the parameters. This problem is very significant in the case of parameters that have a spatial distribution or variation, such as wind. The wind provided by a global weather forecast model or measured at a meteorological station in some particular point is modified by the topography of the terrain and has a different value at every point of the terrain. To estimate the wind speed and direction at each point of the terrain it is necessary to apply a wind field model that determines those values ​​at each point depending on the terrain topography. WindNinja is a wind field simulator that provides an estimate wind direction and wind speed at each point of the terrain given a meteorological wind. However, the calculation of the wind field takes some time when the map has a considerable size (30x30 Km) and the resolution is high (30x30meters). This time penalizes the prediction of forest fire spread and may eventually make impractical the effective prediction of fire spread with wind field. On the other hand, it must be considered that the data structures needed to calculate the wind field of a large map requires a large amount of memory that may not be available on a single node of a current system. To reduce the computation time of the wind field a data partition method has been applied. In this case the wind field is calculated in parallel on each part of the map and then the wind fields of the different parts are joined to form the global wind field. Furthermore, by partitioning the terrain map, the data structures necessary to resolve the wind field in each part are reduced significantly and can be stored in the memory of a single node in a current parallel system. Therefore, the existing nodes can perform computation in parallel with data that fit the capacity of the memory on each node. However, the calculation of the wind field is a complex problem which has certain border effects, so that the wind direction and speed in the points next to the border of each part may have some variability and differ from those they would have obtained if they were far from the border, for example if the wind field is calculated over a single complete map. To solve this problem, it is necessary to include a degree of overlap among the map parts. So, there is a margin from the beginning of the part and the part cells itself. The overall wind field aggregation is obtained by discarding the calculated margin fields overlap of each part. The inclusion of an overlap each part increases the execution time, but the variation in the wind field is reduced. The methodology has been tested with several terrain maps, and it was found that parts of 400x400 cells with an overlap of 50 cells per side provide a reasonable execution time (150 sec) with virtually no variation with respect to the wind field obtained with a global map. With this type of partitioning, each process solves an effective part of a map of 300x300 cells.
Gemma Sanjuan, Carlos Brun, Tomas Margalef, Ana Cortes
307 A Framework for Evaluating Skyline Query over Uncertain Autonomous Databases [abstract]
Abstract: The perception of skyline query is to find a set of objects that is much preferred in all dimensions. While this theory is easily applicable on certain and complete database, however, when it comes to data integration of databases where each has different representation of data in a same dimension, it would be difficult to determine the dominance relation between the underlying data. In this paper, we propose a framework, SkyQUD, to efficiently compute the skyline probability of datasets in uncertain dimensions. We explore the effects of having datasets with uncertain dimensions in relation to the dominance relation theory and propose a framework that is able to support skyline queries on this type of datasets.
Nurul Husna Mohd Saad, Hamidah Ibrahim, Ali Amer Alwan, Fatimah Sidi, Razali Yaakob
253 Efficient Data Structures for Risk Modelling in Portfolios of Catastrophic Risk Using MapReduce [abstract]
Abstract: The QuPARA Risk Analysis Framework~\cite{IEEEbigdata} is an analytical framework implemented using MapReduce and designed to answer a wide variety of complex risk analysis queries on massive portfolios of catastrophic risk contracts. In this paper, we present data structure improvements that greatly accelerate QuPARA's computation of Exceedance Probability (EP) curves with secondary uncertainty.
Andrew Rau-Chaplin, Zhimin Yao, Norbert Zeh
40 Argumentation Approach and Learning Methods in Intelligent Decision Support Systems in the Presence of Inconsistent Data [abstract]
Abstract: This paper contains a description of methods and algorithms for working with inconsistent data in intelligent decision support systems. An argumentation approach and application of rough sets for generalization problems are considered. The methods for finding the conflicts and the generalization algorithm based on rough sets are proposed. Noise models in the generalization algorithm are viewed. Experimental results are introduced. A decision of some problems that are not solvable in classical logics is given.
Vadim N. Vagin, Marina Fomina, Oleg Morosin
365 Enhancing Monte Carlo Preconditioning Methods for Matrix Computations [abstract]
Abstract: An enhanced version of a stochastic SParse Approximate Inverse (SPAI) preconditioner for general matrices is presented. This method is used in contrast to the standard deterministic preconditioners computed by the deterministic SPAI, and its further optimized parallel variant- Modified SParse Approximate Inverse Preconditioner (MSPAI). Thus we present a Monte Carlo preconditioner that relies on the use of Markov Chain Monte Carlo (MCMC) methods to compute a rough matrix inverse first, which is further optimized by an iterative filter process and a parallel refinement, to enhance the accuracy of the preconditioner. Monte Carlo methods quantify the uncertainties by enabling us to estimate the non-zero elements of the inverse matrix with a given precision and certain probability. The advantage of this approach is that we use sparse Monte Carlo matrix inversion whose complexity is linear of the size of the matrix. The behaviour of the proposed algorithm is studied, its performance measured and compared with MSPAI.
Janko Strassburg, Vassil Alexandrov

Workshop on Advances in the Kepler Scientific Workflow System and Its Applications (KEPLER) Session 1

Time and Date: 16:20 - 18:00 on 11th June 2014

Room: Bluewater I

Chair: Ilkay Altintas

260 Design and Implementation of Kepler Workflows for BioEarth [abstract]
Abstract: BioEarth is an ongoing research initiative for the development of a regional-scale Earth System Model (EaSM) for the U.S. Pacific Northwest. Our project seeks to couple and integrate multiple stand-alone EaSMs developed through independent efforts for capturing natural and human processes in various realms of the biosphere: atmosphere (weather and air quality), terrestrial biota (crop, rangeland, and forest agro-ecosystems) and aquatic (river flows, water quality, and reservoirs); hydrology links all these realms. Due to the need to manage numerous complex simulations, an application of automated workflows was essential. In this paper, we present a case study of workflow design for the BioEarth project using the Kepler system to manage applications of the Regional Hydro-Ecologic Simulation System (RHESSys) model. In particular, we report on the design of Kepler workflows to support: 1) standalone executions of the RHESSys model under serial and parallel applications, and 2) a more complex case of performing calibration runs involving multiple preprocessing modules, iterative exploration of parameters and parallel RHESSys executions. We exploited various Kepler features including a user-friendly design interface and support for parallel execution on a cluster. Our experiments show a performance speedup between 7–12x, using 16 cores of a Linux cluster, and demonstrate the general effectiveness of our Kepler workflows in managing RHESSys runs. This study shows the potential of Kepler to serve as the primary integration platform for the BioEarth project, with implications for other data- and compute-intensive Earth systems modeling projects.
Tristan Mullis, Mingliang Liu, Ananth Kalyanaraman, Joseph Vaughan, Christina Tague, Jennifer Adam
327 Tools, methods and services enhancing the usage of the Kepler-based scientific workflow framework [abstract]
Abstract: Scientific workflow systems are designed to compose and execute either a series of computational or data manipulation steps, or workflows in a scientific application. They are usually part of the larger eScience environment. The usage of workflow systems, while very beneficial, is mostly not trivial for the scientists. There are many requirements for additional functionalities around scientific workflows systems that need to be taken into account, like ability of sharing workflows, provision of the user-friendly GUI tools for automation of some tasks, or for submission to distributed computing infrastructures, etc. In this paper we present a tools developed in the response to the requirements of three different scientific communities. These tools simplifies and empower they work with the Kepler scientific workflow system. The usage of such tools and services are presented on the Nanotechnology, Astronomy and Fusion scenarios examples.
Marcin Plociennik, Szymon Winczewski, Paweł Ciecieląg, Frederic Imbeaux, Bernard Guillerminet, Philippe Huynh, Michał Owsiak, Piotr Spyra, Thierry Aniel, Bartek Palak, Tomasz Żok, Wojciech Pych, Jarosław Rybicki
371 Progress towards automated Kepler scientific workflows for computer-aided drug discovery and molecular simulations [abstract]
Abstract: We describe the development of automated workflows that support computed-aided drug discovery (CADD) and molecular dynamics (MD) simulations and are included as part of the National Biomedical Computational Resource (NBCR). The main workflow components include: file-management tasks, ligand force field parameterization, receptor-ligand molecular dynamics (MD) simulations, job submission and monitoring on relevant high-performance computing (HPC) resources, receptor structural clustering, virtual screening (VS), and statistical analyses of the VS results. The workflows aim to standardize simulation and analysis and promote best practices within the molecular simulation and CADD communities. Each component is developed as a stand-alone workflow, which allows easy integration into larger frameworks built to suit user needs, while remaining intuitive and easy to extend.
Pek U. Ieong, Jesper Sørensen, Prasantha L. Vemu, Celia W. Wong, Özlem Demir, Nadya P. Williams, Jianwu Wang, Daniel Crawl, Robert V. Swift, Robert D. Malmstrom, Ilkay Altintas, Rommie E. Amaro
341 Flexible approach to astronomical data reduction workflows in Kepler [abstract]
Abstract: The growing scale and complexity of cataloguing and analyzing of astronomical data forces scientists to look for a new technologies and tools. The workflow environments appear best suited for their needs, but in practice they prove to be too complicated for most users. Before such enviroments are used commonly, they have to be properly adapted for domain specific needs. We have created a universal solution based on the Kepler workflow environment to that end. It consists of a library of domain modules, ready-to-use workflows and additional services for sharing and running worklows. There are three access levels depending on the needs and skills of the user: 1) desktop application, 2) web application 3) on-demand Virtual Research Environment. Everything is set up in the context of Polish grid infrastructure, enabling access to its resources.For flexibility, our solution includes interoperability mechanisms with the domain specific applications and services (including astronomical Virtual Observatory) as well as with other domain grid services.
Paweł Ciecieląg, Marcin Płóciennik, Piotr Spyra, Michał Urbaniak, Tomasz Żok, Wojciech Pych
282 Identifying Information Requirement for Scheduling Kepler Workflow in the Cloud [abstract]
Abstract: Kepler scientific workflow system has been used to support scientists to automatically perform experiments of various domains in distributed computing systems. An execution of a workflow in Kepler is controlled by a director assigned in the workflow. However, users still need to specify compute resources on which the tasks in the workflow are executed. To further ease the technical effort required by scientists, a workflow scheduler that is able to assign workflow tasks to resources for execution is necessary. To this end, we identify from a review of several cloud workflow scheduling techniques the information that should be made available in order for a scheduler to schedule Kepler workflow in the cloud computing context. To justify the usefulness, we discuss each type of information regarding workflow tasks, cloud resources, and cloud providers based on their benefit on workflow scheduling.
Sucha Smanchat, Kanchana Viriyapant

Modeling and Simulation of Large-scale Complex Urban Systems (MASCUS) Session 1

Time and Date: 16:20 - 18:00 on 11th June 2014

Room: Bluewater II

Chair: Heiko Aydt

111 Analysing the Effectiveness of Wearable Wireless Sensors in Controlling Crowd Disasters [abstract]
Abstract: The Love Parade disaster in Duisberg, Germany lead to several deaths and injuries. Disasters like this occur due to the existence of high densities in a limited area. We propose a wearable electronic device that helps reduce such disasters by directing people and thus controlling the density of the crowd. We investigate the design and effectiveness of such a device through an agent based simulation using social force. We also investigate the effect of device failure and participants not paying attention in order to determine the critical number of devices and attentive participants required for the device to be effective.
Teo Yu Hui Angela, Vaisagh Viswanathan, Michael Lees, Wentong Cai
204 Individual-Oriented Model Crowd Evacuations Distributed Simulation [abstract]
Abstract: Emergency plan design is an important problem in building design to evacuate people as fast as possible. Evacuation simulation exercises as fire drills are not a realistic situation to understand the behaviour of people. In the case of crowd evacuations the complexity and uncertainty of the systems increases. Computer simulation allows us to run crowd dynamics models and extract information from emergency situations. Several models solve the emergency evacuation problem. Individual oriented modelling allows to describe rules for individual and simulate interactions between them. Because the variation on the emergency situations results have to be statistically reliable. This reliability increases the computing demand. Distributed and parallel paradigms solve the performance problem. In the present work we developed a model to simulate crowd evacuations. We implemented two versions of the model. One using Netlogo and another using C with MPI. We chose a real environment to test the simulator: building 2 of Fira de Barcelona building, able to hold thousands of persons. The distributed simulator was tested with 62,820 runs in a distributed environment with 15,000 individuals. In this work we show how the simulator has a linear speedup and scales efficiently.
Albert Gutierrez-Milla, Francisco Borges, Remo Suppi, Emilio Luque
133 Simulating Congestion Dynamics of Train Rapid Transit using Smart Card Data [abstract]
Abstract: Investigating congestion in train rapid transit systems (RTS) in today's urban cities is a challenge compounded by limited data availability and difficulties in model validation. Here, we integrate information from travel smart card data, a mathematical model of route choice, and a full-scale agent-based model of the Singapore RTS to provide a more comprehensive understanding of the congestion dynamics than can be obtained through analytical modelling alone. Our model is empirically validated, and allows for close inspection of the dynamics including station crowdedness, average travel duration, and frequency of missed trains---all highly pertinent factors in service quality. Using current data, the crowdedness in all 121 stations appears to be distributed log-normally. In our preliminary scenarios, we investigate the effect of population growth on service quality. We find that the current population (2 million) lies below a critical point; and increasing it beyond a factor of approximately 10% leads to an exponential deterioration in service quality. We also predict that incentivizing commuters to avoid the most congested hours can bring modest improvements to the service quality provided the population remains under the critical point. Finally, our model can be used to generate simulated data for statistical analysis when such data are not empirically available, as is often the case.
Nasri Othman, Erika Fille Legara, Vicknesh Selvam, Christopher Monterola
177 A method to ascertain rapid transit systems' throughput distribution using network analysis [abstract]
Abstract: We present a method of predicting the distribution of passenger throughput across stations and lines of a city rapid transit system by calculating the normalized betweenness centrality of the nodes (stations) and edges of the rail network. The method is evaluated by correlating the distribution of betweenness centrality against throughput distribution which is calculated using actual passenger ridership data. Our ticketing data is from the rail transport system of Singapore that comprises more than 14 million journeys over a span of one week. We demonstrate that removal of outliers representing about 10\% of the stations produces a statistically significant correlation above 0.7. Interestingly, these outliers coincide with stations that opened six months before the time the ridership data was collected, hinting that travel routines along these stations have not yet settled to its equilibrium. The correlation is improved significantly when the data points are split according to their separate lines, illustrating differences in the intrinsic characteristics of each line. The simple procedure established here shows that static network analysis of the structure of a transport network can allow transport planners to predict with sufficient accuracy the passenger ridership, without requiring dynamic and complex simulation methods.
Muhamad Azfar Ramli, Christopher Monterola, Gary Kee Khoon Lee, Terence Gih Guang Hung
236 Fast and Accurate Optimization of a GPU-accelerated CA Urban Model through Cooperative Coevolutionary Particle Swarms [abstract]
Abstract: The calibration of Cellular Automata (CA) models for simulating land-use dynamics requires the use of formal, well-structured and automated optimization procedures. A typical approach used in the literature to tackle the calibration problem, consists of using general optimization metaheuristics. However, the latter often require thousands of runs of the model to provide reliable results, thus involving remarkable computational costs. Moreover, all optimization metaheuristics are plagued by the so called curse of dimensionality, that is a rapid deterioration of eciency as the dimensionality of the search space increases. Therefore, in case of models depending on a large number of parameters, the calibration problem requires the use of advanced computational techniques. In this paper, we investigate the eectiveness of combining two computational strategies. On the one hand, we greatly speed up CA simulations by using general-purpose computing on graphics processing units. On the other hand, we use a specifically designed cooperative coevolutionary Particle Swarm Optimization algorithm, which is known for its ability to operate eectively in search spaces with a high number of dimensions.
Ivan Blecic, Arnaldo Cecchini, Giuseppe A. Trunfio

Multiscale Modelling and Simulation (MSCALE) Session 1

Time and Date: 16:20 - 18:00 on 11th June 2014

Room: Rosser

Chair: Valeria Krzhizhanovskaya

126 Restrictions in model reduction for polymer chain models in dissipative particle dynamics [abstract]
Abstract: We model high molecular weight homopolymers in semidilute concentration via Dissipative Particle Dynamics (DPD). We show that in model reduction methodologies for polymers it is not enough to preserve system properties (i.e., density $\rho$, pressure $p$, temperature $T$, radial distribution function $g(r)$) but preserving also the characteristic shape and length scale of the polymer chain model is necessary. In this work we apply a DPD-model-reduction methodology recently proposed; and demonstrate why the applicability of this methodology is limited upto certain maximum polymer length, and not suitable for solvent coarse graining.
Nicolas Moreno, Suzana Nunes, Victor M. Calo
353 Simulation platform for multiscale and multiphysics modeling of OLEDs [abstract]
Abstract: We present a simulation platform which serves as an integrated framework for multiscale and multiphysics modeling of Organic Light Emitting Diodes (OLEDs) and their components. The platform is aimed at the designers of OLEDs with various areas of expertise ranging from the fundamental theory to the manufacturing technology. The platform integrates an extendable set of in-house and third-party computational programs that are used for predictive modeling of the OLED parameters important for device performance. These computational tools describe properties of atomistic, mesoscale and macroscopic levels. The platform automates data exchange between these description levels and allows one to build simulation workflows and manage remote task execution. The integrated database provides data exchange and storage for the calculated and experimental results.
Maria Bogdanova, Sergey Belousov, Ilya Valuev, Andrey Zakirov, Mikhail Okun, Denis Shirabaykin, Vasily Chorkov, Petr Tokar, Andrey Knizhnik, Boris Potapkin, Alexander Bagaturyants, Ksenia Komarova, Mikhail Strikhanov, Alexey Tishchenko, Vladimir Nikitenko, Vasili Sukharev, Natalia Sannikova, Igor Morozov
336 Scaling acoustical calculations on multicore, multiprocessor and distributed computer environment [abstract]
Abstract: Applying computer systems to calculate acoustic fields is a commonly used practice due to generally high complexity of such tasks. Although implementing algorithmic and software solutions to calculate acoustical fields faces a wide variety of problems caused by impossibility to represent algorithmically all of the physical laws involved in calculation of the field distribution, in all variety of mediums, with wide sets of the field parameters and its sources. Therefore there are lots of limitations on tasks being solved by one simulation system. At the same time a large number of calculations are required to perform general simulation tasks for all of the sets of input parameters. Therefore it is important to develop new algorithmic solutions to calculate acoustic fields for wider range of input parameters, providing scalability to many parallel and distributed computers to increase maximum allowed levels of computation loads with adequate time and cost consumptions caused by the simulation. Tasks of calculating acoustic fields may belong to various domains from the point of view to the physical laws involved in the calculation. In the article a general architecture of the simulation system is presented providing structure and functionality of the system at the top level and its domain independent subsystems. The complete architecture can be defined only for the specific class of calculation tasks. The two classes of them are described: simulating acoustical fields in enclosed rooms and in natural stochastic deep-water waveguides.
Andrey Chusov, Lubov Statsenko, Yulia Mirgorodskaya, Boris Salnikov and Evgeniya Salnikova
384 PyGrAFT: Tools for Representing and Managing Regular and Sparse Grids [abstract]
Abstract: Many computational science applications perform compute-intensive operations on scalar and vector fields residing on multidimensional grids. Typically these codes run on supercomputers--large multiprocessor commodity clusters or hybrid platforms that combine CPUs with accelerators such as GPUs. The Python Grids and Fields Toolkit (PyGrAFT) is set of classes, methods, and library functions for representing scalar and vector fields residing on multidimensional, logically Cartesian--including curvilinear--grids. The aim of PyGrAFT is to accelerate development of numerical analysis applications by combining the high programmer productivity of Python with the high performance of lower-level programming languages. The PyGrAFT data model--which leverages the NumPy {\sf{ndarray}} class--enables representation of tensor product grids of arbitrary dimension, and collections of scalar and/or vector fields residing on these grids. Furthermore, the PyGrAFT data model allows the user to choose field storage ordering for optimal performance for the target application. Class support methods and library functions are implemented to employ where possible reliable, well-tested, high-performance packages from the python software ecosystem (e.g., NumPy, SciPy, mpi4py). The PyGrAFT data model and library is applicable to global address spaces and distributed-memory platforms that utilise MPI. Library operations include intergrid interpolation and support for multigrid solver techniques such as the sparse grid combination technique. We describe the PyGrAFT data model, its parallelisation, and strategies currently underway to explore opportunities for providing multilevel parallelism with relatively little user effort. We illustrate the PyGrAFT data model, library functions, and resultant programming model in action for a variety of applications, including function evaluation, PDE solvers, and sparse grid combination technique solvers. We demonstarte the language interoperability PyGrAFT with a C++ example, and outline strategies for using PyGrAFT with legacy codes written in other programming languages. We explore the implications of this programming model for an emerging problem in computational science and engineering---modelling multiphysics and multiscale systems. We conclude with an outline of the PyGrAFT development roadmap, including full support for vector fields and calculations in curvilinear coordinates, support for GPUs and other parallelisation schemes, and extensions to the PyGrAFT model to accomodate general multiresolution numerical methods.
Jay Larson

Urgent Computing: Computations for Decision Support in Critical Situations (UC) Session 2

Time and Date: 16:20 - 18:00 on 11th June 2014

Room: Mossman

Chair: Alexander Boukhanovsky

366 Hybrid scheduling algorithm in early warning [abstract]
Abstract: Investigations in development of efficient early warning systems (EWS) are essentially for prediction and warning of upcoming natural hazards. Besides providing of communication and computationally intensive infrastructure, the high resource reliability and hard deadline option are required for EWS scenarios processing in order to get guaranteed information in time-limited conditions. In this paper planning of EWS scenarios execution is investigated and the efficient hybrid algorithm for urgent workflows scheduling is developed based on traditional heuristic and meta-heuristic approaches within state-of-art cloud computing principles.
Denis Nasonov, Nikolay Butakov
400 On-board Decision Support System for Ship Flooding Emergency Response [abstract]
Abstract: The paper describes a real-time software system to support emergency planning decisions when ship flooding occurs. The events of grounding and collision are considered, where the risk of subsequent flooding of hull compartments is very high, and must be avoided or at least minimized. The system is based on a highly optimized algorithm that estimates, ahead in time, the progressive flooding of the compartments according to the current ship status and existent damages. Flooding times and stability parameters are measured, allowing for the crew to take the adequate measures, such as isolate or counter-flood compartments, before the flooding takes incontrollable proportions. The simulation is visualized in a Virtual Environment in real-time, which provides all the functionalities to evaluate the seriousness and consequences of the situation, as well as to test, monitor and carry out emergency actions. Being a complex physical phenomena that occurs in an equally complex structure such as a ship, the real-time flooding simulation combined with the Virtual Environment requires large computational power to ensure the reliability of the simulation results. Moreover, the distress normally experienced by the crew in such situations, and the urgent (and hopefully appropriate) required counter-measures, leave no room for inaccuracies or misinterpretations, caused by the lack of computational power, to become acceptable. For the events considered, the system is primarily used as a decision support tool to take urgent actions in order to avoid or at least minimize disastrous consequences such as oil spilling, sinking, or even loss of human lives.
Jose Varela, Jose Rodrigues, Carlos Guedes Soares