PRACE User Forum (PRACE) Session 1

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: M110

Chair: Derek Groen

742	Understanding Scientific Application’s Performance using BSC tools [abstract] Abstract: he current trend in supercomputer architectures is leading scientific applications to use parallel programing models such Message Passing Interface (MPI) to use computing resources properly. Understanding how these applications behave it is not straightforward and its crucial for achieving good performance and good efficiency of their codes. Here we present a real case of study of a cuttingedge scientific application. NEMO is a stateoftheart Ocean Global Circulation Model (OGCM) hundreds of users around the world. It is used for oceanographic research, operational oceanography, seasonal forecast and climate studies. In the framework of PRACE, the projects HiResClim and HiResClim II, related to High Resolution Climate projections, uses NEMO as a oceanic model and had conceded more than 38 million core hours in the 5th PRACE Regular Call for Proposals and 50 million core hours in the 7th PRACE Regular Call for Proposals, both in the Tier0 machine Marenostrum. That huge amount of computation time justifies the effort to analyze and optimize the application’s performance. Using the performance tools developed at Barcelona Supercomputing Center (BSC) it is possible to analyse the behaviour of the application . We studied different executions of the NEMO model and performed different analysis of the computational phases, analyzing how cpu and memory behaves, and also communication patterns. We also did strong and weak scaling tests to find bottlenecks constraining the scalability of the application. With this analysis, we could confirm some of the envisaged problems in previous performance analysis of the application and further see other problems not identified before. Using Paraver is both possible to see with high detail the internal behaviour of the application (we can see for example when, who and to where every message is sent) or to compute metrics to extract useful information (such parallel efficiency, load balance or many more). Dimemas allows us to simulate the behaviour of the application under different conditions. It could be useful to analyze the sensibility to network parameters, and for example it could be useful to analyze if one application could run properly in cloud computing. Other tools being developed at BSC and used in this work are Clustering and Folding. The clustering tool uses a data mining technique to identify regions of code with similar performance trends. This make possible to group together and study different iterations, using the folding tool, in order to get instantaneous performance metrics inside the routines, finding areas of interest that have a poor hardware usage. To demonstrate the power of these tools we will show some success stories for NEMO using BSC tools, reporting how we identified specific bottlenecks, proposed some solutions and finally confirmed the impact of the changes.	Oriol Tinto, Miguel Castrillo, Kim Serradel, Oriol Mula Valls, Ana Cortes and Francisco J. Doblas Reyes
746	Using High Performance Computing to Model Clay-Polymer Nanocomposites [abstract] Abstract: Using a three-level multiscale modelling scheme and several Petascale supercomputers, we have been able to model the dynamical process of polymer intercalation into clay tactoids and the ensuing aggregation of polymer-entangled tactoids into larger structures. In our approach, we use a quantum mechanical and atomistic descriptions to derive a coarse-grained yet chemically specific representation that can resolve processes on hitherto inaccessible length and time scales. We applied our approach to study collections of clay mineral tactoids interacting with two synthetic polymers, poly(ethylene glycol) and poly(vinyl alcohol). The controlled behavior of layered materials in a polymer matrix is centrally important for many engineering and manufacturing applications, and opens up a route to computing the properties of complex soft materials based on knowledge of their chemical composition, molecular structure, and processing conditions. In this talk I will present the work we have performed, as well as the techniques we used to enable the model coupling and the deployment on large infrastructures.	Derek Groen
744	Developing HPC aspects for High order DGM for industrial LES [abstract] Abstract: TBD	Koen Hillewaert
747	Introducing the Partnership for Advanced Computing in Europe - PRACE [abstract] Abstract: The remarkable developments and advances in High Performance Computing (HPC) and communications technology over the last decades made possible many achievements and benefits across a wide variety of academic and industrial branches. Thus, it is well-established that HPC is a key technology and enabler resource for science, industry and business activities, especially for large and complex problems where the scale of the problem being tackled creates challenges or the time of the solution is important. Envisioned to create a world-class competitive and persistent pan-European Research Infrastructure (RI) HPC Service, the Partnership for Advanced Computing in Europe (PRACE) was established in 2010, as a Belgian international not-for-profit association (aisbl) with its seat in Brussels, Belgium. Today, PRACE is one of the world’s leading providers of HPC to research and industry (in particular SME) communities. Out of 25 participating country members within and beyond Europe, 4 “Hosting Members” (France, Germany, Spain and Italy) are in-kind contributors, providing access to 6 leading edge supercomputers in all major architectural classes: JUQUEEN (GCS – FZJ, Germany), CURIE (GENCI – CEA, France), HORNET (GCS – HLRS, Germany), SuperMUC (GCS – LRZ, Germany), MareNostrum (BSC, Spain) and FERMI (CINECA, Italy), who committed a total funding of €400 million for the initial PRACE systems and operations. To keep pace with the dynamic needs of a variety of scientific and industry communities and numerous technical changes and developments, PRACE hosting members' systems are continuously updated and upgraded to make most advanced HPC technologies accessible to European scientists and industry. By pooling national computing resources, PRACE is able to award access to Hosting Members HPC resources, through a unified European open and fair Peer-Review process of proposals calls through a web-tool. Two types of calls for proposals are offered to cover the needs expressed by the research and industry communities and to enable the participating hosting members to synchronize access to the resources, namely the Preparatory Access Call (permanent open call) and the Regular Call for Project Access (twice a year calls). The Preparatory Access is intended for short-term access (2 or 6 months) to resources, for code-enabling and porting, required to prepare proposals for Project Access and to demonstrate the scalability of codes. Project Access is intended for large-scale projects of excellent scientific merit and for which clear European added-value and major impact at international level is expected; and can be used for 12, 24 or 36 months in the case of (Multi-Year Access) production runs. PRACE reserves a level of resources for Centres of Excellence (CoE), selected by the EC under the E-INFRA-5-2015 call for proposals. In 2013, the SME HPC Adoption Programme in Europe (SHAPE) is a pan-European programme to support greater HPC adoption by SMEs was initiated by PRACE. This partnership powers excellent science and engineering in academia and industry, addressing society’s grand challenges. Open to all disciplines of research, and industry for open R&D, the PRACE infrastructure is a vital catalyst in fostering European competitiveness. Up to the 10th PRACE Call for Project Access (February, 2015), PRACE has awarded 10.2 thousand million core hours to 394 R&D projects from 38 countries, to come to fruition and yield unprecedented results. The growing range of disciplines that now depend on HPC can also be observed in the upward trend and evolution of the number and quality of project applications received and resources requested via the PRACE Calls for Project Access. PRACE has supported 2 patents, 158 PhD these, 507 publications (some in the most notable scientific journals) and 719 scientific talks (up to the 5th PRACE Call for Project Access). PRACE is also engaged to provide top-class education and training for computational scientists through the PRACE Advanced Training Centres (PATC), the International HPC Summer School, and PRACE seasonal schools. Until December 2014, PRACE has provided over 200 training events with over 5000 trainees and 19686 person-days of training (attendance-based), with an upward attendance trend from both academia and industry communities. Since mid-2012, PRACE has supported 50 companies, after opening its Calls for Proposals to industrial applicants, in the role of principal investigator or research team member collaborating in an academia-led project. So far, PRACE has awarded 10 SHAPE projects from 6 different countries. PRACE has also published 16 Best Practice Guides and over 200 White Papers. Nowadays it is well-established that HPC is indispensable for Science and Technology advanced in a wide range of scientific disciplines, such as biosciences, climate and health. Success stories and R&D outcomes of PRACE-supported projects shows how joint action and European competitiveness can benefit from a cross-pollination between science and industry (including SMEs), aided by European HPC resources.	Richard Tavares, Antonella Tesoro, Alison Kennedy and Sergi Girona

Modeling and Simulation of Large-scale Complex Urban Systems (MASCUS) Session 1

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: M110

Chair: Heiko Aydt

707	Cellular Automata-based Anthropogenic Heat Simulation [abstract] Abstract: Cellular automata (CA) models have been for several years, employed to describe urban phenomena like growth of human settlements, changes in land use and, more recently, dispersion of air pollutants. We propose to adapt CA to study the dispersion of anthropogenic heat emissions on the micro scale. Three dimensional cubic CA with a constant cell size of 0.15m have been implemented. Simulations suggest an improvement in processing speed compared to conventional computational fluid dynamics (CFD) models, which are limited in scale and yet incapable of solving simulations on local or larger scale. Instead of solving the Navier-Stokes equations, as in CFD, only temperature and heat differences for the CA are modeled. Radiation, convection and turbulence have been parameterized according to scale. This CA based approach can be combined with an agent-based traffic simulation to analyse the effect of driving behavior and other microscopic factors on urban heat.	Michael Wagner, Vaisagh Viswanathan, Dominik Pelzer, Matthias Berger, Heiko Aydt
128	Measuring Variability of Mobility Patterns from Multiday Smart-card Data [abstract] Abstract: Available large amount of mobility data stimulates the work in discovering patterns and understanding regularities. Comparatively, less attention has been paid to the study of variability, which, however, has been argued as equally important as regularities in previous related work, since variability identifies diversity. In a transport network, variability exists from day to day, from person to person, and from place to place. In this paper, we present a set of measuring of variability at individual and aggregated levels using multi-day smart-card data. Statistical analysis, correlation matrix and network-based clustering are applied and the potential usage of measured results for urban applications are discussed. We take Singapore as a case study and use one-week smart-card data for analysis. An interesting finding is that though the number of trips and mobility patterns varies from day to day, the overall spatial structure of urban movement remains the same throughout the whole week. We consider this paper as a tentative work towards a generic framework for measuring regularity and variability, which contributes to the understanding of transit, social and urban dynamics.	Chen Zhong, Ed Manley, Michael Batty and Gerhard Schmitt
500	The Resilience of the Encounter Network of Commuters for a Metropolitan Public Bus System [abstract] Abstract: We analyse the structure and resilience of a massive encounter network generated from commuters who share the same bus ride on a single day. The network is created by using smartcard data that contains detailed travel information of all the commuters who utilised the public bus system during a typical weekday in the whole of Singapore. We show that the network structure is of random-exponential type with small world features rather than a scale-free network. Within one day, 99.97% of all commuters became connected approximately within 7 steps of each other. We report on how this network structure changes upon application of a threshold based on the encounter duration (TE). Among others, we demonstrate a 50% reduction on the size of the giant cluster when TE=15mins. We then assess the dynamics of infection spreading by comparing the effect of both random and targeted node removal strategies. By assuming that the network characteristic is invariant day after day, our simulation indicates that without node removal, 99% of the commuter network became infected within 7 days of the onset of infection. While a targeted removal strategy was shown to be able to delay the onset of the maximum number of infected individuals, it was not able to isolate nodes that remained within the giant component.	Muhamad Azfar Ramli, Christopher Monterola
84	Facilitating model reuse and integration in an urban energy simulation platform [abstract] Abstract: The need for more sustainable, liveable and resilient cities demands improved methods for studying urban infrastructures as integrated wholes. Progress in this direction would be aided by the ability to effectively reuse and integrate existing computational models of urban systems. Building on the concept of multi-model ecologies, this paper describes ongoing efforts to facilitate model reuse and integration in the Holistic Urban Energy Simulation (HUES) platform - an extendable simulation environment for the study of urban multi-energy systems. We describe the design and development of a semantic wiki as part of the HUES platform. The purpose of this wiki is to enable the sharing and navigation of model metadata - essential information about the models and datasets of the platform. Each model and dataset in the platform is represented in the wiki in a structured way to facilitate the identification of opportunities for model reuse and integration. As the platform grows, this will help to ensure that it develops coherently and makes efficient use of existing formalized knowledge. We present the core concepts of multi-model ecologies and semantic wikis, the current state of the platform and associated wiki, and a case study demonstrating their use and benefit.	Lynn Andrew Bollinger, Ralph Evins

Modeling and Simulation of Large-scale Complex Urban Systems (MASCUS) Session 2

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: M110

Chair: Heiko Aydt

555	Reducing Computation Time with a Rolling Horizon Approach Applied to a MILP Formulation of Multiple Urban Energy Hub System [abstract] Abstract: Energy hub model is a powerful concept allowing the interactions of many energy conversion and storage systems to be optimized. Solving the optimal configuration and operating strategy of an energy hub combining multiple energy sources for a whole year can become computationally demanding. Indeed the effort to solve a mixed-integer linear programming (MILP) problem grows dramatically with the number of integer variables. This paper presents a rolling horizon approach applied to the optimisation of the operating strategy of an energy hub. The focus is on the computational time saving realized by applying a rolling horizon methodology to solve problems over many time-periods. The choice of rolling horizon parameters is addressed, and the approach is applied to a model consisting of a multiple energy hubs. This work highlights the potential to reduce the computational burden for the simulation of detailed optimal operating strategies without using typical-periods representations. Results demonstrate the possibility to improve by 15 to 100 times the computational time required to solve energy optimisation problems without affecting the quality of the results.	Julien F. Marquant, Ralph Evins, Jan Carmeliet
307	Economic, Climate Change, and Air Quality Analysis of Distributed Energy Resource Systems [abstract] Abstract: This paper presents an optimisation model and cost-benefit analysis framework for the quantification of the economic, climate change, and air quality impacts of the installation of a distributed energy resource system in the area surrounding Paddington train station in London, England. A mixed integer linear programming model, called the Distributed Energy Network Optimisation (DENO) model, is employed to design the optimal energy system for the district. DENO is then integrated into a cost-benefit analysis framework that determines the resulting monetised climate change and air quality impacts of the optimal energy systems for different technology scenarios in order to determine their overall economic and environmental impacts.	Akomeno Omu, Adam Rysanek, Marc Stettler, Ruchi Choudhary
616	Towards a Design Support System for Urban Walkability [abstract] Abstract: In the paper we present an urban design support tool centered on pedestrian accessibility and walkability of places. Differently from standard decision support systems developed for the purpose of evaluating given pre-defined urban projects and designs, we address the inverse problem to have the software system itself generate hypotheses of projects and designs, given some (user-provided) objectives and constraints. Taking as a starting point a model for evaluating walkability , we construct a variant of a multi-objective genetic algorithm (specifically NSGA-II) to produce the frontier of non-dominated design alternatives to satisfy certain predefined constraints. By way of example, we briefly present an application of the system to a real urban area.	Ivan Blecic, Arnaldo Cecchini, Giuseppe A. Trunfio

Solving Problems with Uncertainties (SPU) Session 1

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: M201

Chair: Vassil Alexandrov

455	An individual-centric probabilistic extension for OWL: Modelling the Uncertainness [abstract] Abstract: The theoretical benefits of semantics as well as their potential impact on IT are well known concepts, extensively discussed in literature. As more and more systems are currently using or referring semantic technologies, the challenging third version of the web (Semantic Web or Web 3.0) is progressively taking shape. On the other hand, apart from the relatively limited capabilities in terms of expressiveness characterizing current concrete semantic technologies, theoretical models and research prototypes are actually overlooking a significant number of practical issues including, among others, consolidated mechanisms to manage and maintain vocabularies, shared notations systems and support to high scale systems (Big Data). Focusing on the OWL model as the current reference technology to specify web semantics, in this paper we will discuss the problem of approaching the knowledge engineering exclusively according to a deterministic model and excluding a priori any kind of probabilistic semantic. Those limitations determine that most knowledge ecosystems including, at some level, probabilistic information are not well suited inside OWL environments. Therefore, despite the big potential of OWL, a consistent number of applications are still using more classic data models or unnatural hybrid environments. But OWL, even with its intrinsic limitations, reflects a model flexible enough to support extensions and integrations. In this work we propose a simple statistical extension for the model that can significantly spread the expressiveness and the purpose of OWL.	Salvatore Flavio Pileggi
457	Relieving Uncertainty in Forest Fire Spread Prediction by Exploiting Multicore Architectures [abstract] Abstract: The most important aspect that affects the reliability of environmental simulations is the uncertainty on the parameter settings describing the environmental conditions, which may involve important biases between simulation and reality. To relieve such arbitrariness, a two-stage prediction method was developed, based on the adjustment of the input parameters according to the real observed evolution. This method enhances the quality of the predictions, but it is very demanding in terms of time and computational resources needed. In this work, we describe a methodology developed for response time assessment in the case of fire spread prediction, based on evolutionary computation. In addition, a parallelization of one of the most important fire spread simulators, FARSITE, was carried out to take advantage of multicore architectures. This allows us to design proper allocation policies that significantly reduce simulation time and reach successful predictions much faster. A multi-platform performance study is reported to analyze the benefits of the methodology.	Andrés Cencerrado, Tomàs Vivancos, Ana Cortés, Tomàs Margalef
723	Populations of models, Experimental Designs and coverage of parameter space by Latin Hypercube and Orthogonal Sampling [abstract] Abstract: In this paper we have used simulations to make a conjecture about the coverage of a $t$ dimensional subspace of a $d$ dimensional parameter space of size $n$ when performing $k$ trials of Latin Hypercube sampling. This takes the form $P(k,n,d,t)=1-e^{-k/n^{t-1}}$. We suggest that this coverage formula is independent of $d$ and this allows us to make connections between building Populations of Models and Experimental Designs. We also show that Orthogonal sampling is superior to Latin Hypercube sampling in terms of allowing a more uniform coverage of the $t$ dimensional subspace at the sub-block size level.	Bevan Thompson, Kevin Burrage, Pamela Burrage, Diane Donovan
340	Analysis of Space-Time Structures Appearance for Non-Stationary CFD Problems [abstract] Abstract: The paper presents a combined approach to finding conditions for space-time structures appearance in non-stationary flows for CFD (computational fluid dynamics) problems. We consider different types of space-time structures, for instance, such as boundary layer separation, vortex zone appearance, appearance of oscillating regimes, transfer from Mach reflection to regular one for shock waves, etc. The approach combines numerical solutions of inverse problems and parametric studies. Parallel numerical solutions are implemented. This approach is intended for fast approximate estimation for dependence of unsteady flow structures on characteristic parameters (or determining parameters) in a certain class of problems. The numerical results are presented in a form of multidimensional data volumes. To find out hidden dependencies in the volumes some multidimensional data processing and visualizing methods should be applied. The approach is organized in a pipeline fashion. For certain classes of problems the approach allows obtaining the sought-for dependence in a quasi-analytical form. The proposed approach can be considered to provide some kind of generalized numerical experiment environment. Examples of its application to a series of practical problems are given. The approach can be applied to CFD problems with ambiguities.	Alexander Bondarev, Vladimir Galaktionov

Solving Problems with Uncertainties (SPU) Session 2

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: M201

Chair: Vassil Alexandrov

509	Discovering most significant news using Network Science approach [abstract] Abstract: The role of social network mass media increased greatly in the recent years. We investigate news publications in Twitter from the point of view of Network Science. We analyzed news data posted by the most popular media sources to reveal the most significant news over some period of time. Significance is a qualitative property that reflects the news impact degree at society and public opinion. We define the threshold of significance and discover a number of news which were significant for society in period from July 2014 up to January 2015.	Ilya Blokh, Vassil Alexandrov
713	Towards Understanding Uncertainty in Cloud Computing Resource Provisioning [abstract] Abstract: In spite of extensive research of uncertainty issues in different fields ranging from computational biology to decision making in economics, a study of uncertainty for cloud computing systems is limited. Most of works examine uncertainty phenomena in users’ perceptions of the qualities, intentions and actions of cloud providers, privacy, security and availability. But the role of uncertainty in the resource and service provisioning, programming models, etc. have not yet been adequately addressed in the scientific literature. There are numerous types of uncertainties associated with cloud computing, and one should to account for aspects of uncertainty in assessing the efficient service provisioning. In this paper, we tackle the research question: what is the role of uncertainty in cloud computing service and resource provisioning? We review main sources of uncertainty, fundamental approaches for scheduling under uncertainty such as reactive, stochastic, fuzzy, robust, etc. We also discuss potentials of these approaches for scheduling cloud computing activities under uncertainty, and address methods for mitigating job execution time uncertainty in the resource provisioning.	Andrei Tchernykh, Uwe Schwiegelsohn, Vassil Alexandrov, El-Ghazali Talbi
507	Monte Carlo method for density reconstruction based on insucient data [abstract] Abstract: In this work we consider the problem of reconstruction of unknown density based on a given sample. We present a method for density reconstruction which includes B-spline approximation, least squares method and Monte Carlo method for computing integrals. The error analysis is provided. The method is compared numerically with other statistical methods for density estimation and shows very promising results.	Aneta Karaivanova, Sofiya Ivanovska, Todor Gurov
20	Total Least Squares and Chebyshev Norm [abstract] Abstract: We investigate the total least square problem with Chebyshev norm instead of the traditionally used Frobenius norm. Using Chebyshev norm is motivated by seeking for robust solutions. In order to solve the problem, we make link with interval computation and use many of results developed there. We show that the problem is NP-hard in general, but it becomes polynomial in the case of a fixed number of regressors. This is the most important result for practice since usually we work with regression models with a low number of regression parameters (compared to the number of observations). We present not only an precise algorithm for the problem, but also a computationally cheap heuristic. We illustrate the behavior of our method in a particular probabilistic setup by a simulation study.	Milan Hladik, Michal Cerny

Mathematical Methods and Algorithms for Extreme Scale (MMAES) Session 1

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: M201

Chair: Vassil Alexandrov

127	Efficient Algorithm for Computing the Ergodic Projector of Markov Multi-Chains [abstract] Abstract: This paper extends the Markov uni-chain series expansion theory to Markov multi-chains, i.e., to Markov chains having multiple ergodic classes and possible transient states. The introduced series expansion approximation (SEA) provides a controllable approximation for Markov multi-chain ergodic projectors which may be a useful tool in large-scale network analysis. As we will illustrate by means of numerical examples, the new algorithm is for large networks faster than the power algorithm.	Joost Berkhout, Bernd Heidergott
376	Transmathematical Basis of Infinitely Scalable Pipeline Machines [abstract] Abstract: A current Grand Challenge is to scale high-performance machines up to exascale. Here we take the theoretical approach of setting out the mathematical basis of pipeline machines that are infinitely scalable, whence any particular scale can be achieved as technology allows. We briefly discuss both hardware and software simulations of such a machine, which lead us to believe that exascale is technologically achievable now. The efficiency of von Neumann machines declines with increasing size but our pipeline machines retain constant efficiency regardless of size. These machines have perfect parallelism in the sense that every instruction of an inline program is executed, on successive data, on every clock tick. Furthermore programs with shared data effectively execute in less than a clock tick. We show that pipeline machines are faster than single or multi-core, von Neumann machines for sufficiently many program runs of a sufficiently time consuming program. Our pipeline machines exploit the totality of transreal arithmetic and the known waiting time of statically compiled programs to deliver the interesting property that they need no hardware or software exception handling.	James Anderson
420	Multilevel Communication optimal Least Squares [abstract] Abstract: Using a recently proposed communication optimal variant of TSQR, weak scalability of the least squares solver (LS) with multiple right hand sides is studied. The communication for TSQR based LS solver for multiple right hand sides remains optimal in the sense that no additional messages are necessary compared to TSQR. However, LS has additional communication volume and flops compared to that for TSQR. Additional flops and words sent for LS is derived. A PGAS model, namely, global address space programming framework (GPI) is used for inter-nodal one sided communication. Within NUMA sockets, C++-11 threading model is used. Scalability results of the proposed method up to a few thousand cores are shown.	Pawan Kumar
406	Developing A Large Time Step, Robust, and Low Communication Multi-Moment PDE Integration Scheme for Exascale Applications [abstract] Abstract: The Boundary Averaged Multi-moment Constrained finite-Volume (BA-MCV) method is derived, explained, and evaluated for 1-D transport to assess accuracy, maximum stable time step (MSTS), oscillations for discontinuous data, and parallel communication burden. The BA-MCV scheme is altered from the original MCV scheme to compute the updates of point wise cell boundary derivatives entirely locally. Then it is altered such that boundary moments are replaced with the interface upwind value. The scheme is stable at a maximum stable CFL (MSCFL) value of one no matter how high-order the scheme is, giving significantly larger time steps than Galerkin methods, for which the MSCFL decreases nearly quadratically with increasing order. The BA-MCV method is compared against a SE method at varying order, both using the ADER-DT time discretization. BA-MCV error for a sine wave was comparable to the same order of accuracy for a SE method. The resulting large time step, multi-moment, low communication scheme is well suited for exascale architectures.	Matthew Norman

Urgent Computing -Computations for Decision Support in Critical Situations (UC) Session 1

Time and Date: 10:15 - 11:55 on 3rd June 2015

Room: V201

Chair: Alexander Boukhanovsky

728	Computational uncertainty management for coastal flood prevention system [abstract] Abstract: Multivariate and progressive uncertainty is the main factor of accuracy in simulation systems. It can be a critical issue for systems that forecast and prevent extreme events and related risks. To deal with this problem, computational uncertainty management strategies should be used. This paper aims to demonstrate an adaptation of the computational uncertainty management strategy in the framework of a system for prediction and prevention of such natural disasters as coastal floods. The main goal of the chosen strategy is to highlight the most significant ways of uncertainty propagation and to collocate blocks of action with procedures for reduction or evaluation of uncertainty in a way that catches the major part of model error. Blocks of action involve several procedures: calibration of models, data assimilation, ensemble forecasts, and various techniques for residual uncertainty evaluation (including risk evaluation). The strategy described in this paper was tested and proved based on a case study of the coastal flood prevention system in St. Petersburg.	Anna Kalyuzhnaya, Alexander Boukhanovsky
731	Computational uncertainty management for coastal flood prevention system. Part II: Diversity analysis [abstract] Abstract: Surge floods in Saint-Petersburg are related to extreme natural phenomena of rare repeatability. A lot of works were devoted to the problems appeared during maintenance of the flood prevention facility complex in Saint-Petersburg. However a lot of investigation issues connected with similar extreme events in Baltic Sea are remained opened. In this work, for surge flood of rare repeatability reconstruction need combination of two approaches based on the statistical multidimensional extremum analysis and on the synthetic surge floods was made. Synthetic storm model, taking multidimensional probability distributions from Reanalysis was developed and synthetic cyclone generation for its implementation was proposed.	Anna Kalyuzhnaya, Denis Nasonov, Alexander Visheratin, Alexey Dudko and Alexander Boukhanovsky
517	SIM-CITY: an e-Science framework for urban assisted decision support [abstract] Abstract: Urban areas are characterised by high population densities and the resulting complex social dynamics. For urban planners to evaluate, analyse, and predict complex urban dynamics, a lot of scenarios and a large parameter space must be explored. In urban disasters, complex situations must be assessed in short notice. We propose the concept of an assisted decision support system to aid in these situations. The system interactively runs a scenario exploration, which evaluates scenarios and optimize for desired properties. We introduce the SIM-CITY architecture to run such interactive scenario explorations and highlight a use case for the architecture, an urban fire emergency response simulation in Bangalore.	Joris Borgdorff, Harsha Krishna, Michael H. Lees
297	Towards a general definition of Urgent Computing [abstract] Abstract: Numerical simulations of urgent events, e.g. tsunamis, storms and flash floods, must be completed within a stipulated deadline. The simulation results are needed by relevant authorities in making timely educated decisions to mitigate financial losses, manage affected areas and reduce casualties. The existing definition of urgent computing is too usage context specific and thus restricts the identification of urgent use cases and the general application of urgent computing. We aim to extend and refine the existing definition and provide a comprehensive general definition of urgent computing. This general definition will aid in the identification of urgent computing's unique challenges and thus demonstrates the need for innovative multi-disciplinary solutions to address these challenges.	Siew Hoon Leong, Dieter Kranzlmüller
375	Combining Data-driven Methods with Finite Element Analysis for Flood Early Warning Systems [abstract] Abstract: We developed a robust approach for real-time levee condition monitoring based on combination of data-driven methods (one-side classification) and finite element analysis. It was implemented within a flood early warning system and validated on a series of full-scale levee failure experiments organised by the IJkdijk consortium in August-September 2012 in the Netherlands. Our approach has detected anomalies and predicted levee failures several days before the actual collapse. This approach was used in the UrbanFlood decision support system for routine levee quality assessment and for critical situations of a potential levee breach and inundation. In case of emergency, the system generates an alarm, warns dike managers and city authorities, and launches advanced urgent simulations of levee stability and flood dynamics, thus helping to make informed decisions on preventive measures, to evaluate the risks and to alleviate adverse effects of a flood.	A.L. Pyayt, D.V. Shevchenko, A.P. Kozionov, I.I. Mokhov, B. Lang, V.V. Krzhizhanovskaya, P.M.A. Sloot

Urgent Computing -Computations for Decision Support in Critical Situations (UC) Session 2

Time and Date: 14:10 - 15:50 on 3rd June 2015

Room: V201

Chair: Alexander Boukhanovsky

725	Evolutionary replicative data reorganization with prioritization for efficient workload processing [abstract] Abstract: Nowadays the importance of data collection, processing, and analyzing is growing tremendously. BigData technologies are in high demand in different areas, including bio-informatics, hydrometeorology, high energy physics, etc. One of the most popular computation paradigms that is used in large data processing frameworks is the MapReduce programming model. Today integrated optimization mechanisms that take into account only load balance and execution fast simplicity are not enough for advanced computations and more efficient complex approaches are needed. In this paper, we suggest an improved algorithm based on categorization for data reorganization in MapReduce frameworks using replication and network aspects. Moreover, for urgent computations that require a specific approach, the prioritization customization is introduced.	Denis Nasonov, Anton Spivak, Andrew Razumovskiy, Anton Myagkov
727	Multiscale agent-based simulation in large city areas: emergency evacuation use case [abstract] Abstract: Complex phenomena are increasingly attracting the interest of researchers from various branches of computational science. So far, this interest have conditioned the demand not only for more sophisticated autonomous models, but also for mechanisms that would associate them. This paper presents a multiscale agent-based modelling and simulation technique based on the incorporation of multiple modules. Two key principles are presented as guiding such an integration: common abstract space as a space, where entities of different models interact and commonly controlled agents – abstract actors operating in a common space, which can be handled by different agent-based models. Proposed approach is evaluated through series of experiments on simulating the emergency evacuation from the cinema building to the city streets, where building and street levels are reproduced in heterogeneous models.	Vladislav Karbovskii, Daniil Voloshin, Andrey Karsakov, Alexey Bezgodov, Aleksandr Zagarskikh
550	Execution management and efficient resource provisioning for flood decision support [abstract] Abstract: We present a resource provisioning and execution management solution for a flood decision support system. The system developed within the ISMOP project, features an urgent computing scenario in which flood threat assessment for large sections of levees is requested within a specified deadline. Unlike typical decision support systems which utilize heavyweight simulations in order to predict the possible course of an emergency, in ISMOP we employ an alternative approach based on the `scenario identification' method. We show that this approach is a particularly good fit for the resource provisioning model of IaaS Clouds. We describe the architecture of the ISMOP decision support system, focusing on the urgent computing scenario and its formal resource provisioning model. Preliminary results of experiments performed in order to calibrate and validate the model indicate that the model fits experimental data.	Bartosz Balis, Marek Kasztelnik, Maciej Malawski, Piotr Nowakowski, Bartosz Wilk, Maciej Pawlik, Marian Bubak
726	Holistic approach to urgent computing for flood decision support [abstract] Abstract: This paper presents the concept of holistic approach to urgent computing which extends resources management in situation of emergency from computational resources to Data Acquisition and Preprocessing System. The layered structure of this system is presented in detail and its rearrangement in case of emergency is proposed. This process is harmonised with large scale computation using Urgent Service Profile. The proposed approach was validated by practical work performed under ISMOP project. Concrete examples of Urgent Service Profile definition have been discussed. Results of preliminary experiments related to energy management and data transmission optimization in case of emergency have been presented.	Robert Brzoza-Woch, Marek Konieczny, Bartosz Kwolek, Piotr Nawrocki, Tomasz Szydło, Krzysztof Zieliński
327	3D simulation system to support the planning of rescue operations on damaged ships [abstract] Abstract: The paper describes a software system to simulate the ship motions in a crisis situation. The scenario consists of the damaged ship subjected to wave excitation forces generated by a random sea base on real wave spectrum. The simulation is displayed in an interactive Virtual Environment allowing the visualization of the ship motions. The numerical simulation of the sea surface and ship motions requires intensive computation to maintain the real-time or even the fast-forward simulations, which are the only ones of interest for these situations. Dedicated tools to analyse the ship behaviour in time are also described. The system can be useful to evaluate the responses of the ship to the current sea state, namely the amplitude, variations and tendencies of ship motions, and help the planning and coordination of rescue operations.	Jose Varela, José Miguel Rodrigues, Carlos Guedes Soares

Bridging the HPC Tallent Gap with Computational Science Research Methods (BRIDGE) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: M110

Chair: Nia Alexandrov

589	Computational Science Research Methods for Science Education at PG level [abstract] Abstract: The role of Computational Science research methods teaching to science students at PG level is to enhance their research profile developing their abilities to investigate complex problems, analyse the resulting data and use adequately HPC environments and tools for computation and visualisation. The paper analyses the current state and proposes a program that encompass mathematical modelling, data science, advanced algorithms development, parallel programming and visualisation tools. It also gives examples of specific scientific domains with explicitly taught and embedded Computational Science subjects.	Nia Alexandrov
717	A New Canadian Interdisciplinary PhD in Computational Sciences [abstract] Abstract: In response to growing demands of society for experts trained in computational skills applied to various domains, the School of Computer Science at the University of Guelph is creating a new approach to doctoral studies called an Interdisciplinary PhD in Computational Sciences. The program is designed to appeal to candidates with strong backgrounds in either computer science or an application discipline who are not necessarily seeking a traditional academic career. Thesis based, it features minimal course requirements and short duration, with the student’s research directed by co-advisors from computer science and the application discipline. The degree program’s rationale and special characteristics are described. Related programs in Ontario and reception of this innovative proposal at the institutional level are discussed.	William Gardner, Gary Grewal, Deborah Stacey, David Calvert, Stefan Kremer and Fangju Wang
730	I have a DRIHM: A case study in lifting computational science services up to the scientific mainstream [abstract] Abstract: While we are witnessing a transition from petascale to exascale computing, we experience, when teaching students and scientists to adopt distributed computing infrastructures for computational sciences, what Geoffrey A. Moore once coined the chasm between the visionaries in computational sciences and the early majority of scientific pragmatists. Using the EU-funded DRIHM project (Distributed Research Infrastructure for Hydro-Meteorology) as a case study, we see that innovative research infrastructures have difficulties to be accepted by the scientific pragmatists: The infrastructure services are not yet "mainstream". Excellence in workforces in computational sciences, however, can only be achieved if the tools are not only available but also used. In this paper we show for DRIHM how the chasm exhibits and how it can be crossed.	Michael Schiffers, Nils Gentschen Felde, Dieter Kranzlmüller
335	Mathematical Modelling Based Learning Strategy [abstract] Abstract: Mathematical modelling is a difficult skill to acquire and transfer. In order to succeed in transferring the ability to model the observable world, the environment in which modelling is taught should resemble as much as possible the real environment in which students will leave and work. We devised a learning strategy based on modelling environmental variables in order to link weather conditions to weather emergencies by pollutants in the atmosphere of Monterrey, Mexico, metropolitan area. We structure course topics around a single comprehensive and integrative project. The objective of the project is to create a model that will predict behavior of existing phenomena using real data. In this case, we used data collected by weather stations. This data consists of weather information such as temperature, pressure, humidity, wind speed and the like. And, it also contains information about pollutants such as O3, CO2, CO, SO2, NOx, particles, etc. Students follow a procedure consisting for 4 stages. In the first stage they analyze the data; try to reduce dimensionality, link weather variables to contaminants and determine characteristic behaviours. In the second stage, students interpolate missing data and project component data to a 2D map of the metro area. In the third stage students create the mathematical model by carrying out curve fitting through least squares technique. In the third stage, students solve the model by finding roots, solving systems of equations, solving differential equations or integrating. The final deliverable is to determine under which weather conditions there can be an environmental contingency that put people’s health in danger. Class topics are taught in the order necessary to carry out the project. Any necessary knowledge required for the project not contemplated by course syllabus is carried out through team presentations with worked-out examples. Analysis of the strategy is presented as well as preliminary results.	Raul Ramirez, Nia Alexandrov, José Raúl Pérez Cázares, Carlos Barba-Jimenez

Bridging the HPC Tallent Gap with Computational Science Research Methods (BRIDGE) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: M110

Chair: Nia Alexandrov

576	Steps Towards Bridging the HPC and Computational Science Talent Gap Based on Ontology Engineering Methods [abstract] Abstract: The paper describes an ontology-based methods and framework for design of learning courses covering the HPC and Big Data areas and how to include these into Computational Science training within the remit of existing courses of Master Programme entitled “Applied Mathematics and Computer Science” (Faculty of Mechanics and Mathematics, Perm State University, Russia). It helped bringing together the university and IT-companies around a real industry projects in the field of Big Data with active participation of master’s students. In this paper, the visual tools and ontology-based methods for computer-supported collaborative learning environment will be also presented.	Svetlana Chuprina
715	Developing High Performance Computing Resources for Teaching Cluster and Grid Computing courses [abstract] Abstract: High-Performance Computing (HPC) and the ability to process large amounts of data are of paramount importance for UK business and economy as outlined by Rt Hon David Willetts MP at the HPC and Big Data conference in February 2014. However there is a shortage of skills and available training in HPC to prepare and expand the workforce for the HPC and Big Data research and development. Currently, HPC skills are acquired mainly by students and staff taking part in HPC-related research projects, MSc courses, and at the dedicated training centres such as Edinburgh University’s EPCC. There are few UK universities teaching the HPC, Clusters and Grid Computing courses at the undergraduate level. To address the issue of skills shortages in the HPC it is essential to provide teaching and training as part of both postgraduate and undergraduate courses. The design and development of such courses is challenging since the technologies and software in the fields of large scale distributed systems such as Cluster, Cloud and Grid computing are undergoing continuous change. The students completing the HPC courses should be proficient in these evolving technologies and equipped with practical and theoretical skills for future jobs in this fast developing area. In this paper we present our experience in developing the HPC, Cluster and Grid modules including a review of existing HPC courses offered at the UK universities. The topics covered in the modules are described, as well as the coursework project based on practical laboratory work. We conclude with an evaluation based on our experience over the last ten years in developing and delivering the HPC modules on the undergraduate courses, with suggestions for future work.	Violeta Holmes, Ibad Kureshi
524	Teaching Quantum Computing with the QuIDE Simulator [abstract] Abstract: Recently, the idea of quantum computation is becoming more and more popular and there are many attempts to build quantum computers. Therefore, there is a need to introduce this topic to regular students of computer science and engineering. In this paper we present a concept of a course powered by the Quantum Integrated Development Environment (QuIDE), the new quantum computer simulator that joins features of GUI based simulators with interpreters and simulation library approach. The idea of the course is to put together theoretical aspects with practical assignments realized on the QuIDE simulator. Such an approach enables studying a variety of topics in a way understandable for this category of students. The topics of the course included understanding the concept of quantum gates, registers and a series of algorithms: Deutsch and Bernstein-Vazirani Problems, Grover's Fast Database Search, Shor's Prime Factorization, Quantum Teleportation and Quantum Dense Coding. We describe results of QuIDE assessment during the course; our solution scored more points in System Usability Scale survey then the other tool previously used for that purpose. We also show that the most useful features of such a tool indicated by students are similar to the assumptions made on the simulator functionality.	Katarzyna Rycerz, Joanna Patrzyk, Bartłomiej Patrzyk, Marian Bubak
577	Using Scientific Visualization Tools to Bridge the Talent Gap [abstract] Abstract: In this paper the use of adaptive scientific visualization tools in education, including in the area of high performance computing education is proposed in order to help students understand in depth the nature of particular scientific problems and to help them to learn parallel computing approaches to solving these problems. The proposed approach may help to bridge the talent gap in natural and computational sciences, since high quality visualization can help to uncover hidden regularities in the data with which the researchers and students work and can lead to new level of understanding how the data can be partitioned and processed in parallel. A multiplatform client-server scientific visualization system is presented that can be easily integrated with third-party solvers from any field of science. This system can be used as a visual aid and a collaboration tool in high performance computing education.	Svetlana Chuprina, Konstantin Ryabinin

Numerical and computational developments to advance multi-scale Earth System Models (MSESM) Session 1

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: M208

Chair: K.J. Evans

141	Progress in Fast, Accurate Multi-scale Climate Simulations [abstract] Abstract: We present a survey of physical and computational techniques that have the potential to contribute to the next generation of high-fidelity, multi-scale climate simulations. Examples of the climate science problems that can be investigated with more depth include the capture of remote forcings of localized hydrological extreme events, an accurate representation of cloud features over a range of spatial and temporal scales, and parallel, large ensembles of simulations to more effectively explore model sensitivities and uncertainties. Numerical techniques, such as adaptive mesh refinement, implicit time integration, and separate treatment of fast physical time scales are enabling improved accuracy and fidelity in simulation of dynamics and allow more complete representations of climate features at the global scale. At the same time, partnerships with computer science teams have focused on taking advantage of evolving computer architectures, such as many-core processors and GPUs, so that these approaches that were previously prohibitively costly have become both more efficient and scalable. In combination, progress in these three critical areas are poised to transform climate modeling in the coming decades.	William Collins, Katherine Evans, Hans Johansen, Carol Woodward, Peter Caldwell
107	Parallel Performance Optimizations on Unstructured Mesh-Based Simulations [abstract] Abstract: This paper addresses two key parallelization challenges in the unstructured mesh-based ocean modeling code, MPAS-Ocean, which uses a mesh based on Voronoi tessellations: (1) load imbalance across processes, and (2) unstructured data access patterns, that inhibit intra- and inter-node performance. Our work analyzes the load imbalance due to naive partitioning of the mesh, and develops methods to generate mesh partitioning with better load balance and reduced communication. Furthermore, we present methods that minimize both inter- and intra-node data movement and maximize data reuse. The techniques include predictive ordering of data elements for higher cache efficiency, as well as communication reduction approaches. We present detailed performance data when running on thousands of cores using the Cray XC30 supercomputer and show that our optimization strategies can exceed the original performance by over 2x. Additionally, many of these solutions and can be broadly applied to a wide variety of unstructured grid-based computations.	Abhinav Sarje, Sukhyun Song, Douglas Jacobsen, Kevin Huck, Jeffrey Hollingsworth, Allen Malony, Samuel Williams, Leonid Oliker
565	A Higher-Order Finite Volume Nonhydrostatic Dynamical Core with Space-Time Refinement [abstract] Abstract: We present an adaptive non-hydrostatic dynamical core based on a higher-order finite volume discretization on the cubed sphere. Adaptivity is both in space, using nested horizontal refinement; and in time, using subcycling in refined regions. The algorithm is able to maintain scalar conservation with careful flux construction at refinement boundaries, as well as conservative coarse-fine interpolation. We show results for simple tests as well as more challenging ones that highlight the benefits of refinement.	Hans Johansen

Numerical and computational developments to advance multi-scale Earth System Models (MSESM) Session 2

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: M208

Chair: K.J. Evans

97	On the scalability of the Albany/FELIX first-order Stokes approximation ice sheet solver for large-scale simulations of the Greenland and Antarctic ice sheets [abstract] Abstract: We examine the scalability of the recently developed Albany/FELIX finite-element based code for the first-order Stokes momentum balance equations for ice flow [1]. We focus our analysis on the performance of two possible preconditioners for the iterative solution of the sparse linear systems, which arise from the discretization of the governing equations: (1) a preconditioner based on the incomplete LU (ILU) factorization, and (2) a recently-developed algebraic multi-level (ML) preconditioner, constructed using the idea of semi-coarsening. A strong scalability study on a realistic, high resolution Greenland ice sheet problem reveals that, for a given number of processor cores, the ML preconditioner results in faster linear solve times but the ILU preconditioner exhibits better scalability. A weak scalability study is performed on a realistic, moderate resolution Antarctic ice sheet problem, a substantial fraction of which contains floating ice shelves, making it fundamentally different from the Greenland ice sheet problem. Here, we show that as the problem size increases, the performance of the ILU preconditioner deteriorates whereas the ML preconditioner maintains scalability. This is because the linear systems are extremely ill-conditioned in the presence of floating ice shelves, and the ill-conditioning has a greater negative effect on the ILU preconditioner than on the ML preconditioner. [1] I. Kalashnikova, M. Perego, A. Salinger, R. Tuminaro, and S. Price. Albany/FELIX: A parallel, scalable and robust finite element higher-order stokes ice sheet solver built for advance analysis. Geosci. Model Develop. Discuss., 7:8079-8149, 2014.	Irina Kalashnikova, Raymond Tuminaro, Mauro Perego, Andrew Salinger, Stephen Price
145	On the Use of Finite Difference Matrix-Vector Products in Newton-Krylov Solvers for Implicit Climate Dynamics with Spectral Elements [abstract] Abstract: Efficient solutions of global climate models require effectively handling disparate length and time scales. Implicit solution approaches allow time integration of the physical system with a step size governed by accuracy of the processes of interest rather than by stability of the fastest time scales present. Implicit approaches, however, require the solution of nonlinear systems within each time step. Usually, a Newton's method is applied to solve these systems. Each iteration of the Newton's method, in turn, requires the solution of a linear model of the nonlinear system. This model employs the Jacobian of the problem-defining nonlinear residual, but this Jacobian can be costly to form. If a Krylov linear solver is used for the solution of the linear system, the action of the Jacobian matrix on a given vector is required. In the case of spectral element methods, the Jacobian is not calculated but only implemented through matrix-vector products. The matrix-vector multiply can also be approximated by a finite difference approximation which may introduce inaccuracy in the overall nonlinear solver. In this paper, we review the advantages and disadvantages of finite difference approximations of these matrix-vector products for climate dynamics within the spectral element shallow water dynamical core of the Community Atmosphere Model (CAM).	Carol Woodward, David Gardner, Katherine Evans
503	Accelerating Time Integration for Climate Modeling Using GPUs [abstract] Abstract: The push towards larger and larger computational platforms has made it possible for climate simulations to resolve climate dynamics across multiple spatial and temporal scales. This direction in climate simulation has created a strong need to develop scalable time stepping methods capable of accelerating throughput on high performance computing. This work details the recent advances in the implementation of implicit time stepping of the spectral element dynamical core within the United States Department of Energy (DOE) Accelerated Climate Model for Energy (ACME) on graphical processing units (GPU) based machines. We demonstrate how solvers in the Trilinos project are interfaced with ACME and GPU kernels to increase computational speed of the residual calculations in the implicit time stepping method for the atmosphere dynamics. We show the optimization gains and data structure reorganization that facilitates the performance improvements.	Rick Archibald, Katherine Evans, Andrew Salinger
543	A Time-Split Discontinuous Galerkin Transport Scheme for Global Atmospheric Model [abstract] Abstract: A time-split transport scheme has been developed for the high-order multiscale atmospheric model (HOMAM). The spacial discretization of HOMAM is based on the discontinuous Galerkin method, combining the 2D horizontal elements on the cubed-sphere surface and 1D vertical elements in a terrain-following height-based coordinate. The accuracy of the time-splitting scheme is tested with a set of new benchmark 3D advection problems. The split time-integrators are based on the Strang-type operator-split method. The convergence of standard error norms shows a second-order accuracy with the smooth scalar field, irrespective of a particular time-integrator. The results with the split scheme is comparable with that of the established models.	Ram Nair, Lei Bao, Michael Toy

Numerical and computational developments to advance multi-scale Earth System Models (MSESM) Session 3

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: M208

Chair: K.J. Evans

321	Analysis of ocean-atmosphere coupling algorithms : consistency and stability [abstract] Abstract: This paper is focused on the numerical and computational issues associated to ocean-atmosphere coupling. It is shown that usual coupling methods do not provide the solution to the correct problem, but to an approaching one since they are equivalent to performing one single iteration of an iterative coupling method. The stability analysis of these ad-hoc methods is presented, and we motivate and propose the adaptation of a Schwarz domain decomposition method to ocean-atmosphere coupling to obtain a stable and consistent coupling method.	Florian Lemarie, Eric Blayo, Laurent Debreu
658	Exploring the Effects of a High-Order Vertical Coordinate in a Non-Hydrostatic Global Model [abstract] Abstract: As atmospheric models are pushed towards non-hydrostatic resolutions, there is a growing need for new numerical discretizations that are accurate, robust and effective at these scales. In this paper we describe a new arbitrary-order staggered nodal finite-element method (SNFEM) vertical discretization motivated by the flux reconstruction formulation. The SNFEM formulation generalizes traditional second-order vertical discretizations, including Lorenz and Charney-Phillips discretizations, to arbitrary order-of-accuracy while preserving desirable properties such as energy conservation. Preliminary results from application of this method to an idealized baroclinic instability are given, demonstrating the effect of improvements in order of accuracy on the structure of the instability.	Paul Ullrich, Jorge Guerra
494	High-Order / Low-Order Methods for Ocean Modeling [abstract] Abstract: We examine a High Order / Low Order (HOLO) approach for a z-level ocean model and show that the traditional semi-implicit and split-explicit methods, as well as a recent preconditioning strategy, can easily be cast in the framework of HOLO methods. The HOLO formulation admits an implicit-explicit method that is algorithmically scalable and second-order accurate, allowing timesteps much larger than the barotropic time scale. We show how HOLO approaches, in particular the implicit-explicit method, can provide a solid route for ocean simulation to heterogeneous computing and exascale environments.	Chris Newman, Geoff Womeldorff, Luis Chacon, Dana Knoll
134	Aeras: A Next Generation Global Atmosphere Model [abstract] Abstract: Sandia National Laboratories is developing a new global atmosphere model named Aeras that is performance portable and supports the quantification of uncertainties. These next-generation capabilities are enabled by building Aeras on top of Albany, a code base that supports the rapid development of scientific application codes while leveraging Sandia's foundational mathematics and computer science packages in Trilinos and Dakota. Embedded uncertainty quantification is an original design capability of Albany, and performance portability is a recent upgrade for Albany. Other required features, such as shell-type elements, spectral elements, efficient explicit and semi-implicit time-stepping, transient sensitivity analysis, and concurrent ensembles, were not components of Albany as the project began, and have been (or are being) added by the Aeras team. We present early sensitivity analysis and performance portability results for the shallow water equations.	William Spotz, Thomas Smith, Irina Demeshko, Jeffrey Fike

Poster Track (POSTER) Session 1

Time and Date: Before Lunch

Room: Solin 1st Floor

None

506	Numerical modelling of pollutant propagation in Lake Baikal during the spring thermal bar [abstract] Abstract: In this paper, the phenomenon of the thermal bar in Lake Baikal and the propagation of pollutants from the Selenga River are studied with a nonhydrostatic mathematical model. An unsteady flow is simulated by solving numerically a system of thermal convection equations in the Boussinesq approximation using second-order implicit difference schemes in both space and time. To calculate the velocity and pressure fields in the model, an original procedure for buoyant flows, SIMPLED, which is a modification of the well-known Patankar and Spalding's SIMPLE algorithm, has been developed. The simulation results have shown that the thermal bar plays a key role in propagation of pollution in the area of Selenga River inflow into Lake Baikal.	Bair Tsydenov, Anthony Kay, Alexander Starchenko
730	I have a DRIHM: A case study in lifting computational science services up to the scientific mainstream [abstract] Abstract: While we are witnessing a transition from petascale to exascale computing, we experience, when teaching students and scientists to adopt distributed computing infrastructures for computational sciences, what Geoffrey A. Moore once coined the chasm between the visionaries in computational sciences and the early majority of scientific pragmatists. Using the EU-funded DRIHM project (Distributed Research Infrastructure for Hydro-Meteorology) as a case study, we see that innovative research infrastructures have difficulties to be accepted by the scientific pragmatists: The infrastructure services are not yet "mainstream". Excellence in workforces in computational sciences, however, can only be achieved if the tools are not only available but also used. In this paper we show for DRIHM how the chasm exhibits and how it can be crossed.	Michael Schiffers, Nils Gentschen Felde, Dieter Kranzlmüller
523	Random Set Method Application to Flood Embankment Stability Modelling [abstract] Abstract: In this work the application of random set theory to flood embankment stability modelling is presented. The objective of this paper is to illustrate a method of uncertainty analysis in a real geotechnical problem.	Anna Pięta, Krzysztof Krawiec
260	MPJ Express Meets YARN: Towards Java HPC on Hadoop Systems [abstract] Abstract: Many organizations—including academic, research, commercial institutions—have invested heavily in setting up High Performance Computing (HPC) facilities for running computational science applications. On the other hand, the Apache Hadoop software—after emerging in 2005— has become a popular, reliable, and scalable open-source framework for processing large-scale data (Big Data). Realizing the importance and significance of Big Data, an increasing number of organizations are investing in relatively cheaper Hadoop clusters for executing their mission critical data processing applications. An issue here is that system administrators at these sites might have to maintain two parallel facilities for running HPC and Hadoop computations. This, of course, is not ideal due to redundant maintenance work and poor economics. This paper attempts to bridge this gap by allowing HPC and Hadoop jobs to co-exist on a single hardware facility. We achieve this goal by exploiting YARN—Hadoop v2.0—that de-couples the compu- tational and resource scheduling part of the Hadoop framework from HDFS. In this context, we have developed a YARN-based reference runtime system for the MPJ Express software that allows executing parallel MPI-like Java applications on Hadoop clusters. The main contribution of this paper is to provide Big Data community access to MPI-like programming using MPJ Express. As an aside, this work allows parallel Java applications to perform computations on data stored in Hadoop Data File System (HDFS).	Hamza Zafar, Farrukh Aftab Khan, Bryan Carpenter, Aamir Shafi, Asad Waqar Malik
393	Scalable Multilevel Support Vector Machines [abstract] Abstract: Solving different types of optimization models (including parameters fitting) for support vector machines on large-scale training data is often an expensive computational task. This paper proposes a multilevel algorithmic framework that scales efficiently to very large data sets. Instead of solving the whole training set in one optimization process, the support vectors are obtained and gradually refined at multiple levels of coarseness of the data. The proposed framework includes: (a) construction of hierarchy of large-scale data coarse representations, and (b) a local processing of updating the hyperplane throughout this hierarchy. Our multilevel framework substantially improves the computational time without loosing the quality of classifiers. The algorithms are demonstrated for both regular and weighted support vector machines. Experimental results are presented for balanced and imbalanced classification problems. Quality improvement on several imbalanced data sets has been observed.	Talayeh Razzaghi, Ilya Safro
407	Arbitrarily High-Order-Accurate, Hermite WENO Limited, Boundary-Averaged Multi-Moment Constrained Finite-Volume (BA-MCV) Schemes for 1-D Transport [abstract] Abstract: This study introduces the Boundary Averaged Multi-moment Constrained finite-Volume (BA-MCV) scheme for 1-D transport with Hermite Weighted Essentially Non-Oscillatory (HWENO) limiting using the ADER Differential Transform (ADER-DT) time discretization. The BA-MCV scheme evolves a cell average using a Finite-Volume (FV) scheme, and it adds further constraints as point wise derivatives of the state at cell boundaries, which are evolved in strong form using PDE derivatives. The resulting scheme maintains a Maximum Stable CFL (MSCFL) value of one no matter how high-order the scheme is. Also, parallel communication requirements are very low and will be described. Using test cases of a function with increasing steepness, the accuracy of the BA-MCV method will be tested in a limited and non-limited context for varying levels of smoothness. Polynomial $h$-refinement convergence and exponential $p$-refinement convergence will be demonstrated. The overall ADER-DT + BA-MCV + HWENO scheme is a scalable and larger time step alternative to Galerkin methods for multi-moment fluid simulation in climate and weather applications.	Matthew Norman
434	A Formal Method for Parallel Genetic Algorithms [abstract] Abstract: We present a formal model that allows to analyze non trivial properties about the behavior of parallel genetic algorithms implemented using multi-islands. The model is based on a probabilistic labeled transition system, that represents the evolution of the population in each island, as well as the interaction among different islands. By studying the traces these systems can perform, the resulting model allows to formally compare the behavior of different algorithms.	Natalia Lopez, Pablo Rabanal, Ismael Rodriguez, Fernando Rubio
484	Comparison of Two Diversication Methods to Solve the Quadratic Assignment Problem [abstract] Abstract: The quadratic assignment problem is one of the most studied NP-hard problems. It is known for its complexity which makes it a good candidate for the parallel design. In this paper, we propose and analyze two parallel cooperative algorithms based on hybrid iterative tabu search. The only difference between the two approaches is the diversification methods. Through 15 of the hardest well-known instances from QAPLIB benchmark, our algorithms produce competitive results. This experimentation shows that our propositions can exceed or equal several leading algorithms from the literature in almost all the hardest benchmark instances.	Omar Abdelkafi, Lhassane Idoumghar, Julien Lepagnot
522	A Matlab toolbox for Kriging metamodelling [abstract] Abstract: Metamodelling offers an efficient way to imitate the behaviour of computationally expensive simulators. Kriging based metamodels are popular in approximating computation-intensive simulations of deterministic nature. Irrespective of the existence of various variants of Kriging in the literature, only a handful of Kriging implementations are publicly available and most, if not all, free libraries only provide the standard Kriging metamodel. ooDACE toolbox offers a robust, flexible and easily extendable framework where various Kriging variants are implemented in an object-oriented fashion under a single platform. This paper presents an incremental update of the ooDACE toolbox introducing an implementation of Gradient Enhanced Kriging which has been tested and validated on several engineering problems.	Selvakumar Ulaganathan, Ivo Couckuyt, Dirk Deschrijver, Eric Laermans, Tom Dhaene
607	Improving Transactional Memory Performance for Irregular Applications [abstract] Abstract: Transactional memory (TM) offers optimistic concurrency support in modern multicore architectures, helping the programmers to extract parallelism in irregular applications when data dependence information is not available before runtime. In fact, recent research focus on exploiting thread-level parallelism using TM approaches. However, the proposed techniques are of general use, valid for any type of application. This work presents ReduxSTM, a software TM system specially designed to extract maximum parallelism from irregular applications. Commit management and conflict detection were tailored to take advantage of both, transaction ordering constraints to assure correct results, and the existence of (partial) reduction patterns, a very frequent memory access pattern in irregular applications. Both facts are used to avoid unnecessary transaction aborts. A function in 300.twolf package from SPEC CPU2000 was taken as a motivating irregular program. This code was parallelized using ReduxSTM and an ordered version of TinySTM, a state-of-the-arte TM system. The experimental evaluation shows our proposed TM system exploits more parallelism from the sequential program and obtains better performance than the other system.	Manuel Pedrero, Eladio Gutiérrez, Sergio Romero, Oscar Plata
635	Building Java Intelligent Applications Data Mining for Java Type-2 Fuzzy Inference Systems [abstract] Abstract: This paper introduces JT2FISClustering, a data mining extension for JT2FIS. JT2FIS is a Java class library for building intelligent applications. This extension is used to extract information from a data set and transform it into an Interval Type-2 Fuzzy Inference System in Java applications. Mamdani and Takagi-Sugeno Fuzzy Inference Systems can be generated using fuzzy c-means or subtractive data mining methods. We compare the outputs and performance of Matlab R versus Java in order to validate the proposed extension.	Manuel Castañón-Puga, Josué-Miguel Flores-Parra, Juan Ramón Castro, Carelia Gaxiola-Pacheco, Luis Enrique Palafox-Maestre
639	The Framework for Rapid Graphics Application Developent: The Multi-scale Problem Visualization. [abstract] Abstract: Interactive real-time visualization plays a significant role in simulation research domain. Multi-scale problems are in need of high performance visualization with good quality and the same could be said about other problem domains, e.g. big data analysis, physics simulation, etc. The state of the art shows that a universal tool for solving such problem is non-existent. Modern computer graphics requires enormous efforts to implement efficient algorithms on modern GPUs and GAPIs. In the first part of our paper we introduce a framework for rapid graphics application development and its extensions for multi-scale problem visualization. In the second part of the paper we provide a prototype of multi-scale problem’s solution in simulation and monitoring of high-precision agent movements starting from behavioral patterns in an airport and up to world-wide flight traffic. Finally we summarize our results and speculate about future investigations.	Alexey Bezgodov, Andrey Karsakov, Aleksandr Zagarskikh, Vladislav Karbovskii
29	A multiscale model for the feto-placental circulation in the monochorionic twin pregnancies [abstract] Abstract: We developed a mathematical model of monochorionic twin pregnancies to simulate both the normal gestation and the Twin-Twin Transfusion Syndrome (TTTS), a disease in which the interplacental anastomose create a flow imbalance, causing one of the twin to receive too much blood and liquids, becoming hypertensive and polyhydramnios (the Recipient) and the other to become hypotensive and oligohydramnios (the Donor). This syndrome, if untreated, leads almost certainly to death one or both twins. We propose a compartment model to simulate the flows between the placenta and the fetuses and the accumulation of the amniotic fluid in the sacs. The aim of our work is to provide a simple but realistic model of the twins-mother system and to stress it by simulating the pathological cases and the related treatments, i.e. aminioreduction (elimination of the excess liquid in the recipient sac), laser therapy (removal of all the anastomoses) and other possible innovative therapies impacting on pressure and flow parameters.	Ilaria Stura, Pietro Gaglioti, Tullia Todros, Caterina Guiot
86	Sequential and Parallel Implementation of GRASP for the 0-1 Multidimensional Knapsack Problem [abstract] Abstract: The knapsack problem is a widely known problem in combinatorial optimization and has been object of many researches in the last decades. The problem has a great number of variants and obtaining an exact solution to any of these is not easily accomplished, which motivates the search for alternative techniques to solve the problem. Among these alternatives, metaheuristics seem to be suitable on the search for approximate solutions for the problem. In this work we propose a sequential and a parallel implementation for the multidimensional knapsack problem using GRASP metaheuristic. The obtained results show that GRASP can lead to good quality results, even optimal in some instances, and that CUDA may be used to expand the neighborhood search and as a result may lead to improved quality results.	Bianca De Almeida Dantas, Edson Cáceres
89	Telescopic hybrid fast solver for 3D elliptic problems with point singularities [abstract] Abstract: This paper describes a telescopic solver for two dimensional h adaptive grids with point singularities. The input for the telescopic solver is an h refined two dimensional computational mesh with rectangular finite elements. The candidates for point singularities are first localized over the mesh by using a greedy algorithm. Having the candidates for point singularities, we execute either a direct solver, that performs multiple refinements towards selected point singularities and executes a parallel direct solver algorithm which has logarithmic cost with respect to refinement level. The direct solvers executed over each candidate for point singularity return local Schur complement matrices that can be merged together and submitted to iterative solver. In this paper we utilize a parallel logarithmic computational cost GPU solver or parallel multi-thread GALOIS solver as a direct solver. We use Incomplete LU Preconditioned Conjugated Gradients (ILUPCG) as an iterative solver. We also show that elimination of point singularities from the refined mesh reduces significantly the number of iterations to be performed by the ILUPCG iterative solver.	Anna Paszynska, Konrad Jopek, Krzysztof Banaś, Maciej Paszynski, Andrew Lenerth, Donald Nguyen, Keshav Pingali, Lisandro Dalcin, Victor Calo
95	Adapting map resolution to accomplish execution time constraints in wind field calculation [abstract] Abstract: Forest fires are natural hazards that every year destroy thousands of hectares around the world. Forest fire propagation prediction is a key point to fight against such hazards. Several models and simulators have been developed to predict forest fire propagation. These models require input parameters such as digital elevation map, vegetation map, and other parameters describing the vegetation and meteorological conditions. However, some meteorological parameters, such as wind speed and direction, change from one point to another one due to the effect of the topography of the terrain. Therefore, it is necessary to couple wind field models, such as WindNinja, to estimate the wind speed and direction at each point of the terrain. The output provided by the wind field simulator is used as input of the fire propagation model. Coupling wind field model and forest fire propagation model improves accuracy prediction, but increases significantly prediction time. This fact is critical since propagation prediction must be provided in advance to allow the control centers to manage firefighters in the best possible way. This work analyses WindNinja execution time, describes a WindNinja parallelisation based on map partitioning, determines the limitations of such methodology for large maps and presents an improvement based on adapting map resolution to accomplish execution time limitations.	Gemma Sanjuan, Tomas Margalef, Ana Cortes
103	Efficient BSP/CGM algorithms for the maximum subsequence sum and related problems [abstract] Abstract: Given a sequence of n numbers, with at least one positive value, the maximum subsequence sum problem consists in finding the contiguous subsequence with the largest sum or score, among all derived subsequences of the original sequence. Several scientific applications have used algorithms that solve the maximum subsequence sum. Particularly in Computational Biology, these algorithms can help in the tasks of identification of transmembrane domains and in the search for GC-content regions, a required activity in the operation of pathogenicity islands location. The sequential algorithm that solves this problem has O(n) time complexity. In this work we present BSP/CGM parallel algorithms to solve the maximum subsequence sum problem and three related problems: the maximum longest subsequence sum, the maximum shortest subsequence sum and the number of disjoints subsequences of maximum sum. To the best of our knowledge there are no parallel BSP/CGM algorithms for these related problems. Our algorithms use p processors and require O(n/p) parallel time with a constant number of communication rounds for the algorithm of the maximum subsequence sum and O(log p) communication rounds, with O(n/p) local computation per round, for the algorithms of the related problems. We implemented the algorithms on a cluster of computers using MPI and on a machine with GPU using CUDA, both with good speed-ups.	Anderson C. Lima, Edson N. Cáceres, Rodrigo G. Branco, Roussian R. A. Gaioso, Samuel B. Ferraz, Siang W. Song, Wellinton S. Martins
225	Fire Hazard Safety Optimisation for Building Environments [abstract] Abstract: This article provides a theoretical study for fire hazard safety in building environments. The working hypothesis is that the navigation costs and hazard spread are deterministically modeled and over time. Based on the dynamic navigation costs under fire hazard, the article introduces the notion of dynamic safety in a recursive manner. Then several theoretical results are proposed to calculate the dynamic safety over time and to establish that it represents the maximum amount of time to delay safely on nodes. Based on the recursive equations, an algorithm is proposed to calculate the dynamic safety and successor matrices. Finally, some experimental results are provided to illustrate the efficiency of the algorithm and to present a real case study.	Sabin Tabirca, Tatiana Tabirca, Laurence Yang
295	A Structuring Concept for Securing Modern Day Computing Systems [abstract] Abstract: Security within computing systems is ambiguous, proliferated through obscurity, a knowledgeable user, or plain luck. Presented is a novel concept for structuring computing systems to achieve a higher degree of overall system security through the compartmentalization and isolation of executed instructions for each component. Envisioned is a scalable model which focuses on lower level operations to alleviate the view of security as a binary outcome to that of a deterministic metric based on a set of independent characteristics.	Orhio Creado, Phu Dung Le, Jan Newmarch, Jeff Tan
323	Federated Big Data for resource aggregation and load balancing with DIRAC [abstract] Abstract: BigDataDIRAC is a Federated Big Data solution with a Distributed Infrastructure with Remote Agent Control (DIRAC) access point. Users have the opportunity to access multiple Big Data resources scattered in different geographical areas, such as access to grid resources. This approach opens the possibility of offering not only grid and cloud to the users, but also Big Data resources from the same DIRAC environment. We describe a system to allow access to a federation of Big Data resources, including load balancing, using DIRAC. Proof of concept is shown and load balancing performance evaluations are presented using several use cases supported by three computing centers in two countries, and with four Hadoop clusters.	Victor Fernandez, Víctor Méndez, Tomás F. Pena
324	Big Data Analytics Performance for Large Out-Of-Core Matrix Solvers on Advanced Hybrid Architectures [abstract] Abstract: This paper examines the performance of large Out-Of-Core matrices to assess the optimal Big Data system performance of advanced computer architectures, based on the performance evaluation of a large dense Lower-Upper Matrix Decomposition (LUD) employing a highly tuned, I/O managed, slab based LUD software package developed by the Lockheed Martin Corporation. We present extensive benchmark studies conducted with this package on UMBC’s Bluegrit and Bluewave clusters, and NASA-GFSC’s Discover cluster systems. Our results show speedup for a single node achieved by Phi Coprocessors relative to the host CPU SandyBridge processors is about a 1.5X improvement, which is an even smaller relative performance gain compared with the studies published by F.Masci (Masci, 2013), where he obtains a 2-2.5x performance. Surprisingly, the Westmere with the Tesla GPU scales comparably with the Sandy Bridge and the Phi Coprocessor up to 12 processes and then fails to continue to scale. The performances across 20 CPU nodes of SandyBridge obtains a uniform speedup of 0.5X over Westmere for problem sizes of 10K, 20K and 40K unknowns. With an Infiniband DDR, the performance of Nehalem processors is comparable to Westmere without the interconnect.	Raghavendra Rao, Milton Halem, John Dorband
352	A critical survey of data grid replication strategies based on data mining techniques [abstract] Abstract: Replication is one common way to effectively address challenges for improving the data management in data grids. It has attracted a great deal of attention of many researchers. Hence, a lot of work is done and many strategies have been proposed. However, most of the existing replication strategies consider a single file-based granularity and do not take into account file access patterns or possible file correlations. However, file correlations become an increasingly important consideration for performance enhancement in data grids. In this regard, the knowledge about file correlations can be extracted from historical and operational data using the techniques of the data mining field. Data mining techniques have proved to be a powerful tool facilitating the extraction of meaningful knowledge from large data sets. As a consequence of the convergence of data mining and data grid, mining grid data is an interesting research field which aims at analyzing grid systems with data mining techniques in order to efficiently discover new meaningful knowledge to enhance data management in data grids. More precisely, in this paper, the extracted knowledge is used to enhance replica management. Gaps in the current literature and opportunities for further research are presented. In addition, we propose a new guideline to data mining application in the context of data grid replication strategies. To the best of our knowledge, this is the first survey mainly dedicated to data grid replication strategies based on data mining techniques.	Tarek Hamrouni, Sarra Slimani, Faouzi Ben Charrrada
428	Reduction of Computational Load for MOPSO [abstract] Abstract: The run time for many optimisation algorithms, particularly those that explicitly consider multiple objectives, can be impractically large when applied to real world problems. This paper reports an investigation into the behaviour of Multi-Objective Particle Swarm Optimisation (MOPSO), that seeks to reduce the number of objective function evaluations needed, without degrading solution quality. By restricting archive size and strategically reducing the trial solution population size, it has been found the number of function evaluations can been reduced by 66.7% without significant reduction in solution quality. In fact, careful manipulation of algorithm operating parameters can even significantly improve solution quality.	Mathew Curtis, Andrew Lewis
501	The Effects of Hotspot Detection and Virtual Machine Migration Policies on Energy Consumption and Service Levels in the Cloud [abstract] Abstract: Cloud computing has received much attention among researchers lately. Managing Cloud resources efficiently necessitates effective policies that assign applications to hardware in a way that they require the least resources possible. Applications are first assigned to virtual machines which are subsequently placed on the most appropriate server host. If a server becomes overloaded, some of its virtual machines are reassigned. This process requires a hotspot detection mechanism in combination with techniques that select the virtual machine(s) to migrate. In this work we introduce two new virtual machine selection policies, Median Migration Time and Maximum Utilisation, and show that they outperform existing approaches on the criteria of minimising energy consumption, service level agreement violations and the number of migrations when combined with different hotspot detection mechanisms. We show that parametrising the the hotspot detection policies correctly has a significant influence on the workload balance of the system.	S Sohrabi, I. Moser
614	Towards a Performance-realism Compromise in the Development of the Pedestrian Navigation Model [abstract] Abstract: Despite the emergence of new approaches and increasingly powerful processing resources, there are cases in the domain of pedestrian modeling that require the maintenance of compromise between the computational performance and realism of the behavior of the simulated agents. Present paper seeks to address this issue through comparative computational experiments and visual validation of the simulations using the real-world data. Acquired results show that a reasonable compromise may be reached for in the multi-level navigation incorporating both route planning and collision avoidance.	Daniil Voloshin, Vladislav Karbovskii, Dmitriy Rybokonenko
641	A Methodology for Designing Energy-Aware Systems for Computational Science [abstract] Abstract: Energy consumption is currently one of the main issues in large distributed systems. More specifically, the efficient management of energy without losing performance has become a hot topic in the field. Thus, the design of systems solving complex problems must take into account energy efficiency. In this paper we present a formal methodology to check the correctness, from an energy-aware point of view, of large systems, such as HPC clusters and cloud environments, dedicated to computational science. Our approach uses a simulation platform, to model and simulate computational science environments, and metamorphic testing, to check the correctness of energy consumption in these systems.	Pablo Cañizares, Alberto Núñez, Manuel Nuñez, J.Jose Pardo
528	Towards an automatic co-generator for manycores’ architecture and runtime: STHORM case-study [abstract] Abstract: The increasing design complexity of manycore architectures at the hardware and software levels imposes to have powerful tools capable of validating every functional and non-functional property of the architecture. At the design phase, the chip architect needs to explore several parameters from the design space, and iterate on different instances of the architecture, in order to meet the defined requirements. Each new architectural instance requires the configuration and the generation of a new hardware model/simulator, its runtime, and the applications that will run on the platform, which is a very long and error-prone task. In this context, the IP-XACT standard has become widely used in the semiconductor industry to package IPs and provide low level SW stack to ease their integration. In this work, we present a primer work on a methodology to automatically configuring and assembling an IP-XACT golden model and generating the corresponding manycore architecture HW model, low-level software runtime and applications. We use the STHORM manycore architecture and the HBDC application as a case study.	Charly Bechara, Karim Ben Chehida, Farhat Thabet
306	Enhancing ELM-based facial image classification by exploiting multiple facial views [abstract] Abstract: In this paper, we investigate the effectiveness of the Extreme Learning Machine (ELM) network in facial image classification. In order to enhance performance, we exploit knowledge related to the human face structure. We train a multi-view ELM network by employing automatically created facial regions of interest to this end. By jointly learning the network parameters and optimized network output combination weights, each facial region appropriately contributes to the final classification result. Experimental results on three publicly available databases show that the proposed approach outperforms facial image classification based on a single facial representation and on other facial region combination schemes	Alexandros Iosifidis, Anastasios Tefas, Ioannis Pitas
429	Automatic Query Driven Data Modelling in Cassandra [abstract] Abstract: Non-relational databases have recently been the preferred choice when it comes to dealing with BigData challenges, but their performance is very sensitive to the chosen data organisations. We have seen differences of over 70 times in response time for the same query on different models. This brings users the need to be fully conscious of the queries they intend to serve in order to design their data model. The common practice then, is to replicate data into different models designed to fit different query requirements. In this scenario, the user is in charge of the code implementation required to keep consistency between the different data replicas. Manually replicating data in such high layers of the database results in a lot of squandered storage due to the underlying system replication mechanisms that are formerly designed for availability and reliability ends. In this paper, we propose and design a mechanism and a prototype to provide users with transparent management, where queries are matched with a well-performing model option. Additionally, we propose to do so by transforming the replication mechanism into a heterogeneous replication one, in order to avoid squandering disk space while keeping the availability and reliability features. The result is a system where, regardless of the query or model the user specifies, response time will always be that of an affine query.	Roger Hernandez, Yolanda Becerra, Jordi Torres, Eduard Ayguade
186	A clustering-based approach to static scheduling of multiple workflows with soft deadlines in heterogeneous distributed systems [abstract] Abstract: Typical patterns of using scientific workflow management systems (SWMS) include periodical executions of prebuilt workflows with precisely known estimates of tasks’ execution times. Combining such workflows into sets could sufficiently improve resulting schedules in terms of fairness and meeting users’ constraints. In this paper, we propose a clustering-based approach to static scheduling of multiple workflows with soft deadlines. This approach generalizes commonly used techniques of grouping and ordering of parts of different workflows. We introduce a new scheduling algorithm, MDW-C, for multiple workflows with soft deadlines and compare its effectiveness with task-based and workflow-based algorithms which we proposed earlier in [1]. Experiments with several types of synthetic and domain-specific test data sets showed the superiority of a mixed clustering scheme over task-based and workflow-based schemes. This was confirmed by an evaluation of proposed algorithms on a basis of the CLAVIRE workflow management platform.	Klavdiya Bochenina, Nikolay Butakov, Alexey Dukhanov, Denis Nasonov
268	Challenges and Solutions in Executing Numerical Weather Prediction in a Cloud Infrastructure [abstract] Abstract: Cloud Computing has emerged as an option to perform large-scale scientific computing. The elasticity of the cloud and its pay-as-you-go model present an interesting opportunity for applications commonly executed in clusters or supercomputers. This paper presents the challenges of migrating and executing a numerical weather prediction (NWP) application to a cloud computing infrastructure. We compared the execution of this High-Performance Computing (HPC) application in a local cluster and in the cloud using different instances sizes. The experiments demonstrate that processing and networking create a limiting factor, but that storing input and output datasets in the cloud presents an interesting option to share results and ease the deployment of a test-bed for a weather research platform. Results show that cloud infrastructure can be used as an viable HPC alternative for numerical weather prediction software.	Emmanuell Diaz Carreño, Eduardo Roloff, Philippe Navaux
325	Flexible Dynamic Time Warping for Time Series Classification [abstract] Abstract: Measuring the similarity or distance between two time series sequences is critical for the classification of a set of time series sequences. Given two time series sequences, X and Y, the dynamic time warping (DTW) algorithm can calculate the distance between X and Y. But the DTW algorithm may align some neighboring points in X to the corresponding points which are far apart in Y. It may get the alignment with higher score, but with less representative information. This paper proposes the flexible dynamic time wrapping (FDTW) method for measuring the similarity of two time series sequences. The FDTW algorithm adds an additional score as the reward for the contiguously long one-to-one fragment. As the experimental results show, the DTW and DDTW and FDTW methods outperforms each other in some testing sets. By combining the FDTW, DTW and DDTW methods to form a classifier ensemble with the voting scheme, it has less average error rate than that of each individual method.	Che-Jui Hsu, Kuo-Si Huang, Chang-Biau Yang, Yi-Pu Guo
511	Onedata - a Step Forward towards Globalization of Data Access for Computing Infrastructures [abstract] Abstract: To satisfy requirements of data globalization and high performance access in particular, we introduce the originally created onedata system which virtualizes storage systems provided by storage resource providers distributed globally. onedata introduces new data organization concepts together with providers' cooperation procedures that involve use of GlobalRegistry as a mediator. The most significant features include metadata synchronization and on-demand file transfer.	Lukasz Dutka, Michał Wrzeszcz, Tomasz Lichoń, Rafał Słota, Konrad Zemek, Krzysztof Trzepla, Łukasz Opioła, Renata Slota, Jacek Kitowski
536	Ocean forecast information system for emergency interventions [abstract] Abstract: The paper describes the computation and information system required to support fast and efficient operations in emergency situation in the marine environment. The most common cases, which induced to activate emergency procedures, are identified and the main features of the Search And Rescue (SAR) intervention are described in their evolution, the inputs and detail that are required and the weakness that still exist. The improvement that can come from a more integrated information system, from the computation of the environmental condition to the adoption of dedicated graphical interface to provide all the necessary information in a clear and complete way, are also explained.	Roberto Vettor, Carlos Guedes Soares
682	Optimizing Performance of ROMS on Intel Xeon Phi [abstract] Abstract: ROMS (Regional Oceanic Modeling System) is an open-source ocean modeling system that is widely used by the scientific community. It uses a coarse-grained parallelization scheme which partitions the computational domain into tiles. ROMS operates on a lot of multi-dimensional arrays, which makes it an ideal candidate to gain from architectures with wide and powerful Vector Processing Units (VPU) such as Intel Xeon Phi. In this paper we present an analysis of the BENCHMARK application of ROMS and the issues affecting its performance on Xeon Phi. We then present an iterative optimization strategy for this application on Xeon Phi which results in a speed-up of over 2x compared to the baseline code in the native mode and 1.5x in symmetric mode.	Gopal Bhaskaran, Pratyush Gaurav
336	Fuzzy indication of reliability in metagenomics NGS data analysis [abstract] Abstract: NGS data processing in metagenomics studies has to deal with noisy data that can contain a large amount of reading errors which are difficult to detect and account for. This work introduces a fuzzy indicator of reliability technique to facilitate solutions to this problem. It includes modified Hamming and Levenshtein distance functions that are aimed to be used as drop-in replacements in NGS analysis procedures which rely on distances, such as phylogenetic tree construction. The distances utilise fuzzy sets of reliable bases or an equivalent fuzzy logic, potentially aggregating multiple sources of base reliability.	Milko Krachunov, Dimitar Vassilev, Maria Nisheva, Ognyan Kulev, Valeriya Simeonova, Vladimir Dimitrov
559	Pairwise genome comparison workflow in the Cloud using Galaxy [abstract] Abstract: Workflows are becoming the new paradigm in bioinformatics. In general, bioinformatics problems are solved by interconnecting several small software pieces to perform complex analyses. This demands a minimal expertise to create, enact and monitor such tools compositions. In addition bioinformatics is immersed in the big-data territory, facing huge problems to analyse such amount of data. We have addressed these problems by integrating a tools management platform (Galaxy) and a Cloud infrastructure, which prevents moving the big datasets between different locations and allows the dynamic scaling of the computing resources depending on the user needs. The result is a user-friendly platform that facilitates the work of the end-users while performing their experiments, installed in a Cloud environment that includes authentication, security and big-data transfer mechanisms. To demonstrate the suitability of our approach we have integrated in the infrastructure an existing pairwise and multiple genome comparison tool which comprises the management of huge datasets and high computational demands.	Óscar Torreño Tirado, Michael T. Krieger, Paul Heinzlreiter, Oswaldo Trelles
583	WebGL based visualisation and analysis of stratigraphic data for the purposes of the mining industry [abstract] Abstract: In recent years the combination of databases, data and internet technologies has greatly enhanced the functionality of many systems based on spatial data, and facilitated the dissemination of such information. In this paper, we propose a web-based data visualisation and analysis system for stratigraphic data from a Polish mine, with visualisation and analysis tools which can be accessed via the Internet. WWW technologies such as active web pages and WebGL technology provide a user-friendly interface for browsing, plotting, comparing, and downloading information of interest, without the need for dedicated mining industry software.	Anna Pieta, Justyna Bała
33	Modeling and Simulation of Masticatory Muscles [abstract] Abstract: Medical simulators play an important role in helping the development of prototype prostheses, pre-surgical planning and in a better understanding of the mechanical phenomena involved in muscular activity. This article focuses in modeling and simulating the activity of the jaw muscular system. The model involves the use of three-dimensional bone models and muscle modeling based on Hill type actuators. Ligament restrictions to mandible movement were taken into account in our model. Data collected from patients were used to partially parameterize our model so that it could be used in medical applications. In addition, the simulation of muscles employed a new methodology based on insertion curves, with many lines of action for each group of muscles. A simulator was developed, which allowed real time visualization of individual muscle activation under each correspondent simulation time. The model derived trajectory was then compared to the assembled data, remaining mostly within the convex hull of the mandible motion curves captured. Furthermore, the model accurately described the desired border movements.	Eduardo Garcia, Márcio Leal, Marta Villamil
35	Fully automatic 2D hp-adaptive Finite Element Method for Non-Stationary Heat Transfer [abstract] Abstract: In this paper we present a fully automatic hp adaptive finite element method code for non-stationary two dimensional problems. The code utilizes the -scheme for time discretization and fully automatic hp adaptive finite element method discretization for numerical solution of each time step. The code is verified on the examplary non-stationary problem of heat transfer over the L-shape domain.	Paweł Matuszyk, Marcin Sieniek, Maciej Paszyński
46	Parallelization of an Encryption Algorithm Based on a Spatiotemporal Chaotic System and a Chaotic Neural Network [abstract] Abstract: In this paper the results of parallelizing a block cipher based on a spatiotemporal chaotic system and a chaotic neural network are presented. A data dependence analysis of loops was applied in order to parallelize the algorithm. The parallelism of the algorithm is demonstrated in accordance with the OpenMP standard. As a result of my study, it was stated that the most time-consuming loops of the algorithm are suitable for parallelization. The efficiency measurements of a parallel algorithm working in ECB, CTR, CBC and CFB modes of operation are shown.	Dariusz Burak
64	Cryptanalysing the shrinking generator [abstract] Abstract: Some linear cellular automata generate exactly the same PN-sequences as those generated by maximum-length LFSRs. Hence, cellular automata can be considered as alternative generators to the maximum-length LFSRs. Moreover, some LFSR-based keystream generators can be modelled as linear structures based on cellular automata. In this work, we analyse a family of one-dimensional, linear, regular and cyclic cellular automata based on the rule 102 that describe the behaviour of the shrinking generator, designed as a non-linear generator. This implies that the output sequence of the generator is sensitive to suffer a cryptanalysis that takes advantage of this linearity.	Sara D. Cardell, Amparo Fúster-Sabater
74	D-Aid - An App to Map Disasters and Manage Relief Teams and Resources [abstract] Abstract: Natural or man-made disasters cause damage to life and property. Lack of appropriate emergency management increases the physical damage and loss of life. D-Aid, the smartphone App proposed by this article, intends to help volunteers and relief teams to quickly map and aid victims of a disaster. Anyone can put an occurrence after a disaster on a web map streamlining and decentralizing the information access. Through visualization techniques like heat maps and voronoi diagrams on a map implemented in the D-Aid app and also on a web map everyone can easily get information about amount of victims, their necessities and eminent dangers after disasters.	Luana Carine Schunke, Luiz Paulo Luna de Oliveira, Mauricio Cardoso, Marta Becker Villamil
168	My Best Current Friend in a Social Network [abstract] Abstract: Due to its popularity, social networks (SNs) have been subject to different analyses. A research field in this area is the identification of several types of users and groups. To make the identification process easier, a SN is usually represented through a graph. Usual tools to analyze a graph are the centrality measures, which identify the most important vertices within a graph; among them the PageRank (a measure originally designed to classify web pages). Informally, in the context of a SN, the PageRank of a user i represents the probability that another user of the SN is seeing the page of i after a considerable time of navigation in the SN. In this paper, we define a new type of user in a SN: the best current friend. Informally, the idea is to identify, among the friends of a user i, who is the friend k that would generate the highest decrease in the PageRank of i if k stops being his/her friend. This may be useful to identify the users/customers whose friendship/relationship should be a priority to keep. We provide formal definitions, algorithms and some experiments for this subject. Our experiments showed that the best current friend of a user is not necessarily among those who have the highest PageRank in the SN, or among the ones who have lots of friends.	Francisco Moreno, Santiago Hernández, Edison Ospina
398	Clustering Heterogeneous Semi-Structured Social Science Datasets [abstract] Abstract: Social scientists have begun to collect large datasets that are heterogeneous and semi-structured, but the ability to analyze such data has lagged behind its collection. We design a process to map such datasets to a numerical form, apply singular value decomposition clustering, and explore the impact of individual attributes or fields by overlaying visualizations of the clusters. This provides a new path for understanding such datasets, which we illustrate with three real-world examples: the Global Terrorism Database, which records details of every terrorist attack since 1970; a Chicago police dataset, which records details of every drug-related incident over a period of approximately a month; and a dataset describing members of a Hezbollah crime/terror network within the U.S.	David Skillicorn, Christian Leuprecht
473	CFD post-processing in Unity3D [abstract] Abstract: In architecture and urban design the urban climate on a meso/micro scale is a strong design criterion for outdoor thermal comfort and building’s energy performance. Evaluating the effect of buildings on the local climate and vice versa can be done by computational fluid dynamics (CFD) methods. The results from CFD are typically visualized through post-processing software closely related to the product family of pre-processing and simulation. The built-in functions are made for engineers and lack user-friendliness for real-time exploration of results. To bridge the gap between architect and engineer we propose visualizations based on game engine technology. This paper demonstrates the implementation of CFD to Unity3D conversion and weather data visualization.	Matthias Berger, Verina Cristie
596	Helsim: a particle-in-cell simulator for highly imbalanced particle distributions [abstract] Abstract: Helsim is a 3D electro-magnetic particle-in-cell simulator used to simulate the behaviour of plasma in space. Particle-in-cell simulators track the movement of particles through space, with the particles generating and being subjected to various fields (electric, magnetic and or gravitational). Helsim dissociates the particles data structure from the fields, allowing them to be distributed and load- balanced independently and can simulate experiments with highly imbalanced particle distributions with ease. This paper shows weak scaling results of a highly imbalanced particle setup on up to 32 thousand cores. The results validate the basic claims for scalability for imbalanced particle distributions, but also highlights a problem with a workaround we had to implement to circumvent an OpenMPI bug we encountered.	Roel Wuyts, Tom Haber, Giovanni Lapenta
724	Efficient visualization of urban simulation data using modern GPUs [abstract] Abstract: Visualization of simulation results in major urban areas is a difficult task. Multi-scale processes and connectivity of the urban environment may require interactive visualization of dynamic scenes with lots of objects at different scales. To visualize these scenes it is not always possible to use standard GIS systems. Wide distribution of high-performance gaming graphics cards has led to the emergence of specialized frameworks, which are able to cope with such kinds of visualization. This paper presents a framework and special algorithms that take full advantage of the GPU to render the urban simulation data over a virtual globe. The experiments on a scalability of the framework have showed that the framework is successfully deals with the visualization of up to two million moving agents and up to eight million of fixed points of interest on top of the virtual globe without detriment to smoothness of the image.	Aleksandr Zagarskikh, Andrey Karsakov, Alexey Bezgodov
732	Cloud Technology for Forecasting Accuracy Evaluation of Extreme Metocean Events [abstract] Abstract: The paper describes the approach for ensemble-based simulation within the tasks of extreme metocean events forecasting as an urgent computing problem. The approach is based on the developed conceptual basis of data-flow construction for the simulation-based ensemble forecasting. It was used to develop the architecture for ensemble-based data processing based on cloud computing environment CLAVIRE with extension for urgent computing resource provisioning and scheduling. Finally the solution for ensemble water level forecasting in Baltic Sea was developed as a part of St. Petersburg flood preventing system.	Sergey Kosukhin, Sergey Kovalchuk, Alexander Boukhanovsky
320	Co-clustering based approach for Indian monsoon prediction [abstract] Abstract: Prediction of Indian monsoon is a challenging task due to complex dynamics and variability over the years. Skills of statistical predictors that perform well in a set of years are not as good for others. In this paper, we attempt to identify a set of predictors that have high skills for a cluster of years. A co-clustering algorithm, which extracts groups of years, paired with good predictor sets for those years, is used for this purpose. Weighted ensemble of these predictors are used in final prediction. Results on past 65 years data show that the approach is competitive with state of art techniques.	Moumita Saha, Pabitra Mitra
139	Agent Based Simulations for the Estimation of Sustainability Indicators [abstract] Abstract: We present a methodology to improve the estimation of several Sustainability Indicators based on the measurement of walking distance to infrastructures combining Agent Based Simulation with Volunteer Geographic Information. Joining these two forces we construct a more realistic and accurate distribution of the infrastructures based on knowledge created by citizens and their perceptions instead of official data sources. A Situated Multi-Agent System is in charge of simulating not only the functional disparity and sociodemographic characteristics of the population but also the geographic reality in a dynamic way. Namely, the system will analyze different geographic barriers for each collective bringing new possibilities to improve the assessment of the needs of the population for a more sustainable development of the city. In this article we will describe the methodology to carry on several sustainability indicator measurements and present the results of the proposed methodology applied to several municipalities.	Ander Pijoan, Cruz E. Borges, Iraia Oribe-Garcia, Cristina Martín, Ainhoa Alonso-Vicario
276	Bray-Curtis Metrics as Measure of Liquid State Machine Separation Ability in Function of Connections Density [abstract] Abstract: Separation ability is one of two most important properties of Liquid State Machines used in the Liquid Computing theory. To measure the so-called distance of states that Liquid State Machine can exist in -- different norms and metrics can be applied. Till now we have used the Euclidean distance to tell the distance of states representing different stimulations of simulated cortical microcircuits. In this paper we compare our previously used methods and the approach with Bray-Curtis measure of dissimilarity. Systematic analysis of efficiency and its comparison for a different number of simulated synapses present in the model will be discussed to some extent.	Grzegorz Wójcik, Marcin Ważny
365	A First Step to Performance Prediction for Heterogeneous Processing on Manycores [abstract] Abstract: In order to maintain the continuous growth of the performance of computers while keeping their energy consumption under control, the microelecttronic industry develops architectures capable of processing more and more tasks concurrently. Thus, the next generations of microprocessors may count hundreds of independent cores that may differ in their functions and features. As an extensive knowledge of their internals cannot be a prerequisite to their programming and for the sake of portability, these forthcoming computers necessitate the compilation flow to evolve and cope with heterogeneity issues. In this paper, we lay a first step toward a possible solution to this challenge by exploring the results of SPMD type of parallelism and predicting performance of the compilation results so that our tools can guide a compiler to build an optimal partition of task automatically, even on heterogeneous targets. We show on experimental results a very good accuracy of our tools to predict real world performance.	Nicolas Benoit, Stephane Louise
468	A decision support system for emergency flood embankment stability [abstract] Abstract: This article presents a decision support system for emergency flood embankment stability. The proposed methodology is based on analysis of data from both a flood embankment measurement network and data generated through numerical modeling. Decisions about the risk of embankment interruption are made on the basis of this analysis. The authors present both the general concept of the system as well as a detailed description the system components.	Magdalena Habrat, Michał Lupa, Monika Chuchro, Andrzej Leśniak
422	A Methodology for Profiling and Partitioning Stream Programs on Many-core Architectures [abstract] Abstract: Maximizing the data throughput is a very common implementation objective for several streaming applications. Such task is particularly challenging for implementations based on many-core and multi-core target platforms because, in general, it implies tackling several NP-complete combinatorial problems. Moreover, an efficient design space exploration requires an accurate evaluation on the basis of dataflow program execution profiling. The focus of the paper is on the methodology challenges for obtaining accurate profiling measures. Experimental results validate a many-core platform built by an array of Transport Triggered Architecture processors for exploring the partitioning search space based on the execution trace analysis.	Malgorzata Michalska, Jani Boutellier, Marco Mattavelli
590	Minimum-overlap clusterings and the sparsity of overcomplete decompositions of binary matrices. [abstract] Abstract: Given a set of $n$ binary data points, a widely used technique is to group its features into $k$ clusters: sets of features for which there is, in turn, a set of data points that has similar values in those features. In the case where $n < k$, an exact decomposition is always possible, and the question of how overlapping are the clusters is of interest. In this paper we approach the question through matrix decomposition, and relate the degree of overlap with the sparsity of one of the resulting matrices. We present i) analytical results regarding bounds on this sparsity, and ii) a heuristic to estimate the minimum amount of overlap that an exact grouping of features into $k$ clusters must have. Happily, adding new data will not alter this minimum amount of overlap. An interpretation of this amount, and its change with $k$, is given for a biological example.	Victor Mireles, Tim Conrad
736	Modeling of critical situations in the migration policy implementation [abstract] Abstract: This paper describes an approach for modeling of potentially critical situations in the society. Potentially critical situations is caused by the lack of compliance of current local policies and the desired goals, methods and means of these policies implementation. The modeling approach is proposed to improve the efficiency of the local government management, taking into account potentially critical situations that may arise on a personal level, social group’s level and society as a whole. The use of proposed method is shown by the example of migration policies in St. Petersburg.	Sergey Mityagin, Sergey Ivanov, Alexander Boukhanovsky, Iliya Gubarev, Tihonova Olga
450	Parallelization of context-free grammar parsing on GPU using CUDA [abstract] Abstract: During the last decade, increasing interest in parallel programming can be observed. It is caused by a tendency of developing microprocessors as multicore units, that can perform instructions simultaneously. Popular and widely used example of such platform is a graphic processing unit (GPU). Its ability to perform calculations simultaneously is being investigated as a way of improving performance of the complex algorithms. Therefore, GPU has the architectures that allow to use its computational power by programmers and software developers in the same way as CPU. One of these architectures is CUDA platform, developed by nVidia. Purpose of our work was to implement the parallel CYK algorithm, which is one of the most popular and effective parsing algorithms for the context-free languages. The process of parsing is crucial for a systems which are dedicated to work with the natural, biological (like RNA), or artificial languages, i.e. interpreters of scripting languages, compilers, and systems, which concern pattern or natural/biological language recognition. Parallelization of context-free grammar parsing on GPU was done by using CUDA platform. Paper presents a review of existing parallelizations of CYK algorithm in the literature, deliver descriptions of proposed algorithms, and discusses experimental results obtained. We considered algorithms in which each cell of CYK matrix was assigned to the respective thread (processor), each pair of cells assigned to the thread, version with a shared memory, and finally version with limited number of non-terminal. The algorithms were evaluated on five artificial grammars with different number of terminals, non-terminals, size of grammar rules, and different lengths of input sequences. Significant performance improvement (up to about 10x) compared with CPU-based computations was achieved.	Olgierd Unold and Piotr Skrzypczak

Workshop on Biomedical and Bioinformatics Challenges for Computer Science (BBC) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: V206

Chair: Mario Cannataro

759	8th Workshop on Biomedical and Bioinformatics Challenges for Computer Science - BBC2015 [abstract] Abstract: This is the summary of the 8th Workshop on Biomedical and Bioinformatics Challenges for Computer Science - BBC2015	Stefano Beretta, Mario Cannataro, Riccardo Dondi
374	Robust Conclusions in Mass Spectrometry Analysis [abstract] Abstract: A central issue in biological data analysis is that uncertainty, resulting from different factors of variabilities, may change the effect of the events being investigated. Therefore, robustness is a fundamental step to be considered. Robustness refers to the ability of a process to cope well with uncertainties, but the different ways to model both the processes and the uncertainties lead to many alternative conclusions in the robustness analysis. In this paper we apply a framework allowing to deal with such questions for mass spectrometry data. Specifically, we provide robust decisions when testing hypothesis over a case/control population of subject measurements (i.e. proteomic profiles). To this concern, we formulate (i) a reference model for the observed data (i.e., graphs), (ii) a reference method to provide decisions (i.e., test of hypotheses over graph properties) and (iii) a reference model of variability to employ sources of uncertainties (i.e., random graphs). We apply these models to a real-case study, analyzing the mass spectrometry pofiles of the most common type of Renal Cell Carcinoma; the Clear Cell variant.	Italo Zoppis, Riccardo Dondi, Massimiliano Borsani, Erica Gianazza, Clizia Chinello, Fulvio Magni, Giancarlo Mauri
612	Modeling of Imaging Mass Spectrometry Data and Testing by Permutation for Biomarkers Discovery in Tissues [abstract] Abstract: Exploration of tissue sections by imaging mass spectrometry reveals abundance of different biomolecular ions in different sample spots, allowing finding region specific features. In this paper we present computational and statistical methods for investigation of protein biomarkers i.e. biological features related to presence of different pathological states. Proposed complete processing pipeline includes data pre-processing, detection and quantification of peaks by using Gaussian mixture modeling and identification of specific features for different tissue regions by performing permutation tests. Application of created methodology provides detection of proteins/peptides with concentration levels specific for tumor area, normal epithelium, muscle or saliva gland regions with high confidence.	Michal Marczyk, Grzegorz Drazek, Monika Pietrowska, Piotr Widlak, Joanna Polanska, Andrzej Polanski
336	Fuzzy indication of reliability in metagenomics NGS data analysis [abstract] Abstract: NGS data processing in metagenomics studies has to deal with noisy data that can contain a large amount of reading errors which are difficult to detect and account for. This work introduces a fuzzy indicator of reliability technique to facilitate solutions to this problem. It includes modified Hamming and Levenshtein distance functions that are aimed to be used as drop-in replacements in NGS analysis procedures which rely on distances, such as phylogenetic tree construction. The distances utilise fuzzy sets of reliable bases or an equivalent fuzzy logic, potentially aggregating multiple sources of base reliability.	Milko Krachunov, Dimitar Vassilev, Maria Nisheva, Ognyan Kulev, Valeriya Simeonova, Vladimir Dimitrov
559	Pairwise genome comparison workflow in the Cloud using Galaxy [abstract] Abstract: Workflows are becoming the new paradigm in bioinformatics. In general, bioinformatics problems are solved by interconnecting several small software pieces to perform complex analyses. This demands a minimal expertise to create, enact and monitor such tools compositions. In addition bioinformatics is immersed in the big-data territory, facing huge problems to analyse such amount of data. We have addressed these problems by integrating a tools management platform (Galaxy) and a Cloud infrastructure, which prevents moving the big datasets between different locations and allows the dynamic scaling of the computing resources depending on the user needs. The result is a user-friendly platform that facilitates the work of the end-users while performing their experiments, installed in a Cloud environment that includes authentication, security and big-data transfer mechanisms. To demonstrate the suitability of our approach we have integrated in the infrastructure an existing pairwise and multiple genome comparison tool which comprises the management of huge datasets and high computational demands.	Óscar Torreño Tirado, Michael T. Krieger, Paul Heinzlreiter, Oswaldo Trelles
645	Iterative Reconstruction from Few-View Projections [abstract] Abstract: In the medical imaging field, iterative methods have become a hot topic of research due to their capacity to resolve the reconstruction problem from a limited number of projections. This gives a good possibility to reduce radiation exposure on patients during the data acquisition. However, due to the complexity of the data, the reconstruction process is still time consuming, especially for 3D cases, even though implemented on modern computer architecture. Time of the reconstruction and high radiation dose imposed on patients are two major drawbacks in computed tomography. With the aim to resolve them effectively, we adapted Least Square QR method with soft threshold filtering technique for few-view image reconstruction and present its numerical validation. The method is implemented using CUDA programming mode and compared to standard SART algorithm. The numerical simulations and qualitative analysis of the reconstructed images show the reliability of the presented method.	Liubov A. Flores, Vicent Vidal, Gumersindo Verdú

Workshop on Biomedical and Bioinformatics Challenges for Computer Science (BBC) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: V206

Chair: Riccardo Dondi

319	GoD: An R-Package based on Ontologies for Prioritization of Genes with respect to Diseases. [abstract] Abstract: Omics sciences are widely used to analyze diseases at a molecular level. Usually, results of omics experiments are a large list of candidate genes, proteins or other molecules. The interpretation of results and the filtering of candidate genes or proteins selected in an experiment is a challenge in some scenarios. This problem is particularly evident in clinical scenarios in which researchers are interested in the behaviour of few molecules related to some specific disease. The filtering requires the use of domain-specific knowledge that is often encoded into ontologies. To support this interpretation, we implemented GoD (Gene ranking based On Diseases), an algorithm that ranks a given set of genes based on ontology annotations. The algorithm orders genes by the semantic similarity computed between annotation of each gene and those describing the selected disease. We tested as proof-of-principle our software using Human Phenotype Ontology (HPO), Gene Ontology (GO) and Disease Ontology (DO) using the semantic similarity measures. The dedicated website is \url{https://sites.google.com/site/geneontologyprioritization/}.	Mario Cannataro, Pietro Hiram Guzzi and Marianna Milano
693	Large Scale Comparative Visualisation of Regulatory Networks with TRNDiff [abstract] Abstract: The advent of Next Generation Sequencing technologies has seen explosive growth in genomic datasets, and dense coverage of related organisms, supporting study of subtle, strain-specific variations as a determinant of function. Such data collections present fresh and complex challenges for bioinformatics, those of comparing models of complex relationships across hundreds and even thousands of sequences. Transcriptional Regulatory Network (TRN) structures document the influence of regulatory proteins called Transcription Factors (TFs) on associated Target Genes (TGs). TRNs are routinely inferred from model systems or iterative search, and analysis at these scales requires simultaneous displays of multiple networks well beyond those of existing network visualisation tools [1]. In this paper we describe TRNDiff, an open source tool supporting the comparative analysis and visualization of TRNs (and similarly structured data) from many genomes, allowing rapid identification of functional variations within species. The approach is demonstrated through a small scale multiple TRN analysis of the Fur iron-uptake system of Yersinia, suggesting a number of candidate virulence factors; and through a far larger study based on integration with the RegPrecise database (http://regprecise.lbl.gov) - a collection of hundreds of manually curated and predicted transcription factor regulons drawn from across the entire spectrum of prokaryotic organisms. The tool is presently available in stand-alone and integrated form. Information may be found at the dedicated site http://trndiff.org, which includes example data, a short tutorial and links to a working version of the stand-alone system. The integrated regulon browser is currently available at the demonstration site http://115.146.86.55/RegulonExplorer/index.html. Source code is freely available under a non-restrictive Apache 2.0 licence from the authors’ repository at http://bitbucket.org/biovisml.	Xin-Yi Chua, Lawrence Buckingham, James Hogan
30	Epistatic Analysis of Clarkson Disease [abstract] Abstract: Genome Wide Association Studies (GWAS) have predominantly focused on the association between single SNPs and disease. It is probable, however, that complex diseases are due to combined effects of multiple genetic variations, as opposed to single variations. Multi-SNP interactions, known as epistatic interactions, can potentially provide information about causes of complex diseases, and build on previous GWAS looking at associations between single SNPs and phenotypes. By applying epistatic analysis methods to GWAS datasets, it is possible to identify significant epistatic interactions, and map SNPs identified to genes allowing the construction of a gene network. A large number of studies have applied graph theory techniques to analyse gene networks from microarray data sets, using graph theory metrics to identify important hub genes in these networks. In this work, we present a graph theory study of SNP and gene interaction networks constructed for a Clarkson disease GWAS, as a result of applying epistatic interaction methods to identify significant epistatic interactions. This study identifies a number of genes and SNPs with potential roles for Clarkson disease that could not be found using traditional single SNP analysis, including a number located on chromosome 5q previously identified as being of interest for capillary malformation.	Alex Upton, Oswaldo Trelles, James Perkins
527	Multiple structural clustering of bromodomains of the bromo and extra terminal (BET) proteins highlights subtle differences in their structural dynamics and acetylated leucine binding pocket [abstract] Abstract: BET proteins are epigenetic readers whose deregulation results in cancer and inflammation. We show that BET proteins (BRD2, BRD3, BRD4 and BRDT) are globally similar with subtle differences in the sequences and structures of their N-terminal bromodomain. Principal component analysis and non-negative matrix factorization reveal distinct structural clusters associated with specific BET family members, experimental methods, and source organisms. Subtle variations in structural dynamics are evident in the acetylated lysine (Kac) binding pocket of BET bromodomains. Using multiple structural clustering methods, we have also identified representative structures of BET proteins, which are potentially useful for developing potential therapeutic agents.	Suryani Lukman, Zeyar Aung, Kelvin Sim
633	Parallel Tools for Simulating the Depolarization Block on a Neural Model [abstract] Abstract: The prototyping and the development of computational codes for biological models, in terms of reliability, efficient and portable building blocks allow to simulate real cerebral behaviours and to validate theories and experiments. A critical issue is the tuning of a model by means of several numerical simulations with the aim to reproduce real scenarios. This requires a huge amount of computational resources to assess the impact of parameters that influence the neuronal response. In this paper, we describe how parallel tools are adopted to simulate the so-called depolarization block of a CA1 pyramidal cell of hippocampus. Here, the high performance computing techniques are adopted in order to achieve a more efficient model simulation. Finally, we analyse the performance of this neural model, investigating the scalability and benefits on multi-core and on parallel and distributed architectures.	Salvatore Cuomo, Pasquale De Michele, Ardelio Galletti, Giovanni Ponti

Workshop on Biomedical and Bioinformatics Challenges for Computer Science (BBC) Session 3

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: V206

Chair: Mauro Castelli

423	Using visual analytics to support the integration of expert knowledge in the design of medical models and simulations [abstract] Abstract: Visual analytics (VA) provides an interactive way to explore vast amounts of data and find interesting patterns. This has already benefited the development of computational models, as the patterns found using VA can then become essential elements of the model. Similarly, recent advances in the use of VA for the data cleaning stage are relevant to computational modelling given the importance of having reliable data to populate and check models. In this paper, we demonstrate via case studies of medical models that VA can be very valuable at the conceptual stage, to both examine the fit of a conceptual model with the underlying data and assess possible gaps in the model. The case studies were realized using different modelling tools (e.g., system dynamics or network modelling), which emphasizes that the relevance of VA to medical modelling cuts across techniques. Finally, we discuss how the interdisciplinary nature of modelling for medical applications requires an increased support for collaboration, and we suggest several areas of research to improve the intake and experience of VA for collaborative modelling in medicine.	Philippe Giabbanelli, Piper Jackson
409	Mining Mobile Datasets to Enable the Fine-Grained Stochastic Simulation of Ebola Diffusion [abstract] Abstract: The emergence of Ebola in West Africa is of worldwide public health concern. Successful mitigation of epidemics requires coordinated, well-planned intervention strategies that are specific to the pathogen, transmission modality, population, and available resources. Modeling and simulation in the field of computational epidemiology provides predictions of expected outcomes that are used by public policy planners in setting response strategies. Developing up to date models of population structures, daily activities, and movement has proven challenging for developing countries due to limited governmental resources. Recent collaborations (in 2012 and 2014) with telecom providers have given public health researchers access to Big Data needed to build high-fidelity models. Researchers now have access to billions of anonymized, detailed call data records (CDR) of mobile devices for several West African countries. In addition to official census records, these CDR datasets provide insights into the actual population locations, densities, movement, travel patterns, and migration in hard to reach areas. These datasets allow for the construction of population, activity, and movement models. For the first time, these models provide computational support of health related decision making in these developing areas (via simulation-based studies). New models, datasets, and simulation software were produced to assist in mitigating the continuing outbreak of Ebola. Existing models of disease characteristics, propagation, and progression were updated for the current circulating strain of Ebola. The simulation process required the interactions of multi-scale models, including viral loads (at the cellular level), disease progression (at the individual person level), disease propagation (at the workplace and family level), societal changes in migration and travel movements (at the population level), and mitigating interventions (at the abstract governmental policy level). The predictive results from this system were validated against results from the CDC's high-level predictions.	Nicholas Vogel, Christopher Theisen, Jonathan Leidig, Jerry Scripps, Douglas Graham, Greg Wolffe
383	A Novel O(n) Numerical Scheme for ECG Signal Denoising [abstract] Abstract: High quality Electrocardiogram (ECG) data is very important because this signal is generally used for the analysis of heart diseases. Wearable sensors are widely adopted for physical activity monitoring and for the provision of healthcare services, but noise always degrades the quality of these signals. The paper describes a new algorithm for ECG signal denoising, applicable in the contest of the real-time health monitoring using mobile devices, where the signal processing efficiency is a strict requirement. The proposed algorithm is computationally cheap because it belongs to the class of Infinite Impulse Response (IIR) noise reduction algorithms. The main contribution of the proposed scheme is that removes the noise’s frequencies without the implementation of the Fast Fourier Transform that would require the use of special optimized libraries. It is composed by only few code lines and hence offers the possibility of implementation on mobile computing devices in an easy way. Moreover, the scheme allows the local denoising and hence a real time visualization of the denoised signal. Experiments on real datasets have been carried out in order to test the algorithm from accuracy and computational point of view.	Raffaele Farina, Salvatore Cuomo, Ardelio Galletti
549	Syncytial Basis for Diversity in Spike Shapes and their Propagation in Detrusor Smooth Muscle [abstract] Abstract: Syncytial tissues, such as the smooth muscle of the urinary bladder wall, are known to produce action potentials (spikes) with marked differences in their shapes and sizes. The need for this diversity is currently unknown, and neither is their origin understood. The small size of the cells, their syncytial arrangement, and the complex nature of innervation poses significant challenges for the experimental investigation of such tissues. To obtain better insight, we present here a three-dimensional electrical model of smooth muscle syncytium, developed using the compartmental modeling technique, with each cell possessing active channel mechanisms capable of producing an action potential. This enables investigation of the syncytial effect on action potential shapes and their propagation. We show how a single spike shape could undergo modulation, resulting in diverse shapes, owing to the syncytial nature of the tissue. Difference in the action potential features could impact their capacity to propagate through a syncytium. This is illustrated through comparison of two distinct action potential mechanisms. A better understanding of the origin of the various spike shapes would have significant implications in pathology, assisting in evaluating the underlying cause and directing their treatment.	Shailesh Appukuttan, Keith Brain, Rohit Manchanda
200	The Potential of Machine Learning for Epileptic Seizures Prediction [abstract] Abstract: Epilepsy is one of the most common neurological diseases, affecting about 1% of the world population, of all ages, genders, origins. About one third of the epileptic patients cannot be treated by medication or surgery: they suffer from refractory epilepsy and must live with their seizures during all their lives. A seizure can happen anytime, anywhere, imposing severe constrains in the professional and social lives of these patients. The development of transportable and comfortable devices, able to capture a sufficient number of EEG scalp channels, to digitally process the signal, to extract appropriate features from the EEG raw signals, and give these features to machine learning classifiers, is an important objective that a large research community is pursuing worldwide. The classifiers must detect the pre-ictal time (some minutes before the seizure). In this presentation the problem is presented, solutions are proposed, results are discussed. The problem is formulated as a classification of high-dimensional datasets, with unbalanced four classes. Preprocessing of raw data, classification using Artificial Neural Networks and Support Vector Machines to the 275 patients of the European Epilepsy Database show that computer science, in this case machine learning, will have an important role in the problem. For about 30% of the patients we found results with clinical relevance. Real-time experiments made with some patients, in clinical environment and at home will be shown (including video) and discussed. The problem is still challenging the computer science community researching in medical applications. New research directions will be pointed out in the presentation.	Antonio Dourado, Cesar Teixeira and Francisco Sales

Agent-Based Simulations, Adaptive Algorithms and Solvers (ABS-AAS) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: M104

Chair: Maciej Paszynski

754	Agent-Based Simulations, Adaptive Algorithms and Solvers [abstract] Abstract: The aim of this workshop is to integrate results of different domains of computer science, computational science and mathematics. We invite papers oriented toward simulations, either hard simulations by means of finite element or finite difference methods, or soft simulations by means of evolutionary computations, particle swarm optimization and other. The workshop is most interested in simulations performed by using agent-oriented systems or by utilizing adaptive algorithms, but simulations performed by other kind of systems are also welcome. Agent-oriented system seems to be the attractive tool useful for numerous domains of applications. Adaptive algorithms allow significant decrease of the computational cost by utilizing computational resources on most important aspect of the problem. This year following the challenges of ICCS 2015 theme "Computational Science at the Gates of Nature" we invite submissions using techniques dealing with large simulations, e.g. agents based algorithms dealing with big data, model reduction techniques for large problems, fast solvers for large three dimensional simulations, etc. To give - rather flexible - guidance in the subject, the following, more detailed, topics are suggested. These of theoretical brand, like: (a) multi-agent systems in high-performance computing, (b) efficient adaptive algorithms for big problems, (c) low computational cost adaptive solvers, (d) agent-oriented approach to adaptive algorithms, (e) model reduction techniques for large problems, (f) mathematical modeling and asymptotic analysis of large problems, (g) finite element or finite difference methods for three dimensional or non-stationary problems, (h) mathematical modeling and asymptotic analysis. And those with stress on application sphere: (a) agents based algorithms dealing with big data, (b) application of adaptive algorithms in large simulation, (c) simulation and large multi-agent systems, (d) application of adaptive algorithms in three dimensional finite element and finite difference simulations, (e) application of multi-agent systems in computational modeling, (f) multi-agent systems in integration of different approaches.	Maciej Paszynski, Robert Schaefer, Krzysztof Cetnarowicz, David Pardo and Victor Calo
631	Coupling Navier-Stokes and Cahn-Hilliard equations in a two-dimensional annular flow configuration [abstract] Abstract: In this work, we present a novel isogeometric analysis discretization for the Navier-Stokes-Cahn-Hilliard equation, which uses divergence-conforming spaces. Basis functions generated with this method can have higher-order continuity, and allow to directly discretize the higher-order operators present in the equation. The discretization is implemented in PetIGA-MF, a high-performance framework for discrete differential forms. We present solutions in a two-dimensional annulus, and model spinodal decomposition under shear flow.	Philippe Vignal, Adel Sarmiento, Adriano Côrtes, Lisandro Dalcin, Victor Calo
656	High-Accuracy Adaptive Modeling of the Energy Distribution of a Meniscus-Shaped Cell Culture in a Petri Dish [abstract] Abstract: Cylindrical Petri dishes embedded in a rectangular waveguide and exposed to a polarized electromagnetic wave are often used to grow cell cultures. To guarantee the success of these cultures, it is necessary to enforce that the specific absorption rate distribution is sufficiently high and uniform over the Petri dish. Accurate numerical simulations are needed to design such systems. These simulations constitute a challenge due to the strong discontinuity of electromagnetic parameters of the materials involved, the relative low value of field within the dish cultures compared with the rest of the domain, and the presence of the meniscus shape developed at the liquid/solid interface. The latter greatly increases the level of complexity of the model in terms of geometry and the intensity of the gradients/singularities of the field solution. In here, we employ a three-dimensional (3D) $hp$-adaptive finite element method using isoparametric elements to obtain highly accurate simulations. We analyse the impact of the geometrical modeling of the meniscus shape cell culture in the $hp$-adaptivity. Numerical results concerning the convergence history of the error indicate the numerical difficulties arisen due to the presence of a meniscus-shaped object. At the same time, the resulting energy distribution shows that to consider such meniscus shape is essential to guarantee the success of the cell culture from the biological point of view.	Ignacio Gomez-Revuelto, Luis Emilio Garcia-Castillo and David Pardo
162	Leveraging workflows and clouds for a multi-frontal solver for finite element meshes [abstract] Abstract: Scientific workflows in clouds have been successfully used for automation of large-scale computations, but so far they were applied to the loosely-coupled problems, where most workflow tasks can be processed independently in parallel and do not require high volume of communication. The multi-frontal solver algorithm for finite element meshes can be represented as a workflow, but the fine granularity of resulting tasks and the large communication to computation ratio makes it hard to execute it efficiently in loosely-coupled environments such as the Infrastructure-as-a-Service clouds. In this paper, we hypothesize that there exists a class of meshes that can be effectively decomposed into a workflow and mapped onto a cloud infrastructure. To show that, we have developed a workflow-based multi-frontal solver using the HyperFlow workflow engine, which comprises workflow generation from the elimination tree, analysis of the workflow structure, task aggregation based on estimated computation costs, and distributed execution using a~dedicated worker service that can be deployed in clouds or clusters. The results of our experiments using the workflows of over 10,000 tasks indicate that after task aggregation the resulting workflows of over 100 tasks can be efficiently executed and the overheads are not prohibitive. These results lead us to conclusions that our approach is feasible and gives prospects for providing a generic workflow-based solution using clouds for problems typically considered as requiring HPC infrastructure.	Bartosz Balis, Kamil Figiela, Maciej Malawski, Konrad Jopek
571	Multi-pheromone ant colony optimization for socio-cognitive simulation purposes [abstract] Abstract: We present an application of Ant Colony Optimisation (ACO) to simulate socio-cognitive features of a population. We incorporated perspective taking ability to generate three different proportions of ant colonies: Control Sample, High Altercentricity Sample, and Low Altercentricity Sample. We simulated their performances on the Travelling Salesman Problem and compared them with the classic ACO. Results show that all three 'cognitively enabled' ant colonies require less time than the classic ACO. Also, though the best solution is found by the classic ACO, the Control Sample finds almost as good a solution but much faster. This study is offered as an example to illustrate an easy way of defining inter-individual interactions based on stigmergic features of the environment.	Mateusz Sekara, Kowalski Michal, Aleksander Byrski, Bipin Indurkhya, Marek Kisiel-Dorohinicki, Dana Samson, Tom Lenaerts

Agent-Based Simulations, Adaptive Algorithms and Solvers (ABS-AAS) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: M104

Chair: Piotr Gurgul

292	Quantities of Interest for Surface based Resistivity Geophysical Measurements [abstract] Abstract: The objective of traditional goal-oriented strategies is to construct an optimal mesh that minimizes the problem size needed to achieve a user prescribed tolerance error for a given quantity of interest (QoI). Typical geophysical resistivity measurement acquisition systems can easily record electromagnetic (EM) fields. However, depending upon the application, EM fields are sometimes loosely related to the quantity that is to be inverted (conductivity or resistivity), and therefor they become inadequate for inversion. In the present work, we study the impact of the selection of the QoI in our inverse problem. We focus on two different acquisition systems: marine controlled source electromagnetic (CSEM), and magnetotellurics (MT). For both applications, numerical results illustrate the benefits of employing adequate QoI. Specifically, the use as QoI of the impedance matrix on MT measurements provides huge computational savings, since one can replace the existing absorbing boundary conditions (BCs) by a homogeneous Dirichlet BC to truncate the computational domain, something that is not possible when considering EM fields as QoI.	Julen Alvarez-Aramberri, Shaaban Ali Bakr, David Pardo, Helene Barucq
448	Multi-objective Hierarchic Memetic Solver for Inverse Parametric Problems [abstract] Abstract: We propose a multi-objective approach for solving challenging inverse parametric problems. The objectives are misfits for several physical descriptions of a phenomenon under consideration, whereas their domain is a common set of admissible parameters. The resulting Pareto set, or parameters close to it, constitute various alternatives of minimizing individual misfits. A special type of selection applied to the memetic solution of the multi-objective problem narrows the set of alternatives to the ones that are sufficiently coherent. The proposed strategy is exemplified by solving a real-world engineering problem consisting of the magnetotelluric measurement inversion that leads to identification of oil deposits located about 3 km under the Earth's surface, where two misfit functions are related to distinct frequencies of the electric and magnetic waves.	Ewa Gajda-Zagórska, Maciej Smołka, Robert Schaefer, David Pardo, Julen Alvarez-Aramberri
62	Towards green multi-frontal solver for adaptive finite element method [abstract] Abstract: In this paper we present the optimization of the energy consumption for the multi-frontal solver algorithm executed over two dimensional grids with point singularities. The multi-frontal solver algorithm is controlled by so-called elimination tree, defining the order of elimination of rows from particular frontal matrices, as well as order of memory transfers for Schur complement matrices. For a given mesh there are many possible elimination trees resulting in different number of floating point operations (FLOPs) of the solver or different amount of data transferred via memory transfers. In this paper we utilize the dynamic programming optimization procedure and we compare elimination trees optimized with respect to FLOPs with elimination trees optimized with respect to energy consumption.	Hassan Aboueisha, Mikhail Moshkov, Konrad Jopek, Paweł Gepner, Jacek Kitowski, Maciej Paszynski
492	Ordering of elements for the volume & neighbors algorithm constructing elimination trees for 2D and 3D h adaptive FEM [abstract] Abstract: In this paper we analyze the optimality of the volume and neighbors algorithm constructing elimination trees for three dimensional h adaptive finite element method codes. The algorithm is a greedy algorithm that constructs the elimination trees based on the bottom up analysis of the computational mesh. We compare the results of the volume and neighbors greedy algorithm with the global dynamic programming optimization performed on a class of elimination trees. The comparison is based on the Directed Acyclic Graph (DAG) constructed for model grids. We construct DAGs for a two model grids, two dimensional grid with point singularity and two dimensional grid with edge singularity. We show that the quasi-optimal trees created by the volume and neighbors algorithm for considered grids are also captured by the dynamic programming procedure. It means that created elimination trees are optimal in the considered class of elimination trees. We show that different ordering of elements at the input of the volume and neighbors algorithm results in different computational costs of the multi-frontal solver algorithm executed over the resulting elimination trees. Finally we present the ordering of elements that results in optimal (in the considered class) elimination trees. The theoretical results are verified with numerical experiments performed on a three dimensional grids with point, edge and face singularities.	Anna Paszynska
66	A new time integration scheme for Cahn-Hilliard equations [abstract] Abstract: In this paper we present a new integration scheme that can be applied for solution of dicult non-stationary problems. The scheme results from linearization of the Cranck-Nicolson scheme that is unconditionally stable but needs to solve non-linear equation at each time step. We test our linearized time integration scheme on the challenging Cahn-Hilliard equations, modeling the separation of two phase fluids. The problem is solved using higher order isogeometric fintie element method with B-spline basis functions. We implement our linear scheme in PETIGA framework interfaced via PETSc toolkit. We utilize a GMRES iterative solver for solution of a linear system at every time step. We also define a simple time adaptivity scheme, which increases the time step size when number of GMRES iterations is less than 30. We compare our linear scheme with simple time adaptation algorithm with non-linear scheme with sophisticated time adaptivity, on the two dimensional Cahn-Hilliard equations. We control the stability of our simulations by monitoring the Ginzberg-Landau free energy functional. We conclude that our simple scheme with simple time adaptivitiy outperforms the non-linear one with advanced time adaptivity by means of the execution time, while providing similar history of the evolution of the free energy functional.	Robert Schaefer, Maciej Smolka, Lisandro Dalcin, Maciej Paszynski

Agent-Based Simulations, Adaptive Algorithms and Solvers (ABS-AAS) Session 3

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: M104

Chair: Aleksander Byrski

364	Object Oriented Programming for Partial Differential Equations [abstract] Abstract: After a short introduction to the mathematical modelling of the elastic dynamic problem, which shows the similarity between the governing Partial Differential Equations (PDEs) in different applications, common blocks for Finite Element approximation are identified, and an Object Oriented Programming (OOP) methodology for linear and non-linear, stationary and dynamic problems is presented. Advantages of this approach are commented and some results are shown as examples of this methodology.	Elisabete Alberdi Celaya, Juan José Anza Aguirrezabala
667	GPGPU for Difficult Black-box Problems [abstract] Abstract: Difficult black-box problems are required to be solved in many scientific and industrial areas. In this paper, efficient use of a hardware accelerator to implement dedicated solvers for such problems is discussed and studied based on an example of Golomb Ruler problem. The actual solution of the problem is shown based on evolutionary and memetic algorithms accelerated on GPGPU. The presented results prove the supremacy of GPGPU over optimized multicore CPU implementation.	Marcin Pietron, Aleksander Byrski, Marek Kisiel-Dorohinicki
558	Multi-variant Planing for Dynamic Problems with Agent-based Signal Modeling [abstract] Abstract: The problem of planning for groups of autonomous beings is gaining attention over the last few years. Real life tasks, like mobile robots coordination or urban traffic management, need robust and flexible solutions. In this paper a new approach to the problem of multi-variant planning in such systems is presented. It assumes use of simple reactive controllers by the beings, however the state observation is enriched by dynamically updated model, which contains planning results. The approach gives promising results in the considered use case, which is the Multi Robot Task Allocation problem.	Szymon Szomiński, Wojciech Turek, Małgorzata Żabińska, Krzysztof Cetnarowicz
637	Conditional Synchronization in Multi-Agent Graph-Based Knowledge Systems [abstract] Abstract: Graph transformations provide a well established method for the formal description of modifications of graph-based systems. On the other side such systems can be regarded as multi-agent ones providing a feasible mean for maintaining and manipulating large scale data. This paper deals with the problem of information exchange among agents maintaining different graph-based systems. Graph formalism applied for representing a knowledge maintained by agents is used at the same time to perform graph transformations modeling a knowledge exchange. The consistency of a knowledge represented by the set of agents is ensured by execution of some graph transformations rules by two agents in a parallel way. We suggest that complex operations (sequences of graph transformations) should be introduced instead of the formalism basing on simple unconditional operations. The approach presented in this paper is accompanied by examples concerning the problem of personal data distributed over different places (and maintained by different agents) and transmitted in such an environment\footnote{Financial support for this study was provided from resources of National Center for Research and Development, the grant number NCBiR 0021/R/ID2/2011/01. }.	Leszek Kotulski, Adam Sędziwy, Barbara Strug
442	Agent-based approach to WEB exploration process [abstract] Abstract: The paper contains the concept of agent-based search system and monitoring of Web pages. It is oriented at the exploration of limited problem area, covering a given sector of industry or economy. The proposal of agent-based (modular) structure of the system is due to the desire to ease the introduction of modifications or enrichment of its functionality. Commonly used search engines do not offer such a feature. The second part of the article presents a pilot version of the WEB mining system, representing a simplified implementation of the previously presented concept. Testing of the implemented application was executed by referring to the problem area of foundry industry.	Andrzej Opaliński, Edward Nawarecki, Stanisława Kluska-Nawarecka

Agent-Based Simulations, Adaptive Algorithms and Solvers (ABS-AAS) Session 4

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: M104

Chair: Aleksander Byrski

568	Agent-oriented Foraminifera Habitat Simulation [abstract] Abstract: An agent-oriented software solution for simulation of marine unicellular organisms called foraminifera is presented. Their simplified microhabitat interactions are described and implemented to run the model and verify its flexibility. This group of well fossilizable protists has been selected due to its excellent ``in fossilio'' record that should help to verify our future long-run evolutionary results. The introduced system is built utilizing PyAge platform and based on easily exchangeable components that may be replaced (also in runtime). Selected experiments considering substantial and technological efficiency were conducted and the obtained results are presented and discussed.	Maciej Kazirod, Wojciech Korczynski, Elias Fernandez, Aleksander Byrski, Marek Kisiel-Dorohinicki, Paweł Topa, Jaroslaw Tyszka, Maciej Komosinski
432	Comparison of the structure of equation systems and the GPU multifrontal solver for finite difference, collocation and finite element method [abstract] Abstract: The article is an in-depth comparison of the solving process of the equation systems specific for finite difference, collocation and finite element methods. The paper considers recently developed isogeometric versions of the collocation and finite element methods, employing B-splines for the computations and ensuring C^{p-1} continuity on the borders of elements for the B-splines of the order p. For solving the systems, we use our GPU implementation of the state-of-the-art parallel multifrontal solver, which leverages modern GPU architectures and allows to reduce the complexity. We analyze the structures of linear equation systems resulting from each of the methods and how different structures of matrix lead to different multifrontal solver elimination trees. We also consider the flows of multifrontal solver depending on the originally employed method.	Pawel Lipski, Maciej Wozniak, Maciej Paszynski

Sixth Workshop on Data Mining in Earth System Science (DMESS) Session 1

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: M209

Chair: Jay Larson

739	Data Mining in Earth System Science (DMESS 2015) [abstract] Abstract: Spanning many orders of magnitude in time and space scales, Earth science data are increasingly large and complex and often represent very long time series, making such data difficult to analyze, visualize, interpret, and understand. Moreover, advanced electronic data storage technologies have enabled the creation of large repositories of observational data, while modern high performance computing capacity has enabled the creation of detailed empirical and process-based models that produce copious output across all these time and space scales. The resulting “explosion” of heterogeneous, multi-disciplinary Earth science data have rendered traditional means of integration and analysis ineffective, necessitating the application of new analysis methods and the development of highly scalable software tools for synthesis, assimilation, comparison, and visualization. This workshop explores various data mining approaches to understanding Earth science processes, emphasizing the unique technological challenges associated with utilizing very large and long time series geospatial data sets. Especially encouraged are original research papers describing applications of statistical and data mining methods—including cluster analysis, empirical orthogonal functions (EOFs), genetic algorithms, neural networks, automated data assimilation, and other machine learning techniques—that support analysis and discovery in climate, water resources, geology, ecology, and environmental sciences research.	Forrest M. Hoffman, Jitendra Kumar and Jay Larson
312	Pattern-Based Regionalization of Large Geospatial Datasets Using COBIA [abstract] Abstract: Pattern-based regionalization -- spatial classification of an image into sub-regions characterized by relatively stationary patterns of pixel values -- is of significant interest for conservation, planing, as well as for academic research. A technique called the complex object-based image analysis (COBIA) is particularly well-suited for pattern-based regionalization of very large spatial datasets. In COBIA image is subdivided into a regular grid of local blocks of pixels (complex objects) at minimal computational cost. Further analysis is performed on those blocks which represent local patterns of pixel-based variable. A variant of COBIA presented here works on pixel-classified images, uses a histogram of co-occurrence pattern features as block attribute, and utilizes the Jensen-Shannon divergence to measure a distance between any two local patterns. In this paper the COBIA concept is utilized for unsupervised regionalization of land cover dataset (pixel-classified Landsat images) into landscape types -- characteristic patterns of different land covers. This exploratory technique identifies and delineates landscape types using a combination of segmentation of a grid of local patterns with clustering of the segments. A test site with 3.5 x 10^8 pixels is regionalized in just few minutes using a standard desktop computer. Computational efficiency of presented approach allows for carrying out regionalizations of various high resolution spatial datasets on continental or global scales.	Tomasz Stepinski, Jacek Niesterowicz, Jaroslaw Jasiewicz
720	Fidelity of Precipitation Extremes in High Resolution Global Climate Simulations [abstract] Abstract: Precipitation extremes have tangible societal impacts. Here, we assess if current state of the art global climate model simulations at high spatial resolutions capture the observed behavior of precipitation extremes in the past few decades over the continental US. We design a correlation-based regionalization framework to quantify precipitation extremes, where samples of extreme events for a grid box may also be drawn from neighboring grid boxes with statistically equal means and statistically significant temporal correlations. We model precipitation extremes with the Generalized Extreme Value (GEV) distribution fits to time series of annual maximum precipitation. Non-stationarity of extremes is captured by including a time-dependent parameter in the GEV distribution. Our analysis reveals that the high-resolution model substantially improves the simulation of stationary precipitation extreme statistics particularly over the Northwest Pacific coastal region and the Southeast US. Observational data exhibits significant non-stationary behavior of extremes only over some parts of the Western US, with declining trends in the extremes. While the high resolution simulations improve upon the low resolution model in simulating this non-stationary behavior, the trends are statistically significant only over some of those regions.	Salil Mahajan, Katherine Evans, Marcia Branstetter, Valentine Anantharaj, Juliann Leifeld
729	On Parallel and Scalable Classification and Clustering Techniques for Earth Science Datasets [abstract] Abstract: One observation of earth data science is their massive increase in volume (e.g. higher quality measurements) or the emerging high number of dimensions (e.g. hyperspectral bands in satellite observations). Traditional data mining tools (R, Matlab, etc.) are partly becoming infeasible to be used with those datasets. Parallel and scalable techniques bear the potential to overcome these limits while our analysis revealed that a wide variety of new implementations are not all suited for data mining tasks in earth science. This contribution gives reasons by focusing on two distinct parallel and scalable data mining techniques used in High Performance Computing (HPC) environments in earth science case studies: (a) Parallel Density-based Spatial Clustering of Applications with Noise (DBSCAN) for automated outlier detection in time series data and (b) parallel classification using multi-class Support Vector Machines (SVMs) for land cover identification in multi-spectral satellite datasets. In the paper we also compare recent ‘big data stacks’ vs. traditional HPC techniques.	Markus Götz, Matthias Richerzhagen, Gabriele Cavallaro, Christian Bodenstein, Philipp Glock, Morris Riedel, Jon Atli Benediktsson
322	Completion of a sparse GLIDER database using multi-iterative Self-Organizing Maps (ITCOMP SOM) [abstract] Abstract: We present a novel approach named ITCOMP SOM that uses iterative self-organizing maps (SOM) to progressively reconstruct missing data in a highly correlated multidimensional dataset. This method was applied for the completion of a complex oceanographic data-set containing glider data from the EYE of the Levantine experiment of the EGO project. ITCOMP SOM provided reconstructed temperature and salinity profiles that are consistent with the physics of the phenomenon they sampled. A cross-validation test was performed and validated the approach, providing a root mean square error of providing a root mean square error of 0,042°C for the reconstruction of the temperature profiles and 0,008 PSU for the simultaneous reconstruction of the salinity profiles.	Anastase - Alexander Charantonis, Pierre Testor, Laurent Mortier, Fabrizio D'Ortenzio, Sylvie Thiria
698	A Feature-first Approach to Clustering for Highlighting Regions of Interest in Scientific Data [abstract] Abstract: We present a simple clustering algorithm that classifies the points of a dataset by a combination of scalar variables' values as well as spatial locations. How heavily the spatial locations impact the algorithm is a tunable parameter. With no impact the algorithm bins the data by calculating a histogram and classifies each point by a bin ID. With full impact, points are bunched together with their neighbors regardless of value. This approach is unsurprisingly very sensitive to this weighting; a sampling of possible values yields a wide variety of classifications. However, we have found that when tuned just right it is indeed possible to extract meaningful features from the resulting clustering. Furthermore, the principles behind our development of this technique are also applicable in both tuning the algorithm as well as in selecting data regions. In this paper we will provide the details of design and implementation and demonstrate using the auto-tuned approach to extract interesting regions of real scientific data. Our target application is data derived from NASA’s Moderate Resolution Imaging Spectroradiometer (MODIS) sensors.	Robert Sisneros

DDDAS-Dynamic Data Driven Applications Systems and Large-Scale-Big-Data & Large-Scale-Big-Computing (DDDAS-LS) Session 1

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: M105

Chair: Frederica Darema

758	DDDAS, a key driver for Large-Scale-Big-Data and Large-Scale-Big-Computing [abstract] Abstract: This talk will provide an overview of future directions in Big Data and Big Computing, as driven by the DDDAS paradigm. In DDDAS, the computation and instrumentation aspects of an application system are integrated in a dynamic feed-back control loop. Thus, by its inception DDDAS is a driver for application support environments where the computational platforms span the diverse range of high-end and mid-range computing and including the instrumentation platforms, stationary and mobile networked sensors, and end-user devices. Commensurately, the data involved in DDDAS environments span data associated with complex computer models of application systems to instrumentation-data – either collected from large instruments or from the multitudes of heterogeneous mobile and stationary ubiquitous sensing devices, and including end-user devices. Data from ubiquitous sensing constitute the next wave of Big Data. These collections of ubiquitous and heterogeneous sensing devices not only are sources of large volumes of heterogeneous sets of data, but also the amount of computing that is performed collectively on the multitudes of instrumentation/sensor platforms amounts to significant computational power which should be viewed in tandem with that performed in the high-end platforms. In DDDAS environments, this range of platforms - from the high-end, to the instrumentation and end-user platforms - constitute the dynamically integrated, unified platform referred to here as Large-Scale-Big-Computing (LSBC); the diverse sets of data - from high-end computing data to data from large sets of heterogeneous sensing are referred to as Large-Scale-Big-Data (LSBD). There are challenges and opportunities in supporting and exploiting Large-Scale-Big-Computing and Large-Scale-Big-Data. DDDAS has been applied and is creating new capabilities in many application areas spanning systems from the nanoscale to the terra-scale and the extraterra-scale, and covering a multitude of domains such as: physical, chemical, biological systems (e.g.: engineered materials, protein folding, bionetworks); engineered systems (e.g.: structural health monitoring, decision support and environment cognizant operation); surveillance, co-operative sensing, autonomic coordination, cognition; energy efficient operations; medical and health systems (e.g.: MRI imaging, seizure control); ecological and environmental systems (e.g.: earthquakes, hurricanes, tornados, wildfires, volcanic eruptions and ash transport, chemical pollution); fault tolerant critical infrastructure systems (e.g.: electric-powergrids, transportation systems); manufacturing processes planning and control; space weather and adverse atmospheric events; cybersecurity; systems software. DDDAS is a driver for LSBC and LSBD environments and but also a methodology to efficiently manage and exploit these large-scale-heterogeneous resources, aspects which will be addressed in the talk; additional examples of DDDAS-based new capabilities for such applications are provided in other papers in this workshop.	Frederica Darema
214	Dynamic Data-driven Deformable Reduced Models for Coherent Fluids [abstract] Abstract: In autonomous mapping of geophysical fluids, a DDDAS framework involves reduced models constructed offline for online use. Here we show that classical model reduction is ill-suited to deal with model errors manifest in coherent fluids as feature errors including position, scale, shape or other deformations. New fluid representations are required. We propose augmenting amplitude vector spaces by non-parametric deformation vector fields which enables the synthesis of new Principal Appearance and Geometry modes, Coherent Random Field expansions, and an Adaptive Reduced Order Model by Alignment (AROMA) framework. AROMA dynamically deforms reduced models in response to feature errors. It provides robustness and efficiency in inference by unifying perceptual and physical representations of coherent fluids that to the best of our knowledge has not hitherto been proposed.	Sai Ravela
449	Parallel solution of DDDAS variational inference problems [abstract] Abstract: Inference problems in dynamically data-driven application systems use physical measurements along with a physical model to estimate the parameters or state of a physical system. Developing parallel algorithms to solve inference problems can improve the process of estimating and predicting the physical state of a system. Solution to inference problems using the variational approach require multiple evaluations of the associated cost function and gradient, where the gradient is defined as the increase/decrease inflection point of the variable between two points. In this paper we present a scalable algorithm based on augmented Lagrangian approach to solve the variational inference problem. The augmented Lagrangian framework facilitates parallel cost function and gradient computations. We show that the methodology is highly scalable with increasing problem size by applying it for the Lorenz-96 model.	Vishwas Hebbur Venkata Subba Rao, Adrian Sandu
540	Security and Privacy Dimensions in Next Generation DDDAS/Infosymbiotic Systems: A Position Paper [abstract] Abstract: The omnipresent pervasiveness of personal devices will expand the applicability of the DDDAS paradigm in innumerable ways. While every single smartphone or wearable device is potentially a sensor with powerful computing and data capabilities, privacy and security in the context of human participants must be addressed to leverage the infinite possibilities of dynamic data driven application systems. We propose a security and privacy preserving framework for next generation systems that harness the full power of the DDDAS paradigm while (1) ensuring provable privacy guarantees for sensitive data; (2) enabling field-level, intermediate, and central hierarchical feedback-driven analysis for both data volume mitigation and security; and (3)intrinsically addressing uncertainty caused either by measurement error or security-driven data perturbation. These thrusts will form the foundation for secure and private deployments of large scale hybrid participant-sensor DDDAS systems of the future.	Li Xiong, Vaidy Sunderam

DDDAS-Dynamic Data Driven Applications Systems and Large-Scale-Big-Data & Large-Scale-Big-Computing (DDDAS-LS) Session 2

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: M105

Chair: Frederica Darema

561	Spectral Validation of Measurements in a Vehicle Tracking DDDAS [abstract] Abstract: Vehicle tracking in adverse environments is a challenging problem because of the high number of factors constraining their motion and possibility of frequent occlusion. In such conditions, identification rates drop dramatically. Hyperspectral imaging is known to improve the robustness of target identification by recording extended data in many wavelengths. However, it is impossible to transmit such a high rate data in real time with a conventional full hyperspectral sensor. Thus, we present a persistent ground-based target tracking system, taking advantage of a state-of-the-art, adaptive, multi-modal sensor controlled by Dynamic Data Driven Applications Systems (DDDAS) methodology. This overcomes the data challenge of hyperspectral tracking by only using spectral data as required. Spectral features are inserted in a feature matching algorithm to identify spectrally likely matches and simplify multidimensional assignment algorithm. The sensor is tasked for spectra acquisition by the prior estimates from the Gaussian Sum Filter and foreground mask generated by the background subtraction. Prior information matching the target features is used to tackle false negatives in the background subtraction output. The proposed feature-aided tracking system is evaluated in a challenging scene with a realistic vehicular simulation.	Burak Uzkent, Matthew J. Hoffman, Anthony Vodacek
567	Dynamic Data-Driven Application System (DDDAS) for Video Surveillance User Support [abstract] Abstract: Human-machine interaction mixed initiatives require a pragmatic coordination between different systems. Context understanding is established from the content, analysis, and guidance from query-based coordination between users and machines. Inspired by Level 5 Information Fusion ‘user refinement’, a live-video computing (LVC) structure is presented for user-based query access of a data-base management of information. Information access includes multimedia fusion of query-based text, images, and exploited tracks which can be utilized for context assessment, content-based information retrieval (CBIR), and situation awareness. In this paper, we explore new developments in dynamic data-driven application systems (DDDAS) of context analysis for user support. Using a common image processing data set, a system-level time savings is demonstrated using a query-based approach in a context, control, and semantic-aware information fusion design	Erik Blasch, Alex Aved
630	Multi-INT Query Language for DDDAS Designs [abstract] Abstract: Context understanding is established from the content, analysis, and guidance from query-based coordination between users and machines. In this manuscript, a live-video computing (LVC) approach is presented for access, comprehension and management of information for context assessment. Context assessment includes multimedia fusion of query-based text, images, and exploited tracks which can be utilized for image retrieval. In this paper, we explore the developments in database systems to enable context to be utilized in user-based queries for video tracking content extraction. Using a common image processing data set, we demonstrate activity analysis with context, privacy, and semantic-aware in a Dynamic Data-Driven Application System (DDDAS).	Alex Aved, Erik Blasch
683	A DDDAS Plume Monitoring System with Reduced Kalman Filter [abstract] Abstract: A new dynamic data-driven application system (DDDAS) is proposed in this article to dynamically estimate a concentration plume and to plan optimal paths for unmanned aerial vehicles (UAVs) equipped with environmental sensors. The proposed DDDAS dynamically incorporates measured data from UAVs into an environmental simulation while simultaneously steering measurement processes. The main idea is to employ a few time-evolving proper orthogonal decomposition (POD) modes to simulate a coupled linear system, and to simultaneously measure plume concentration and plume source distribution via a reduced Kalman filter. In order to maximize the information gain, UAVs are dynamically driven to hot spots chosen based on the POD modes using a greedy algorithm. We demonstrate the efficacy of the data assimilation and control strategies in a numerical simulation and a field test.	Liqian Peng, Matthew Silic, Kamran Mohseni
685	A Dynamic Data Driven Approach for Operation Planning of Microgrids [abstract] Abstract: Distributed generation resources (DGs) and their utilization in large-scale power systems are attracting more and more utilities as they are becoming more qualitatively reliable and economically viable. However, uncertainties in power generation from DGs and fluctuations in load demand must be considered when determining the optimal operation plan for a microgrid. In this context, a novel dynamic data driven approach is proposed for determining the real-time operation plan of an electric microgrid while considering its conflicting objectives. In particular, the proposed approach is equipped with three modules: 1) a database including the real-time microgrid topology data (i.e., power demand, market price for electricity, etc.) and the data for environmental factors (i.e., solar radiation, wind speed, temperature, etc.); 2) a simulation, in which operation of the microgrid is simulated with embedded rule-based scale identification procedures; 3) a multi-objective optimization module which finds the near-optimal operation plan in terms of minimum operating cost and minimum emission using a particle-filtering based algorithm. The complexity of the optimization depends on the scale of the problem identified from the simulation module. The results obtained from the optimization module are sent back to the microgrid system to enhance its operation. The experiments conducted in this study have demonstrated the power of the proposed approach in real-time assessment and control of operation in microgrids.	Xiaoran Shi, Haluk Damgacioglu, Nurcin Celik

DDDAS-Dynamic Data Driven Applications Systems and Large-Scale-Big-Data & Large-Scale-Big-Computing (DDDAS-LS) Session 3

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: M105

Chair: Frederica Darema

188	Detecting and adapting to parameter changes for reduced models of dynamic data-driven application systems [abstract] Abstract: We consider the task of dynamic capability estimation for an unmanned aerial vehicle, which is needed to provide the vehicle with the ability to dynamically and autonomously sense, plan, and act in real time. Our dynamic data-driven application systems framework employs reduced models to achieve rapid evaluation runtimes. Our reduced models must also adapt to underlying dynamic system changes, such as changes due to structural damage or degradation of the system. Our dynamic reduced models take into account changes in the underlying system by directly learning from the data provided by sensors, without requiring access to the original high-fidelity model. We present here an adaptivity indicator that detects a change in the underlying system and so allows the initiation of the dynamic reduced modeling adaptation if necessary. The adaptivity indicator monitors the error of the dynamic reduced model by comparing model predictions with sensor data, and signals a change if the error exceeds a given threshold. The indicator is demonstrated on a deflection model of a damaged plate in bending. Local damage of the plate is modeled by a change in the thickness of the plate. The numerical results show that in this example the adaptivity indicator detects all changes in the thickness and correctly initiates the adaptation of the reduced model.	Benjamin Peherstorfer, Karen Willcox
208	Multiobjective Design Optimization in the Lightweight Dataflow for DDDAS Environment (LiD4E) [abstract] Abstract: In this paper, we introduce new methods for multiobjective, system-level optimization that have been incorporated into the Lightweight Dataflow for Dynamic Data Driven Application Systems (DDDAS) Environment (LiD4E). LiD4E is a design tool for optimized implementation of dynamic, data-driven stream mining systems using high-level dataflow models of computation. More specifically, we develop in this paper new methods for integrated modeling and optimization of real-time stream mining constraints, multidimensional stream mining performance (precision and recall), and energy efficiency. Using a design methodology centered on data-driven control of and coordination between alternative dataflow subsystems for stream mining (classification modes), we develop systematic methods for exploring complex, multidimensional design spaces associated with dynamic stream mining systems, and deriving sets of Pareto-optimal system configurations that can be switched among based on data characteristics and operating constraints.	Kishan Sudusinghe, Yang Jiao, Haifa Ben Salem, Mihaela van der Schaar, Shuvra Bhattacharyya
212	FreshBreeze: A Data Flow Approach for Meeting DDDAS Challenges [abstract] Abstract: The DDDAS paradigm, unifying applications, mathematical modeling, and sensors, is now more relevant than ever with the advent of Large-Scale/Big-Data and Big-Computing. Large-Scale-Dynamic-Data (advertised as the next wave of Big Data) includes the integrated range of data from high-end systems and instruments together with the dynamic data arising from ubiquitous sensing and control in engineered, natural, and societal systems. In this paper we present Fresh Breeze, a dataflow-based execution and programming model and computer architecture and how it provides a sound basis to develop future computing systems that match the DDDAS challenges. The DDDAS' computation patterns and data storage needs are well matched by the Fresh Breeze system's codelet-based execution model and memory-chunk-based memory model, as well as the proposed ISA level architecture features to be highlighted in this paper. We have extended and improved a previous generation of Fresh Breeze simulation platform to model a Fresh Breeze processing chip comprising up to 64 processing cores with an ISA with new features to address the issues of efficient symbiotic processing, and have completed a compiler tool chain from an adapted version of the Java source language to machine-level codelets for the simulator. We have evaluated our current implementation on several standard kernels from linear algebra for which near-linear speedup versus the number of cores is achieved without manual parallelization or scale-specific performance tuning. These test kernels show effectiveness of the fine-grain task scheduling and load balancing features essential to achieving the best performance for DDDAS. It is expected that once planned support of stream computation and transaction processing is checked out, it will be possible to demonstrate superior performance for application codes of DDDAS.	Xiaoming Li, Jack Dennis, Guang Gao, Willie Lim, Haitao Wei, Chao Yang, Robert Pavel
221	Dynamic Data Driven Sensor Network Selection and Tracking [abstract] Abstract: The deployment of networks of sensors and development of pertinent information processing techniques can facilitate the requirement of situational awareness present in many defense/surveillance systems. Sensors allow the collection and distributed processing of information in a variety of environments whose structure is not known and is dynamically changing with time. A distributed dynamic data driven (DDDAS-based) framework is developed in this paper to address distributed multi-threat tracking under limited sensor resources. The acquired sensor data will be used to control the sensing part of the sensor network, and utilize only the sensing devices that acquire good quality measurements about the present targets. The DDDAS-based concept will be utilized to enable efficient sensor activation of only those parts of the network located close to a target/object. A novel combination of stochastic filtering techniques, drift homotopy and sparsity-inducing canonical correlation analysis (S-CCA) is utilized to dynamically identify the target-informative sensors and utilize them to perform improved drift-based particle filtering techniques that will allow robust, stable and accurate distributed tracking of multiple objects. Numerical tests demonstrate the effectiveness of the novel framework.	Ioannis Schizas, Vasileios Maroulas
408	A Framework for Migrating Relational Datasets to NoSQL [abstract] Abstract: In software development, migration from a Data Base Management System (DBMS) to another, especially with distinct characteristics, is a challenge for programmers and database administrators. Changes in the application code in order to comply with new DBMS are usually vast, causing migrations infeasible. In order to tackle this problem, we present NoSQLayer, a framework capable to support conveniently migrating from relational (i.e., MySQL) to NoSQL DBMS (i.e., MongoDB). This framework is presented in two parts: (1) migration module; and, (2) mapping module. The first one is a set of methods enabling seamless migration between DBMSs (i.e. MySQL to MongoDB). The latter provides a persistence layer to process database requests, being capable to translate and execute these requests in any DBMS, returning the data in a suitable format as well. Experiments show NoSQLayer as a handful solution suitable to handle large volume of data (e.g., Web scale) in which traditional relational DBMS might be inept in the duty.	Leonardo Rocha, Fernando Vale, Élder Cirilo, Dárlinton B. F. Carvalho, Fernando Mourão

DDDAS-Dynamic Data Driven Applications Systems and Large-Scale-Big-Data & Large-Scale-Big-Computing (DDDAS-LS) Session 4

Time and Date: 10:15 - 11:55 on 3rd June 2015

Room: M105

Chair: Frederica Darema

470	Bayesian Computational Sensor Networks: Small-Scale Structural Health Monitoring [abstract] Abstract: The Bayesian Computational Sensor Network methodology is applied to small-scale structural health monitoring. A mobile robot equipped with vision and ultrasound sensor maps small-scale structures for damage localizes itself and the damage in the map. The combination of vision and ultrasound reduces the uncertainty in damage localization. The data storage and analysis takes place exploiting cloud computing mechanisms, and there is also an off-line computational model calibration component which returns information to the robot concerning updated on-board models as well as proposed sampling points. The approach is validated in a set of physical experiments.	Wenyi Wang, Anshul Joshi, Nishith Tirpankar, Philip Erickson, Michael Cline, Palani Thangaraj, Tom Henderson
482	Highly Parallel Algorithm for Large Data In Core and Out Core Triangulation in E2 and E3 [abstract] Abstract: A triangulation of points in E^2, or a tetrahedronization of points in E^3, is used in many applications. It is not necessary to fulfill the Delaunay criteria in all cases. For large data (more then 5∙ 〖10〗^7 points), parallel methods are used for the purpose of decreasing run time. A new approach for fast, effective and highly parallel CPU and GPU triangulation, or tetrahedronization, of large data sets in E^2 or E^3 suitable for in core and out core memory processing, is proposed. Experimental results proved that the resulting triangulation/tetrahedralization, is close to the Delaunay triangulation/tetrahedralization. It also demonstrates the applicability of the method proposed in applications.	Michal Smolik, Vaclav Skala
672	Resilient and Trustworthy Dynamic Data-Driven Application Systems for Crisis Environments [abstract] Abstract: Future cyber information systems are required to determine network performance including trust, resiliency, and timeliness. Using the Dynamic Data-Driven Application Systems (DDDAS) concepts; we develop a method for crisis management that incorporates sensed data, performance models, theoretical analysis, and service-based software. Using constructs from security and resiliency theories, the motivating concept is Resilient-DDDAS-as-a-Cloud Service (rDaaS). Service-based approaches allow a system to react as needed to the dynamics of the situation. The Resilient Cloud Middleware supports the analysis the data stored and retrieved in the cloud, management of processes, and coordination with the end user/application. The r-DaaS concept is demonstrated with a nuclear plant example for emergency response that demonstrates the importance of the DDDAS system level performance.	Youakim Badr, Salim Hariri, Erik Blasch
216	Efficient Execution of Replicated Transportation Simulations with Uncertain Vehicle Trajectories [abstract] Abstract: Many Dynamic Data-Driven Application Systems (DDDAS) use replicated simulations to project possible future system states. In many cases there are substantial similarities among these different replications. In other cases output statistics are independent of certain simulation computations. This paper explores computational methods to exploit these properties to improve efficiency. We discuss a new algorithm to speed up the execution of replicated vehicle traffic simulations, where the output statistics of interest focus on one or more attributes such as the trajectory of a certain “target” vehicle. By focusing on correctly reproducing the behavior of the target vehicle and its interaction with other modeled entities across the different replications and modifying the event handling mechanism the execution time can be reduced on both serial and parallel machines. A speculative execution method using a tagging mechanism allows this speedup to occur without loss of accuracy in the output statistics.	Philip Pecher, Michael Hunter, Richard Fujimoto
613	Adapting Stream Processing Framework for Video Analysis [abstract] Abstract: Stream processing (SP) became relevant mainly due to inexpensive and hence ubiquitous deployment of sensors in many domains (e.g., environmental monitoring, battle field monitoring). Other continuous data generators (web clicks, traffic data, network packets, mobile devices) have also prompted processing and analysis of these streams for applications such as traffic congestion/ accidents, network intrusion detection, and personalized marketing. Image processing has been researched for several decades. Recently there is emphasis on video stream analysis for situation monitoring due to the ubiquitous deployment of video cameras and unmanned aerial vehicles for security and other applications. This paper elaborates on the research and development issues that need to be addressed for extending the traditional stream processing framework for video analysis, especially for situation awareness. This entails extensions to: data model, operators and language for expressing complex situations, QoS specifications and algorithms needed for their satisfaction. Specifically, this paper demonstrates inadequacy of current data representation (e.g., relation and arrable) and querying capabilities to infer long-term research and development issues.	S Chakravarthy, A Aved, S Shirvani, M Annappa, E Blasch

Dynamic Data Driven Applications Systems (DDDAS) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: M105

Chair: Craig Douglas

215	Ensemble Learning for Dynamic Data Assimilation [abstract] Abstract: The organization of an ensemble of initial perturbations by a nonlinear dynamical system can produce highly non-Gaussian patterns, evidence of which is clearly observed in position-amplitude-scale features of coherent fluids. The true distribution of the ensemble is unknown, in part because models are in error and imperfect. A variety of distributions have been proposed in the context of Bayesian inference, including for example, mixture and kernel models. We contend that seeking posterior modes in non-Gaussian inference is fraught with heightened sensitivity to model error and demonstrate this fact by showing that a large component of the total variance remains unaccounted for as more modes emerge. Further, we show that in the presence of bias, this unaccounted variance slows convergence and produces distributions with lower information that require extensive auxiliary clean up procedures such as resampling. These procedures are difficult in large-scale problems where ensemble members may be generated through myriad schemes. We show that by treating the estimation problem entailed as a regression machine, multiple objectives can be incorporated in inference. The relative importance of these objectives can morph over time and can be dynamically adjusted by the data. In particular, we show that both variance reduction and nonlinear modes can be targeted using a stacked cascade generalization. We demonstrate this approach by constructing a new sequential filter called the Boosted Mixture Ensemble Filter and illustrating this on a lorenz system.	Sai Ravela
504	A Method for Estimating Volcanic Hazards [abstract] Abstract: This paper presents one approach to determining the hazard threat to a locale due to a large volcanic avalanche. The methodology employed includes large-scale numerical simulations, field data reporting the volume and runout of flow events, and a detailed statistical analysis of uncertainties in the modeling and data. The probability of a catastrophic event impacting a locale is calculated, together with a estimate of the uncertainty in that calculation. By a careful use of simulations, a hazard map for an entire region can be determined. The calculation can be turned around quickly, and the methodology can be applied to other hazard scenarios.	E Bruce Pitman and Abani Patra
55	Forecasting Volcanic Plume Hazards With Fast UQ [abstract] Abstract: This paper introduces a numerically-stable multiscale scheme to efficiently generate probabilistic hazard maps for volcanic ash transport using models of transport, dispersion and wind. The scheme relies on graph-based algorithms and low-rank approximations of the adjacency matrix of the graph. This procedure involves representing both the parameter space and physical space by a weighted graph. A combination of clustering and low rank approximation is then used to create a good approximation of the original graph. By performing a multiscale data sampling, a well-conditioned basis of a low rank Gaussian kernel matrix, is identified and used for out-of-sample extensions used in generating the hazard maps.	Ramona Stefanescu, Abani Patra, M. I Bursik, E Bruce Pitman, Peter Webley, Matthew D. Jones
45	Forest fire propagation prediction based on overalapping DDDAS forecasts [abstract] Abstract: The effects of forest fires cause a widespread devastation throughout the world every year. A good prediction of fire behavior can help on coordination and management of human and material resources in the extinction of these emergencies. Given the high uncertainty of fire behavior and the difficulty of extracting information required to generate accurate predictions, one system able to adapt to fire dynamics considering the uncertainty of the data is necessary. In this work two different systems based on Dynamic Data Driven Application are applied and a new probabilistic method based on the combination of both approaches is presented. This new method uses the computational power provided by high performance computing systems to adapt the chances in these kind of dynamic environments.	Tomás Artés, Adrián Cardil, Ana Cortés, Tomàs Margalef, Domingo Molina, Lucas Pelegrín, Joaquín Ramírez
533	Towards an Integrated Cyberinfrastructure for Scalable Data-Driven Monitoring, Dynamic Prediction and Resilience of Wildfires [abstract] Abstract: Wildfires are critical for ecosystems in many geographical regions. However, our current urbanized existence in these environments is inducing this ecological balance to evolve into a different dynamic leading to the biggest fires in history. Wildfire wind speeds and directions change in an instant, and first responders can only be effective if they take action as quickly as the conditions change. What is lacking in disaster management today is a system integration of real-time sensor networks, satellite imagery, near-real time data management tools, wildfire simulation tools, and connectivity to emergency command centers before, during and after a wildfire. As a first time example of such an integrated system, the WIFIRE project is building an end-to-end cyberinfrastructure for real-time and data-driven simulation, prediction and visualization of wildfire behavior. This paper summarizes the approach and early results of the WIFIRE project to integrate networked observations, e.g., heterogeneous satellite data and real-time remote sensor data with computational techniques in signal processing, visualization, modeling and data assimilation to provide a scalable, technological, and educational solution to monitor weather patterns to predict a wildfire’s Rate of Spread.	Ilkay Altintas, Jessica Block, Raymond de Callafon, Daniel Crawl, Charles Cowart, Amarnath Gupta, Mai H. Nguyen, Hans-Werner Braun, Jurgen Schulze, Michael Gollner, Arnaud Trouve, Larry Smarr

Dynamic Data Driven Applications Systems (DDDAS) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: M105

Chair: Craig Douglas

689	Dynamic Data Driven Approach for Modeling Human Error [abstract] Abstract: Mitigating human errors is a priority in the design of complex systems, especially through the use of body area networks. This paper describes early developments of a dynamic data driven platform to predict operator error and trigger appropriate intervention before the error happens. Using a two-stage process, data was collected using several sensors (e.g. electroencephalography, pupil dilation measures, and skin conductance) during an established protocol - the Stroop test. The experimental design began with a relaxation period, 40 questions (congruent, then incongruent) without a timer, a rest period followed by another two rounds of questions, but under increased time pressure. Measures such as workload and engagement showed responses consistent with what is known for Stroop tests. Dynamic system analysis methods were then used to analyze the raw data through principal components analysis and least squares complex exponential method. The results show that this algorithm has the potential to capture mental states in a mathematical fashion, thus enabling the possibility of prediction.	Wan-Lin Hu, Janette Meyer, Zhaosen Wang, Tahira Reid, Douglas Adams, Sunil Prabnakar, Alok Chaturvedi
526	Dynamic Execution of a Business Process via Web Service Selection and Orchestration [abstract] Abstract: Dynamic execution of a business process requires the selection and composition of multiple existing services regardless of their locations, platforms, execution speeds, etc. Thus web service selection appears as a challenging and elusive task especially when the service task has to be executed based on user requirements at the runtime. This paper presents our Semantic-Based Business Process Execution Engine (SBPEE) for the dynamic execution of business processes by the orchestration of various exposed web services. SBPEE is based on our designed Project Domain Ontology (PrjOnt) that captures user specifications and SWRL rules which classify the user specification into a specific category according to the business logic and requirements of an enterprise. Based on this classification of the user project and requirements, our semantic engine selects web services from the service repository for the dynamic execution of a business process. SBPEE matches functional requirements of a web service and required QoS attributes to identify the list of pertinent candidate services to fulfil the complex business process transactions. Finally, we present our case study on Create Order business process that aims at creating an order for the customer by following various web services for its task completion.	Muhammad Fahad, Nejib Moalla, Yacine Ouzrout
719	Dynamic Data-Driven Avionics Systems: Inferring Failure Modes from Data Streams [abstract] Abstract: Dynamic Data-Driven Avionics Systems (DDDAS) embody ideas from the Dynamic Data-Driven Application Systems paradigm by creating a data-driven feedback loop that analyzes spatio-temporal data streams coming from aircraft sensors and instruments, looks for errors in the data signaling potential failure modes, and corrects for erroneous data when possible.In case of emergency, DDDAS need to provide enough information about the failure to pilots to support their decision making in real-time. We have developed the PILOTS system, which supports data-error tolerant spatio-temporal stream processing, as an initial step to realize the concept of DDDAS. In this paper, we apply the PILOTS system to actual data from the Tuninter 1153 (TU1153) ight accident in August 2005, where the installation of an incorrect fuel sensor led to a fatal accident. The underweight condition suggesting an incorrect fuel indication for TU1153 is successfully detected with 100% accuracy during cruise ight phases. Adding logical redundancy to avionics through a dynamic data-driven approach can significantly improve the safety of flight.	Shigeru Imai, Alessandro Galli, Carlos A. Varela
71	OpenDBDDAS Toolkit: Secure MapReduce and Hadoop-like Systems [abstract] Abstract: The OpenDBDDAS Toolkit is a software framework to provide support for more easily creating and expanding dynamic big data-driven application systems (DBDDAS) that are common in environmental systems, many engineering applications, disaster management, traffic management, and manufacturing. In this paper, we describe key features needed to implement a secure MapReduce and Hadoop-like system for high performance clusters that guarantees a certain level of privacy of data from other concurrent users of the system. We also provide examples of a secure MapReduce prototype and compare it to another high performance MapReduce, MR-MPI.	Craig C. Douglas, Enrico Fabiano, Mookwon Seo, Xiaoban Wu

Applications of Matrix Computational Methods in the Analysis of Modern Data (MATRIX) Session 1

Time and Date: 10:15 - 11:55 on 3rd June 2015

Room: M209

Chair: Kouroush Modarresi

761	Matrix Completion via Fast Alternating Least Squares [abstract] Abstract: We develop a new scalable method for matrix completion via nuclear-norm regularization and alternating least squares. The algorithm has an EM flavor, which dramatically reduces the computational cost per iteration at the cost of more iterations. *joint work with Rahul Mazumder, Jason Lee and Reza Zadeh.	Trevor Hastie
93	Stable Autoencoding: A Flexible Framework for Regularized Low-Rank Matrix Estimation [abstract] Abstract: Low-rank matrix estimation plays a key role in many scientific and engineering tasks, including collaborative filtering and image denoising. Low-rank procedures are often motivated by the statistical model where we observe a noisy matrix drawn from some distribution with expectation assumed to have a low-rank representation; the statistical goal is then to recover the signal from the noisy data. Given this setup, we develop a framework for low-rank matrix estimation that allows us to transform noise models into regularization schemes via a simple parametric bootstrap. Effectively, our procedure seeks an autoencoding basis for the observed matrix that is robust with respect to the specified noise model. In the simplest case, with an isotropic noise model, our procedure is equivalent to a classical singular value shrinkage estimator. For non-isotropic noise models, however, our method does not reduce to singular value shrinkage, and instead yields new estimators that perform well in experiments. Moreover, by iterating our stable autoencoding scheme, we can automatically generate low-rank estimates without specifying the target rank as a tuning parameter.	Julie Josse, Stefan Wager
349	Finding Top UI/UX Design Talent on Adobe Behance [abstract] Abstract: The Behance social network allows professionals of diverse artistic disciplines to exhibit their work and connect amongst each other. We investigate the network properties of the UX/UI designer subgraph. Considering the subgraph is motivated by the idea that professionals in the same discipline are more likely to give a realistic assessment of a colleague's work. We therefore developed a metric to assess the in uence and importance of a specic member of the community based on structural properties of the subgraph and additional measures of prestige. For that purpose, we identied appreciations as a useful measure to include in a weighted PageRank algorithm, as it adds a notion of perceived quality of the work in the artist's portfolio to the ranking, which is not contained in the structural information of the graph. With this weighted PageRank, we identied locations that have a high density of in uential UX/UI designers.	Susanne Halstead, Daniel Serrano, Scott Proctor
753	Graphs, Matrices, and the GraphBLAS: Seven Good Reasons [abstract] Abstract: The analysis of graphs has become increasingly important to a wide range of applications. Graph analysis presents a number of unique challenges in the areas of (1) software complexity, (2) data complexity, (3) security, (4) mathematical complexity, (5) theoretical analysis, (6) serial performance, and (7) parallel performance. Implementing graph algorithms using matrix-based approaches provides a number of promising solutions to these challenges. The GraphBLAS standard (istc-bigdata.org/GraphBlas) is being developed to bring the potential of matrix based graph algorithms to the broadest possible audience. The GraphBLAS mathematically defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. This paper provides an introduction to the GraphBLAS and describes how the GraphBLAS can be used to address many of the challenges associated with analysis of graphs.	Jeremy Kepner

Applications of Matrix Computational Methods in the Analysis of Modern Data (MATRIX) Session 2

Time and Date: 14:10 - 15:50 on 3rd June 2015

Room: M209

Chair: Kouroush Modarresi

762	Anomaly Detection and Predictive Maintenance through Large Sparse Similarity Matrices and Causal Modeling [abstract] Abstract: We use large (100k x 100k) sparse similarity matrices of time series (sensor data) for anomaly detection and failure prediction. These similarity matrices, computed using the universal information distance based on Kolmogorov Complexity, are used to perform non-parametric unsupervised clustering with non-linear boundaries without complex and slow coordinate transformations of the raw data. Changes over time in the similarity matrix allow us to observe anomalous behavior of the system and predict failure of parts. This approach is well suited for big data with little prior domain knowledge. Once we have learned the basic dependency patterns from the data, we can use this in addition to domain knowledge to build a causal model that relates outcomes to inputs through hidden variables. Given a set of observed outcomes and their associated sensor data, we can build a probabilistic model of the underlying causal events that produced both the outcomes and the data. The parameters of this probabilistic model (conditional joint probabilities) are inferred by maximizing the likelihood of the observed historical data. When such a model has been inferred, we can use it to predict future outcomes based on observations by first identifying the most likely underlying causal events that produced the observations and hence the most likely resulting outcomes. Both approaches differ from building a direct correlational model between the data and outcomes because they utilize complex representations of the state of the system -- in the first case through the similarity matrix and in the second case through domain specific modeling. As a final step, these predictions of future outcomes are fed into an over-arching stochastic optimization for optimal scheduling of maintenance activities over either short or long term time horizons. We’ll show real world examples from utilities, aerospace and defense and video surveillance. * It is a joint work with Alan McCord and Anand Murugan	Paul Hofmann
692	Computation of Recommender System using Localized Regularization [abstract] Abstract: Targeting and Recommendation are major topics in ecommerce. The topic is treated as “Matrix Completion” in statistics. The main point is to compute the unknown (missing) values in the matrix data. This work is based on a different view of regularization, i.e., a localized regularization technique which leads to improvement in the estimation of the missing values.	Kourosh Modarresi
695	Unsupervised Feature Extraction using Singular Value Decomposition [abstract] Abstract: Though modern data often provides a massive amount of information, much of that might be redundant or useless (noise). Thus, it is significant to recognize the most informative features of data. This will help the analysis of the data by removing the consequences of high dimensionality, in addition of obtaining other advantages of lower dimensional data such as lower computational cost and a less complex model.	Kourosh Modarresi
384	Quantifying complementarity among strategies for influencers' detection on Twitter [abstract] Abstract: The so-called influencer, a person with the ability to persuade people, have important role on the information diffusion in social media environments. Indeed, influencers might dictate word-of-mouth and peer recommendation, impacting tasks such as recommendation, advertising, brand evaluation, among others. Thus, a growing number of works aim to identify influencers by exploiting distinct information. Deciding about the best strategy for each domain, however, is a complex task due to the lack of consensus among these works. This paper presents a quantitative study of analysis among some of the main strategies for identifying influencers, aiming to help researchers on this decision. Besides determining semantic classes of strategies, based on the characteristics they exploit, we obtained through PCA an effective meta-learning process to combine linearly distinct strategies. As main implications, we highlight a better understanding about the selected strategies and a novel manner to alleviate the difficulty on deciding which strategy researchers would adopt.	Alan Neves, Ramon Viera, Fernando Mourão, Leonardo Rocha
399	Fast Kernel Matrix Computation for Big Data Clustering [abstract] Abstract: Kernel k-Means is a basis for many state of the art global clustering approaches. When the number of samples grows too big, however, it is extremely time consuming to compute the entire kernel matrix and it is impossible to store it in the memory of a single computer. The algorithm of Approximate Kernel k-Means has been proposed, which works using only a small part of the kernel matrix. The computation of the kernel matrix, even a part of it, remains a significant bottleneck of the process. Some types of kernel, however, can be computed using matrix multiplication. Modern CPU architectures and computational optimization methods allow for very fast matrix multiplication, thus those types of kernel matrices can be computed much faster than others.	Nikolaos Tsapanos, Anastasios Tefas, Nikolaos Nikolaidis, Alexandros Iosifidis, Ioannis Pitas

Large Scale Computational Physics (LSCP) Session 1

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: V102

Chair: Fukuko YUASA

757	Workshop on Large Scale Computational Physics - LSCP [abstract] Abstract: The LSCP workshop focuses on symbolic and numerical methods and simulations, algorithms and tools (software and hardware) for developing and running large-scale computations in physical sciences. Special attention goes to parallelism, scalability and high numerical precision. System architectures are also of interest as long as they are supporting physics related calculations, such as: massively parallel systems, GPUs, many-integrated-cores, distributed (cluster, grid/cloud) computing, and hybrid systems. Topics are chosen from areas including: theoretical physics (high energy physics, nuclear physics, astrophysics, cosmology, quantum physics, accelerator physics), plasma physics, condensed matter physics, chemical physics, molecular dynamics, bio-physical system modeling, material science/engineering, nanotechnology, fluid dynamics, complex and turbulent systems, and climate modeling.	Elise de Doncker, Fukuko Yuasa
96	The Particle Accelerator Simulation Code PyORBIT [abstract] Abstract: The particle accelerator simulation code PyORBIT is presented. The structure, implementation, history, parallel and simulation capabilities, and future development of the code are discussed. The PyORBIT code is a new implementation and extension of algorithms of the original ORBIT code that was developed for the Spallation Neutron Source accelerator at the Oak Ridge National Laboratory. The PyORBIT code has a two level structure. The upper level uses the Python programming language to control the flow of intensive calculations performed by the lower level code implemented in the C++ language. The parallel capabilities are based on MPI communications. The PyORBIT is an open source code accessible to the public through the Google Open Source Projects Hosting service.	Andrei Shishlo
115	Simulations of several finite-sized objects in plasma [abstract] Abstract: Interaction of plasma with finite-sized objects is one of central problems in the physics of plasmas. Since object charging is often nonlinear and involved, it is advisable to address this problem with numerical simulations. First-principle simulations allow studying trajectories of charged plasma particles in self-consistent force fields. One of such approaches is the particle-in-cell (PIC) method, where the use of spatial grid for the force calculation significantly reduces the computational complexity. Implementing finite-sized objects in PIC simulations is often a challenging task. In this work we present simulation results and discuss the numerical representation of objects in the DiP3D code, which enables studies of several independent objects in various plasma environments.	Wojciech Miloch
196	DiamondTorre GPU implementation algorithm of the RKDG solver for fluid dynamics and its using for the numerical simulation of the bubble-shock interaction problem [abstract] Abstract: In this paper the solver based upon the RKDG method for solving three-dimensional Euler equations of gas dynamics is considered. For the numerical scheme the GPU implementation algorithm called DiamondTorre is used, which helps to improve the performance speed of calculations. The problem of the interaction of a spherical bubble with a planar shock wave is considered in the three-dimensional setting. The obtained calculations are in agreement with the known results of experiments and numerical simulations. The calculation results are obtained with the use of the PC.	Boris Korneev, Vadim Levchenko
460	Optimal Temporal Blocking for Stencil Computation [abstract] Abstract: Temporal blocking is a class of algorithms which reduces the required memory bandwidth (B/F ratio) of a given stencil computation, by “blocking” multiple time steps. In this paper, we prove that a lower limit exists for the reduction of the B/F attainable by temporal blocking, under certain conditions. We introduce the PiTCH tiling, an example of temporal blocking method that achieves the optimal B/F ratio. We estimate the performance of PiTCH tiling for various stencil applications on several modern CPUs. We show that PiTCH tiling achieves 1.5 ∼ 2 times better B/F reduction in three-dimensional applications, compared to other temporal blocking schemes. We also show that PiTCH tiling can remove the bandwidth bottleneck from most of the stencil applications considered.	Takayuki Muranushi, Junichiro Makino

Large Scale Computational Physics (LSCP) Session 2

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: V102

Chair: Fukuko YUASA

684	A Case Study of CUDA FORTRAN and OpenACC for an Atmospheric Climate Kernel [abstract] Abstract: The porting of a key kernel in the tracer advection routines of the Community Atmosphere Model - Spectral Element (CAM-SE) to use Graphics Processing Units (GPUs) using OpenACC is considered in comparison to an existing CUDA FORTRAN port. The development of the OpenACC kernel for GPUs was substantially simpler than that of the CUDA port. Also, OpenACC performance was about 1.5x slower than the optimized CUDA version. Particular focus is given to compiler maturity regarding OpenACC implementation for modern fortran, and it is found that the Cray implementation is currently more mature than the PGI implementation. Still, for the case that ran successfully on PGI, the PGI OpenACC runtime was slightly faster than Cray. The results show encouraging performance for OpenACC implementation compared to CUDA while also exposing some issues that may be necessary before the implementations are suitable for porting all of CAM-SE. Most notable are that GPU shared memory should be used by future OpenACC implementations and that derived type support should be expanded.	Matthew Norman, Jeffrey Larkin, Aaron Vose and Katherine Evans
585	OpenCL vs OpenACC: lessons from development of lattice QCD simulation code [abstract] Abstract: OpenCL and OpenACC are generic frameworks for heterogeneous programming using CPU and accelerator devices such as GPUs. They have contrasting features: the former explicitly controls devices through API functions, while the latter generates such procedures along a guide of the directives inserted by a programmer. In this paper, we apply these two frameworks to a general-purpose code set for numerical simulations of lattice QCD, which is a computational physics of elementary particles based on the Monte Carlo method. The fermion matrix inversion, which is usually the most time-consuming part of the lattice QCD simulations, is off-loaded to the accelerator devices. From a viewpoint of constructing reusable components based on the object-oriented programming and also tuning the code to achieve high performance, we discuss feasibility of these frameworks through the practical implementations.	Hideo Matsufuru, Sinya Aoki, Tatsumi Aoyama, Kazuyuki Kanaya, Shinji Motoki, Yusuke Namekawa, Hidekatsu Nemura, Yusuke Taniguchi, Satoru Ueda, Naoya Ukita
515	Application of GRAPE9-MPX for high precision calculation in particle physics and performance results [abstract] Abstract: There are scientific applications which require calculations with high precision such as Feynman loop integrals and orbital integrations. These calculations also need to be accelerated. We have been developing dedicated accelerator systems which consist of processing elements for high precision arithmetic operations and a programing interface. GRAPE9-MPX is our latest system with multiple Field Programmable Gate Array (FPGA) boards on which our developed PEs are implemented. We present the performance results for GRAPE9-MPX extended to have upto 16 FPGA boards for quadruple/hexuple/octuple-precision with some optimization. The achieved performance for a Feynman loop integral with 12 FPGA boards is 26.5 Gflops for quadruple precision. We also give an analytical consideration for the performance results.	Hiroshi Daisaka, Naohito Nakasato, Tadashi Ishikawa, Fukuko Yuasa
734	Adaptive Integration for 3-loop Feynman Diagrams with Massless Propagators [abstract] Abstract: We apply multivariate adaptive integration to problems arising from self-energy Feynman loop diagrams with massless internal lines. Results are obtained with the ParInt integration software package, which is layered over MPI (Message Passing Interface) and incorporates advanced parallel computation techniques such as load balancing among processes that may be distributed over a network of nodes. To solve the problems numerically we introduce a parameter r in a factor of the integrand function. Some problem categories allow setting r = 0; other cases require an extrapolation as r -> 0. Furthermore we apply extrapolation with respect to the dimensional regularization parameter by setting the dimension n = 4 - 2*eps and extrapolating as eps -> 0. Timing results show near optimal parallel speedups with ParInt for the problems at hand.	Elise de Doncker, Fukuko Yuasa, Omofolakunmi Olagbemi

Workshop on Nonstationary Models of Pattern Recognition and Classifier Combinations (NMRPC) Session 1

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: M110

Chair: Michal Woźniak

480	An algebraic approach to combining classifiers [abstract] Abstract: In distributed classification, each learner observes its environment and deduces a classifier. As a learner has only a local view of its environment, classifiers can be exchanged among the learners and integrated, or merged, to improve accuracy. However, the operation of merging is not defined for most classifiers. Furthermore, the classifiers that have to be merged may be of different types in settings such as ad-hoc networks in which several generations of sensors may be creating classifiers. We introduce decision spaces as a framework for merging possibly different classifiers. We formally study the merging operation as an algebra, and prove that it satisfies a desirable set of properties. The impact of time is discussed for the two main data mining settings. Firstly, decision spaces can naturally be used with non-stationary distributions, such as the data collected by sensor networks, as the impact of a model decays over time. Secondly, we introduce an approach for stationary distributions, such as homogeneous databases partitioned over different learners, which ensures that all models have the same impact. We also present a method using storage flexibly to achieve different types of decay for non-stationary distributions.	Philippe Giabbanelli, Joseph Peters
36	Power LBP: A novel texture operator for smiling and neutral facial display classification [abstract] Abstract: Texture operators are commonly used to describe image content for many purposes. Recently they found its application in the task of emotion recognition, especially using local binary pattern method, LBP. This paper introduces a novel texture operator called power LBP, which defines a new ordering schema based on absolute intensity differences. Its definition as well as interpretation are given. The performance of suggested solution is evaluated on the problem of smiling and neutral facial display recognition. In order to evaluate the power LBP operator accuracy, its discriminative capacity work is compared to several members of the LPB family. Moreover, the influence of applied classification approach is also considered, by presenting results for k-nearest neighbour, support vector machine, and template matching classifiers. Furthermore, results for several databases are compared.	Bogdan Smolka, Karolina Nurzynska
657	Incremental Weighted One-Class Classifier for Mining Stationary Data Streams [abstract] Abstract: Data streams and big data analytics is among the most popular contemporary machine learning problems. More and more often real-life problems could generate massive and continuous amounts of data. Standard classifiers cannot cope with a large volume of the training set and/or changing nature of the environment. In this paper, we deal with a problem of continuously arriving objects, that with each time interval may contribute new, useful knowledge to the patter classification system. This is known as stationary data stream mining. One-class classification is a very useful tool for stream analysis, as it can be used for tackling outliers, noise, appearance of new classes or imbalanced data to name a few. We propose a novel version of incremental One-Class Support Vector Machine, that assigns weights to each object according to its level of significance. This allows to train more robust one-class classifiers on incremental streams. We present two schemes for estimating weights for new, incoming data and examine their usefulness on a number of benchmark datasets. We also analyze time and memory requirements of our method. Results of experimental investigations prove, that our method can achieve better one-class recognition quality than algorithms used so far.	Bartosz Krawczyk and Michal Wozniak
659	Wagging for Combining Weighted One-Class Support Vector Machines [abstract] Abstract: Most of machine learning problems assume, that we have at our disposal objects originating from two or more classes. By learning from a representative training set a classifier is able to estimate proper decision boundaries. However, in many real-life problems obtaining objects from some of the classes is difficult, or even impossible. In such cases, we are dealing with one-class classification, or learning in the absence of counterexamples. Such recognition systems must display a high robustness to new, unseen objects that may belong to an unknown class. That is why ensemble learning has become an attractive perspective in this field. In our work, we propose a novel one-class ensemble classifier, based on wagging. A weighted version of boosting is used, and the output weights for each object are used directly in the process of training Weighted One-Class Support Vector Machines. This introduces a diversity into the pool of one-class classifiers and extends the competence of formed ensemble. Experimental analysis, carried out on a number of benchmarks and backed-up with statistical analysis proves that the proposed method can outperform state-of-the-art ensembles dedicated to one-class classification.	Bartosz Krawczyk, Michal Wozniak

International Workshop on Computational Flow and Transport: Modeling, Simulations and Algorithms (CFT) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: M201

Chair: Shuyu Sun

388	Statistical Inversion of Absolute Permeability in Single Phase Darcy Flow [abstract] Abstract: In this paper, we formulate the permeability inverse problem in the Bayesian framework using total variation (TV) and $\ell_p$ regularization prior. We use the Markov Chain Monte Carlo (MCMC) method for sampling the posterior distribution to solve the ill-posed inverse problem. We present simulations to estimate the distribution for each pixel for the image reconstruction of the absolute permeability.	Thilo Strauss, Xiaolin Fan, Shuyu Sun, Taufiquar Khan
32	An enhanced velocity multipoint flux mixed finite element method for Darcy flow on non-matching hexahedral grids [abstract] Abstract: This paper proposes a new enhanced velocity method to directly construct a flux-continuous velocity approximation with multipoint flux mixed finite element method on subdomains. This gives an efficient way to perform simulations on multiblock domains with non-matching hexahedral grids. We develop a reasonable assumption on geometry, discuss implementation issues, and give several numerical results with slightly compressible single phase flow.	Benjamin Ganis, Mary Wheeler, Ivan Yotov
124	A compact numerical implementation for solving Stokes equations using matrix-vector operations [abstract] Abstract: In this work, a numerical scheme is implemented to solve Stokes equations based on cell-centered finite difference over staggered grid. In this scheme, all the difference operations have been vectored thereby eliminating loops. This is particularly important when using programming languages that require interpretations, e.g., Matlab and Python. Using this scheme, the execution time becomes significantly smaller compared with non-vectored operations and also become comparable with those languages that require no repeated interpretations like FORTRAN, C, etc. This technique has also been applied to Navier-Stokes equations under laminar flow conditions.	Tao Zhang, Amgad Salama, Shuyu Sun, Hua Zhong
265	Numerical Models for the Simulation of Aeroacoustic Phenomena [abstract] Abstract: In the development of a numerical model for aeroacoustic problems, two main issues arise: which level of physical approximation to adopt and which numerical scheme is the most appropriate. It is possible to consider a hierarchy of physical aproximations, ranging from the wave equation, without or with convective effects, to the linearized Euler and Navier-Stokes equations, as well as a wide range of high-order numerical schemes, ranging from compact finite difference schemes to the discontinuous Galerkin method (DGM) for unstructured grids. For problems in complex geometries, significant hydrodynamic-acoustic interactions, coupling acoustic waves and vortical modes, may occur. For example in ducts with sudden changes of area where flow separation occurs in correspondence of sharp edges with a consequent generation of vorticity for viscous effects. To correctly model this coupling, the Navier-Stokes equations, linearized with respect to a representative mean flow, must be solved. The formulation based on Linearized Navier-Stokes (LNS) equations is suitable to deal with problems involving such hydrodynamic-acoustic interactions. The occurrence of geometrical complexities, such as sharp edges, where acoustic energy is transferred into the vortical modes for viscous effects, requires an highly accurate numerical scheme with non only reduced dispersive properties, to accurate model the wave propagation, but also providing a very low level of numerical dissipation on unstructured grids. The DGM is the most appropriate numerical scheme satisfying these requirements. The objective of the present work is to develop an efficient numerical solution of the LNS equations, based on a DGM on unstructured grids. To our knowledge, there is only one work dealing with the solution of the LNS for aeroacoustics where the equations are solved in the frequency domain. In this work we develop the method in the time domain. The non-dispersive and non-diffusive nature of acoustic waves propagating over long distances forces us to adopt highly accurate numerical methods. DGM is one of the most promising scheme due to its intrinsic stability and to its capability to treat unstructured grids. Both advantages make this method well suited for problems characterized by wave propagation phenomena in complex geometries. The main disadvantage of DGM is the high computational requirements because the discontinuous character of the method which adds extra nodes on the interfaces between cells respect to a standard continuous Galerkin Method (GM). Techniques of optimization of the DGM in the case of the Navier-Stokes equations, to reduce the computational effort, are currently object of intense research. At our knowledge, no similar effort is made in the context of the solution of the LNS equations. The LNS equations are derived and the DGM is presented. Preliminary results for the case of the scattering of plane waves traveling in a duct with a sudden area expansion and a comparison between LEE and LNS calculations of vortical modes, are presented.	Renzo Arina

International Workshop on Computational Flow and Transport: Modeling, Simulations and Algorithms (CFT) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: M201

Chair: Shuyu Sun

56	Numerical simulation of the flow in the fuel injector in sharply inhomogeneous electric field [abstract] Abstract: The results of detailed numerical simulation of the flow in an injector including electrohydrodynamic interaction in sharply inhomogeneous electric field formed by electrode system closed to the “needle-plane” type are presented. The aim of the simulation is to estimate the charge rate flow at the fuel injector outlet. The results were obtained using the open-source package OpenFOAM in which the corresponding models of electrohydrodynamics were added. The parametric calculations were performed for axis-symmetric model using RANS k-omega SST turbulence model. Due to swirl device in fuel injector the flow is strongly swirling. To obtain parameters for axis-symmetric flow calculations the 3D simulation was performed for the simplified injector model including swirl device and without electrods.	Alexander Smirnovsky, Vladimir Nagorny, Dmitriy Kolodyazhny, Alexander Tchernysheff
122	An algorithm for the numerical solution of the pseudo compressible Navier-Stokes equations based on the experimenting fields approach [abstract] Abstract: In this work, the experimenting fields approach is applied to the numerical solution of the Navier-Stokes equation for incompressible viscous flow. In this work, the solution is sought for both the pressure and velocity fields in the same time. Apparently, the correct velocity and pressure fields satisfy the governing equations and the boundary conditions. In this technique a set of predefined fields are introduced to the governing equations and the residues are calculated. The flow according to these fields will not satisfy the governing equations and the boundary conditions. However, the residues are used to construct the matrix of coefficients. Although, in this setup it seems trivial constructing the global matrix of coefficients, in other setups it can be quite involved. This technique separates the solver routine from the physics routines and therefore makes easy the coding and debugging procedures. We compare with few examples that demonstrate the capability of this technique.	Amgad Salama, Shuyu Sun, Mohamed El Amin
462	Pore network modeling of drainage process in patterned porous media: a quasi-static study [abstract] Abstract: This work represents a preliminary investigation on the role of wettability conditions on the flow of a two-phase system in porous media. Since such eects have been lumped implicitly in relative permeability-saturation and capillary pressure-saturation relationships, it is quite challenging to isolate its eects explicitly in real porous media applications. However, within the framework of pore network models, it is easy to highlight the effects of wettability conditions on the transport of two-phase systems. We employ quasi-static investigation in which the system undergo slow movement based on slight increment of the imposed pressure. Several numerical experiments of the drainage process are conducted to displace a wetting fluid with a non-wetting one. In all these experiments the network is assigned dierent scenarios of various wettability patterns. The aim is to show that the drainage process is very much aected by the imposed pattern of wettability. The wettability conditions are imposed by assigning the value of contact angle to each pore throat according to predefined patterns.	Tao Zhang, Amgad Salama, Shuyu Sun and Mohamed El Amin

International Workshop on Computational Flow and Transport: Modeling, Simulations and Algorithms (CFT) Session 3

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: M201

Chair: Shuyu Sun

123	Numerical Treatment of Two-Phase Flow in Porous Media Including Specific Interfacial Area [abstract] Abstract: In this work, we present a numerical treatment of the model of two-phase flow in porous media including specific interfacial area. For numerical discretization we use the cell-centered finite difference (CCFD) method based on the shifting-matrices method which could reduce the time-consuming operations. A new iterative implicit algorithm has been developed to solve the problem under consideration. All advection and advection-like terms that appear in saturation equation and interfacial area equation are treated using upwind schemes together with the CCFD and shifting-matrices techniques. Selected simulation results such as $p_c-S_w-a_{wn}$ surface have been introduced. The simulation results have a good agreement with those in the literature using either pore network modeling or Darcy scale modeling.	Mohamed El-Amin, Redouane Meftah, Amgad Salama, Shuyu Sun
210	Chaotic states and order in the chaos of the paths of freely falling and ascending spheres [abstract] Abstract: The research extends and improves the parametric study of "Instabilities and transition of a sphere falling or ascending freely in a Newtonian fluid" of Jenny et al. (2004) with special focus on the onset of chaos and on chaotic states. The results show that the effect of density ratio responsible for two qualitatively different oblique oscillating states has a significant impact both on the onset of chaos and on the behavior of fully chaotic states. The observed difference between dense and light spheres is associated to the strength of coupling between fluid and solid degrees of freedom. While the low frequency mode of oblique oscillating state presents specific features due to a strong solid - fluid coupling, the dynamics of the high frequency mode is shown to be driven by the same vortex shedding as the wake of a fixed sphere. The different fluid-solid coupling also determines two different ways how chaos sets in. Two outstanding ordered regimes are evidenced and investigated in the chaotic domain. One of them, characteristic for its helical trajectories, might provide a link to the experimentally evidenced, but so far numerically unexplained, vibrating regime of ascension of light spheres. For fully chaotic states, it is shown that statistical averaging converges in a satisfactory manner. Several statistical characteristics are suggested and evaluated.	Wei Zhou and Jan Dušek
288	Switching Between the NVT and NpT Ensembles Using the Reweighting and Reconstruction Scheme [abstract] Abstract: Recently, we have developed several techniques in order to accelerate Monte Carlo (MC) molecular simulations. For that purpose, two strategies were followed. In the first, new algorithms were proposed as a set of early rejection schemes performing faster than the conventional algorithm while preserving the accuracy of the method. On the other hand, a reweighting and reconstruction scheme was introduced that is capable of retrieving primary quantities and second derivative properties at several thermodynamic conditions from a single MC Markov chain. The latter scheme, was first developed to extrapolate quantities in NVT ensemble for structureless Lennard-Jones particles. However, it is evident that for most real life applications the NpT ensemble is more convenient, as pressure and temperature are usually known. Therefore, in this paper we present an extension to the reweighting and reconstruction method to solve NpT problems utilizing the same Markov chains generated by the NVT ensemble simulations. Eventually, the new approach allows elegant switching between the two ensembles for several quantities at a wide range of neighboring thermodynamic conditions.	Ahmad Kadoura, Amgad Salama, Shuyu Sun
185	Coupled modelling of a shallow water flow and pollutant transport using depth averaged turbulent model. [abstract] Abstract: The paper presents a mathematical model of a turbulent river flow based on unsteady shallow water equations and depth averaged turbulence model. The numerical model is based on upwind finite volume method on structured staggered grid. In order to get a stable numerical solution simple-based algorithm was used. Among well-developed models of the river flow proposed approach stands out with its computational efficiency and high quality in describing processes in a river stream. For the main cases of pollution transport in river flows it is essential to know whether the model is appropriate to predict turbulent characteristics of the flow in the open channel. Two computational cases have been carried out to investigating and to applying established model. The first case shows the impact of confluents into generation of turbulence in the river flow and shows that recirculation flows effects on the process of pollutant dispersion in water basins. Driven cavity test case have been carried out to investigate the accuracy of the established method and its applicability to the streams with a complex structure.	Alexander V. Starchenko and Vladislava V. Churuksaeva

Tools for Program Development and Analysis in Computational Science (TOOLS) Session 1

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: M209

Chair: Jie Tao

602	Cube v.4 : From Performance Report Explorer to Performance Analysis Tool [abstract] Abstract: Cube v.3 has been a powerful tool to examine Scalasca performance reports, but was basically unable to perform analyses on its own. With Cube v.4, we addressed several shortcomings of Cube v.3. We generalized the Cube data model, extended the list of supported data types, and allow operations with nontrivial algebras, e.g. for performance models or statistical data. Additionally, we introduced two major new features that greatly enhance the performance analysis features of Cube: Derived metrics and GUI plugins. Derived metrics can be used to create and manipulate metrics directly within the GUI, using a powerful domain-specific language called CubePL. Cube GUI plugins allow the development of novel performance analysis techniques based on Cube data without changing the source code of the Cube GUI.	Michael Knobloch, Bernd Mohr, Anke Visser, Pavel Saviankou
51	Visual MPI Performance Analysis using Event Flow Graphs [abstract] Abstract: Event flow graphs used in the context of performance monitoring combine the scalability and low overhead of profiling methods with lossless information recording of tracing tools. In other words, they capture statistics on the performance behavior of parallel applications while preserving the temporal ordering of events. Event flow graphs require significantly less storage than regular event traces and can still be used to recover the full ordered sequence of events performed by the application. In this paper we explore the usage of event flow graphs in the context of visual performance analysis. We show that graphs can be used to quickly spot performance problems, helping to better understand the behavior of an application. We demonstrate our performance analysis approach with MiniFE, a mini-application that mimics the key performance aspects of finite-element applications in High Performance Computing (HPC).	Xavier Aguilar, Karl Fürlinger, Erwin Laure
75	Glprof: A Gprof inspired, Callgraph-oriented Per-Object Disseminating Memory Access Multi-Cache Profiler [abstract] Abstract: Application analysis is facilitated through a number of program profiling tools. The tools vary in their complexity, ease of deployment, design, and profiling detail. Specifically, understanding, analyzing, and optimizing is of particular importance for scientific applications where minor changes in code paths and data-structure layout can have profound effects. Understanding how intricate data-structures are accessed and how a given memory system responds is a complex task. In this paper we describe a trace profiling tool, Glprof, specifically aimed to lessen the burden of the programmer to pin-point heavily involved data-structures during an application's run-time, and understand data-structure run-time usage. Moreover, we showcase the tool's modularity using additional cache simulation components. We elaborate on the tool's design, and features. Finally we demonstrate the application of our tool in the context of Spec benchmarks using the Glprof profiler and two concurrently running cache simulators, PPC440 and AMD Interlagos.	Tomislav Janjusic, Christos Kartsaklis
326	Graphical high level analysis of communication in distributed virtual reality applications [abstract] Abstract: Analysing distributed virtual reality applications communicating through message-passing is challenging. Their development is complex, and knowing if something is wrong depends on the states of each process, defects (bugs) cause software crashes, hangs, and generation of incorrect results. To address this daunting problem we specify functional behavior models (for example, using synchronization barriers and shared variables) for these applications that ensures correctness. We also developed the GTracer tool, which compares the functional behavior models developed with the messages transmitted among processes. GTracer checks for violations of these models automatically and displays the message traffic graphically. It is a tool made for libGlass, a message library for distributed computing. We have been able to find several non-trivial defects during the tests of this tool.	Marcelo Guimarães, Bruno Gnecco, Diego Dias, José Brega, Luis Trevelin

Tools for Program Development and Analysis in Computational Science (TOOLS) Session 2

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: M209

Chair: Jie Tao

368	Providing Parallel Debugging for DASH Distributed Data Structures with GDB [abstract] Abstract: The C++ DASH template library provides distributed data container for Partitioned Global Address Space (PGAS)-like programming. Because DASH is new and under development no debugger is capable to handle the parallel processes or access/modify container elements in a convenient way. This paper describes how the DASH library has to be extended to interrupt the start-up process to connect a debugger with all started processes and to enable the debugger for accessing and modifying DASH container elements. Furthermore, an GDB extension to output well formatted DASH container information is presented.	Denis Hünich, Andreas Knüpfer, José Gracia
156	Sequential Performance: Raising Awareness of the Gory Details [abstract] Abstract: The advent of multicore and manycore processors, including GPUs, in the customer market encouraged developers to focus on extraction of parallelism. While it is true that parallelism can deliver performance boosts, parallelization is also very complex and error-prone task. Many applications are still sequential, or dominated by sequential sections. Modern micro-architectures have become extremely complex, and they usually do a very good job at executing fast a given sequence of instructions. When they occasionally fail, however, the penalty may be severe. Pathological behaviors often have their roots in very low-level implementation details of the micro-architecture, hardly available to the programmer. We argue that the impact of these low-level features on performance has been overlooked, often relegated to experts. We show that a few metrics can be easily defined to help assess the overall performance of an applications, and quickly diagnose a problem. Finally we illustrate our claim with a simple prototype, along with several use cases.	Erven Rohou, David Guyon
544	Evolving Fortran types with inferred units-of-measure [abstract] Abstract: Dimensional analysis is a well known technique for checking the consistency of equations involving physical quantities, constituting a kind of type system. Various type systems for dimensional analysis, and its refinement to units-of-measure, have been proposed. In this paper, we detail the design and implementation of a units-of-measure system for Fortran, implemented as a pre-processor. Our system is designed to aid adding units to existing code base: units may be polymorphic and can be inferred. Furthermore, we introduce a technique for reporting to the user a set of critical variables}which should be explicitly annotated with units to get the maximum amount of unit information with the minimal number of explicit declarations. This aids adoption of our type system to existing code bases, of which there are many in computational science projects.	Dominic Orchard, Andrew Rice and Oleg Oshmyan

The Eleventh Workshop on Computational Finance and Business Intelligence (CFBI) Session 1

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: M105

Chair: Yong Shi

353	Nonparallel hyperplanes support vector machine for multi-class classification [abstract] Abstract: In this paper, we proposed a nonparallel hyperplanes classier for multi-class classication, termed as NHCMC. This method inherits the idea of multiple birth support vector machine(MBSVM), that is the "max" decision criterion instead of the "min" one, but it has the incomparable advantages than MBSVM. First, the optimization problems in NHCMC can be solved eciently by sequential minimization optimization (SMO) without needing to compute the large inverses matrices before training as SVMs usually do; Second, kernel trick can be applied directly to NHCMC, which is superior to existing MBSVM. Experimental results on lots of data sets show the eciency of our method in multi-class classication accuracy.	Xuchan Ju, Yingjie Tian, Dalian Liu, Zhiquan Qi
415	Multilevel dimension reduction Monte-Carlo simulation for high-dimensional stochastic models in finance [abstract] Abstract: One-way coupling often occurs in multi-dimensional stochastic models in finance. In this paper, we develop a highly efficient Monte Carlo (MC) method for pricing European options under a N-dimensional one-way coupled model, where N is arbitrary. The method is based on a combination of (i) the powerful dimension and variance reduction technique, referred to as drMC, developed in Dang et. al (2014), that exploits this structure, and (ii) the highly effiective multilevel MC (mlMC) approach developed by Giles (2008). By first applying Step (i), the dimension of the problem is reduced from N to 1, and as a result, Step (ii) is essentially an application of mlMC on a 1-dimensional problem. Numerical results show that, through a careful construction of the ml-dr estimator, improved efficiency expected from the Milstein timestepping with first order strong convergence can be achieved. Moreover, our numerical results show that the proposed ml-drMC method is significantly more efficient than the mlMC methods currently available for multi-dimensional stochastic problems.	Duy-Minh Dang, Qifan Xu, Shangzhe Wu
671	Computational Visual Analysis of the Order Book Dynamics for Creating High-Frequency Foreign Exchange Trading Strategies. [abstract] Abstract: This paper presents a Hierarchical Hidden Markov Model used to capture the USD/COP market sentiment dynamics choosing from uptrend or downtrend latent regimes based on observed feature vector realizations calculated from transaction prices and wavelet-transformed order book volume dynamics. The HHMM learned a natural switching buy/uptrend sell/downtrend trading strategy using a training-validation framework over one month of market data. The model was tested on the following two months, and its performance was reported and compared to results obtained from randomly classified market states and a feed-forward Neural Network. This paper also separately assessed the contribution to the model’s performance of the order book information and the wavelet transformation.	Javier Sandoval, German Hernandez
636	Influence of the External Environment Behaviour on the Banking System Stability [abstract] Abstract: There are plenty of researches dedicated to financial system stability, which takes significant place in prevention of financial crisis and its consequences. However banking system and external environment interaction and customers behaviour influence on the banking system stability are poorly studied. Current paper propose agent-based model of banking system and its external environment. We show how customers behaviour characteristics affect a banking system stability. Optimal interval for total environmental funds towards banking system wealthy is performed.	Valentina Y. Guleva, Alexey Dukhanov

Workshop on Teaching Computational Science (WTCS) Session 1

Time and Date: 10:15 - 11:55 on 3rd June 2015

Room: M201

Chair: Alfredo Tirado-Ramos

117	Developing a Hands-On Course Around Building and Testing High Performance Computing Clusters [abstract] Abstract: We describe a successful approach to designing and implementing a High Performance Computing (HPC) class focused on creating competency in building, configuring, programming, troubleshooting, and benchmarking HPC clusters. By coordinating with campus services, we were able to avoid any additional costs to the students or the university. Students built three twelve-unit independently-operating clusters. Working groups were formed for each cluster and they installed the operating system, created users, connected to the campus network and wrote a variety of scripts and parallel programs while documenting the process. We describe how we solved unexpected problems encountered along the way. We illustrate through pre- and post-course surveys that students gained substantial knowledge in fundamental aspects of HPC through the hands-on approach of creating their own clusters.	Karl Frinkle, Mike Morris
269	Interactively Exploring the Connection between Bidirectional Compression and Star Bicoloring [abstract] Abstract: The connection between scientific computing and graph theory is detailed for a particular problem called bidirectional compression. This scientific computing problem consists of finding a pair of seed matrices in automatic differentiation. In terms of graph theory, the problem is nothing but finding a star bicoloring of a suitably defined graph. An interactive educational module is designed and implemented to illustrate the connection between bidirectional com- pression and star bicoloring. The web-based module is intended to be used in classroom to illustrate the intricate nature of this combinatorial problem.	M. Ali Rostami, Martin Buecker
651	Scientific Workflows with XMDD: A Way to Use Process Modeling in Computational Science Education [abstract] Abstract: Process models are well suited to describe in a formal but still intuitive fashion what a system should do. They can thus play a central role in problem-based computational science education with regard to qualifying students for the design and implementation of software applications for their specific needs without putting the focus on the technical part of coding. eXtreme Model Driven Design (XMDD) is a software development paradigm that explicitly focuses on the What (solving problems) rather than on the How (the technical skills of writing code). In this paper we describe how we apply an XMDD-based process modeling and execution framework for scientific workflow projects in the scope of a computer science course for students with a background in natural sciences.	Anna-Lena Lamprecht, Tiziana Margaria
152	Teaching Science Using Computationally-Based Investigations [abstract] Abstract: Wofford College has initiated a computational laboratory course, Scientific Investigations Using Computation, which satisfies one of its Bachelor of Science requirements. In the course, which one professor teaches, students explore important concepts in science and, using computational tools, implement the scientific method to gain a better understanding of the natural world. Before the first class for a topic, which usually takes one week, students read a module by the authors of this abstract. Some of the topics are the carbon cycle, global warming, disease, adaptation and mimicry, fur patterns, membranes, gas laws, chemical kinetics, and enzyme kinetics. Each module includes a discussion of the topic, quick review questions, points of inquiry for further investigation, and references. In class, students take an online quiz from the quick review questions and complete an enriching activity related to the topic. Typically, in pairs or larger groups, students are assigned points of inquiry to investigate, develop, and present for subsequent periods in the week. A topic culminates in a three-hour laboratory, where students perform experiments at computers using the agent-based modeling tool NetLogo and the spreadsheet Excel. NetLogo, which is free to download, includes numerous computational models that have levels for Interface to run the simulation and view the results, Information about the model, and Code, which the user can view and change. Laboratory guidelines by the authors lead the students through the material in a step-by-step fashion. As well as conducting experiments computationally, the students modify the code to refine the models. Thus, the class examines scientific topics using the scientific method and various resources, gains an appreciation of the utility of computational simulations, and starts to learn to program and to think algorithmically.	Angela Shiflet and George Shiflet
158	DNA and普通話(Mandarin): Bringing introductory programming to the Life Sciences and Digital Humanities [abstract] Abstract: The ability to write software (to script, to program, to code) is a vital skill for students and their future data-centric, multidisciplinary careers. We present a ten-year effort to teach introductory programming skills in domain-focused courses to students across divisions in our liberal arts college. By creatively working with colleagues in Biology, Statistics, and now English, we have designed, modified, and offered six iterations of two courses: “DNA” and “Computing for Poets”. Larger percentages of women have consistently enrolled in these two courses vs. the traditional first course in the major. We share our open source course materials and present here our use of a blended learning classroom that leverages the increasing quality of online video lectures and programming practice sites in an attempt to maximize faculty-student interactions in class.	Mark Leblanc, Michael Drout

Workshop on Teaching Computational Science (WTCS) Session 2

Time and Date: 14:10 - 15:50 on 3rd June 2015

Room: M201

Chair: Angela Shiflet

3	DAEL Framework: A New Adaptive E-learning Framework for Students with Dyslexia [abstract] Abstract: This paper reports on an extensive study conducted on the existing frameworks and relevant theories that lead to a better understanding of the requirements of an e-learning tool for people with dyslexia. The DAEL framework has been developed with respect to four different dimensions: presentation, hypermediality, acceptability and accessibility, and user experience. However, there has been no research on the different types of dyslexia and the dyslexic user’s viewpoint as they affect application design. Therefore, in this paper a framework is proposed which would conform to the standards of acceptability and accessibility for dyslexic students. We hypothesise that an e-learning application, which will adopt itself according to individuals’ dyslexia types, will advantage the dyslexics’ individuals in their learning process.	Aisha Alsobhi, Nawaz Khan, Harjinder Rahanu
632	Approach to Automation of Cloud Learning Resources’ Design for Courses in Computational Science Based on eScience Resources with the Use of the CLAVIRE Platform [abstract] Abstract: Abstract This paper describes the set of methods and cloud tools used to simplify the rapid design of learning resources for courses in computational science. We have developed and added new tools to our cloud platform – CLAVIRE – to simplify and speed up the sharing of scientific executable resources, design and implementation of courses’ structure and virtual learning labs, and preparation of the text resources for the theoretical part of the course and the case studies and seminars. We have applied our approach to design a course in eScience tchnologies based on the sequences of application packages and cloud services developed for task solving in different application domains and integrated into the CLAVIRE platform. Our approach allows us to significantly speed up the design and implementation of learning resources, and does not reduce the value of teachers’ (experts’) participation.	Alexey Dukhanov, Tamara Trofimenko, Maria Karpova, Lev Bezborodov, Alexey Bezgodov, Anna Bilyatdinova, Anna Lutsenko
248	An Introduction Course in the Computational Modeling of Nature [abstract] Abstract: To meet the requirement of a course in computational thinking for a minor in Informational Technology, an introductory course in computational modeling of nature was developed. Influenced by the modeling course developed by Dickerson at Middlebury College, the computational science textbook by Shiflet and Shiflet, and my own work on partial differential equation models, this course contains the development of three kinds of models of phenomena in nature. These three kinds of models are agent-based models using the language of NetLogo, simple finite difference models using the system dynamics option of NetLogo, and complex finite difference models using the language of C++. The natural phenomena modeled include some standard ones (ants following pheromone trails, the interaction of sheep and wolves, erosion due to rainfall, and the spread of malaria) and some non-standard ones (the 7-day creation of the world, 3 dogs playing games, and formation of stripes and spots in the skins of animals). The emphasis of the course is on the modeling process instead of on programming with modeling as a thread. A distinguishing feature is that, because students spend significant time with three different modeling techniques, they are able to compare and critique these models.	Kathie Yerion
403	How Engineers deal with Mathematics solving Differential Equations [abstract] Abstract: Numerical methods are tools for approximating solutions to problems that may have complicated developments or cannot be solved analytically. In engineering studies, students have to face problems from other disciplines such as structural or rock mechanics, biology, chemistry or physics. Prior to solving these problems it is important to define and adopt a rational framework. The students of fourth course out of five, of the bachelor’s degree in Computer Sciences or Industrial Engineering at the University of Salamanca (Spain), they learn mathematics solving real problems with the help of their acquired interdisciplinary knowledge. We have proposed the students a term project that summarizes some of the knowledge and skills acquired during the course. We will describe in this study the software and specific applications that will be use during the whole course.	Araceli Queiruga Dios, Ascensión Hernández Encinas, Angel Martin Del Rey, Jesus Martin-Vaquero, Juan José Bullón Pérez, Gerardo Rodríguez Sánchez
116	TSGL: A Thread Safe Graphics Library for Visualizing Parallelism [abstract] Abstract: Multicore processors are now the standard CPU architecture, and multithreaded parallel programs are needed to take full advantage of such CPUs. New tools are needed to help students learn how to design and build such parallel programs. In this paper, we present the thread-safe graphics library (TSGL), a new C++11 library that allows different threads to draw to a shared Canvas, which is updated in approximate real-time. Using TSGL, instructors and students can create visualizations that illustrate multithreaded behavior. We present three multithreaded applications that illustrate the use of TSGL to help students see and understand how an application is using parallelism to speed up its computation.	Joel Adams, Patrick Crain, Mark Vander Stel
26	Education in Computational Sciences [abstract] Abstract: The last two decades have witnessed an enormously rapid development of computational technologies which have undoubtedly affected all the fields of human activities, including education. In fact, computational science is one of the most evolving profile study programmes at technical universities nowadays. This article thus focuses on the description of the key content courses, curricula and degrees offered within the study programmes at the Faculty of Informatics and Management of the University of Hradec Kralove, Czech Republic. Moreover, this study characterizes teaching and learning of university computational professionals with a special focus on core competences such as an ability to identify and solve problems, knowledge of analytical methods, or operating systems, but also on active and passive knowledge of English since English as lingua franca can help these professionals together with other core competences succeed in the job market after their graduation.	Petra Poulova, Blanka Klimova

Paradigms for Control in Social Systems (PCSS) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: M209

Chair: Justin Ruths

755	Overview and Introduction [abstract] Abstract: TBD	Derek Ruths
751	Jeff's Invited Talk [abstract] Abstract: TBD	Jeff Shamma

Paradigms for Control in Social Systems (PCSS) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: M209

Chair: Justin Ruths

749	Sinan's Invited Talk [abstract] Abstract: TBD	Sinan Aral
752	Bruce's Invited Talk [abstract] Abstract: TBD	Bruce Desmarais

Paradigms for Control in Social Systems (PCSS) Session 3

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: M209

Chair: Justin Ruths

748	A Role for Network Science in Social Norms Intervention [abstract] Abstract: Social norms theory has provided a foundation for public health interventions on critical issues such as alcohol and substance use, sexual violence, and risky sexual behavior. We assert that modern social norms interventions can be better informed with the use of network science methods. Social norms can be seen as a complex contagion on a social network, and the propagation of social norms as an information diffusion process. We observe instances where the recommendations of social norms theory match up to theoretical predictions from information diffusion models, but also places where the network science viewpoint highlights aspects of intervention design not addressed by the existing theory. Information about network structure and dynamics are often not used in existing social norms interventions; we argue that these factors may be contributing to the lack of efﬁcacy of social norms interventions delivered via online social networks. Network models of intervention also offer opportunities for better evaluation and comparison across application domains.	Clayton Davis, Julia Heiman, Filippo Menczer
750	Ali's Invited Talk [abstract] Abstract: TBD	Ali Jadbabaie
756	Closing and Wrap-up [abstract] Abstract: TBD	Justin Ruths

Multiscale Modelling and Simulation, 12th International Workshop (MSCALE) Session 1

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: V201

Chair: Valeria Krzhizhanovskaya

1	Multiscale Modelling and Simulation Workshop: 12 Years of Inspiration [abstract] Abstract: Modelling and simulation of multiscale systems constitutes a grand challenge in computational science, and is widely applied in fields ranging from the physical sciences and engineering to the life sciences and the socio-economic domain. To adequately simulate numerous intertwined processes characterized by different spatial and temporal scales (often spanning many orders of magnitude), sophisticated models and advanced computational techniques are required. Additionally, these multiscale models frequently need large scale computing capabilities as well as dedicated software and services that enable the exploitation of existing and evolving computational ecosystems. The aim of the annual Workshop on Multiscale Modelling and Simulation is to facilitate the progress in this multidisciplinary research field http://www.computationalscience.nl/MMS/. In this paper, we reflect on the 12 years of workshop history and glimpse at the latest developments presented in 2015 in Iceland, the Land of Fire and Ice. In Section 6, we invite new workshop co-organizers.	V.V. Krzhizhanovskaya, D. Groen, B. Bosak, A.G. Hoekstra
342	A Survey of Open Source Multiphysics Frameworks in Engineering [abstract] Abstract: This paper presents a systematic survey of open source multiphysics frameworks in the engineering domains. These domains share many commonalities despite the diverse application areas. A thorough search for the available frameworks with both academic and industrial origins has revealed numerous candidates. Considering key characteristics such as project size, maturity and visibility, we selected Elmer, OpenFOAM and Salome for a detailed analysis. All the public documentation for these tools has been manually collected and inspected. Based on the analysis, we built a feature model for multiphysics in engineering, which captures the commonalities and variability in the domain. We in turn validated the resulting model via two other tools; Kratos by manual inspection, and OOFEM by means of expert validation by domain experts.	Önder Babur, Vit Smilauer, Tom Verhoeff, Mark van den Brand
696	A Hybrid Multiscale Framework for Subsurface Flow and Transport Simulations [abstract] Abstract: Extensive research efforts have been invested in reducing model errors to improve the predictive ability of biogeochemical earth and environmental system simulators, with applications ranging from contaminant transport and remediation to impacts of biogeochemical elemental cycling (e.g., carbon and nitrogen) on local ecosystems and regional to global climate. While the bulk of this research has focused on improving model parameterizations in the face of observational limitations, the more challenging type of model error/uncertainty to identify and quantify is model structural error which arises from incorrect mathematical representations of (or failure to consider) important physical, chemical, or biological processes, properties, or system states in model formulations. While improved process understanding can be achieved through scientific study, such understanding is usually developed at small scales. Process-based numerical models are typically designed for a particular characteristic length and time scale. For application-relevant scales, it is generally necessary to introduce approximations and empirical parameterizations to describe complex systems or processes. This single-scale approach has been the best available to date because of limited understanding of process coupling combined with practical limitations on system characterization and computation. While computational power is increasing significantly and our understanding of biological and environmental processes at fundamental scales is accelerating, using this information to advance our knowledge of the larger system behavior requires the development of multiscale simulators. Accordingly there has been much recent interest in novel multiscale methods in which microscale and macroscale models are explicitly coupled in a single hybrid multiscale simulation. A limited number of hybrid multiscale simulations have been developed for biogeochemical earth systems, but they mostly utilize application-specific and sometimes ad-hoc approaches for model coupling. We are developing a generalized approach to hierarchical model coupling designed for high-performance computational systems, based on the Swift computing workflow framework. In this presentation we will describe the generalized approach and provide two use cases: 1) simulation of a mixing-controlled biogeochemical reaction coupling pore- and continuum-scale models, and 2) simulation of biogeochemical impacts of groundwater – river water interactions coupling fine- and coarse-grid model representations. This generalized framework can be customized for use with any pair of linked models (microscale and macroscale) with minimal intrusiveness to the at-scale simulators. It combines a set of python scripts with the Swift workflow environment to execute a complex multiscale simulation utilizing an approach similar to the well-known Heterogeneous Multiscale Method. User customization is facilitated through user-provided input and output file templates and processing function scripts, and execution within a high-performance computing environment is handled by Swift, such that minimal to no user modification of at-scale codes is required.	Timothy Scheibe, Xiaofan Yang, Xingyuan Chen, Glenn Hammond
467	Fluid simulations with atomistic resolution: multiscale model with account of nonlocal momentum transfer [abstract] Abstract: Nano- and microscale flow phenomena turn out to be highly non-trivial for simulation and require the use of heterogeneous modeling approaches. While the continuum Navier-Stokes equations and related boundary conditions quickly break down at those scales, various direct simulation methods and hybrid models have been applied, such as Molecular Dynamics and Dissipative Particle Dynamics. Nonetheless, a continuum model for nanoscale flow is still an unsolved problem. We present a model taking into account nonlocal momentum transfer. Instead of a bulk viscosity an improved system of parameters of liquid properties, represented by a spatial scalar function for momentum transfer rate between neighboring volumes, is used. Our model does not require boundary conditions on the channel walls. Common nanoflow models relying on a bulk viscosity in combination with a slip boundary condition can be obtained from the model. The required model parameters can be calculated from momentum density fluctuations obtained by Molecular Dynamics simulations. Thus, our model is multiscale, however, the continuum model is applied in the whole region of the simulation. We demonstrate good agreed with nanoflow in a tube as obtained by complete Molecular Dynamics.	Andrew I. Svitenkov, Sergey A. Chivilikhin, Alfons G. Hoekstra, Alexander V. Boukhanovsky
463	An automated multiscale ensemble simulation approach for vascular blood flow [abstract] Abstract: Cerebrovascular diseases such as brain aneurysms are a primary cause of adult disability. The flow dynamics in brain arteries, both during periods of rest and increased activity, are known to be a major factor in the risk of aneurysm formation and rupture, although the precise relation is still an open field of investigation. We present an automated ensemble simulation method for modelling cerebrovascular blood flow under a range of flow regimes. By automatically constructing and performing an ensemble of multiscale simulations, where we unidirectionally couple a 1D solver with a 3D lattice-Boltzmann code, we are able to model the blood flow in a patient artery over a range of flow regimes. We apply the method to a model of a middle cerebral artery, and find that this approach helps us to fine-tune our modelling techniques, and opens up new ways to investigate cerebrovascular flow properties.	Mohamed Itani, Ulf Schiller, Sebastian Schmieschek, James Hetherington, Miguel Bernabeu, Hoskote Chandrashekar, Fergus Robertson, Peter Coveney and Derek Groen
29	A multiscale model for the feto-placental circulation in the monochorionic twin pregnancies [abstract] Abstract: We developed a mathematical model of monochorionic twin pregnancies to simulate both the normal gestation and the Twin-Twin Transfusion Syndrome (TTTS), a disease in which the interplacental anastomose create a flow imbalance, causing one of the twin to receive too much blood and liquids, becoming hypertensive and polyhydramnios (the Recipient) and the other to become hypotensive and oligohydramnios (the Donor). This syndrome, if untreated, leads almost certainly to death one or both twins. We propose a compartment model to simulate the flows between the placenta and the fetuses and the accumulation of the amniotic fluid in the sacs. The aim of our work is to provide a simple but realistic model of the twins-mother system and to stress it by simulating the pathological cases and the related treatments, i.e. aminioreduction (elimination of the excess liquid in the recipient sac), laser therapy (removal of all the anastomoses) and other possible innovative therapies impacting on pressure and flow parameters.	Ilaria Stura, Pietro Gaglioti, Tullia Todros, Caterina Guiot

Multiscale Modelling and Simulation, 12th International Workshop (MSCALE) Session 2

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: V201

Chair: Valeria Krzhizhanovskaya

595	A Multiscale and Patient-Specific Computational Framework of Atherosclerosis Formation and Progression: A Case Study in the Aorta and Peripheral Arteries [abstract] Abstract: Atherosclerosis is the main cause of mortality and morbidity in the western world. Atherosclerosis is a chronic disease defined by life-long processes, with multiple actors playing a role at different biological and time scales. Patient-specific in silico models and simulations can help to understand better the mechanisms of atherosclerosis formation, potentially improving patient management. A conceptual and computational multiscale framework for the modelling of atherosclerosis formation at its early stage was created from the integration of a fluid mechanics model and a biochemical model. The fluid mechanics model describes the interaction between arterial endothelium and blood flow using an artery-specific approach. The low density lipoprotein (LDL) oxidation and consequent immune reaction leading to chronic inflammatory process at the basis of plaque formation was described in the biochemical model. The integration of these modelling approaches led to the creation of a computational framework, an effective tool for the modelling of atherosclerosis plaque development. The model presented in this study was able to capture key features of atherogenesis such as the location of pro-atherogenic areas and to reproduce the formation of plaques detectable from in vivo observations. This framework is being currently tested at University College Hospital (UCH).	Giulia Di Tomaso, Cesar Pichardo, Obiekezie Agu, Vanessa Diaz
121	A multiscale model evaluating phenotypes variations in tumors following multiple xeno-transplantation [abstract] Abstract: Tumor growth is a very challenging issue of capital importance to address therapy and patient management. Since it is difficult to follow the cancer natural history in humans, animal models are largely investigated. In particular, xeno-transplants are often performed on previously immune-depressed mice in order to get information about both macroscopic (growth rate) and microscopical (at cellular and genomic level) features. Previous studies showed that following multiple transplants tumors grow faster, and this fact was commonly assumed to prove the occurrence of mutations whose rate depended on the transplantation passage due to some sort of genetic instability. A recent paper reports data from a very interesting experiment, where two different clones are monitored through multipassage xeno-trasplant. We use these data in order to validate a two-population Gompertz model which assume a constant mutation rate but takes into account for the timing of multiple transplants.	Ilaria Stura and Caterina Guiot
83	Multiscale modeling approach for radial particle transport in large-scale simulations of the tokamak plasma edge [abstract] Abstract: A multiscale model for an improved description of radial particle transport described by the density continuity equation in large-scale plasma edge simulations for tokamak fusion devices is presented. It includes the effects of mesoscale drift-fluid dynamics on the macroscale profiles and vice versa. The realization of the multiscale model in form of the coupled code system B2-ATTEMPT is outlined. A procedure employed to efficiently determine the averaged mesoscale terms using a nonparametric trend test, the Reverse Arrangements Test, is described. Results of stationary, self-consistent B2-ATTEMPT simulations are compared to prior simulations for experiments at the TEXTOR tokamak, making a first evaluation of the predicted magnitude of radial particle transport possible.	Felix Hasenbeck, Dirk Reiser, Philippe Ghendrih, Yannick Marandet, Patrick Tamain, Annette Möller, Detlev Reiter
130	Coupled simulations in plasma physics with the Integrated Plasma Simulator platform [abstract] Abstract: A fusion plasma is a complex object involving a wide range of physics phenomena occurring at different scales. When building multiscale or multi-physics applications, an interesting approach (formalized in the Multiscale Modelling and Simulation Framework) consists in coupling single scale components (where a scale can be either spatial, temporal or refer to a different physics or numeric model), making each single component easier to develop, validate and maintain. Such coupling has been investigated within the EFDA ITM-TF task force, by using a common data structure to interface every single scale component, and a workflow manager to pilot the simulation. When such a workflow has to run in parallel, a possible approach consists in running the simulation platform within a regular parallel allocation in a single computer. The Integrated Plasma Simulator has been based on such a principle: it runs in a single (possibly very large) allocation and handles internally the dynamic scheduling of each component considering several layers of abstraction for the parallelism. We have implemented within the IPS platform two fusion workflows with different computational needs: an acyclic (loose-coupling) chain composed of high-resolution equilibrium reconstruction and MHD stability study, and a cyclic (tight-coupling) turbulence – transport time evolution. The acyclic case involves a parameter scan where the runtime of each single case can differ significantly, whereas the cyclic case is composed of codes which have to be executed in sequence with different level of parallelism and computational cost. This contribution presents briefly the characteristics of the IPS platform and compares them to other platforms used in the fusion community (Kepler, Muscle). Then implementation details are given about the wrappers, which are required to embed legacy codes coming from the ITM community into the IPS platform. Finally, targeted cyclic and acyclic workflows and their characteristics are presented as well as their performance in different configurations.	Olivier Hoenen, David Coster, Sebastian Petruczynik, Marcin Plociennik
135	Spectral Solver for Multi-Scale Plasma Physics Simulations with Dynamically Adaptive Number of Moments [abstract] Abstract: A spectral method for kinetic plasma simulations based on the expansion of the velocity distribution function in a variable number of Hermite polynomials is presented. The method is based on a set of non-linear equations that is solved to determine the coefficients of the Hermite expansion satisfying the Vlasov and Poisson equations. In this paper, we first show that this technique combines the fluid and kinetic approaches into one framework. Second, we present an adaptive strategy to increase and decrease the number of Hermite functions dynamically during the simulation. The technique is applied to the Landau damping and two-stream instability test problems. Performance results show 21% and 47% saving of total simulation time in the Landau and two-stream instability test cases, respectively.	Juris Vencels, Gian Luca Delzanno, Alec Johnson, Ivy Bo Peng, Erwin Laure, Stefano Markidis

Multiscale Modelling and Simulation, 12th International Workshop (MSCALE) Session 3

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: V201

Chair: Valeria Krzhizhanovskaya

133	Telescopic Projective Integration for Multiscale Kinetic Equations with a Specified Relaxation Profile [abstract] Abstract: We study the design of a general, fully explicit numerical method for simulating kinetic equations with an extended BGK collision model allowing for multiple relaxation times. In that case, the problem is stiff and we show that its spectrum consists of multiple separated eigenvalue clusters. Projective integration methods are explicit integration schemes that first take a few small (inner) steps with a simple, explicit method, after which the solution is extrapolated forward in time over a large (outer) time step. They are very efficient schemes, provided there are only two clusters of eigenvalues. Telescopic projective integration methods generalize the idea of projective integration methods by constructing a hierarchy of projective levels. Here, we show how telescopic projective integration methods can be used to efficiently integrate multiple relaxation time BGK models. We show that the number of projective levels only depends on the number of clusters and the size of the outer level time step only depends on the slowest time scale present in the model. Both do not depend on the small-scale parameter. We analyze stability and illustrate with numerical results.	Ward Melis, Giovanni Samaey
489	Coevolution of Information Processing and Topology in Hierarchical Adaptive Random Boolean Networks [abstract] Abstract: Random Boolean networks (RBNs) are frequently employed for modelling complex systems driven by information processing, e.g. for gene regulatory networks (GRNs). Here we propose a hierarchical adaptive RBN (HARBN) as a system consisting of distinct adaptive RBNs – subnetworks – connected by a set of permanent interlinks. Information measures and internal subnetworks topology of HARBN coevolve and reach steady-states that are specific for a given network structure. We investigate mean node information, mean edge information as well as a mean node degree as functions of model parameters and demonstrate HARBNs ability to describe complex hierarchical systems.	Piotr Górski, Agnieszka Czaplicka and Janusz Holyst
13	Numerical Algorithms for Solving One Type of Singular Integro-Differential Equation Containing Derivatives of the Time Delay States [abstract] Abstract: This study presents numerical algorithms for solving a class of equations that partly consists of derivatives of the unknown state at previous certain times, as well as an integro-differential term containing a weakly singular kernel. These equations are types of integro-differential equation of the second kind and were originally obtained from an aeroelasticity problem. One of the main contributions of this study is to propose numerical algorithms that do not involve transforming the original equation into the corresponding Volterra equation, but still enable the numerical solution of the original equation to be determined. The feasibility of the proposed numerical algorithm is demonstrated by applying examples in measuring the maximum errors with exact solutions at every computed nodes and calculating the corresponding numerical rates of convergence thereafter.	Shihchung Chiang and Terry Herdman
209	Safer Batteries Through Coupled Multiscale Modeling [abstract] Abstract: Batteries are highly complex electrochemical systems, with performance and safety governed by coupled nonlinear electrochemical-electrical-thermal-mechanical processes over a range of spatiotemporal scales. We describe a new, open source computational environment for battery simulation known as VIBE - the Virtual Integrated Battery Environment. VIBE includes homogenized and pseudo-2D electrochemistry models such as those by Newman-Tiedemann-Gu (NTG) and Doyle-Fuller-Newman (DFN, a.k.a. DualFoil) as well as a new advanced capability known as AMPERES (Advanced MultiPhysics for Electrochemical and Renewable Energy Storage). AMPERES provides a 3D model for electrochemistry and full coupling with 3D electrical and thermal models on the same grid. VIBE/AMPERES has been used to create three-dimensional battery cell and pack models that explicitly simulate all the battery components (current collectors, electrodes, and separator). The models are used to predict battery performance under normal operations and to study thermal and mechanical response under adverse conditions.	John Turner, Srikanth Allu, Abhishek Kumar, Sergiy Kalnaus, Sreekanth Pannala, Srdjan Simunovic, Mark Berrill, Damien Lebrun-Grandie, Wael Elwasif
92	The Formation of a Magnetosphere with Implicit Particle-in-Cell Simulations [abstract] Abstract: A magnetosphere is a region of space filled with plasma around a magnetized object, shielding it from solar wind particles. The shape of a magnetosphere is determined by the microscopic interaction phenomena between the solar wind and the dipolar magnetic field of the object. To correctly describe these interactions, we need to model phenomena occurring over a large range of time and spatial scales. In fact, magnetosphere comprises regions with different particle densities, temperatures and magnetic field where the characteristic time scales (plasma period, electron and ion gyro period) and spatial scales (Debye length, ion and electron skin depth) vary considerably. We simulate the formation of a magnetosphere with an implicit Particle-in-Cell code, called iPIC3D. We used a dipole model to represent the magnetic field of the object, where an interplanetary magnetic field is convected by the solar wind. We carried out global Particle-in-Cell simulations that consist of a complete Magnetosphere system, including magnetopause, magnetosheath and magnetotail. In this paper we describe the new algorithms implemented in iPIC3D to address the problem of modelling multi-scale phenomena in magnetosphere. In particular, we present a new adaptive sub-cycling technique to correctly describe the motion of particles that are close to the magnetic dipole. We also implemented new boundary conditions to model the inflow and outflow of solar wind in the simulation box. Finally, we discuss about the application of these new methods for modelling planetary magnetospheres.	Ivy Bo Peng, Stefano Markidis, Andris Vaivads, Juris Vencels, Giovanni Lapenta, Andrey Divin, Jorge Amaya, Erwin Laure

6th Workshop on Computational Optimization, Modelling & Simulation (COMS) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: V201

Chair: Leifur Leifsson

261	Surrogate-Based Airfoil Design with Space Mapping and Adjoint Sensitivity [abstract] Abstract: This paper presents a space mapping algorithm for airfoil shape optimization enhanced with adjoint sensitivities. The surrogate-based algorithm utilizes low-cost derivative information obtained through adjoint sensitivities to improve the space mapping matching between a high-fidelity airfoil model, evaluated through expensive CFD simulations, and its fast surrogate. Here, the airfoil surrogate model is constructed though low-fidelity CFD simulations. As a result, the design process can be performed at a low computational cost in terms of the number of high-fidelity CFD simulations. The adjoint sensitivities are also exploited to speed up the surrogate optimization process. Our method is applied to a constrained drag minimization problem in two-dimensional inviscid transonic flow. The problem is solved for several low-fidelity model termination criteria. The results show that when compared with direct gradient-based optimization with adjoint sensitivities, the proposed approach requires 49-78% less computational cost while still obtaining a comparable airfoil design.	Yonatan Tesfahunegn, Slawomir Koziel, Leifur Leifsson, Adrian Bekasiewicz
317	How to Speed up Optimization? Opposite-Center Learning and Its Application to Differential Evolution [abstract] Abstract: This paper introduces a new sampling technique called Opposite-Center Learning (OCL) intended for convergence speedup of meta-heuristic optimization algorithms. It comprises an extension of Opposition-Based Learning (OBL), a simple scheme that manages to boost numerous optimization methods by considering the opposite points of candidate solutions. In contrast to OBL, OCL has a theoretical foundation – the opposite center point is defined as the optimal choice in pair-wise sampling of the search space given a random starting point. A concise analytical background is provided. Computationally the opposite center point is approximated by a lightweight Monte Carlo scheme for arbitrary dimension. Empirical results up to dimension 20 confirm that OCL outperforms OBL and random sampling: the points generated by OCL have shorter expected distances to a uniformly distributed global optimum. To further test its practical performance, OCL is applied to differential evolution (DE). This novel scheme for continuous optimization named Opposite-Center DE (OCDE) employs OCL for population initialization and generation jumping. Numerical experiments on a set of benchmark functions for dimensions 10 and 30 reveal that OCDE on average improves the convergence rates by 38% and 27% compared to the original DE and the Opposition-based DE (ODE), respectively, while remaining fully robust. Most promising are the observations that the accelerations shown by OCDE and OCL increase with problem dimensionality.	H. Xu, C.D. Erdbrink, V.V. Krzhizhanovskaya
281	Visualizing and Improving the Robustness of Phase Retrieval Algorithms [abstract] Abstract: Coherent x-ray diffractive imaging is a novel imaging technique that utilizes phase retrieval and nonlinear optimization methods to image matter at nanometer scales. We explore how the convergence properties of a popular phase retrieval algorithm, Fienup’s HIO, behave by introducing a reduced dimensionality problem allowing us to visualize convergence to local minima and the globally optimal solution. We then introduce generalizations of HIO that improve upon the original algorithm’s ability to converge to the globally optimal solution.	Ashish Tripathi, Sven Leyffer, Todd Munson, Stefan Wild
257	Fast Optimization of Integrated Photonic Components Using Response Correction and Local Approximation Surrogates [abstract] Abstract: A methodology for a rapid design optimization of integrated photonic couplers is presented. The proposed technique exploits variable-fidelity electromagnetic (EM) simulation models, additive response correction for accommodating the discrepancies between the EM models of various fidelities, and local response surface approximations for a fine tuning of the final design. A specific example of a 1,555 nm coupler is considered with an optimum design obtained at a computational cost corresponding to about 24 high-fidelity EM simulations of the structure.	Adrian Bekasiewicz, Slawomir Koziel, Leifur Leifsson
197	Model Selection for Discriminative Restricted Boltzmann Machines Through Meta-heuristic Techniques [abstract] Abstract: Discriminative learning of Restricted Boltzmann Machines has been recently introduced as an alternative to provide a self-contained approach for both unsupervised feature learning and classification purposes. However, one of the main problems faced by researchers interested in such approach concerns with a proper selection of its parameters, which play an important role in its final performance. In this paper, we introduced some meta-heuristic techniques for this purpose, as well as we showed they can be more accurate than a random search, that is commonly used by some works.	Joao Paulo Papa, Gustavo Rosa, Aparecido Marana, Walter Scheirer and David Cox

6th Workshop on Computational Optimization, Modelling & Simulation (COMS) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: V201

Chair: Leifur Leifsson

619	A Cooperative Coevolutionary Differential Evolution Algorithm with Adaptive Subcomponents [abstract] Abstract: The performance of cooperative coevolutionary algorithms for large-scale continuous optimization is significantly affected by the adopted decomposition of the search space. According to the literature, a typical decomposition in case of fully separable problems consists of adopting equally sized subcomponents for the whole optimization process (i.e. static decomposition). Such an approach is also often used for fully non-separable problems, together with a random-grouping strategy. More advanced methods try to determine the optimal size of subcomponents during the optimization process using reinforcement-learning techniques. However, the latter approaches are not always suitable in this case because of the non-stationary and history-dependent nature of the learning environment. This paper investigates a new Cooperative Coevolutionary algorithm, based on Differential Evolution, in which several decompositions are applied in parallel during short learning phases. The experimental results on a set of large-scale optimization problems show that the proposed method can lead to a reliable estimate of the suitability of each subcomponent size. Moreover, in some cases it outperforms the best static decomposition.	Giuseppe A. Trunfio
105	Multi-Level Job Flow Cyclic Scheduling in Grid Virtual Organizations [abstract] Abstract: Distributed environments with the decoupling of users from resource providers are generally termed as utility Grids. The paper focuses on the problems of efficient job flow distribution and scheduling in virtual organizations (VOs) of utility Grids while ensuring the VO stakeholders preferences and providing dependable strategies for resources utilization. An approach based on the combination of the cyclic scheduling scheme, backfilling and several heuristic procedures is proposed and studied. Comparative simulation results are introduced for different algorithms and heuristics depending on the resource domain composition and heterogeneity. Considered scheduling approaches provide different benefits depending on the VO scheduling objectives.The results justify the use of the proposed approaches in a broad range of the considered resource environment parameters.	Victor Toporkov, Anna Toporkova, Alexey Tselishchev, Dmitry Yemelyanov, Petr Potekhin
346	The Stochastic Simplex Bisection Algorithm [abstract] Abstract: We propose the stochastic simplex bisection algorithm. It randomly selects one from a set of simplexes, bisects it, and replaces it with its two offspring. The selection probability is proportional to a score indicating how promising the simplex is to bisect. We generalize intervals to simplexes, rather than to hyperboxes, as bisection then only requires evaluating the function in one new point, which is somewhat randomized. Using a set of simplexes that partition the search space yields completeness and avoids redundancy. We provide an appropriate scale- and offset-invariant score definition and add an outer loop for handling hyperboxes. Experiments show that the algorithm is capable of exploring vast numbers of local optima, over huge ranges, yet finding the global one. The ease with which it handles quadratic functions makes it ideal for non-linear regression: it is here successfully applied to logistic regression. The algorithm does well, also when the number of function evaluations is severely restricted.	Christer Samuelsson
243	Local Tuning in Nested Scheme of Global Optimization [abstract] Abstract: Numerical methods for global optimization of the multidimensional multiextremal functions in the framework of the approach oriented at dimensionality reduction by means of the nested optimization scheme are considered. This scheme reduces initial multidimensional problem to a set of univariate subproblems connected recursively. That enables to apply efficient univariate algorithms for solving the multidimensional problems. The nested optimization scheme served as the source of many methods for optimization of Lipschitzian function. However, in all of them there is the problem of estimating the Lipschitz constant as the parameter of the function optimized and, as a consequence, of tuning to it the optimization method. In the methods proposed earlier, as a rule, a global estimate (related to whole search domain) is used whereas local Lipschitz constants in some subdomains can differ significantly from the global constant. It can slow down the optimization process considerably. To overcome this drawback in the article the finer estimates of a priori unknown Lipschitz constants taking into account local properties of the objective function are considered and used in the nested optimization scheme. The results of numerical experiments presented demonstrate the advantages of methods with mixed (local and global) estimates of Lipschitz constants in comparison with the use the global ones only.	Victor Gergel, Vladimir Grishagin, Ruslan Israfilov
226	Variations of Ant Colony Optimization for the solution of the structural damage identification problem [abstract] Abstract: In this work the inverse problem of identification of structural stiffness coefficients of a damped spring-mass system is tackled. The problem is solved by using different versions of Ant Colony Optimization (ACO) metaheuristic solely or coupled with the Hooke-Jeeves (HJ) local search algorithm. The evaluated versions of ACO are based on a discretization procedure to deal with the continuous domain design variables together with different pheromone evaporation and deposit strategies and also on the frequency of calling the local search algorithm. The damage estimation is evaluated using noiseless and noisy synthetic experimental data assuming a damage configuration throughout the structure. The reported results show the hybrid method as the best choice when both rank-based pheromone deposit and a new heuristic information based on the search history are used.	Carlos Eduardo Braun, Leonardo D. Chiwiacowsky, Arthur T. Gómez

6th Workshop on Computational Optimization, Modelling & Simulation (COMS) Session 3

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: V201

Chair: Leifur Leifsson

256	Multi-Objective Design Optimization of Planar Yagi-Uda Antenna Using Physics-Based Surrogates and Rotational Design Space Reduction [abstract] Abstract: A procedure for low-cost multi-objective design optimization of antenna structures is discussed. The major stages of the optimization process include: (i) an initial reduction of the search space aimed at identifying its relevant subset containing the Pareto-optimal design space, (ii) construction—using sampled coarse-discretization electromagnetic (EM) simulation data—of the response surface approximation surrogate, (iii) surrogate optimization using a multi-objective evolutionary algorithm, and (iv) the Pareto front refinement. Our optimization procedure is demonstrated through the design of a planar quasi Yagi-Uda antenna. The final set of designs representing the best available trade-offs between conflicting objectives is obtained at a computational cost corresponding to about 172 evaluations of the high-fidelity EM antenna model.	Slawomir Koziel, Adrian Bekasiewicz, Leifur Leifsson
644	Agent-Based Simulation for Creating Robust Plans and Schedules [abstract] Abstract: The paper describes methods for constructing the robust schedules using agent-based simulation. The measure of robustness represents the resistance of the schedule to random phenomena and we present the method for calculating robustness of the schedule. The procedure for creating the robust schedule combines standard solutions for planning and scheduling with computer simulation. It is described in detail and allows creation an executable robust schedule. Three different procedures for increasing the robustness (by changing the order of allocation of resources, by changing a plan and increasing time reserves) are short explained. The presented techniques were tested using real detailed simulation model of an existing container terminal.	Peter Jankovič
413	Shape Optimization of Trawl-Doors Using Variable-Fidelity Models and Space Mapping [abstract] Abstract: Trawl-doors have a large influence on the fuel consumption of fishing vessels. Design and optimization of trawl-doors using computational models are key factors in minimizing the fuel consumption. This paper presents an efficient optimization algorithm for the design of trawl-door shapes using computational fluid dynamic models. The approach is iterative and uses variable-fidelity models and space mapping. The algorithm is applied to the design of a multi-element trawl-door, involving four design variables controlling the angle of attack and the slat position and orientation. The results demonstrate that a satisfactory design can be obtained at a cost of a few iterations of the algorithm. Compared with direct optimization of the high-fidelity model and local response surface surrogate models, the proposed approach requires 79% less computational time while, at the same time, improving the design significantly (over 12% increase in the lift-to-drag ratio).	Ingi Jonsson, Leifur Leifsson, Slawomir Koziel, Yonatan Tesfahunegn, Adrian Bekasiewicz
347	Optimised robust treatment plans for prostate cancer focal brachytherapy [abstract] Abstract: Focal brachytherapy is a clinical procedure that can be used to treat low-risk prostate cancer with reduced side-effects compared to conventional brachytherapy. Current practice is to manually plan the placement of radioactive seeds inside the prostate to achieve a desired treatment dose. Problems with the current practice are that the manual planning is time-consuming and high doses to the urethra and rectum cause undesirable side-effects. To address this problem, we have designed an optimisation algorithm that constructs treatment plans which achieve the desired dose while minimizing dose to organs at risk. We also show that these seed plans are robust to post-operative movement of the seeds within the prostate.	John Betts, Chris Mears, Hayley Reynolds, Guido Tack, Kevin Leo, Martin Ebert, Annette Haworth
514	Identification of Multi-inclusion Statistically Similar Representative Volume Element for Advanced High Strength Steels by Using Data Farming Approach [abstract] Abstract: Statistically Similar Representative Volume Element (SSRVE) is used to simplify computational domain for microstructure representation of material in multiscale modelling. The procedure of SSRVE creation is based on optimization loop which allows to find the highest similarity between SSRVE and an original material microstructure. The objective function in this optimization is built upon computationally intensive numerical methods, including simulations of virtual material deformation, which is very time consuming. To avoid such long lasting calculations we propose to use the data farming approach to identification of SSRVE for Advanced High Strength Steels (AHSS) characterized by multiphase microstructure. The optimization method is based on a nature inspired approach which facilitates distribution and parallelization. The concept of SSRVE creation as well as the software architecture of the proposed solution is described in the paper in details. It is followed by examples of the results obtained for the identification of SSRVE parameters for DP steels which are widely exploited in modern automotive industry. Possible directions for further development and uses are described in the conclusions.	Lukasz Rauch, Danuta Szeliga, Daniel Bachniak, Krzysztof Bzowski, Renata Słota, Maciej Pietrzyk, Jacek Kitowski

ICCS 2015 Main Track (MT) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: M101

Chair: Jorge Veiga Fachal

53	Diarchy: An Optimized Management Approach for MapReduce Masters [abstract] Abstract: The MapReduce community is progressively replacing the classic Hadoop with Yarn, the second-generation Hadoop (MapReduce 2.0). This transition is being made due to many reasons, but primarily because of some scalability drawbacks of the classic Hadoop. The new framework has appropriately addressed this issue and is being praised for its multi-functionality. In this paper we carry out a probabilistic analysis that emphasizes some reliability concerns of Yarn at the job master level. This is a critical point, since the failures of a job master involves the failure of all the workers managed by such a master. In this paper, we propose Diarchy, a novel system for the management of job masters. Its aim is to increase the reliability of Yarn, based on the sharing and backup of responsibilities between two masters working as peers. The evaluation results show that Diarchy outperforms the reliability performance of Yarn in different setups, regardless of cluster size, type of job, or average failure rate and suggest a positive impact of this approach compared to the traditional, single-master Hadoop architecture.	Bunjamin Memishi, María S. Pérez, Gabriel Antoniu
61	MPI-Parallel Discrete Adjoint OpenFOAM [abstract] Abstract: OpenFOAM is a powerful Open-Source (GPLv3) Computational Fluid Dynamics tool box with a rising adoption in both academia and industry due to its continuously growing set of features and the lack of license costs. Our previously developed discrete adjoint version of OpenFOAM allows us to calculate derivatives of arbitrary objectives with respect to a potentially very large number of input parameters at a relative (to a single primal flow simulation) computational cost which is independent of that number. Discrete adjoint OpenFOAM enables us to run gradient-based methods such as topology optimization efficiently. Up until recently only a serial version was available limiting both the computing performance and the amount of memory available for the solution of the problem. In this paper we describe a first parallel version of discrete adjoint OpenFOAM based on our adjoint MPI library.	Markus Towara, Michel Schanen, Uwe Naumann
98	Versioned Distributed Arrays for Resilience in Scientific Applications: Global View Resilience [abstract] Abstract: Exascale studies project reliability challenges for future HPC systems. We propose the Global View Resilience (GVR) system, a library that enables applications to add resilience in a portable, application-controlled fashion using versioned distributed arrays. We describe GVR’s interfaces to distributed arrays, versioning, and cross-layer error recovery. Using several large applications (OpenMC, preconditioned conjugate gradient (PCG) solver, ddcMD, and Chombo), we evaluate the programmer effort to add resilience. The required changes are small (<2% LOC), localized, and machine-independent, requiring no software architecture changes. We also measure the overhead of adding GVR versioning and show that generally overheads <2 % are achieved. Thus, we conclude that GVR’s interfaces and implementation are flexible, portable, and create a gentle-slope path to tolerate growing error rates in future systems.	Andrew Chien, Pavan Balaji, Pete Beckman, Nan Dun, Aiman Fang, Hajime Fujita, Kamil Iskra, Zachary Rubenstein, Ziming Zheng, Robert Schreiber, Jeff Hammond, James Dinan, Ignacio Laguna, David Richards, Anshu Dubey, Brian van Straalen, Mark Hoemmen, Michael Heroux, Keita Teranishi, Andrew Siegel
106	Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC [abstract] Abstract: High throughput computing (HTC) has aided the scientific community in the analysis of vast amounts of data and computational jobs in distributed environments. To manage these large workloads, several systems have been developed to efficiently allocate and provide access to distributed resources. Many of these systems rely on job characteristics estimates (e.g., job runtime) to characterize the workload behavior, which in practice is hard to obtain. In this work, we perform an exploratory analysis of the CMS experiment workload using the statistical recursive partitioning method and conditional inference trees to identify patterns that characterize particular behaviors of the workload. We then propose an estimation process to predict job characteristics based on the collected data. Experimental results show that our process estimates job runtime with 75% of accuracy on average, and produces nearly optimal predictions for disk and memory consumption.	Rafael Ferreira Da Silva, Mats Rynge, Gideon Juve, Igor Sfiligoi, Ewa Deelman, James Letts, Frank Wuerthwein, Miron Livny
182	Performance Tuning of MapReduce Jobs Using Surrogate-Based Modeling [abstract] Abstract: Modeling workflow performance is crucial for finding optimal configuration parameters and optimizing execution times. We apply the method of surrogate-based modeling to performance tuning of MapReduce jobs. We build a surrogate model defined by a multivariate polynomial containing a variable for each parameter to be tuned. For illustrative purposes, we focus on just two parameters: the number of parallel mappers and the number of parallel reducers. We demonstrate that an accurate performance model can be built sampling a small set of the parameter space. We compare the accuracy and cost of building the model when using different sampling methods as well as when using different modeling approaches. We conclude that the surrogate-based approach we describe is both less expensive in terms of sampling time and more accurate than other well-known tuning methods.	Travis Johnston, Mohammad Alsulmi, Pietro Cicotti, Michela Taufer

ICCS 2015 Main Track (MT) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: M101

Chair: Markus Towara

305	A Neural Network Embedded System for Real-Time Estimation of Muscle Forces [abstract] Abstract: This work documents the progress towards the implementation of an embedded solution for muscular forces assessment during cycling activity. The core of the study is the adaptation to a real-time paradigm an inverse biomechanical model. The model is well suited for real-time applications since all the optimization problems are solved through a direct neural estimator. The real-time version of the model was implemented on an embedded microcontroller platform to profile code performance and precision degradation, using different numerical techniques to balance speed and accuracy in a low computational resources environment.	Gabriele Maria Lozito, Maurizio Schmid, Silvia Conforto, Francesco Riganti Fulginei, Daniele Bibbo
366	Towards Scalability and Data Skew Handling in GroupBy-Joins using MapReduce Model [abstract] Abstract: For over a decade, MapReduce has become the leading programming model for parallel and massive processing of large volumes of data. This has been driven by the development of many frameworks such as Spark, Pig and Hive, facilitating data analysis on large-scale systems. However, these frameworks still remain vulnerable to communication costs, data skew and tasks imbalance problems. This can have a devastating effect on the performance and on the scalability of these systems, more particularly when treating GroupBy-Join queries of large datasets. In this paper, we present a new GroupBy-Join algorithm allowing to reduce communication costs considerably while avoiding data skew effects. A cost analysis of this algorithm shows that our approach is insensitive to data skew and ensures perfect balancing properties during all stages of GroupBy-Join computation even for highly skewed data. These performances have been confirmed by a series of experimentations.	Mohamad Al Hajj Hassan, Mostafa Bamha
452	MREv: an Automatic MapReduce Evaluation Tool for Big Data Workloads [abstract] Abstract: The popularity of Big Data computing models like MapReduce has caused the emergence of many frameworks oriented to High Performance Computing (HPC) systems. The suitability of each one to a particular use case depends on its design and implementation, the underlying system resources and the type of application to be run. Therefore, the appropriate selection of one of these frameworks generally involves the execution of multiple experiments in order to assess their performance, scalability and resource efficiency. This work studies the main issues of this evaluation, proposing a new MapReduce Evaluator (MREv) tool which unifies the configuration of the frameworks, eases the task of collecting results and generates resource utilization statistics. Moreover, a practical use case is described, including examples of the experimental results provided by this tool. MREv is available to download at http://mrev.des.udc.es.	Jorge Veiga, Roberto R. Expósito, Guillermo L. Taboada, Juan Tourino
604	Load-Balancing for Large Scale Situated Agent-Based Simulations [abstract] Abstract: In large scale agent-based simulations, memory and computational power requirements can increase dramatically because of high numbers of agents and interactions. To be able to simulate millions of agents, distributing the simulator on a computer network is promising, but raises some issues like: agents allocation and load-balancing between machines. In this paper, we study the best ways to automatically balance the loads between machines in large scale situations. We study the performance of two different applications with two different distribution approaches, and we show in our experimental results that some applications can automatically adapt the loads between machines and get alone a high performance in large scale simulations with one distribution approach than the other.	Omar Rihawi, Yann Secq, Philippe Mathieu
669	Changing CPU Frequency in CoMD Proxy Application Offloaded to Intel Xeon Phi Co-processors [abstract] Abstract: Obtaining exascale performance is a challenge. Although the technology of today features hardware with very high levels of concurrency, exascale performance is primarily limited by energy consumption. This limitation has lead to the use of GPUs and specialized hardware such as many integrated core (MIC) co-processors and FPGAs for computation acceleration. The Intel Xeon Phi co-processor, built upon the MIC architecture, features many low frequency, energy efficient cores. Applications, even those which do not saturate the large vector processing unit in each core, may benefit from the energy-efficient hardware and software of the Xeon Phi. This work explores the energy savings of applications which have not been optimized for the co-processor. Dynamic voltage and frequency scaling (DVFS) is often used to reduce energy consumption during portions of the execution where performance is least likely to be affected. This work investigates the impact on energy and performance when DVFS is applied to the CPU during MIC-offloaded sections (i.e., code segments to be processed on the co-processor). Experiments, conducted on the molecular dynamics proxy application CoMD, show that as much as 14\% energy may be saved if two Xeon Phi's are used. When DVFS is applied to the host CPU frequency, energy savings of as high as 9\% are obtained in addition to the 8\% saved from reducing link-cell count.	Gary Lawson, Masha Sosonkina, Yuzhong Shen

ICCS 2015 Main Track (MT) Session 3

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: M101

Chair: Gabriele Maria Lozito

37	Improving OpenCL programmability with the Heterogeneous Programming Library [abstract] Abstract: The use of heterogeneous devices is becoming increasingly widespread. Their main drawback is their low programmability due to the large amount of details that must be handled. Another important problem is the reduced code portability, as most of the tools to program them are vendor or device-specfic. The exception to this observation is OpenCL, which largely suffers from the reduced programmability problem mentioned, particularly in the host side. The Heterogeneous Programming Library (HPL) is a recent proposal to improve this situation, as it couples portability with good programmability. While the HPL kernels must be written in a language embedded in C++, users may prefer to use OpenCL kernels for several reasons such as their growing availability or a faster development from existing codes. In this paper we extend HPL to support the execution of native OpenCL kernels and we evaluate the resulting solution in terms of performance and programmability, achieving very good results.	Moises Vinas, Basilio B. Fraguela, Zeki Bozkus, Diego Andrade
241	Efficient Particle-Mesh Spreading on GPUs [abstract] Abstract: The particle-mesh spreading operation maps a value at an arbitrary particle position to con- tributions at regular positions on a mesh. This operation is often used when a calculation involving irregular positions is to be performed in Fourier space. We study several approaches for particle-mesh spreading on GPUs. A central concern is the use of atomic operations. We are also concerned with the case where spreading is performed multiple times using the same particle configuration, which opens the possibility of preprocessing to accelerate the overall com- putation time. Experimental tests show which algorithms are best under which circumstances.	Xiangyu Guo, Xing Liu, Peng Xu, Zhihui Du, Edmond Chow
279	AMA: Asynchronous Management of Accelerators for Task-based Programming Models [abstract] Abstract: Computational science has benefited in the last years from emerging accelerators that increase the performance of scientific simulations, but using these devices hinders the programming task. This paper presents AMA: a set of optimization techniques to efficiently manage multi-accelerator systems. AMA maximizes the overlap of computation and communication in a blocking-free way. Then, we can use such spare time to do other work while waiting for device operations. Implemented on top of a task-based framework, the experimental evaluation of AMA on a quad-GPU node shows that we reach the performance of a hand-tuned native CUDA code, with the advantage of fully hiding the device management. In addition, we obtain up to more than 2x performance speed-up with respect to the original framework implementation.	Judit Planas, Rosa M. Badia, Eduard Ayguadé, Jesús Labarta
286	Adaptive Partitioning for Irregular Applications on Heterogeneous CPU-GPU Chips [abstract] Abstract: Commodity processors are comprised of several CPU cores and one integrated GPU. To fully exploit this type of architectures, one needs to automatically determine how to partition the workload between both devices. This is specially challenging for irregular workloads, where each iteration's work is data dependent and shows control and memory divergence. In this paper, we present a novel adaptive partitioning strategy specially designed for irregular applications running on heterogeneous CPU-GPU chips. The main novelty of this work is that the size of the workload assigned to the GPU and CPU adapts dynamically to maximize the GPU and CPU utilization while balancing the workload among the devices. Our experimental results on an Intel Haswell architecture using a set of irregular benchmarks show that our approach outperforms exhaustive static and adaptive state-of-the-art approaches in terms of performance and energy consumption.	Antonio Vilches, Rafael Asenjo, Angeles Navarro, Francisco Corbera, Ruben Gran, Maria Garzaran
304	Using high performance algorithms for the hybrid simulation of disease dynamics on CPU and GPU [abstract] Abstract: In the current work the authors present several approaches to the high performance simulation of human diseases propagation using hybrid two-component imitational models. The models under study were created by coupling compartmental and discrete-event submodels. The former is responsible for the simulation of the demographic processes in a population while the latter deals with a disease progression for a certain individual. The number and type of components used in a model may vary depending on the research aims and data availability. The introduced high performance approaches are based on batch random number generation, distribution of simulation runs and the calculations on graphical processor units. The emphasis was made on the possibility to use the approaches for various model types without considerable code refactoring for every particular model. The speedup gained was measured on simulation programs written in C++ and MATLAB for the models of HIV and tuberculosis spread and the models of tumor screening for the prevention of colorectal cancer. The benefits and drawbacks of the described approaches along with the future directions of their development are discussed.	Vasiliy Leonenko, Nikolai Pertsev, Marc Artzrouni

ICCS 2015 Main Track (MT) Session 4

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: M101

Chair: Sascha Hell

411	Point Distribution Tensor Computation on Heterogeneous Systems [abstract] Abstract: Big data in observational and computational sciences impose increasing challenges on data analysis. In particular, data from light detection and ranging (LIDAR) measurements are questioning conventional methods of CPU-based algorithms due to their sheer size and complexity as needed for decent accuracy. These data describing terrains are natively given as big point clouds consisting of millions of independent coordinate locations from which meaningful geometrical information content needs to be extracted. The method of computing the point distribution tensor is a very promising approach, yielding good results to classify domains in a point cloud according to local neighborhood information. However, an existing KD-Tree parallel approach, provided by the VISH visualization framework, may very well take several days to deliver meaningful results on a real-world dataset. Here we present an optimized version based on uniform grids implemented in OpenCL that is able to deliver results of equal accuracy up to 24 times faster on the same hardware. The OpenCL version is also able to benefit from a heterogeneous environment and we analyzed and compared the performance on various CPU, GPU and accelerator hardware platforms. Finally, aware of the heterogeneous computing trend, we propose two low-complexity dynamic heuristics for the scheduling of independent dataset fragments in multi-device heterogenous systems.	Ivan Grasso, Marcel Ritter, Biagio Cosenza, Werner Benger, Günter Hofstetter, Thomas Fahringer
465	Toward a multi-level parallel framework on GPU cluster with PetSC-CUDA for PDE-based Optical Flow computation [abstract] Abstract: In this work we present a multi-level parallel framework for the Optical Flow computation on a GPUs cluster, equipped with a scientific computing middleware (the PetSc library). Starting from a flow-driven isotropic method, which models the optical flow problem through a parabolic partial differential equation (PDE), we have designed a parallel algorithm and its software implementation that is suitable for heterogeneous computing environments (multiprocessor, single GPU and cluster of GPUs). The proposed software has been tested on real SAR images sequences. Experiments highlight the performances obtained and a gain of about 95% with respect to the sequential implementation.	Salvatore Cuomo, Ardelio Galletti, Giulio Giunta, Livia Marcellino
472	Performance Analysis and Optimisation of Two-Sided Factorization Algorithms for Heterogeneous Platform [abstract] Abstract: Many applications, ranging from big data analytics to nanostructure designs, require the solution of large dense singular value decomposition (SVD) or eigenvalue problems. A first step in the solution methodology for these problems is the reduction of the matrix at hand to condensed form by two-sided orthogonal transformations. This step is standardly used to significantly accelerate the solution process. We present a performance analysis of the main two-sided factorizations used in these reductions: the bidiagonalization, tridiagonalization, and the upper Hessenberg factorizations on heterogeneous systems of multicore CPUs and Xeon Phi coprocessors. We derive a performance model and use it to guide the analysis and to evaluate performance. We develop optimized implementations for these methods that get up to $80\%$ of the optimal performance bounds. Finally, we describe the heterogeneous multicore and coprocessor development considerations and the techniques that enable us to achieve these high-performance results. The work here presents the first highly optimized implementation of these main factorizations for Xeon Phi coprocessors. Compared to the LAPACK versions optmized by Intel for Xeon Phi (in MKL), we achieve up to $50\%$ speedup.	Khairul Kabir, Azzam Haidar, Stanimire Tomov, Jack Dongarra
483	High-Speed Exhaustive 3-locus Interaction Epistasis Analysis on FPGAs [abstract] Abstract: Epistasis, the interaction between genes, has become a major topic in molecular and quantitative genetics. It is believed that these interactions play a significant role in genetic variations causing complex diseases. Several algorithms have been employed to detect pairwise interactions in genome-wide association studies (GWAS) but revealing higher order interactions remains a computationally challenging task. State of the art tools are not able to perform exhaustive search for all three-locus interactions in reasonable time even for relatively small input datasets. In this paper we present how a hardware-assisted design can solve this problem and provide fast, efficient and exhaustive third-order epistasis analysis with up-to-date FPGA technology.	Jan Christian Kässens, Lars Wienbrandt, Jorge González-Domínguez, Bertil Schmidt and Manfred Schimmler
487	Evaluating the Potential of Low Power Systems for Headphone-based Spatial Audio Applications [abstract] Abstract: Embedded architectures have been traditionally designed tailored to perform a dedicated (specialized) function, and in general feature a limited amount of processing resources as well as exhibit very low power consumption. In this line, the recent introduction of systems-on-chip (SoC) composed of low power multicore processors, combined with a small graphics accelerator (or GPU), presents a notable increment of the computational capacity while partially retaining the appealing low power consumption of embedded systems. This paper analyzes the potential of these new hardware systems to accelerate applications that integrate spatial information into an immersive audiovisual virtual environment or into video games. Concretely, our work discusses the implementation and performance evaluation of a headphone-based spatial audio application on the Jetson TK1 development kit, a board equipped with a SoC comprising a quad-core ARM processor and an NVIDIA "Kepler" GPU. Our implementations exploit the hardware parallelism of both types of architectures by carefully adapting the underlying numerical computations. The experimental results show that the accelerated application is able to move up to 300 sound sources simultaneously in real time on this platform.	Jose A. Belloch, Alberto Gonzalez, Rafael Mayo, Antonio M. Vidal, Enrique S. Quintana-Orti

ICCS 2015 Main Track (MT) Session 5

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: M101

Chair: Lars Wienbrandt

488	Real-Time Sound Source Localization on an Embedded GPU Using a Spherical Microphone Array [abstract] Abstract: Spherical microphone arrays are becoming increasingly important in acoustic signal processing systems for their applications in sound field analysis, beamforming, spatial audio, etc. The positioning of target and interfering sound sources is a crucial step in many of the above applications. Therefore, 3D sound source localization is a highly relevant topic in the acoustic signal processing field. However, spherical microphone arrays are usually composed of many microphones and running signal processing localization methods in real time is an important issue. Some works have already shown the potential of Graphic Processing Units (GPUs) for developing high-end real-time signal processing systems. New embedded systems with integrated GPU accelerators providing low power consumption are becoming increasingly relevant. These novel systems play a very important role in the new era of smartphones and tablets, opening further possibilities to the design of high-performance compact processing systems. This paper presents a 3D source localization system using a spherical microphone array fully implemented on an embedded GPU. The real-time capabilities of these platforms are analyzed, providing also a performance analysis of the localization system under different acoustic conditions.	Jose A. Belloch, Maximo Cobos, Alberto Gonzalez, Enrique S. Quintana-Orti
81	The Scaled Boundary Finite Element Method for the Analysis of 3D Crack Interaction [abstract] Abstract: The Scaled Boundary Finite Element Method (SBFEM) can be applied to solve linear elliptic boundary value problems when a so-called scaling center can be defined such that every point on the boundary is \textit{visible} from it. From a more practical point of view, this means that in linear elasticity, a separation of variables ansatz can be used for the displacements in a scaled boundary coordinate system. This approach allows an analytical treatment of the problem in the scaling direction. Only the boundary needs to be discretized with Finite Elements. Employment of the separation of variables ansatz in the virtual work balance yields a Cauchy-Euler differential equation system of second order which can be transformed into an eigenvalue problem and solved by standard eigenvalue solvers for nonsymmetric matrices. A further obtained linear equation system serves for enforcing the boundary conditions. If the scaling center is located directly at a singular point, elliptic boundary value problems containing singularities can be solved with high accuracy and computational efficiency. The application of the SBFEM to the linear elasticity problem of two meeting inter-fiber cracks in a composite laminate exposed to a simple homogeneous temperature decrease reveals the presence of hypersingular stresses.	Sascha Hell and Wilfried Becker
85	Algorithmic Differentiation of Numerical Methods: Second-Order Tangent Solvers for Systems of Parametrized Nonlinear Equations [abstract] Abstract: Forward mode algorithmic differentiation transforms implementations of multivariate vector functions as computer programs into first directional derivative (also: first-order tangent) code. Its reapplication yields higher directional derivative (higher-order tangent) code. Second derivatives play an important role in nonlinear programming. For example, second-order (Newtontype) nonlinear optimization methods promise faster convergence in the neighborhood of the minimum through taking into account second derivative information. Part of the objective function may be given implicitly as the solution of a system of n parameterized nonlinear equations. If the system parameters depend on the free variables of the objective, then second derivatives of the nonlinear system’s solution with respect to those parameters are required. The local computational overhead for the computation of second-order tangents of the solution vector with respect to the parameters by Algorithmic Differentiation depends on the number of iterations performed by the nonlinear solver. This dependence can be eliminated by taking a second-order symbolic approach to differentiation of the nonlinear system.	Niloofar Safiran, Johannes Lotz, Uwe Naumann

ICCS 2015 Main Track (MT) Session 6

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: M101

Chair: Niloofar Safiran

131	How High a Degree is High Enough for High Order Finite Elements? [abstract] Abstract: High order finite element methods can solve partial differential equations more efficiently than low order methods. But how large of a polynomial degree is beneficial? This paper addresses that question through a case study of three problems representing problems with smooth solutions, problems with steep gradients, and problems with singularities. It also contrasts h-adaptive, p-adaptive, and h-adaptive refinement. The results indicate that for low accuracy requirements, like 1% relative error, h-adaptive refinement with relatively low order elements is sufficient, and for high accuracy requirements, p-adaptive refinement is best for smooth problems and hp-adaptive refinement with elements up to about 10th degree is best for other problems.	William Mitchell
179	Higher-Order Discrete Adjoint ODE Solver in C++ for Dynamic Optimization [abstract] Abstract: Parametric ordinary differential equations (ODE) arise in many engineering applications. We consider ODE solutions to be embedded in an overall objective function which is to be minimized, e.g. for parameter estimation. For derivative-based optimization algorithms adjoint methods should be used. In this article, we present a discrete adjoint ODE integration framework written in C++ (NIXE 2.0) combined with Algorithmic Differentiation by overloading (dco/c++). All required derivatives, i.e. Jacobians for the integration as well as gradients and Hessians for the optimization, are generated automatically. With this framework, derivatives of arbitrary order can be implemented with minimal programming effort. The practicability of this approach is demonstrated in a dynamic parameter estimation case study for a batch fermentation process using sequential method of dynamic optimization. Ipopt is used as the optimizer which requires second derivatives.	Johannes Lotz, Uwe Naumann, Alexander Mitsos, Tobias Ploch, Ralf Hannemann-Tamás
211	A novel Factorized Sparse Approximate Inverse preconditioner with supernodes [abstract] Abstract: Krylov methods preconditioned by Factorized Sparse Approximate Inverses (FSAI) are an efficient approach for the solution of symmetric positive definite linear systems on massively parallel computers. However, FSAI often suffers from a high set-up cost, especially in ill-conditioned problems. In this communication we propose a novel algorithm for the FSAI computation that makes use of the concept of supernode borrowed from sparse LU factorizations and direct methods.	Massimiliano Ferronato, Carlo Janna, Giuseppe Gambolati
343	Nonsymmetric preconditioning for conjugate gradient and steepest descent methods [abstract] Abstract: We analyze a possibility of turning off post-smoothing(relaxation) in geometric multigrid when used as a preconditioner in preconditioned conjugate gradient (PCG) linear and eigenvalue solvers for the 3D Laplacian. The geometric Semicoarsening Multigrid (SMG) method is provided by the hypre parallel software package. We solve linear systems using two variants (standard and flexible) of PCG and preconditioned steepest descent (PSD) methods. The eigenvalue problems are solved using the locally optimal block preconditioned conjugate gradient (LOBPCG) method available in hypre through BLOPEX software. We observe that turning off the post-smoothing in SMG dramatically slows down the standard PCG-SMG. For flexible PCG and LOBPCG, our numerical tests show that removing the post-smoothing results in overall 40--50 percent acceleration, due to the high costs of smoothing and relatively insignificant decrease in convergence speed. We demonstrate that PSD-SMG and flexible PCG-SMG converge similarly if SMG post-smoothing is off. A theoretical justification is provided.	Henricus Bouwmeester, Andrew Dougherty, Andrew Knyazev

ICCS 2015 Main Track (MT) Session 7

Time and Date: 10:15 - 11:55 on 3rd June 2015

Room: M101

Chair: Michal Marks

345	Dynamics with Matrices Possessing Kronecker Product Structure [abstract] Abstract: In this paper we present an application of Alternating Direction Implicit Algorithm to solving non-stationary PDE-s, allowing to obtain linear computational complexity. We illustrate this approach by solving two example non-stationary three-dimensional problems using explicit Euler time-stepping scheme: heat equation and linear elasticity equations for a cube.	Marcin Łoś, Maciej Woźniak, Maciej Paszyński, Lisandro Dalcin, Victor M. Calo
360	A Nonuniform Staggered Cartesian Grid Approach for Lattice-Boltzmann Method [abstract] Abstract: We propose a numerical approach based on the Lattice-Boltzmann method (LBM) for dealing with mesh refinement of Non-uniform Staggered Cartesian Grid. We explain, in detail, the strategy for mapping LBM over such geometries. The main benefit of this approach, compared to others, consists of solving all fluid units only once per time-step, and also reducing considerably the complexity of the communication and memory management between different refined levels. Also, it exhibits a better matching for parallel processors. To validate our method, we analyze several standard test scenarios, reaching satisfactory results with respect to other state-of-the-art methods. The performance evaluation proves that our approach not only exhibits a simpler and efficient scheme for dealing with mesh refinement, but also fast resolution, even in those scenarios where our approach needs to use a higher number of fluid units.	Pedro Valero-Lara, Johan Jansson
48	A Novel Cost Estimation Approach for Wood Harvesting Operations Using Symbolic Planning [abstract] Abstract: While forestry is an important economic factor, the methods commonly used to estimate potential financial gains from undertaking a harvesting operation are usually based on heuristics and experience. Those methods use an abstract view on the harvesting project at hand, focusing on a few general statistical parameters. To improve the accuracy of felling cost estimates, we propose a novel, single-tree-based cost estimation approach, thich utilizes knowledge about the harvesting operation at hand to allow for a more specific and accurate estimate of felling costs. The approach utilizes well-known symbolic planning algorithms which are interfaced via the Planning Domain Definition Language (PDDL) and compile work orders. The work orders can then be used to estimate the total working time and thus the estimated cost for an individual harvesting project, as well as some additional efficiency statistics. Since a large proportion of today's harvesting operations are mechanized instead of motor manual, we focus on the planning of harvester and forwarder workflows. However, the use of these heavy forest machines carries the risk of damaging forest soil when repeatedly driving along skidding roads. Our approach readily allows for assessment of these risks.	Daniel Losch, Nils Wantia, Jürgen Roßmann
140	Genetic Algorithm using Theory of Chaos [abstract] Abstract: This paper is focused on genetic algorithm with chaotic crossover operator. We have performed some experiments to study possible use of chaos in simulated evolution. A novel genetic algorithm with chaotic optimization operation is proposed to optimization of multimodal functions. As the basis of a new crossing operator a simple equation involving chaos is used, concrete the logistic function. The logistic function is a simple one-parameter function of the second order that shows a chaotic behavior for some values of the parameter. Generally, solution of the logistic function has three areas of its behavior: convergent, periodic and chaotic. We have supposed that the convergent behavior leads to exploitation and the chaotic behavior aids to exploration. The periodic behavior is probably neutral and thus it is a negligible one. Results of our experiments conrm these expectations. A proposed genetic algorithm with chaotic crossover operator leads to more ecient computation in comparison with the traditional genetic algorithm.	Petra Snaselova, Frantisek Zboril
271	PSO-based Distributed Algorithm for Dynamic Task Allocation in a Robotic Swarm [abstract] Abstract: Dynamic task allocation in a robotic swarm is a necessary process for proper management of the swarm. It allows the distribution of the identified tasks to be performed, among the swarm of robots, in such a way that a pre-defined proportion of execution of those tasks is achieved. In this context, there is no central unit to take care of the task allocation. So any algorithm proposal must be distributed, allowing every, and each robot in the swarm to identify the task it must perform. This paper proposes a distributed control algorithm to implement dynamic task allocation in a swarm robotics environment. The algorithm is inspired by the particle swarm optimization. In this context, each robot that integrates the swarm must run the algorithm periodically in order to control the underlying actions and decisions. The algorithm was implemented on ELISA III real swarm robots and extensively tested. The algorithm is effective and the corresponding performance is promising.	Nadia Nedjah, Rafael Mendonça, Luiza De Macedo Mourelle

ICCS 2015 Main Track (MT) Session 8

Time and Date: 14:10 - 15:50 on 3rd June 2015

Room: M101

Chair: Nadia Nedjah

469	Expressively Modeling the Social Golfer Problem in SAT [abstract] Abstract: Constraint Satisfaction Problems allow one to expressively model problems. On the other hand, propositional satisfiability problem (SAT) solvers can handle huge SAT instances. We thus present a technique to expressively model set constraint problems and to encode them automatically into SAT instances. Our technique is expressive and less error-prone. We apply it to the Social Golfer Problem and to symmetry breaking of the problem.	Frederic Lardeux, Eric Monfroy
538	Multi-Objective Genetic Algorithm for Variable Selection in Multivariate Classication Problems: A Case Study in Verification of Biodiesel Adulteration [abstract] Abstract: This paper proposes multi-objective genetic algorithm for the problem of variable selection in multivariate calibration. We consider the problem related to the classification of biodiesel samples to detect adulteration, Linear Discriminant Analysis classifier. The goal of the multi-objective algorithm is to reduce the dimensionality of the original set of variables; thus, the classification model can be less sensitive, providing a better generalization capacity. In particular, in this paper we adopted a version of the Non-dominated Sorting Genetic Algorithm (NSGA-II) and compare it to a mono-objective Genetic Algorithm (GA) in terms of sensitivity in the presence of noise. Results show that the mono-objective selects 20 variables on average and presents an error rate of 14%. One the other hand, the multi-objective selects 7 variables and has an error rate of 11%. Consequently, we show that the multi-objective formulation provides classification models with lower sensitivity to the instrumental noise when compared to the mono-objetive formulation.	Lucas de Almeida Ribeiro, Anderson Da Silva Soares
653	Sitting Multiple Observers for Maximum Coverage: An Accurate Approach [abstract] Abstract: The selection of the lowest number of observers that ensures the maximum visual coverage over an area represented by a digital elevation model (DEM) is an important problem with great interest in many elds, e.g., telecommunications, environment planning, among others. However, this problem is complex and intractable when the number of points of the DEM is relatively high. This complexity is due to three issues: 1) the diculty in determining the visibility of the territory from a point, 2) the need to know the visibility at all points of the territory and 3) the combinatorial complexity of the selection of observers. The recent progress in total-viewshed maps computation not only provides an ecient solu-tion to the rst two problems, but also opens other ways to new solutions that were unthinkable previously. This paper presents a new type of cartography, called the masked total viewshed map, and based on this algorithm, optimal solutions for both sequential and simultaneous observers location are provided.	Antonio Manuel Rodriguez Cervilla, Siham Tabik, Luis Felipe Romero Gómez
169	USING CRITERIA RECONSTRUCTION OF LOW-SAMPLING TRAJECTORIES AS A TOOL FOR ANALYTICS [abstract] Abstract: Today, a lot of applications with incorporated Geo Positional Systems (GPS) deliver huge quantities of spatio-temporal data. Trajectories followed by moving objects can be generated from this data. However, these trajectories may have silent durations, i.e., time durations when no data are available for describing the route of a MO. As a result, the movement during silent durations must be described and the low sampling data trajectory need to be filled in using specialized techniques of data imputation to study and discover new knowledge based on movement. Our interest is to show opportunities of analytical tasks using a criteria based operator over reconstructed low-sampling trajectories. Also, a simple visual analysis of the reconstructed trajectories is done to offer a simple analytic perspective of the reconstruction and how the criterion of movement can change the analysis. To the best of our knowledge, this work is the first attempt to use the different reconstruction of trajectories criteria to identify the opportunities of analytical tasks over reconstructed low-sampling trajectories as a whole.	Francisco Moreno, Edison Ospina, Iván Amón Uribe
258	Using Genetic Algorithms for Maximizing Technical Efficiency in Data Envelopment Analysis [abstract] Abstract: Data Envelopment Analysis (DEA) is a non-parametric technique for estimating the technical efficiency of a set of Decision Making Units (DMUs) from a database consisting of inputs and outputs. This paper studies DEA models based on maximizing technical efficiency, which aim to determine the least distance from the evaluated DMU to the production frontier. Usually, these models have been solved through unsatisfactory methods used for combinatorial NP-hard problems. Here, the problem is approached by metaheuristic techniques and the solutions are compared with those of the methodology based on the determination of all the facets of the frontier in DEA. The use of metaheuristics provides solutions close to the optimum with low execution time.	Martin Gonzalez, Jose J. Lopez-Espin, Juan Aparicio, Domingo Gimenez, Jesus T. Pastor

ICCS 2015 Main Track (MT) Session 9

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: V101

Chair: Megan Olsen

673	The construction of complex networks from linear and nonlinear measures — Climate Networks [abstract] Abstract: During the last decade the techniques of complex network analysis have found application in climate research. The main idea consists in embedding the characteristics of climate variables, e.g., temperature, pressure or rainfall, into the topology of complex networks by appropriate linear and nonlinear measures. Applying such measures on climate time series leads to defining links between their corresponding locations on the studied region, whereas the locations are the network’s nodes. The resulted networks, consequently, are analysed using the various network analysis tools present in literature in order to get a better insight on the processes, patterns and interactions occurring in climate system. In this regard we present ClimNet; a complete set of software tools to construct climate networks based on a wide range of linear (cross correlation) and nonlinear (Information theoretic) measures. The presented software will allow the construction of large networks’ adjacency matrices from climate time series while supporting functions to tune relationships to different time-scales by means of symbolic ordinal analysis. The provided tools have been used in the production of various original contributions in climate research. This work presents an in-depth description of the implemented statistical functions widely used to construct climate networks. Additionally, a general overview of the architecture of the developed software is provided as well as a brief analysis of application examples.	J. Ignacio Deza, Hisham Ihshaish
70	Genetic Algorithm evaluation of green search allocation policies in multilevel complex urban scenarios [abstract] Abstract: This paper investigates the relationship between the underlying complexity of urban agent-based models and the performance of optimisation algorithms. In particular, we address the problem of optimal green space allocation within a densely populated urban area. We find that a simple monocentric urban growth model may not contain enough complexity to be able to take complete advantage of advanced optimisation techniques such as Genetic Algorithms (GA) and that, in fact, simple greedy baselines can find a better policy for these simple models. We then turn to more realistic urban models and show that the performance of GA increases with model complexity and uncertainty level.	Marta Vallejo, Verena Rieser and David Corne
80	A uniﬁed and memory eﬃcient framework for simulating mechanical behavior of carbon nanotubes [abstract] Abstract: Carbon nanotubes possess many interesting properties, which make them a promising material for a variety of applications. In this paper, we present a uniﬁed framework for the simulation of mechanical behavior of carbon nanotubes. It allows the creation, simulation and visualization of these structures, extending previous work by the research group ”MISMO” at TU Darmstadt. In particular, we develop and integrate a new iterative solving procedure, employing the conjugate gradient method, that drastically reduces the memory consumption in comparison to the existing approaches. The increase in operations for the memory saving approach is partially oﬀset by a well scaling shared-memory parallelization. In addition the hotspots in the code have been vectorized. Altogether, the resulting simulation framework enables the simulation of complex carbon nanotubes on commodity multicore desktop computers.	Michael Burger, Christian Bischof, Christian Schröppel, Jens Wackerfuß
129	Towards an Integrated Conceptual Design Evaluation of Mechatronic Systems: The SysDICE Approach [abstract] Abstract: Mechatronic systems play a significant role in different types of industry, especially in transportation, aerospace, automotive and manufacturing. Although their multidisciplinary nature provides enormous functionalities, it is still one of the substantial challenges which frequently impede their design process. Notably, the conceptual design phase aggregates various engineering disciplines, project and business management fields, where different methods, modeling languages and software tools are applied. Therefore, an integrated environment is required to intimately engage the different domains together. This paper outlines a model-based research approach for an integrated conceptual design evaluation of mechatronic systems using SysML. Particularly, the state of the art is highlighted, most important challenges, remaining problems in this field and a novel solution is proposed, named SysDICE, combining model based system engineering and artificial intelligence techniques to support for achieving efficient design.	Mohammad Chami, Jean-Michel Bruel
164	MDE in Practice for Computational Science [abstract] Abstract: Computational Science tackles complex problems by definition. These problems concern people not only in large scale, but in their day-to-day life. With the development of computing facilities, novel application areas can legitimately benefit from the existing experience in the field. Nevertheless, the lack of reusability, the growing in complexity, and the “computing-oriented” nature of the actual solutions call for several improvements. Among these, raising the level of abstraction is the one we address in this paper. As an illustration we can mention the problem of the validity of the experimentations which depends on the validity of the defined programs (bugs not in the experiment and data but in the simulators/validators!). This raise the needs for leveraging on knowledge / expertise. In the software and systems modeling community, research on domain-specific modeling languages (DSMLs) is focused since the last decade on providing technologies for developing languages and tools that allow domain experts to develop system solutions efficiently. In this vision paper, based on concrete experiments, we claim that DSMLs can bridge the gap between the (problem) space in which scientist work and the implementation (programming) space. Incorporating domain-specific concepts and high-quality development experience into DSMLs can significantly improve scientist productivity and experimentation quality.	Jean-Michel Bruel, Benoit Combemale, Ileana Ober, Helene Raynal

ICCS 2015 Main Track (MT) Session 10

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: V101

Chair: Wentong Cai

153	Co-evolution in Predator Prey through Reinforcement Learning [abstract] Abstract: In general we know that high-level species such as mammals must learn from their environment to survive. We believe that most species evolved over time by ancestors learning the best traits, which allowed them to propagate more than their less effective counterparts. In many instances, learning occurs in a competitive environment, where a species is evolving alongside its food source and/or its predator. We are unaware of work that studies co-evolution of predator and prey through simulation such that each entity learns to survive within its world, and passes that information on to its progeny, without running multiple training runs. We propose an agent-based model of predators and prey with co-evolution through feature-based Q-learning, to allow predators and prey to learn during their lifetime. We show that this learning results in a more successful species for both predator and prey. We suggest that feature-based Q-learning is more effective for this problem than traditional variations on reinforcement learning, and would improve current population dynamics simulations.	Megan Olsen and Rachel Fraczkowski
184	Adaptive Autonomous Navigation using Reactive Multi-agents System for Control Laws Merging [abstract] Abstract: This paper deals with intelligent autonomous navigation of a vehicle in cluttered environment. We present a control architecture for safe and smooth navigation of a Unmanned Ground Vehicles (UGV). This control architecture is designed to allow the use of a single control law for different vehicle contexts (attraction to the target, obstacle avoidance, etc.). The reactive obstacle avoidance strategy is based on the limit-cycle approach. To manage the interaction between the controllers according to the context, the multi-agents system is proposed. Multi-agents systems are an efficient approach for problem solving and decision making. They can be applied to a wide range of applications thanks to their intrinsic properties such as self-organization/emergent phenomena. Merging approach between control laws is based on their properties to adapt the control to the environment. Different simulations on cluttered environment show the performance and the efficiency of our proposal, to obtain fully reactive and safe control strategy, for the navigation of a UGV.	Baudouin Dafflon, Franck Gechter, José Vilca, Lounis Adouane
309	Quantitative Evaluation of Decision Effects in the Management of Emergency Department Problems [abstract] Abstract: Due to the complexity and crucial role of an Emergency Department(ED) in the healthcare system. The ability to more accurately represent, simulate and predict performance of ED will be invaluable for decision makers to solve management problems. One way to realize this requirement is by modeling and simulating the emergency department, the objective of this research is to design a simulator, in order to better understand the bottleneck of ED performance and provide ability to predict such performance on defined condition. Agent-based modeling approach was used to model the healthcare staff, patient and physical resources in ED. This agent-based simulator provides the advantage of knowing the behavior of an ED system from the micro-level interactions among its components. The model was built in collaboration with healthcare staff in a typical ED and has been implemented and verified in a Netlogo modeling environment. Case studies are provided to present some capabilities of the simulator in quantitive analysis ED behavior and supporting decision making. Because of the complexity of the system, high performance computing technology was used to increase the number of studied scenarios and reduce execution time.	Zhengchun Liu, Eduardo Cabrera, Manel Taboada, Francisco Epelde, Dolores Rexachs, Emilio Luque
310	Agent Based Model and Simulation of MRSA Transmission in Emergency Departments [abstract] Abstract: In healthcare environments we can find several microorganisms causing nosocomial infection, and of which one of the most common and most dangerous is Methicillin-resistant Staphylococcus Aureus. Its presence can lead to serious complications to the patient. Our work uses Agent Based Modeling and Simulation techniques to build the model and the simulation of Methicillin-resistant Staphylococcus Aureus contact transmission in emergency departments. The simulator allows us to build virtual scenarios with the aim of understanding the phenomenon of MRSA transmission and the potential impact of the implementation of different measures in propagation rates.	Cecilia Jaramillo, Manel Taboada, Francisco Epelde, Dolores Rexachs, Emilo Luque
373	Multi-level decision system for the crossroad scenario [abstract] Abstract: Among the innovations aimed at tackling the transportation issues in the urban area, one of the most promising solutions is the possibility of making virtual trains of vehicles so as to provide a new kind of transportation system. Even if this kind of solutions is now widespread in the literature, some difficulties still need to be resolved. For instance, one must find solutions to make the crossing of the train possible while maintaining train composition (trains must not be split) and safety conditions. This paper proposes a multi-level decision process aimed at dealing with this issue. This proposal is based on train parameters dynamic adaptation which lead to trains crossing without stopping any of them. Results, obtained in simulations, make the comparison with a classical crossing strategy.	Bofei Chen, Franck Gechter, Abderrafiaa Koukam

ICCS 2015 Main Track (MT) Session 11

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: V101

Chair: Emilio Luque

379	Towards a Cognitive Agent-Based Model for Air Conditioners Purchasing Prediction [abstract] Abstract: Climate change as a result of human activities is a problem of a paramount importance. The global temperature on Earth is gradually increasing and it may lead to substantially hotter summers in a moderate belt of Europe, which in turn is likely to influence the air conditioning penetration in this region. The current work is an attempt to predict air conditioning penetration in different residential areas in the UK between 2030-2090 using an integration of calibrated building models, future weather predictions and an agent-based model. Simulation results suggest that up to 12% of homes would install an air conditioner in 75 years’ time assuming an average purchasing ability of the households. The performed simulations provide more insight into the influence of overheating intensity along with households’ purchasing ability and social norms upon households’ decisions to purchase an air conditioner.	Nataliya Mogles, Alfonso Ramallo-González, Elizabeth Gabe-Thomas
481	Crowd evacuations SaaS: an ABM approach [abstract] Abstract: Crowd evacuations involve thousands of persons in closed spaces. Having knowledge about where the problematic exits will be or where the disaster may occur can be crucial in emergency planning. We implemented a simulator using Agent Based Modelling able to model the behaviour of people in evacuation situations and a workflow able to run it in the cloud. The input is just a PNG image and the output are statistical results of the simulation executed on the cloud. This allows to provide the user with a system abstraction and only a map of the scenario is needed. Many events are held in main city squares, so to test our system we chose Siena and we fit about 28,000 individuals in the centre of the square. The software has special computational requirements because the results need to be statistically reliable. Because these needs we use distributed computing. In this paper we show how the simulator scales efficiently on the cloud.	Albert Gutierrez-Milla, Francisco Borges, Remo Suppi, Emilio Luque
499	Differential Evolution with Sensitivity Analysis and the Powell's Method for Crowd Model Calibration [abstract] Abstract: Evolutionary algorithms (EAs) are popular and powerful approaches for model calibration. This paper proposes an enhanced EA-based model calibration method, namely the differential evolution (DE) with sensitivity analysis and the Powell's method (DESAP). In contrast to traditional EA-based model calibration methods, the proposed DESAP owns three main features. First, an entropy-based sensitivity analysis operation is integrated so as to dynamically identify important parameters of the model as evolution progresses online. Second, the Powell's method is performed periodically to fine-tune the important parameters of the best individual in the population. Finally, in each generation, the DE operators are performed on a small number of better individuals rather than all individuals in the population. These new search mechanisms are integrated into the DE framework so as to reduce the computational cost and to improve the search efficiency. To validate its effectiveness, the proposed DESAP is applied to two crowd model calibration cases. The results demonstrate that the proposed DESAP outperforms several state-of-the-art model calibration methods in terms of accuracy and efficiency.	Jinghui Zhong and Wentong Cai
525	Strip Partitioning for Ant Colony Parallel and Distributed Discrete-Event Simulation [abstract] Abstract: Data partitioning is one of the main problems in parallel and distributed simulation. Distribution of data over the architecture directly influences the efficiency of the simulation. The partitioning strategy becomes a complex problem because it depends on several factors. In an Individual-oriented Model, for example, the partitioning is related to interactions between the individual and the environment. Therefore, parallel and distributed simulation should dynamically enable the interchange of the partitioning strategy in order to choose the most appropriate partitioning strategy for a specific context. In this paper, we propose a strip partitioning strategy to a spatially dependent problem in Individual-oriented Model applications. This strategy avoids sharing resources, and, as a result, it decreases communication volume among the processes. In addition, we develop an objective function that calculates the best partitioning for a specific configuration and gives the computing cost of each partition, allowing for a computing balance through a mapping policy. The results obtained are supported by statistical analysis and experimentation with an Ant Colony application. As a main contribution, we developed a solution where the partitioning strategy can be chosen dynamically and always returns the lowest total execution time.	Francisco Borges, Albert Gutierrez-Milla, Remo Suppi, Emilio Luque
530	Model of Collaborative UAV Swarm Toward Coordination and Control Mechanisms Study [abstract] Abstract: In recent years, thanks to the low cost of deploying, maintaining an Unmanned Aerial Vehicle (UAV) system and the possibility to operating them in areas inaccessible or dangerous for human pilots, UAVs have attracted much research attention both in the military field and civilian application. In order to deal with more sophisticated tasks, such as searching survival points, multiple target monitoring and tracking, the application of UAV swarms is forseen. This requires more complex control, communication and coordination mechanisms. However, these mechanisms are difficult to test and analyze under flight dynamic conditions. These multi- UAV scenarios are by their nature well suited to be modeled and simulated as multi-agent systems. The first step of modeling an multi-agent system is to construct the model of agent, namely accurate model to represent its behavior, constraints and uncertainties of UAVs. In this paper we introduce our approach to model an UAV as an agent in terms of multi-agent system principle. Construction of the model to satisfy the need for a simulation environment that researchers can use to evaluate and analyze swarm control mechanisms. Simulations results of a case study is provided to demonstrate one possible use of this approach.	Xueping Zhu, Zhengchun Liu, Jun Yang

ICCS 2015 Main Track (MT) Session 12

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: V101

Chair: George Kampis

569	Simulation of Alternative Fuel Markets using Integrated System-Dynamics Model of Energy System [abstract] Abstract: An integrated system-dynamics model of energy systems is employed to explore the transition process towards alternative fuel markets. The model takes into account the entire energy system including interactions among supply sectors, energy prices, infrastructure and fuel demand. The paper presents the model structure and describes the algorithm for the short-term and long-term simulation of energy markets. The integrated model is applied to the renewable-based energy system of Iceland as a case study to simulate the transition path towards alternative fuel market during the time horizon of 2015-2050. An optimistic transition scenario towards hydrogen and biofuels is investigated for the numerical results. The market simulation algorithm effectively exhibits the continual transition towards equilibrium as market prices dynamically adjust to changes in supply and demand. The application of the model has potential to provide important policy insights as it can simulate the impact of different policy instruments on both supply and demand sides.	Ehsan Shafiei, Brynhildur Davíðsdóttir, Jonathan Leaver, Hlynur Stefansson, Eyjólfur Ingi Ásgeirsson
586	Information Impact on Transportation Systems [abstract] Abstract: With a broader distribution of personal smart devices and with an increasing availability of advanced navigation tools, more drivers can have access to real time information regarding the traffic situation. Our research focuses on determining how using the real time information about a transportation system could influence the system itself. We developed an agent based model to simulate the effect of drivers using real time information to avoid traffic congestion. Experiments reveal that the system's performance is influenced by the number of participants that have access to real time information. We also discover that, in certain circumstances, the system performance when all participants have information is no different from, and perhaps even worse than, when no participant has access to information.	Sorina Litescu, Vaisagh Viswanathan, Michael Lees, Alois Knoll and Heiko Aydt
609	The Multi-Agent Simulation-Based Framework for Optimization of Detectors Layout in Public Crowded Places [abstract] Abstract: In this work the framework for detectors layout optimization based on a multi-agent simulation is proposed. Its main intention is to provide a decision support team with a tool for automatic design of social threat detection systems for public crowded places. Containing a number of distributed detectors, such system performs detection and an identification of threat carriers. The generic model of detector used in the framework allows considering detection of various types of threats, e.g. infections, explosives, drugs, radiation. The underlying agent-based models provide data on social mobility which is used along with a probability based quality assessment model within the optimization process. The implemented multi-criteria optimization scheme is based on a genetic algorithm. For experimental study the framework has been applied in order to get the optimal detectors' layout in the Pulkovo airport.	Nikolay Butakov, Denis Nasonov, Konstantin Knyazkov, Vladislav Karbovskii, Yulya Chuprova
626	Towards Ensemble Simulation of Complex Systems [abstract] Abstract: The paper presents an early-stage research which is aimed towards the development of comprehensive conceptual and technological framework for ensemble-based simulation of complex systems. The concept of multi-layer ensemble is presented as a background for further development of the framework to cover different kind of ensembles: ensemble of system’s state, data ensemble, and models ensemble. Formal description of a hybrid model is provided as a core concept for ensemble-based complex system simulation. The example of water level forecasting application is used to show selected ensemble classes covered by the proposed framework.	Sergey Kovalchuk, Alexander Boukhanovsky

ICCS 2015 Main Track (MT) Session 13

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: V101

Chair: Witold Dzwinel

712	Collaborative Knowledge Fusion by Ad-Hoc Information Distribution in Crowds [abstract] Abstract: We study situations where (such as in a city festival) in the case of a phone signal outage cell phones can communicate opportunistically (for instance, using WiFi or Bluetooth) and we want to understand and control information spreading. A particular question is, how to prevent false information from spreading, and how to facilitate the spreading of useful (true) information? We introduce collaborative knowledge fusion as the operation by which individual, local knowledge claims are ``merged". Such fusion events are local, e.g. happen upon the physical meetings of knowledge providers. We study and evaluate different methods for collaborative knowledge fusion and study the conditions for and tradeoffs of the convergence to a global true knowledge state under various conditions.	George Kampis, Paul Lukowicz
220	Modeling Deflagration in Energetic Materials using the Uintah Computational Framework [abstract] Abstract: Predictive computer simulations of large-scale deflagration and detonation are dependent on the availability of robust reaction models embedded in a computational framework capable of running on massively parallel computer architectures. We have been developing such models in the Uintah Computational Framework, which is capable of scaling up to 512k cores. Our particular interest is in predicting DDT for accident scenarios involving large numbers of energetic devices; the 2005 truck explosion in Spanish Fork Canyon, UT is a prototypical example. Our current reaction model adapts components from Ward, Son and Brewster to describe the effects of pressure and initial temperature on deflagration, from Berghout et al. for burning in cracks in damaged explosives, and from Souers for describing fully developed detonation. The reaction model has been subjected to extensive validation against experimental tests. Current efforts are focused on effects of carrying the computational grid elements on multiple aspects of deflagration and the transition to detonation.	Jacqueline Beckvermit, Todd Harman, Andrew Bezdjian, Charles Wight
237	Fast Equilibration of Coarse-Grained Polymeric Liquids [abstract] Abstract: The study of macromolecular systems may require large computer simulations that are too time consuming and resource intensive to execute in full atomic detail. The integral equation coarse-graining approach by Guenza and co-workers enables the exploration of longer time and spatial scales without sacrificing thermodynamic consistency, by approximating collections of atoms using analytically-derived soft-sphere potentials. Because coarse-grained (CG) characterizations evolve polymer systems far more efficiently than the corresponding united atom (UA) descriptions, we can feasibly equilibrate a CG system to a reasonable geometry, then transform back to the UA description for a more complete equilibration. Automating the transformation between the two different representations simultaneously exploits CG efficiency and UA accuracy. By iteratively mapping back and forth between CG and UA, we can quickly guide the simulation towards a configuration that would have taken many more time steps within the UA representation alone. Accomplishing this feat requires a diligent workflow for managing input/output coordinate data between the different steps, deriving the potential at runtime, and inspecting convergence. In this paper, we present a lightweight workflow environment that accomplishes such fast equilibration without user intervention. The workflow supports automated mapping between the CG and UA descriptions in an iterative, scalable, and customizable manner. We describe this technique, examine its feasibility, and analyze its correctness.	David Ozog, Jay McCarty, Grant Gossett, Allen Malony and Marina Guenza
392	Massively Parallel Simulations of Hemodynamics in the Human Vasculature [abstract] Abstract: We present a computational model of three-dimensional and unsteady hemodynamics within the primary large arteries in the human on 1,572,864 cores of the IBM Blue Gene/Q. Models of large regions of the circulatory system are needed to study the impact of local factors on global hemodynamics and to inform next generation drug delivery mechanisms. The HARVEY code successfully addresses key challenges that can hinder effective solution of image-based hemodynamics on contemporary supercomputers, such as limited memory capacity and bandwidth, flexible load balancing, and scalability. This work is the first demonstration of large (> 500 cm) fluid dynamics simulations of the circulatory system modeled at resolutions as high as 10 μm.	Amanda Randles, Erik W. Draeger and Peter E. Bailey
402	Parallel performance of an IB-LBM suspension simulation framework [abstract] Abstract: We present performance results from ficsion, a general purpose parallel suspension solver, employing the Immersed-Boundary lattice-Boltzmann method (IB-LBM). ficsion is build on top of the open-source LBM framework Palabos, making use of its data structures and their inherent parallelism. We describe in brief the implementation and present weak and strong scaling results for simulations of dense red blood cell suspensions. Despite its complexity the simulations demonstrate a fairly good, close to linear scaling, both in the weak and strong scaling scenarios.	Lampros Mountrakis, Eric Lorenz, Orestis Malaspinas, Saad Alowayyed, Bastien Chopard and Alfons G. Hoekstra

ICCS 2015 Main Track (MT) Session 14

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: V101

Chair: Lampros Mountrakis

405	A New Stochastic Cellular Automata Model for Traffic Flow Simulation with Driver's Behavior Prediction [abstract] Abstract: In this work we introduce a novel, flexible and robust traffic flow cellular automata model. Our proposal includes two important stages that make possible the consideration of different profiles of drivers' behaviors. We first consider the motion expectation of cars that are in front of each driver. Secondly, we define how a specific car decides to get around, considering the foreground traffic configuration. Our model uses stochastic rules for both situations, adjusting the Probability Density Function of the Beta Distribution for three neighborhoods drives behavior, adjusting different parameters of the Beta distribution for each one.	Marcelo Zamith, Leal-Toledo Regina, Esteban Clua, Elson Toledo and Guilherme Magalhães
557	A Model Driven Approach to Water Resource Analysis based on Formal Methods and Model Transformation [abstract] Abstract: Several frameworks have been proposed in literature in order to cope with critical infrastructure modelling issues, and almost all rely on simulation techniques. Anyway simulation is not enough for critical systems, where any problem may lead to consistent loss in money and even human lives. Formal methods are widely used in order to enact exhaustive analyses of these systems, but their complexity grows with system dimension and heterogeneity. In addition, experts in application domains could not be familiar with formal modelling techniques. A way to manage complexity of analysis is the use of Model Based Transformation techniques: analysts can express their models in the way they use to do and automatic algorithms translate original models into analysable ones, reducing analysis complexity in a completely transparent way. In this work we describe an automatic transformation algorithm generating hybrid automata for the analysis of a natural water supply system. We use real system located in the South of Italy as case study.	Francesco Moscato, Flora Amato, Francesco De Paola, Crescenzo Diomaiuta, Nicola Mazzocca, Maurizio Giugni
175	An Invariant Framework for Conducting Reproducible Computational Science [abstract] Abstract: Computational reproducibility depends on being able to isolate necessary and sufficient computational artifacts and preserve them for later re-execution. Both isolation and preservation of artifacts can be challenging due to the complexity of existing software and systems and the resulting implicit dependencies, resource distribution, and shifting compatibility of systems as time progresses---all conspiring to break the reproducibility of an application. Sandboxing is a technique that has been used extensively in OS environments for isolation of computational artifacts. Several tools were proposed recently that employ sandboxing as a mechanism to ensure reproducibility. However, none of these tools preserve the sandboxed application for re-distribution to a larger scientific community---aspects that are equally crucial for ensuring reproducibility as sandboxing itself. In this paper, we describe a combined sandboxing and preservation framework, which is efficient, invariant and practical for large-scale reproducibility. We present case studies of complex high energy physics applications and show how the framework can be useful for sandboxing, preserving and distributing applications. We report on the completeness, performance, and efficiency of the framework, and suggest possible standardization approaches.	Haiyan Meng, Rupa Kommineni, Quan Pham, Robert Gardner, Tanu Malik and Douglas Thain
264	Very fast interactive visualization of large sets of high-dimensional data [abstract] Abstract: The embedding of high-dimensional data into 2D (or 3D) space is the most popular way of data visualization. Despite recent advances in developing of very accurate dimensionality reduction algorithms, such as BH-SNE, Q-SNE and LoCH, their relatively high computational complexity still remains the obstacle for interactive visualization of truly large sets of high-dimensional data. We show that a new clone of the multidimensional scaling method (MDS) – nr-MDS – can be up to two orders of magnitude faster than the modern dimensionality reduction algorithms. We postulate its linear O(M) computational and memory complexity. Simultaneously, our method preserves in 2D and 3D target spaces high separability of data, similar to that obtained by the state-of-the-art dimensionality reduction algorithms. We present the effects of nr-MDS application in visualization of data repositories such as 20 Newsgroups (M=18000), MNIST (M=70000) and REUTERS (M=267000).	Witold Dzwinel, Rafał Wcisło
315	Automated Requirements Extraction for Scientific Software [abstract] Abstract: Requirements engineering is crucial for software projects, but formal requirements engineering is often ignored in scientific software projects. Scientists do not often see the benefit of directing their time and effort towards documenting requirements. Additionally, there is a lack of requirements engineering knowledge amongst scientists who develop software. We aim at helping scientists to easily recover and reuse requirements without acquiring prior requirements engineering knowledge. We apply an automated approach to extract requirements for scientific software from available knowledge sources, such as user manuals and project reports. The approach employs natural language processing techniques to match defined patterns in input text. We have evaluated the approach in three different scientific domains, namely seismology, building performance and computational fluid dynamics. The evaluation results show that 78--97% of the extracted requirement candidates are correctly extracted as early requirements.	Yang Li, Emitzá Guzmán Ortega, Konstantina Tsiamoura, Florian Schneider, Bernd Bruegge

ICCS 2015 Main Track (MT) Session 15

Time and Date: 10:15 - 11:55 on 3rd June 2015

Room: V101

Chair: Dirk De Vos

387	Interactive 180º Rear Projection Public Relations [abstract] Abstract: In the globalized world, good products may not be enough to reach potential clients if creative marketing strategies are not well delineated. Public relations are also important when it comes to capture clients attention, making the first contact between them and companies products while being persuasive enough to gain the of the client that the company has the right products to fit their needs. A virtual public relations is purposed, combining technology and a human like public relations capable of interacting with potential clients placed 180 degrees in front of the installation, by using gestures and sound. Four 4 Microsoft Kinects were used to develop de 180 degrees model for interaction, which allows recognition of gestures, sound sources, words, extract the face and body of the user and track users positions (including an heat map).	Ricardo Alves, Aldric Négrier, Luís Sousa, J.M.F Rodrigues, Paulo Felizberto, Miguel Gomes, Paulo Bica
11	Identification of DNA Motif with Mutation [abstract] Abstract: The conventional way of identifying possible motif sequences in a DNA strand is to use representative scalar weight matrix for searching good match substring alignments. However, this approach, solely based on match alignment information, is susceptible to a high number of ambiguous sites or false positives if the motif sequences are not well conserved. A significant amount of time is then required to verify these sites for the suggested motifs. Hence in this paper, the use of mismatch alignment information in addition to match alignment information for DNA motif searching is proposed. The objective is to reduce the number of ambiguous false positives encountered in the DNA motif searching, thereby making the process more efficient for biologists to use.	Jian-Jun Shu
231	A software tool for the automatic quantification of the left ventricle myocardium hyper-trabeculation degree [abstract] Abstract: Isolated left ventricular non-compaction (LVNC) is a myocardial disorder characterised by prominent ventricular trabeculations and deep recesses extending from the LV cavity to the subendocardial surface of the LV. Up to now, there is no common and stable solution in the medical community for quantifying and valuing the non-compacted cardiomyopathy. A software tool for the automatic quantification of the exact hyper-trabeculation degree in the left ventricle myocardium is designed, developed and tested. This tool is based on medical experience, but the possibility of the human appreciation error has been eliminated. The input data for this software are the cardiac images of the patients obtained by means of magnetic resonance. The output results are the percentage quantification of the trabecular zone with respect to the compacted area. This output is compared with human processing performed by medical specialists. The software proves to be a valuable tool to help diagnosis, so saving valuable diagnosis time.	Gregorio Bernabe, Javier Cuenca, Pedro E. López de Teruel, Domingo Gimenez, Josefa González-Carrillo
453	Blending Sentence Optimization Weights of Unsupervised Approaches for Extractive Speech Summarization [abstract] Abstract: This paper evaluates the performance of two unsupervised approaches, Maximum Marginal Relevance (MMR) and concept-based global optimization framework for speech summarization. Automatic summarization is very useful techniques that can help the users browse a large amount of data. This study focuses on automatic extractive summarization on multi-dialogue speech corpus. We propose improved methods by blending each unsupervised approach at sentence level. Sentence level information is leveraged to improve the linguistic quality of selected summaries. First, these scores are used to filter sentences for concept extraction and concept weight computation. Second, we pre-select a subset of candidate summary sentences according to their sentence weights. Last, we extend the optimization function to a joint optimization of concept and sentence weights to cover both important concepts and sentences. Our experimental results show that these methods can improve the system performance comparing to the concept-based optimization baseline for both human transcripts and ASR output. The best scores are achieved by combining all three approaches, which are significantly better than the baseline system.	Noraini Seman, Nursuriati Jamil
513	The CardioRisk Project: Improvement of Cardiovascular Risk Assessment [abstract] Abstract: The CardioRisk project addresses the coronary artery disease (CAD), namely, the management of myocardial infarction (MI) patients. The main goal is the development of personalized clinical models for cardiovascular (CV) risk assessment of acute events (e.g. death and new hospitalization), in order to stratify patients according to their care needs. This paper presents an overview of the scientific and technological issues that are under research and development. Three major scientific challenges can be identified: i) the development of fusion approaches to merge CV risk assessment tools; ii) strategies for the grouping (clustering) of patients; iii) biosignal processing techniques to achieve personalized diagnosis. At the end of the project, a set of algorithms/models must properly address these three challenges. Additionally, a clinical platform was implemented, integrating the developed models and algorithms. This platform supports a clinical observational study (100 patients) that is being carried out in Leiria Hospital Centre to validate the developed approach. Inputs from the hospital information system (demographics, biomarkers, clinical exams) are considered as well as an ECG signal acquired based on a Holter device. A real patient dataset provided by Santa Cruz Hospital, Portugal, comprising N=460 ACS-NSTEMI patients is also applied to perform initial validations (individual algorithms). The CardioRisk team is composed by two research institutions, the University of Coimbra (Portugal), Politecnico di Milano (Italy) and Leiria Hospital Centre (a Portuguese public hospital).	Simão Paredes, Teresa Rocha, Paulo de Carvalho, Jorge Henriques, Diana Mendes, Ricardo Cabete, Ramona Cabiddu, Anna Maria Bianchi and João Morais

ICCS 2015 Main Track (MT) Session 16

Time and Date: 14:10 - 15:50 on 3rd June 2015

Room: V101

Chair: Jian-Jun Shu

563	Parallel metaheuristics in computational biology: an asynchronous cooperative enhanced Scatter Search method [abstract] Abstract: Metaheuristics are gaining increased attention as efficient solvers for hard global optimization problems arising in bioinformatics and computational systems biology. Scatter Search (SS) is one of the recent outstanding algorithms in that class. However, its application to very hard problems, like those considering parameter estimation in dynamic models of systems biology, still results in excessive computation times. In order to reduce the computational cost of the SS and improve its success, several research efforts have been made to propose dierent variants of the algorithm, including parallel approaches. This work presents an asynchronous Cooperative enhanced Scatter Search (aCeSS) based on the parallel execution of different enhanced Scatter Search threads and the cooperation between them. The main features of the proposed solution are: low overhead in the cooperation step, by means of an asynchronous protocol to exchange information between processes; more effectiveness of the cooperation step, since the exchange of information is driven by quality of the solution obtained in each process, rather than by a time elapsed; optimal use of available resources, thanks to a complete distributed approach that avoids idle processes at any moment. Several challenging parameter estimation problems from the domain of computational systems biology are used to assess the efficiency of the proposal and evaluate its scalability in a parallel environment.	David R Penas, Patricia Gonzalez, Jose A. Egea, Julio R. Banga, Ramon Doallo
716	Simulating leaf growth dynamics through Metropolis-Monte Carlo based energy minimization [abstract] Abstract: Throughout their life span plants maintain the ability to generate new organs such as leaves. This is normally done in an orderly way by activating limited groups of dormant cells to divide and grow. It is currently not understood how that process is precisely regulated. We have used the VirtualLeaf framework for plant organ growth modelling to simulate the typical developmental stages of leaves of the model plant Arabidopsis thaliana. For that purpose the Hamiltonian central to the Monte-Carlo based mechanical equilibration of VirtualLeaf was modified. A basic two-dimensional model was defined starting from a rectangular grid with a dynamic phytohormone gradient that spatially instructs the cells in the growing leaf. Our results demonstrate that such a mechanism can indeed reproduce various spatio-temporal characteristics of leaf development and provides clues for further model development.	Dirk De Vos, Emil De Borger, Jan Broeckhove and Gerrit Ts Beemster
118	Clustering Acoustic Events in Environmental Recordings for Species Richness Surveys [abstract] Abstract: Environmental acoustic recordings can be used to perform avian species richness surveys, whereby a trained ornithologist can observe the species present by listening to the recording. This could be made more efficient by using computational methods for iteratively selecting the richest parts of a long recording for the human observer to listen to, a process known as “smart sampling”. This allows scaling up to much larger ecological datasets. In this paper we explore computational approaches based on information and diversity of selected samples. We propose to use an event detection algorithm to estimate the amount of information present in each sample. We further propose to cluster the detected events for a better estimate of this amount of information. Additionally, we present a time dispersal approach to estimating diversity between iteratively selected samples. Combinations of approaches were evaluated on seven one-day recordings that have been manually annotated by bird watchers. The results show that on average all the methods we have explored would allow annotators to observe more new species in fewer minutes compared to a baseline of random sampling at dawn.	Philip Eichinski, Laurianne Sitbon, Paul Roe
337	On the Effectiveness of Crowd Sourcing Avian Nesting Video Analysis at Wildlife@Home [abstract] Abstract: Wildlife@Home is citizen science project developed to provide wildlife biologists a way to swiftly analyze the massive quantities of data that they can amass during video surveillance studies. The project has been active for two years, with over 200 volunteers who have participated in providing observations through a web interface where they can stream video and report the occurrences of various events within that video. Wildlife@Home is currently analyzing avian nesting video from three species: Sharptailed-Grouse (Tympanuchus phasianellus) an indicator species which plays a role in determining the effect of North Dakota's oil development on the local wildlife, Interior Least Tern (Sternula antillarum) a federally listed endangered species, and Piping Plover (Charadrius Melodus) a federally listed threatened species. Video comes from 105 grouse, 61 plover and 37 tern nests from multiple nesting seasons, and consists of over 85,000 hours (13 terabytes) of 24/7 uncontrolled outdoor surveillance video. This work describes the infrastructure supporting this citizen science project, and examines the effectiveness of two different interfaces for crowd sourcing: a simpler interface where users watch short clips of video and report if an event occurred within that video, and a more involved interface where volunteers can watch entire videos and provide detailed event information including beginning and ending times for events. User observations are compared against expert observations made by wildlife biology research assistants, and are shown to be quite effective given strategies used in the project to promote accuracy and correctness.	Travis Desell, Kyle Goehner, Alicia Andes, Rebecca Eckroad, Susan Felege
594	Prediction of scaling resistance of concrete modified with high-calcium fly ash using classification methods [abstract] Abstract: The goal of the study was applying machine learning methods to create rules for prediction of the surface scaling resistance of concrete modified with high-calcium fly ash. To determine the scaling durability the Bor{\aa}s method, according to European Standard procedure (PKN-CEN/TS 12390-9:2007), was used. The results of numeral experiments were utilized as a training set to generate rules indicating the relation between material composition and the scaling resistance. The classifier generated by BFT algorithm from the WEKA workbench can be used as a tool for adequate classification of plain concretes and concretes modified with high-calcium fly ash as materials resistant or not resistant to the surface scaling.	Michal Marks, Maria Marks

ICCS 2015 Main Track (MT) Session 17

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: V206

Chair: Ilya Valuev

59	Swarming collapse under limited information flow between individuals [abstract] Abstract: Information exchange is critical to the execution and effectiveness of natural and artificial collective behaviors: fish schooling, birds flocking, amoebae aggregating or robots swarming. In particular, the emergence of dynamic collective responses in swarms confronted to complex environments underscore the central role played by social transmission of information. Here, the different possible origins of information flow bottlenecks are identified, and the associated effects on dynamic collective behaviors revealed using a combination of network-, control- and information-theoretic elements applied to a group of interacting self-propelled particles (SPPs). Specifically, we consider a minimalistic agent-based model consisting of N topologically interacting SPPs moving at constant speed through a domain having periodic boundaries. Each individual agent is characterized by its direction of travel and a canonical swarming behavior of the consensus type is examined. To account for the finiteness of the bandwidth, we consider synchronous information exchanges occurring every T = 1/2B, where the unit interval T is the minimum time interval between condition changes of data transmission signal. The agents move synchronously at discrete time steps T by a fixed distance upon receiving informational signals from their neighbors as per a linear update rule involving. We find a sufficient condition on the agents’ bandwidth B that guarantees the effectiveness of swarming while also highlighting the profound connection with the topology of the underlying interaction network. We also show that when decreasing B, the swarming behavior invariably vanishes following a second-order phase transition irrespectively of the intrinsic noise level.	Roland Bouffanais
63	Multiscale simulation of organic electronics via massive nesting of density functional theory computational kernels [abstract] Abstract: Modelling is essential for development of organic electronics, such as organic light emitting diodes (OLEDs), organic field-effect transistors (OFETs) and organic photovoltaics (OPV). OLEDs have currently most applications, as they are already used in super-thin energy-efficient displays for television sets and smartphones, and in future will be used for lighting applications exploiting a world market worth tens of billions Euro. OLEDs should be further developed to increase their performance and durability, and reduce the currently high production costs. The conventional development process is very costly and time-demanding due to the large number of possible materials which have to be synthesized for the production and characterization of prototypes. Deeper understanding of the relationship between OLED device properties and materials structure allows for high-throughput materials screening and thus a tremendous reduction of development costs. In simulations, the properties of various materials one can be virtually and cost-effectively explored and compared to measurements. Based on these results, material composition, morphology and manufacturing processes can be systematically optimized. A typical OLED consists of a stack of multiple crystalline or amorphous organic layers. To compute electronic transport properties, e.g. charge mobilities, a quantum mechanical model, in particular the density functional theory (DFT) is commonly employed. Recently, we performed simulations of electronic processes in OLED materials achieved by multiscale modelling, i.e. by integrating sub-models on different length scales to investigate charge transport in thin films based on the experimentally characterized semi-conducting small molecules [1]. Here, we present a novel scale-out computational strategy to for a tightly coupled multiscale model consisting of a core region with 500 molecules (5000 pairs) of charge hopping sites and a embedding region, containing about 10000 electrostatically interacting molecules. The energy levels of each site depend on the local electrostatic environment yielding a significant contribution to the energy disor-der. This effect is explicitly taken into account in the quantum mechanical sub-model in a self-consistent manner, which represents however, a considerable computational challenge. Thus the total number of DFT calculations needed is of the order of 10^5-10^6. DFT models scale mostly as N^3, where N is the number of basis functions which is strongly related to the number of electrons. While DFT is implemented in a number of efficiently parallelized electronic structure codes, the computational scaling of a single DFT calculation applied for amorphous organic materials is naturally limited by the molecule size. After every iteration cycle, data are exchanged between all contained molecules of the self-consistence loop to update the electrostatic environment of each site. This requires that the DFT sub-model is executed employing a second-level parallelisation with a special scheduling strategy. The realisation of this model on high performance computer (HPC) systems has several issues: i) The DFT sub-models, which are stand-alone applications (such as NWChem or TURBOMOLE), have to be spawned at run time via process forking; ii) Large amounts of input and output data have to be transferred to and from the DFT sub-models though the cluster file system. These two requirements limit the computational performance and often conflict with the usage policies of common HPC environments. In addition, sub-model scheduling and DFT data pre-/post-processing have severe impact on the overall performance. To this end, we designed a DFT application programming interface (API) with different language bindings, such as Python and C++, allowing linking of DFT sub-models, independent of the concrete DFT implementation, to multiscale models. In addition, we propose solutions for in-core handling large input and output data as well as efficient scheduling algorithms. In this contribution, we will describe the architecture and outline the technical implementation of a framework for nesting DFT sub-models. We will demonstrate the use and analyse the performance of the framework for multiscale modelling of OLED materials. The framework provides an API which can be used to integrate DFT sub-models in other applications. [1] P. Friederich, F. Symalla, V. Meded, T. Neumann and W. Wenzel, “Ab Initio Treatment of Disorder Effects in Amorphous Organic Materials: Toward Parameter Free Materials Simulation”, Journal of Chemical Theory and Computation 10, 3720–3725 (2014).	Angela Poschlad, Pascal Friederich, Timo Strunk, Wolfgang Wenzel and Ivan Kondov
189	Optimization and Practical Use of Composition Based Approaches Towards Identification and Collection of Genomic Islands and Their Ontology in Prokaryotes [abstract] Abstract: Motivation: Horizontally transferred genomic islands (islands, GIs) have been referred to as important factors which contribute towards the emergences of pathogens and outbreak instances. The development of tools towards the identification of such elements and retracing their distribution patterns will help to understand how such cases arise. Sequence composition has been used to identify islands, infer their phylogeny; and determine their relative times of insertions. The collection and curation of known islands will enhance insight into island ontology and flow. Results: This paper introduces the merger of SeqWord Genomic Islands Sniffer (SWGIS) which utilizes composition based approaches for identification of islands in bacterial genomic sequences and the Predicted Genomic Islands (Pre_GI) database which houses 26,744 islands found in 2,407 bacterial plasmids and chromosomes. SWGIS is a standalone program that detects genomic islands using a set of optimized parametric measures with estimates of acceptable false positive and false negative rates. Pre_GI is novel repository that includes island ontology and flux. This study furthermore illustrates the need for parametric optimization towards the prediction of islands to minimize false negative and false positive predictions. In addition Pre_GI emphasizes the practicality of compounded knowledge a database affords in the detection and visualization of ontological links between islands. Availability: SWGIS is freely available on the web at http://www.bi.up.ac.za/SeqWord/sniffer/index.html. Pre_GI is freely accessible at http://pregi.bi.up.ac.za/index.php.	Rian Pierneef, Oliver Bezuidt, Oleg Reva

ICCS 2015 Main Track (MT) Session 18

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: V206

Chair: Roland Bouffanais

603	Path Optimization Using Nudged Elastic Band Method for Analysis of Seismic Travel Time [abstract] Abstract: A path optimization method is presented here for analysis of travel times of seismic waves.The method is an adaption of the nudged elastic band method to ray tracing where the the path corresponding to minimal travel time is determined. The method is based on a discrete representation of an initial path followed by iterative optimization of the discretization points so as to minimize the integrated time of propagation along the path. The gradient of the travel time with respect to the location of the discretization points is evaluated and used to find the optimal location of the points. An important aspect of the method is an estimation of the tangent to the path at each discretization point and elimination of the component of the gradient along the path during the iterative optimization. The distribution of discretization points along the path is controlled by spring forces acting only in the direction of the path tangent. The method is illustrated on two test problems and performance compared with previously proposed and actively used methods in the field of seismic data inversion.	Igor Nosikov, Pavel Bessarab, Maksim Klimenko and Hannes Jonsson
642	Global optimization using saddle traversals (GOUST) [abstract] Abstract: The GOUST method relies on a fast way to identify first order saddle points on multidimensional objective function surfaces [1,2]. Given a local minimum, the method involves farming out several searches for first order saddle points, and then sliding down on the other side to discover new local minima. The system is then advanced to one of the newly discovered minima. In this way, local minima of the objective function are mapped out with a tendency to progress towards the global minimum. A practical application of this approach in global optimization of a few geothermal reservoir model parameters has recently been demonstrated [3]. The difficulty of an optimization problem, however, generally increases exponentially with the number of degrees of freedom and we investigate GOUST’s ability to search for the global minimum using a group of selected test functions. The performance of GOUST is tested as a function of the number of dimensions and compared with various other global optimization methods such as evolutionary algorithms and basin hopping. [1] 'A Dimer Method for Finding Saddle Points on High Dimensional Potential Surfaces Using Only First Derivatives',G. Henkelman and H. Jónsson, J. Chem. Phys., Vol. 111, page 7010 (1999) [2] 'Comparison of methods for finding saddle points without knowledge of the final states’, R. A. Olsen, G. J. Kroes, G. Henkelman, A. Arnaldsson and H. Jónsson, J. Chem. Phys. vol. 121, 9776 (2004). [3] 'Geothermal model calibration using a global minimization algorithm based on finding saddle points as well as minima of the objective function', M. Plasencia, A. Pedersen, A. Arnaldsson, J-C. Berthet and H. Jónsson, Computers and Geosciences 65, 110 (2014)	Manuel Plasencia Gutierrez, Kusse Sukuta and Hannes Jónsson
655	Memory Efficient Finite Difference Time Domain Implementation for Large Meshes [abstract] Abstract: In this work we propose a memory-efficient (cache oblivious) implementation of the Finite Difference Time Domain algorithm. The implementation is based on a recursive space-time decomposition of the mesh update dependency graph into subtasks. The algorithm is suitable for processing large spatial meshes, since, unlike in the traditional layer-by-layer update, its efficiency (number of processed mesh cells per unit time) does not drastically drop with growing total mesh size. Additionally, our implementation allows for concurrent execution of subtasks of different size. Depending on the computer architecture, the scheduling may simultaneously encompass different parallelism levels such as vectorization, multithreading and MPI. Concurrent execution mechanisms are switched on (programmed) for subgraphs reaching some suitable size (rank) in course of recursion. In this presentation we discuss the implementation and analyze the performance of the implemented FDTD algorithm for various computer architectures, including multicore systems and large clusters. We demonstrate the FDTD update performance reaching up to 50% of the estimated CPU peak which is 10-30 times higher than that of the traditional FDTD solvers. We also demonstrate an almost perfect parallel scaling of the implemented solver. We discuss the effect of mesh memory layouts such as Z-curve (Morton order) increasing locality of data or interleaved layouts for vectorized updates.	Ilya Valuev and Andrey Zakirov
706	Coupled nuclear reactor simulation with the Virtual Environment for Reactor Applications (VERA) [abstract] Abstract: The Consortium for Advanced Simulation of Light Water Reactors (CASL) was established in July 2010 for the modeling and simulation of commercial nuclear reactors. Led by Oak Ridge National Laboratory (ORNL), CASL also includes three major universities, three industry partners, and three other U.S. National Laboratories. In order to deliver advanced simulation capabilities, CASL has developed and deployed VERA, the Virtual Environment for Reactor Applications (VERA), which integrates components for physical phenomena to enable high-fidelity analysis of conditions within nuclear reactors under a wide range of operating conditions. We report on the architecture of the system, why we refer to it as an Environment rather than a Toolkit or Framework, numerical approaches to the coupled nonlinear simulations, and show results produced on large HPC systems such as the 300,000-core NVIDIA GPU-accelerated Cray XK7 Titan system at Oak Ridge National Laboratory.	John Turner

Computational Optimisation in the Real World (CORW) Session 1

Time and Date: 10:15 - 11:55 on 3rd June 2015

Room: M110

Chair: Timoleon Kipouros

377	A Solution for a Real-time Stochastic Capacitated Vehicle Routing Problem with Time Windows [abstract] Abstract: Real-time distribution planning presents major difficulties when applied to large problems. Commonly, this planning is associated to the capacitated vehicle routing problem with time windows (CVRPTW), deeply studied in the literature. In this paper we propose an optimization system developed to be integrated with an existing Enterprise Resource Planning (ERP) without causing major disruption to the current distribution process of a company. The proposed system includes: a route optimization module, a module implementing the communications within and to the outside of the system, a non-relational database to provide local storage of information relevant to the optimization procedure, and a cartographic subsystem. The proposed architecture is able to deal with dynamic problems included in the specification of the project, namely: arrival of new orders while already optimizing as well as locking and closing of routes by the system administrator. A back-office graphical interface was also implemented and some results are presented.	Pedro Cardoso, Gabriela Schütz, Andriy Mazayev, Emanuel Ey, Tiago Corrêa
25	Fast Multi-Objective Optimisation of a Micro-Fluidic Device by using Graphics Accelerators [abstract] Abstract: The development of technology that uses widely available and inexpensive hardware for real-world cases is presented in this work. This is part of a long-term approach to minimise the impact of aviation on the environment and aims to enable the users both from industrial and academic background to design more optimal mixing devices. Here, a Multi-Objective Tabu Search is combined with a flow solver based on the Lattice Boltzmann Method (LBM) so as to optimise and simulate the shape and the flow of a micro-reactor, respectively. Several geometrical arrangements of a micro-reactor are proposed so as to increase the mixing capability of the device while minimising the pressure losses and to investigate related flow features. The computational engineering design process is accelerated by harnessing the high computational power of Graphic Processor Units (GPUs). The ultimate aim is to effectively harvest and harness computing cycles while performing design optimisation studies that can deliver higher quality designs of improved performance within shorter time intervals.	Christos Tsotskas, Timoleon Kipouros, Mark Savill
294	Multi-objective Optimisation of Marine Propellers [abstract] Abstract: Real world problems have usually multiple objectives. These objective functions are often in conflict, making them highly challenging in terms of determining optimal solutions and analysing solutions obtained. In this work Multi-objective Particle Swarm Optimisation (MOPSO) is employed to optimise the shape of marine propellers for the first time. The two objectives identified are maximising efficiency and minimising cavitation. Several experiments are undertaken to observe and analyse the impacts of structural parameters (shape and number of blades) and operating conditions (RPM) on both objective. The paper also investigates the negative effects of uncertainties in parameters and operating conditions on efficiency and cavitation. Firstly, the results showed that MOPSO is able to find a very accurate and uniformly distributed approximation of the true Pareto optimal front. The analysis of the results also shows that a propeller with 5 or 6 blades operating between 180 and 190 RPM results in the best trade-offs for efficiency and cavitation. Secondly, the simulation results show the significant negative impacts of uncertainties on both objectives.	Seyedali Mirjalili, Andrew Lewis, Seyed Ali Mohammad Mirjalili
502	Distributing Fibre Boards: A Practical Application of Heterogeneous Fleet Vehicle Routing Problem with Time Windows [abstract] Abstract: The Heterogeneous Fleet Capacitated Vehicle Routing Problem with Time Windows and Three-Dimensional Loading Constraints (3L-HFCVRPTW) combines the aspects of 3D loading, heterogeneous transport with capacity constraints and time windows for deliveries. It is the first formulation that comprises all these aspects and takes its inspiration from a practical problem of distributing daily fibre board deliveries faced by our industry partner. Given the shape of the goods to transport, the delivery vehicles are customised and their loading constraints take a specialised form. This study introduces the problem and its constraints as well as a specialised procedure for loading the boards. The loading module can be called during or after the route optimisation. In this initial work, we apply simple local search procedures to the routing problem to two data sets obtained from our industry partner and subsequently employ the loading module to place the deliveries on the vehicles. Simulated Annealing outperforms Iterated Local Search, suggesting that the routing problem is multimodal, and operators that shift deliveries between routes appear most beneficial.	S Pace, A Turky, I. Moser, A Aleti
679	Performance Comparison of Evolutionary Algorithms for Airfoil Design [abstract] Abstract: Different evolutionary algorithms, by their very nature, will have different search trajectory characteristics. Understanding these particularly for real world problems gives researchers and practitioners valuable insights into potential problem domains for the various algorithms, as well as an understanding for potential hybridisation. In this study, we examine three evolutionary techniques, namely, multi-objective particle swarm optimisation, extremal optimisation and tabu search. A problem that is to design optimal cross sectional areas of airfoils that maximise lift and minimise drag, is used. The comparison analyses actual parameter values, rather than just objective function values and computational costs. It reveals that the three algorithms favoured various extents of explorations on the different parameters.	Marcus Randall, Tim Rawlins, Andrew Lewis, Timos Kipouros

Computational Optimisation in the Real World (CORW) Session 2

Time and Date: 14:10 - 15:50 on 3rd June 2015

Room: M110

Chair: Andrew Lewis

235	Public service system design by radial formulation with dividing points [abstract] Abstract: In this paper, we introduce an approximate approach to public service system design making use of a universal IP-solver. The solved problem consists in minimization of the total discomfort of system users, which is usually proportional to the sum of demand-weighted distances between users and the nearest source of provided service. Presented approach is based on radial formulation. The disutility values are estimated by some upper and lower bounds given by so-called dividing points. Deployment of dividing points in uences the solution accuracy. The process of the dividing point deployment is based on the idea that some disutility values can be considered relevant and are expected to obtain in the optimal solution. Hereby, we study various approaches to the relevance with their impact on the accuracy and computational time.	Jaroslav Janacek, Marek Kvet
439	An Improved Cellular Automata Algorithm for Wildfire Spread [abstract] Abstract: Despite being computationally more efficient than vector based approaches, the use of raster-based techniques for simulating wildfire spread has been limited by the distortions that affect the fire shapes. This work presents a Cellular Automata (CA) approach that is able to mitigate this problem with a redefinition of the spread velocity, where the equations generally used in vector-based approaches are modified by mean of a number of correction factors. A numerical optimization approach is used to find the optimal values for the correction factors. The results are compared to the ones given by two well-known Cellular Automata simulators. According to this work, the proposed approach provides better results, in terms of accuracy, at a comparable computational cost.	Tiziano Ghisu, Bachisio Arca, Grazia Pellizzaro, Pierpaolo Duce
537	I-DCOP: Train Classification Based on an Iterative Process Using Distributed Constraint Optimization [abstract] Abstract: This paper presents an Iterative process based on Distributed Constraint Optimization (I-DCOP), to solve train classification problems. The input of the I-DCOP is the train classification problem modelled as a DCOP, named Optimization Model for Train Classification (OMTC). The OMTC generates a feasible schedule for a train classification problem defined by the inbound trains, the total of outbound trains and the cars assigned to them. The expected result, named feasible schedule, leads to the correct formation of the outbound trains, based on the order criteria defined. The OMTC minimizes the schedule execution time and the total number of roll-ins (operation executed on cars, sometimes charged by the yards). I-DCOP extends the OMTC including the constraints of limited amount of classification tracks ant their capacity. However, these constraints are included iteratively by adding domain restrictions on the OMTC. Both OMTC and I-DCOP have been measured using scenarios based on real yard data. OMTC has generated optimal and feasible schedules to the scenarios, optimizing the total number of roll-ins. I-DCOP solved more complex scenarios, providing sub-optimal solutions. The experiments have shown that distributed constraint optimization problems can include additional constraints based on interactively defined domain.	Denise Maria Vecino Sato, André Pinz Borges, Peter Márton, Edson E. Scalabrin
622	An Investigation of the Performance Limits of Small, Planar Antennas Using Optimisation [abstract] Abstract: This paper presents a generalised parametrisation as well as an approach to computational optimisation for small, planar antennas. A history of previous, more limited antenna optimisation techniques is discussed and a new parametrisation introduced in this context. Validation of this new approach against previously developed structures is provided and preliminary results of the optimisation are demonstrated and discussed. For the optimisation, a binary Multi-Objective Particle Swarm Optimisation (MOPSO) is used and several methods for generating a viable initial population are introduced and discussed in the context of practical limitations computational simulations.	Jan Hettenhausen, Andrew Lewis, David Thiel, Morteza Shahpari

Fourth International Workshop on Advances in High-Performance Computational Earth Sciences: Applications and Frameworks (IHPCES) Session 1

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: M104

Chair: Henry Tufo

532	2D Adaptivity for 3D Problems: Parallel SPE10 Reservoir Simulation on Dynamically Adaptive Prism Grids [abstract] Abstract: We present an approach for parallel simulation of 2.5D applications on fully dynamically adaptive 2D triangle grids based on space-filling curve traversal. Often, subsurface, oceanic or atmospheric flow problems in geosciences have small vertical extent or anisotropic input data. Interesting solution features, such as shockwaves, emerge mostly in horizontal directions and require little vertical capturing. \samoa is a 2D code with fully dynamically adaptive refinement, targeted especially at low-order discretizations due to its cache-oblivious and memory-efficient design. We added support for 2.5D grids by implementing vertical columns of degrees-of-freedom, allowing full horizontal refinement and load balancing but restricted control over vertical layers. Results are shown for the SPE10 benchmark, a particularly hard two-phase flow problem in reservoir simulation with a small vertical extent. SPE10 investigates oil exploration by water injection in heterogenous porous media. Performance of \samoa is memory-bound for this scenario with up to 70\% throughput of the STREAM benchmark and a good parallel efficiency of 85\% for strong scaling on 512 cores and 91\% for weak scaling on 8192 cores.	Oliver Meister and Michael Bader
446	A Pipelining Implementation for High Resolution Seismic Hazard Maps Production [abstract] Abstract: Seismic hazard maps are a significant input into emergency hazard management that play an important role in saving human lives and reducing the economic effects after earthquakes. Despite the fact that a number of software tools have been developed (McGuire, 1976, 1978; Bender and Perkins, 1982, 1987; Ordaz et al., 2013; Robinson et al. 2005, 2006; Field et al., 2003), map resolution is generally low, potentially leading to uncertainty in calculations of ground motion level and underestimation of the seismic hazard in a region. In order to generate higher resolution maps, the biggest challenge is to handle the significantly increased data processing workload. In this study, a method for improving seismic hazard map resolution is presented that employs a pipelining implementation of the existing EqHaz program suite (Assatourians and Atkinson, 2013) based on IBM InfoSphere Streams – an advanced stream computing platform. Its architecture is specifically configured for continuous analysis of massive volumes of data at high speeds and low latency. Specifically, it treats processing workload as data streams. Processing procedures are implemented as operators that are connected to form processing pipelines. To handle large processing workload, these pipelines are flexible and scalable to be deployed and run in parallel on large-scale HPC clusters to meet application performance requirements. As a result, mean hazard calculations are possible for maps with resolution up to 2,500,000 points with near-real-time processing time of approximately 5-6 minutes.	Yelena Kropivnitskaya, Jinhui Qin, Kristy F. Tiampo, Michael A. Bauer
519	Scalable multicase urban earthquake simulation method for stochastic earthquake disaster estimation [abstract] Abstract: High-resolution urban earthquake simulations are expected to be useful for improving the reliability of the estimates of damage due to future earthquakes. However, current high-resolution simulation models involve uncertainties in their inputs. An alternative is to apply stochastic analyses using multicase simulations with varying inputs. In this study, we develop a method for simulating the responses of ground and buildings to many earthquakes. By a suitable mapping of computations among computation cores, the developed program attains 97.4% size-up scalability using 320,000 processes (40,000 nodes) on the K computer. This enables the computation of more than 1,000 earthquake scenarios for 0.25 million structures in central Tokyo.	Kohei Fujita, Tsuyoshi Ichimura, Muneo Hori, Lalith Maddegedara, Seizo Tanaka
704	Multi-GPU implementations of parallel 3D sweeping algorithms with application to geological folding [abstract] Abstract: This paper studies some of the CUDA programming challenges in connection with using multiple GPUs to carry out plane-by-plane updates in parallel 3D sweeping algorithms. In particular, attention must be paid to masking the overhead of various data movements between the GPUs. Multiple OpenMP threads on the CPU side should be combined multiple CUDA streams per GPU to hide the data transfer cost related to the halo computation on each 2D plane. Moreover, the technique of peer-to-peer memory access can be used to reduce the impact of 3D volumetric data shuffles that have to be done between mandatory changes of the grid partitioning. We have investigated the performance improvement of 2- and 4-GPU implementations that are applicable to 3D anisotropic front propagation computations related to geological folding. In comparison with a straightforward multi-GPU implementation, the overall performance improvement due to masking of data movements on four GPUs of the Fermi architecture was 23%. The corresponding improvement obtained on four Kepler GPUs was 50\%.	Ezhilmathi Krishnasamy, Mohammed Sourouri, Xing Cai

Fourth International Workshop on Advances in High-Performance Computational Earth Sciences: Applications and Frameworks (IHPCES) Session 2

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: M104

Chair: Xing Cai

244	Big Data on Ice: The Forward Observer System for In-Flight Synthetic Aperture Radar Processing [abstract] Abstract: We introduce the Forward Observer system, which is designed to provide data assurance in field data acquisition while receiving significant amounts (several terabytes per flight) of Synthetic Aperture Radar data during flights over the polar regions, which provide unique requirements for developing data collection and processing systems. Under polar conditions in the field and given the difficulty and expense of collecting data, data retention is absolutely critical. Our system provides a storage and analysis cluster with software that connects to field instruments via standard protocols, replicates data to multiple stores automatically as soon as it is written, and provides pre-processing of data so that initial visualizations are available immediately after collection, where they can provide feedback to researchers in the aircraft during the flight.	Richard Knepper, Matthew Standish, Matthew Link
690	Multi-Scale Coupling Simulation of Seismic Waves and Building Vibrations using ppOpen-HPC [abstract] Abstract: In order to simulate an earthquake shock originating from the earthquake source and the damage it causes to buildings, not only the seismic wave that propagates over a wide region of several 100 km2, but also the building vibrations that occur over a small region of several 10 m2 must be resolved concurrently. Such a multi-scale simulation is difficult because such kind of modeling and implementation by only a specific application are limited. To overcome these problems, a multi-scale weak-coupling simulation of seismic wave and building vibrations using "ppOpen-HPC" libraries is conducted. The ppOpen-HPC, wherein "pp" stands for "post-peta scale", is an open source infrastructure for development and execution of optimized and reliable simulation codes on large-scale parallel computers. On the basis of our evaluation, we confirm that an acceptable result can be achieved that ensures that the overhead cost of the coupler is negligible and it can work on large-scale computational resources.	Masaharu Matsumoto, Takashi Arakawa, Takeshi Kitayama, Futoshi Mori, Hiroshi Okuda, Takashi Furumura, Kengo Nakajima
621	A hybrid SWAN version for fast and efficient practical wave modelling [abstract] Abstract: In the Netherlands, for coastal and inland water applications, wave modelling with SWAN has become a main ingredient. However, computational times are relatively high. Therefore we investigated the parallel efficiency of the current MPI and OpenMP versions of SWAN. The MPI version is not that efficient as the OpenMP version within a single node. Therefore, in this paper we propose a hybrid version of SWAN. It combines the efficiency of the current OpenMP version on shared memory with the capability of the current MPI version to distribute memory over nodes. We describe the numerical algorithm. With initial numerical experiments we show the potential of this hybrid version. Parallel I/O, further optimization, and behavior for larger number of nodes will be subject of future research.	Menno Genseberger, John Donners
573	Numerical verification criteria for coseismic and postseismic crustal deformation analysis with large-scale high-fidelity model [abstract] Abstract: Numerical verification of postseismic crustal deformation analysis, computed using a large-scale finite element simulation, was carried out, by proposing new criteria that consider the characteristics of the target phenomenon. Specifically, pointwise displacement was used in the verification. In addition, the accuracy of the numerical solution was explicitly shown by considering the observation error of the data used for validation. The computational resource required for each analysis implies that high-performance computing techniques are necessary to obtain a verified numerical solution of crustal deformation analysis for the Japanese Islands. Such verification in crustal deformation simulations should take on greater importance in the future, since continuous improvement in the quality and quantity of crustal deformation data is expected.	Ryoichiro Agata, Tsuyoshi Ichimura, Kazuro Hirahara, Mamoru Hyodo, Takane Hori, Chihiro Hashimoto, Muneo Hori

Workshop on Computational Chemistry and its Applications (CCA) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: V102

Chair: Jerry Bernholc

600	Calculations of molecules and solids using self-interaction corrected energy functionals and unitary optimization of complex orbitals [abstract] Abstract: The Perdew-Zunger self-interaction correction to DFT energy functionals can improve the accuracy of calculated results in many respects. The long range effective potential for the electrons then has the correct -1/r dependence so Rydberg excited states of molecules and clusters of molecules can be accurately treated [1,2]. Also, localized electronic states are brought down in energy so defects in semi-conductors and insulators with defect states in the band gap can be characterized [3,4]. The calculations are, however, more challenging since the energy functional is no longer unitary invariant and each step in the self-consistent procedure needs to include an inner loop where unitary optimization is carried out [5,6]. As a result, the calculations produce a set of optimal orbitals which are generally localized and correspond well to chemical intuition. It has become evident that the optimal orbital need to be complex valued functions [7]. If they are restricted to real valued functions, the energy of atoms and molecules is less accurate and structure of molecules can even be incorrect [8]. [1] 'Self-interaction corrected density functional calculations of Rydberg states of molecular clusters: N,N-dimethylisopropylamine', H. Gudmundsdóttir, Y. Zhang, P. M. Weber and H. Jónsson, J. Chem. Phys. 141, 234308 (2014). [2] 'Self-interaction corrected density functional calculations of molecular Rydberg states', H. Gudmundsdóttir, Y. Zhang, P. M. Weber and H. Jónsson, J. Chem. Phys. 139, 194102 (2013). [3] `Simulation of Surface Processes', H. Jónsson, Proceedings of the National Academy of Sciences 108, 944 (2011). [4] 'Solar hydrogen production with semiconductor metal oxides: New directions in experiment and theory', Á. Valdés et al., Phys. Chem. Chem. Phys. 14, 49 (2012). [5] 'Variational, self-consistent implementation of the Perdew–Zunger self-interaction correction with complex optimal orbitals', S. Lehtola and H. Jónsson, Journal of Chemical Theory and Computation 10, 5324 (2014). [6] 'Unitary Optimization of Localized Molecular Orbitals', S. Lehtola and H. Jónsson, Journal of Chemical Theory and Computation 9, 5365 (2013). [7] 'Importance of complex orbitals in calculating the self-interaction corrected ground state of atoms', S. Klüpfel, P. J. Klüpfel and H. Jónsson, Phys. Rev. A Rapid Communication 84, 050501 (2011). [8] 'The effect of the Perdew-Zunger self-interaction correction to density functionals on the energetics of small molecules', S. Klüpfel, P. Klüpfel and H. Jónsson, J. Chem. Phys. 137, 124102 (2012).	Hannes Jónsson
629	Towards An Optimal Gradient-Dependent Energy Functional of the PZ-SIC Form [abstract] Abstract: too high atomization energy (overbinding of the molecules), the application of PZ-SIC gives a large overcorrection and leads to significant underestimation of the atomization energy. The exchange enhancement factor that is optimal for the generalized gradient approximation within the Kohn-Sham (KS) approach may not be optimal for the self-interaction corrected functional. The PBEsol functional, where the exchange enhancement factor was optimized for solids, gives poor results for molecules in KS but turns out to work better than PBE in PZ-SIC calculations. The exchange enhancement is weaker in PBEsol and the functional is closer to the local density approximation. Furthermore, the drop in the exchange enhancement factor for increasing reduced gradient in the PW91 functional gives more accurate results than the plateaued enhancement in the PBE functional. A step towards an optimal exchange enhancement factor for a gradient dependent functional of the PZ-SIC form is taken by constructing an exchange enhancement factor that mimics PBEsol for small values of the reduced gradient, and PW91 for large values. The average atomization energy is then in closer agreement with the high-level quantum chemistry calculations, but the variance is still large, the F2 molecule being a notable outlier.	Elvar Örn Jónsson, Susi Lehtola, Hannes Jónsson
686	Correlating structure and function for nanoparticle catalysts [abstract] Abstract: Metal nanoparticles of only ~100-200 atoms are synthesized using a dendrimer encapsulation technique to facilitate a direct comparison with density functional theory (DFT) calculations in terms of both structure and catalytic function. Structural characterization is done using electron microscopy, x-ray scattering, and electrochemical methods. Combining these tools with DFT calculations is found to improve the quality of the structural models. DFT is also successfully used to predict trends between structure and composition of the nanoparticles and their catalytic function for reactions including the reduction of oxygen and the oxidation of formic acid. This investigation demonstrates some remarkable properties of the nanoparticles, including facile structural rearrangements and nanoscale tuning parameters which can be used to optimize catalytic rates.	Graeme Henkelman
199	The single-center multipole expansion (SCME) model for water: development and applications [abstract] Abstract: Despite many decades of force field developments, and the proliferation of efficient first principles molecular dynamics simulation techniques, a universal microscopic model for water in its various phases has not yet been achieved. In recent years, progress in force field development has shifted from optimizing in ever greater detail the parameters of simple pair-wise additive empirical potentials to developing more advanced models that explicitly include many-body interactions through induced polarization and short-range exchange-repulsion interactions. Such models are often parametrized to reproduce as closely as possible the Born-Oppenheimer surface from highly accurate quantum chemistry calculations; the best models often outperform DFT in accuracy, yet are orders of magnitude more computationally efficient. The SCME model was recently suggested as a physically rigorous and transparent model where the dominant electrostatic interaction is described through a single-center multipole expansion up to the hexadecapole moment, and where many-body effects are treated by induced dipole and quadrupole moments. In this paper, recent improvements of SCME are presented along with selected applications. Monomer flexibility is included via an accurate potential energy surface, a dipole moment surface is used to describe the geometric component of the dipole polarizability, and several formulations of the anisotropic short-range exchange-repulsion interaction are compared. The performance of this second version of the model, SCME2, is demonstrated by comparing to experimental results and high-level quantum chemistry calculations. Future perspectives for applications and developments of SCME2 are presented, including an outline for how the model can be adapted to describe mixed systems of water with other small molecules and how it can be used as a polarizable solvent in QM/MM simulations.	Kjartan Thor Wikfeldt and Hannes Jonsson
8	Quantum Topology of the Charge density of Chemical Bonds. QTAIM analysis of the C-Br and O-Br bonds. [abstract] Abstract: The present study aims to explore the quantum topological features of the electron density and its Laplacian of the understudied molecular bromine species involved in ozone depletion events. The characteristics of the C-Br and O-Br bonds have been analyzed via quantum theory of atom in molecules (QTAIM) analysis using the wave functions computed at the B3LYP/aug-cc-PVTZ level of theory. Quantum topology analysis reveal that the C-Br and O-Br bonds show depletion of charge density indicating the increased ionic character of these bonds. Contour plots and relief maps have been analyzed for regions of valence shell charge concentrations (VSCC) and depletions (VSCD) in the ground state	Rifaat Hilal, Saadullah Aziz, Shabaan Elrouby, Abdulrahman Alyoubi

Workshop on Computational Chemistry and its Applications (CCA) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: V102

Chair: Hannes Jonsson

608	Computational study of electrochemical CO2 reduction at transition metal electrodes [abstract] Abstract: Density functional theory calculations were used to model the electrochemical reduction of CO2 on various transition metals, in particular Cu(111) and Pt(111) surfaces. The minimum energy paths for sequential protonation by either Tafel or Heyrovsky mechanism were calculated using the nudged elastic band method for applied potentials comparable to those used in experimental studies, ranging from -0.7 V to -1.7 V. A mechanism for CO2 reduction on Cu(111) has been identified where the highest activation energy is 0.4 eV. On Pt(111) a different mechanism is found to be optimal but it involves a higher barrier, 0.7 eV. Hydrogen production is then a competing reaction with activation barrier of only 0.3 eV, while on Cu(111) hydrogen production has a barrier of 0.6 eV. These results are consistent with experimental findings where copper electrodes are found to lead to relatively high yield of CH4 while H2 forms almost exclusively at platinum electrodes. A detailed understanding of the mechanism of electrochemical reduction of CO2 to hydrocarbons can help design improved catalysts for this important reaction.	Javed Hussain, Egill Skúlason, Hannes Jónsson
147	Petascale Calculations of Electronic Structure and Electron Transport [abstract] Abstract: We describe new developments and applications of the Real Space Multigrid (RMG) electronic structure family of codes. RMG uses real-space grids, a multigrid pre-conditioner, and subspace diagonalization to solve the Kohn-Sham equations. It is designed for use on massively parallel computers and has shown excellent scalability and performance, reaching 6.5 PFLOPS on 18k Cray compute nodes with 288k CPU cores and 18k GPUs. We discuss (i) New developments in parallel subspace diagonalization, which speeds of the diagonalization part of the calculations by a factor of three or more; and (ii) Linear-scaling quantum transport methodology, which enable calculations for several thousand atoms. As examples, we consider: (iii) Molecular sensors based on carbon nanotubes, with configurations based both on direct attachment (physisorption and chemisorption) and indirect functionalization via covalent and non-covalent linkers; and (iv) Electron transport in DNA and the effects of base-pair matching, solvent and counterions. All of these dramatically affect the conductivity of DNA strands, which explains the wide range of results observed experimentally. If time permits, we will also discuss (v) fully quantum simulations of solvated biomolecules, in which Kohn-Sham (KS) DFT is used to describe the biomolecule and its first solvation shells, while the orbital-free (OF) DFT is employed for the rest of the solvent. The OF part is fully O(N) and capable of handling 10^5 solvent molecules on current parallel supercomputers, while taking only ~10% of the total time. RMG is now an open source code, running on Linux, Windows and MacIntosh systems. The current release of the code may be downloaded at http://sourceforge.net/projects/rmgdft/. In collaboration with E. Briggs, Y. Li, B. Tan, M. Hodak, and W. Lu.	Jerry Bernholc
239	Dynamic Structural Disorder in Supported Pt Clusters Under Operando Conditions [abstract] Abstract: Supported nanoparticle catalysts are ubiquitous in heterogeneous catalytic processes, and there is broad interest in their physical and chemical properties. However, global probes such as XAS and XPS generally reveal their ensemble characteristics, obscuring details of their fluctuating internal structure. We have previously shown [1] that a combination of theoretical and experimental techniques is needed to understand the intra-particle heterogeneity of these systems [2], and their changes under operando conditions [3]. For example, ab initio DFT/MD simulations revealed that the nanoscale structure and charge distribution are inhomogeneous and dynamically fluctuating over several time-scales, ranging from fast (200-400 fs) bond vibrations to slow fluxional bond breaking (>10 ps). In particular the anomalous behavior of the mean-square relative displacement is not static, but rather is driven by stochastic motion of the center of mass over 1-4 ps time-scales. The resulting large scale fluctuations are termed “dynamic structural disorder” (DSD) [2]. Moreover, the nanoparticles tend to exhibit a semi-melted cluster surface, which for alloy clusters can be atomically-segregated. Recent studies of CO- and H-covered Pt nanoclusters on C and SiO2 supports show a variety of spectral and structural trends as a function of temperature. DFT simulations show that adsorption drives local electronic structure changes that are responsible for the opposite energy shifts vs temperature, of the absorption edge and off-resonant emission line. Moreover, desorption results in local bond contraction, thus explaining the negative thermal expansion observed in XAS experiments. For example, upon single CO adsorption, the Pt-Pt bonds formed by coordinated Pt atoms are locally expanded by ~5%, with little change in the rest of the particle. Coordination also has a large effect on the net charge of the Pt atoms (Figure 1), with a net loss of charge upon adsorption. Finally, we show how high coverage inverts the charging structure of the cluster, turning the negative surface (positive interior) of the clean cluster to positive surface (negative interior) in the fully covered case. Supported by DOE grant DE-FG02-03ER15476, with computer support from DOE-NERSC. [1] F. D. Vila, J. J. Rehr, J. Kas, R. G. Nuzzo and A. I. Frenkel, Phys. Rev. B 78, 121404(R) (2008). [2] J. J. Rehr and F. D. Vila, J. Chem. Phys. 140, 134701 (2014). [3] F. D. Vila, J. J. Rehr, S. D. Kelly and S. R. Bare J. Phys. Chem. C 117, 12446 (2013).	John Rehr, Fernando Vila and Anatoly Frenkel
359	Long time scale simulations of amorphous ice [abstract] Abstract: Amorphous ice, or amorphous solid water (ASW), is the most common form of ice in astrophysical environments and is believed to be the dominant com- ponent of comets, planetary rings and dust grains in interstellar molecu- lar clouds. The surface of ASW catalyzes chemical reactions in interstellar space ranging from H 2 to complex organic molecules, and a deeper under- standing of ASW is thus crucial for better models of chemical evolution in the universe. ASW is disordered and metastable with respect to crystalline hexagonal ice, and forms when water molecules are deposited on surfaces at temperatures below 140 K. However, the structure, morphology and for- mation mechanisms of ASW are poorly understood and have received much attention across many disciplines. Indeed, the structure and morphology of ASW depend sensitively on how it forms, where key parameters are temper- ature, deposition rate and deposition angle. While atomistic simulations of ASW are extremely challenging due to slow kinetics and the long timescales involved, state-of-the-art long timescale methods provide a possible means to study the atomistic mechanisms involved on relevant timescales. Here we will discuss atomistic simulations of the growth and long timescale evolution of ASW through the use of the adaptive kinetic Monte Carlo (AKMC) tech- nique coupled to different interaction potentials for water molecules. The influence of temperature and deposition parameters is studied in detail and compared to available experimental results. Our results elucidate the struc- ture and formation mechanisms of ASW under astrophysical environments and provide realistic structure models that can be used in further studies of chemical reactivity of ASW surfaces.	Ramya Kormath Madam

Workshop on Computational Chemistry and its Applications (CCA) Session 3

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: V102

Chair: John Rehr

404	Modelling Molecular Crystals by QM/MM [abstract] Abstract: Computational modelling of chemical systems is most easily carried out in the vacuum for single molecules. Accounting for environmental effects accurately in quantum chemical calculations, however, is often necessary for computational predictions of chemical systems to have any relevance to experiments carried out in the condensed phases. I will discuss a quantum mechanics/molecular mechanics (QM/MM) based method to account for solid-state effects on geometries and molecular properties in molecular crystals. The method in its recent black-box implementation in Chemshell can satisfactorily describe the crystal packing effects on local geometries in a molecular crystals and account for the electrostatic effects that affects certain molecular properties such as transition metal NMR chemical shifts, electric field gradients, Mössbauer and other spectroscopic properties.	Ragnar Bjornsson
437	A Quaternion Method for Removing Translational and Rotational Degrees of Freedom from Transition State Search Methods [abstract] Abstract: In finite systems, such as nanoparticles and gas-phase molecules, calculations of minimum energy paths connecting initial and final states of transitions as well as searches for saddle points are complicated by the presence of external degrees of freedom, such as overall translation and rotation. A method based on quaternion algebra for removing the external degrees of freedom is presented and applied in calculations using two commonly used methods: the nudged elastic band (NEB) method for finding minimum energy paths and DIMER for minimum-mode following to find transition states. With the quaternion approach, fewer images in the NEB are needed to represent MEPs accurately. In both the NEB and DIMER calculations, the number of iterations required to reach convergence is significantly reduced.	Marko Melander
438	Drag Assisted Simulated Annealing Method for Geometry Optimization of Molecules [abstract] Abstract: One of the methods to find the global minimum of a potential energy surface of a molecular system is simulated annealing. The main idea of simulated annealing is to start you system at a high temperature and then slowly cool it down so that there is a chance for the atoms in the system to explore the different degrees of freedom and ultimately find the global minimum. Simulated annealing is traditionally used in classical Monte Carlo or in classical molecular dynamics. One of the methods to find the global minimum of a potential energy surface of a molecular system is simulated annealing. The main idea of simulated annealing is to start you system at a high temperature and then slowly cool it down so that there is a chance for the atoms in the system to explore the different degrees of freedom and ultimately find the global minimum. Simulated annealing is traditionally used in classical Monte Carlo or in classical molecular dynamics. In molecular dynamics, one of the traditional methods was first implemented by Woodcock in 1971. In this method the velocities are scaled down after a given number of molecular dynamics steps, let the system explore the potential energy surface and scale down the velocities again until a minimum is found. In this work we propose to use a viscous friction term, similar to the one used in Langevin dynamics, to slowly bring down the temperature of the system in a natural way. We use drag terms that depend linearly or quadraticaly on the velocity of the particles. These drag terms will naturally bring the temperature the system down and when the system reaches equilibrium they will vanish. Thus, imposing a natural criterion to stop the simulation. We tested the method in Lenard-Jones clusters of up to 20 atoms. We started the system in different initial conditions and used different values for the temperature and the drag coefficients and found the global minima of every one of the clusters. This method demonstrated to be conceptually very simple, but very robust, in finding the global minima.	Bilguun Woods, Paulo Acioli
597	Modeling electrochemical reactions at the solid-liquid interface using density functional calculations [abstract] Abstract: Charged interfaces are physical phenomena found in various natural systems and artificial devices within the fields of biology, chemistry and physics. In electrochemistry, this is known as the electrochemical double layer, introduced by Helmholtz over 150 years ago. At this interface, between a solid surface and the electrolyte, chemical reactions can take place in a strong electric field. In this presentation, a new computational method is introduced for creating charged interfaces and to study charge transfer reactions on the basis of periodic DFT calculations. The electrochemical double layer is taken as an example, in particular the hydrogen electrode as well as the O2, N2 and CO2 reductions. With this method the mechanism of forming hydrogen gas, water, ammonia and methane/methanol is studied. The method is quite general and could be applied to a wide variety of atomic scale transitions at charged interfaces.	Egill Skúlason
601	Transition Metal Nitride Catalysts for Electrochemical Reduction of Nitrogen to Ammonia at Ambient Conditions [abstract] Abstract: Computational screening for catalysts that are stable, active and selective towards electrochemical reduction of nitrogen to ammonia at room temperature and ambient pressure is presented from a range of transition metal nitride surfaces. Density functional theory (DFT) calculations are used to study the thermochemistry of cathode reaction so as to construct the free energy profile and to predict the required onset potential via the Mars-van Krevelen mechanism. Stability of the surface vacancy as well as the poisoning possibility of these catalysts under operating conditions are also investigated towards catalyst engineering for sustainable ammonia formation. The most promising candidates turned out to be the (100) facets of rocksalt structure of VN, CrN, NbN and ZrN that should be able to form ammonia at -0.51 V, -0.76 V, -0.65 V and -0.76 V vs. SHE, respectively. Another interesting result of the current work is that for the introduced nitride candidates hydrogen evolution is no longer the competing reaction; thus, high formation yield of ammonia is expected at low onset potentials.	Younes Abghoui, Egill Skúlason

Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: M208

Chair: Stephane Louise

743	Alchemy Workshop Keynote: Programming heterogeneous, manycore machines: a runtime system's perspective [abstract] Abstract: Heterogeneous manycore parallel machines, mixing multicore CPUs with manycore accelerators provide an unprecedented amount of processing power per node. Dealing with such a large number of heterogeneous processing units -- providing a highly unbalanced computing power -- is one of the biggest challenge that developpers of HPC applications have to face. To Fully tap into the potential of these heterogeneous machines, pure offloading approaches, that consist in running an application on host cores while offloading part of the code on accelerators, are not sufficient. In this talk, I will go through the major software techniques that were specifically designed to harness heterogeneous architectures, focusing on runtime systems. I will discuss some of the most critical issues programmers have to consider to achieve portability of performance, and how programming languages may evolve to meet such as goal. Eventually, I will give some insights about the main challenges designers of programming environments will have to face in upcoming years.	Raymond Namyst
433	On the Use of a Many-core Processor for Computational Fluid Dynamics Simulations [abstract] Abstract: The increased availability of modern embedded many-core architectures supporting floating-point operations in hardware makes them interesting targets in traditional high performance computing areas as well. In this paper, the Lattice Boltzmann Method (LBM) from the domain of Computational Fluid Dynamics (CFD) is evaluated on Adapteva’s Epiphany many-core architecture. Although the LBM implementation shows very good scalability and high floating-point efficiency in the lattice computations, current Epiphany hardware does not provide adequate amounts of either local memory or external memory bandwidth to provide a good foundation for simulation of the large problems commonly encountered in real CFD applications.	Sebastian Raase, Tomas Nordström
263	A short overview of executing Γ Chemical Reactions over the ΣC and τC Dataflow Programming Models [abstract] Abstract: Many-core processors offer top computational power while keeping the energy consumption reasonable compared to complex processors. Today, they enter both high-performance computing systems, as well as embedded systems. However, these processors require dedicated programming models to efficiently benefit from their massively parallel architectures. The chemical programming paradigm has been introduced in the late eighties as an elegant way of formally describing distributed programs. Data are seen as molecules that can freely react thanks to operators to create new data. This paradigm has also been used within the context of grid computing and now seems to be relevant for many-core processors. Very few implementations of runtimes for chemical programming have been proposed, none of them giving serious elements on how it can be deployed onto a real architecture. In this paper, we propose to implement some parts of the chemical paradigm over the ΣC dataflow programming language, that is dedicated to many-core processors. We show how to represent molecules using agents and communication links, and to iteratively build the dataflow graph following the chemical reactions. A preliminary implementation of the chemical reaction mechanisms is provided using the τC dataflow compilation toolchain, a language close to ΣC, in order to demonstrate the relevance of the proposition.	Loïc Cudennec, Thierry Goubier
435	Threaded MPI Programming Model for the Epiphany RISC Array Processor [abstract] Abstract: The Adapteva Epiphany RISC array processor offers high computational energy-efficiency and parallel scalability. However, extracting performance with a standard parallel programming model remains a great challenge. We present an effective programming model for the low-power Epiphany architecture based on the Message Passing Interface (MPI) standard. Using MPI exploits the similarities between the Epiphany architecture and a networked parallel distributed cluster. Furthermore, our approach enables codes written with MPI to execute on the RISC array processor with little modification. We present experimental results for the threaded MPI implementation of matrix-matrix multiplication and highlight the importance of fast inter-core data transfers. Our high-level programming methodology achieved an on-chip performance of 9.1 GFLOPS.	David Richie, James Ross, Song Park and Dale Shires

Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: M208

Chair: Stephane Louise

529	An Empirical Evaluation of a Programming Model for Context-Dependent Real-time Streaming Applications [abstract] Abstract: We present a Programming Model for real-time streaming applications on high performance embedded multi- and many-core systems. Realistic streaming applications are highly dependent on the execution context (usually of physical world), past learned strategies, and often real-time constraints. The proposed Programming Model encompasses both real-time requirements, determinism of execution and context dependency. It is an extension of the well-known Cyclo-Static Dataflow (CSDF), for its desirable properties (determinism and composability), with two new important data-flow filters: Select-duplicate, and Transaction which retain the main properties of CSDF graphs and also provide useful features to implement real-time computational embedded applications. We evaluate the performance of our programming model thanks to several real-life case-studies and demonstrate that our approach overcomes a range of limitations that use to be associated with CSDF models.	Xuan Khanh Do, Stephane Louise, Albert Cohen
617	A Case Study on Using a Proto-Application as a Proxy for Code Modernization [abstract] Abstract: The current HPC system architecture trend consists in the use of many-core and heterogeneous architectures. Programming and runtime approaches struggle to scale with the growing number of nodes and cores. In order to take advantage of both distributed and shared memory levels, flat MPI seems unsustainable. Hybrid parallelization strategies are required. In a previous work we have demonstrated the efficiency of the D&C approach for the hybrid parallelization of finite element method assembly on unstructured meshes. In this paper we introduce the concept of proto-application as a proxy between computer scientists and application developers.The D&C library has been entirely developed on a proto-application, extracted from an industrial application called DEFMESH, and then ported back and validated on the original application. In the meantime, we have ported the D&C library in AETHER, an industrial fluid dynamics code developed by Dassault Aviation. The results show that the speed-up validated on the proto-application can be reproduced on other full scale applications using similar computational patterns. Nevertheless, this experience draws the attention on code modernization issues, such as data layout adaptation and memory management. As the D\&C library uses a task based runtime, we also make a comparison between Intel\textregistered Cilk\texttrademark Plus and OpenMP.	Nathalie Möller, Eric Petit, Loïc Thébault, Quang Dinh
422	A Methodology for Profiling and Partitioning Stream Programs on Many-core Architectures [abstract] Abstract: Maximizing the data throughput is a very common implementation objective for several streaming applications. Such task is particularly challenging for implementations based on many-core and multi-core target platforms because, in general, it implies tackling several NP-complete combinatorial problems. Moreover, an efficient design space exploration requires an accurate evaluation on the basis of dataflow program execution profiling. The focus of the paper is on the methodology challenges for obtaining accurate profiling measures. Experimental results validate a many-core platform built by an array of Transport Triggered Architecture processors for exploring the partitioning search space based on the execution trace analysis.	Malgorzata Michalska, Jani Boutellier, Marco Mattavelli
424	Execution Trace Graph Based Multi-Criteria Partitioning of Stream Programs [abstract] Abstract: One of the problems proven to be NP-hard in the field of many-core architectures is the partitioning of stream programs. In order to maximize the execution parallelism and obtain the maximal data throughput for a streaming application it is essential to find an appropriate actors assignment. The paper proposes a novel approach for finding a close-to-optimal partitioning configuration which is based on the execution trace graph of a dataflow network and its analysis. We present some aspects of dataflow programming that make the partitioning problem different in this paradigm and build the heuristic methodology on them. Our optimization criteria include: balancing the total processing workload with regards to data dependencies, actors idle time minimization and reduction of data exchanges between processing units. Finally, we validate our approach with experimental results for a video decoder design case and compare them with some state-of-the-art solutions.	Malgorzata Michalska, Simone Casale-Brunet, Endri Bezati, Marco Mattavelli
365	A First Step to Performance Prediction for Heterogeneous Processing on Manycores [abstract] Abstract: In order to maintain the continuous growth of the performance of computers while keeping their energy consumption under control, the microelecttronic industry develops architectures capable of processing more and more tasks concurrently. Thus, the next generations of microprocessors may count hundreds of independent cores that may differ in their functions and features. As an extensive knowledge of their internals cannot be a prerequisite to their programming and for the sake of portability, these forthcoming computers necessitate the compilation flow to evolve and cope with heterogeneity issues. In this paper, we lay a first step toward a possible solution to this challenge by exploring the results of SPMD type of parallelism and predicting performance of the compilation results so that our tools can guide a compiler to build an optimal partition of task automatically, even on heterogeneous targets. We show on experimental results a very good accuracy of our tools to predict real world performance.	Nicolas Benoit, Stephane Louise

Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY) Session 3

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: M208

Chair: Stephane Louise

528	Towards an automatic co-generator for manycores’ architecture and runtime: STHORM case-study [abstract] Abstract: The increasing design complexity of manycore architectures at the hardware and software levels imposes to have powerful tools capable of validating every functional and non-functional property of the architecture. At the design phase, the chip architect needs to explore several parameters from the design space, and iterate on different instances of the architecture, in order to meet the defined requirements. Each new architectural instance requires the configuration and the generation of a new hardware model/simulator, its runtime, and the applications that will run on the platform, which is a very long and error-prone task. In this context, the IP-XACT standard has become widely used in the semiconductor industry to package IPs and provide low level SW stack to ease their integration. In this work, we present a primer work on a methodology to automatically configuring and assembling an IP-XACT golden model and generating the corresponding manycore architecture HW model, low-level software runtime and applications. We use the STHORM manycore architecture and the HBDC application as a case study.	Charly Bechara, Karim Ben Chehida, Farhat Thabet
249	Retargeting of the Open Community Runtime to Intel Xeon Phi [abstract] Abstract: The Open Community Runtime (OCR) is a recent effort in the search for a runtime for extreme scale parallel systems. OCR relies on the concept of a dynamically generated task graph to express the parallelism of a program. Rather than being directly used for application development, the main purpose of OCR is to become a low-level runtime for higher-level programming models and tools. Since manycore architectures like the Intel Xeon Phi are likely to play a major role in future high performance systems, we have implemented the OCR API for shared-memory machines, including the Xeon Phi. We have also implemented two benchmark applications and performed experiments to investigate the viability of the OCR as a runtime for manycores. Our experiments and a comparison with OpenMP indicate that OCR can be an efficient runtime system for current and emerging manycore systems.	Jiri Dokulil, Siegfried Benkner
14	Prefetching Challenges in Distributed Memories for CMPs [abstract] Abstract: Prefetch engines working on distributed memory systems behave independently by analyzing the memory accesses that are addressed to the attached piece of cache. They potentially generate prefetching requests targeted at any other tile on the system that depends on the computed address. This distributed behavior involves several challenges that are not present when the cache is unified. In this paper, we identify, analyze, and quantify the effects of these challenges, thus paving the way to future research on how to implement prefetching mechanisms at all levels of this kind of system with shared distributed caches.	Marti Torrents, Raul Martínez, Carlos Molina