ICCS 2015 Main Track (MT) Session 1

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: M101

Chair: Jorge Veiga Fachal

53 Diarchy: An Optimized Management Approach for MapReduce Masters [abstract]
Abstract: The MapReduce community is progressively replacing the classic Hadoop with Yarn, the second-generation Hadoop (MapReduce 2.0). This transition is being made due to many reasons, but primarily because of some scalability drawbacks of the classic Hadoop. The new framework has appropriately addressed this issue and is being praised for its multi-functionality. In this paper we carry out a probabilistic analysis that emphasizes some reliability concerns of Yarn at the job master level. This is a critical point, since the failures of a job master involves the failure of all the workers managed by such a master. In this paper, we propose Diarchy, a novel system for the management of job masters. Its aim is to increase the reliability of Yarn, based on the sharing and backup of responsibilities between two masters working as peers. The evaluation results show that Diarchy outperforms the reliability performance of Yarn in different setups, regardless of cluster size, type of job, or average failure rate and suggest a positive impact of this approach compared to the traditional, single-master Hadoop architecture.
Bunjamin Memishi, María S. Pérez, Gabriel Antoniu
61 MPI-Parallel Discrete Adjoint OpenFOAM [abstract]
Abstract: OpenFOAM is a powerful Open-Source (GPLv3) Computational Fluid Dynamics tool box with a rising adoption in both academia and industry due to its continuously growing set of features and the lack of license costs. Our previously developed discrete adjoint version of OpenFOAM allows us to calculate derivatives of arbitrary objectives with respect to a potentially very large number of input parameters at a relative (to a single primal flow simulation) computational cost which is independent of that number. Discrete adjoint OpenFOAM enables us to run gradient-based methods such as topology optimization efficiently. Up until recently only a serial version was available limiting both the computing performance and the amount of memory available for the solution of the problem. In this paper we describe a first parallel version of discrete adjoint OpenFOAM based on our adjoint MPI library.
Markus Towara, Michel Schanen, Uwe Naumann
98 Versioned Distributed Arrays for Resilience in Scientific Applications: Global View Resilience [abstract]
Abstract: Exascale studies project reliability challenges for future HPC systems. We propose the Global View Resilience (GVR) system, a library that enables applications to add resilience in a portable, application-controlled fashion using versioned distributed arrays. We describe GVR’s interfaces to distributed arrays, versioning, and cross-layer error recovery. Using several large applications (OpenMC, preconditioned conjugate gradient (PCG) solver, ddcMD, and Chombo), we evaluate the programmer effort to add resilience. The required changes are small (<2% LOC), localized, and machine-independent, requiring no software architecture changes. We also measure the overhead of adding GVR versioning and show that generally overheads <2 % are achieved. Thus, we conclude that GVR’s interfaces and implementation are flexible, portable, and create a gentle-slope path to tolerate growing error rates in future systems.
Andrew Chien, Pavan Balaji, Pete Beckman, Nan Dun, Aiman Fang, Hajime Fujita, Kamil Iskra, Zachary Rubenstein, Ziming Zheng, Robert Schreiber, Jeff Hammond, James Dinan, Ignacio Laguna, David Richards, Anshu Dubey, Brian van Straalen, Mark Hoemmen, Michael Heroux, Keita Teranishi, Andrew Siegel
106 Characterizing a High Throughput Computing Workload: The Compact Muon Solenoid (CMS) Experiment at LHC [abstract]
Abstract: High throughput computing (HTC) has aided the scientific community in the analysis of vast amounts of data and computational jobs in distributed environments. To manage these large workloads, several systems have been developed to efficiently allocate and provide access to distributed resources. Many of these systems rely on job characteristics estimates (e.g., job runtime) to characterize the workload behavior, which in practice is hard to obtain. In this work, we perform an exploratory analysis of the CMS experiment workload using the statistical recursive partitioning method and conditional inference trees to identify patterns that characterize particular behaviors of the workload. We then propose an estimation process to predict job characteristics based on the collected data. Experimental results show that our process estimates job runtime with 75% of accuracy on average, and produces nearly optimal predictions for disk and memory consumption.
Rafael Ferreira Da Silva, Mats Rynge, Gideon Juve, Igor Sfiligoi, Ewa Deelman, James Letts, Frank Wuerthwein, Miron Livny
182 Performance Tuning of MapReduce Jobs Using Surrogate-Based Modeling [abstract]
Abstract: Modeling workflow performance is crucial for finding optimal configuration parameters and optimizing execution times. We apply the method of surrogate-based modeling to performance tuning of MapReduce jobs. We build a surrogate model defined by a multivariate polynomial containing a variable for each parameter to be tuned. For illustrative purposes, we focus on just two parameters: the number of parallel mappers and the number of parallel reducers. We demonstrate that an accurate performance model can be built sampling a small set of the parameter space. We compare the accuracy and cost of building the model when using different sampling methods as well as when using different modeling approaches. We conclude that the surrogate-based approach we describe is both less expensive in terms of sampling time and more accurate than other well-known tuning methods.
Travis Johnston, Mohammad Alsulmi, Pietro Cicotti, Michela Taufer

ICCS 2015 Main Track (MT) Session 2

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: M101

Chair: Markus Towara

305 A Neural Network Embedded System for Real-Time Estimation of Muscle Forces [abstract]
Abstract: This work documents the progress towards the implementation of an embedded solution for muscular forces assessment during cycling activity. The core of the study is the adaptation to a real-time paradigm an inverse biomechanical model. The model is well suited for real-time applications since all the optimization problems are solved through a direct neural estimator. The real-time version of the model was implemented on an embedded microcontroller platform to profile code performance and precision degradation, using different numerical techniques to balance speed and accuracy in a low computational resources environment.
Gabriele Maria Lozito, Maurizio Schmid, Silvia Conforto, Francesco Riganti Fulginei, Daniele Bibbo
366 Towards Scalability and Data Skew Handling in GroupBy-Joins using MapReduce Model [abstract]
Abstract: For over a decade, MapReduce has become the leading programming model for parallel and massive processing of large volumes of data. This has been driven by the development of many frameworks such as Spark, Pig and Hive, facilitating data analysis on large-scale systems. However, these frameworks still remain vulnerable to communication costs, data skew and tasks imbalance problems. This can have a devastating effect on the performance and on the scalability of these systems, more particularly when treating GroupBy-Join queries of large datasets. In this paper, we present a new GroupBy-Join algorithm allowing to reduce communication costs considerably while avoiding data skew effects. A cost analysis of this algorithm shows that our approach is insensitive to data skew and ensures perfect balancing properties during all stages of GroupBy-Join computation even for highly skewed data. These performances have been confirmed by a series of experimentations.
Mohamad Al Hajj Hassan, Mostafa Bamha
452 MREv: an Automatic MapReduce Evaluation Tool for Big Data Workloads [abstract]
Abstract: The popularity of Big Data computing models like MapReduce has caused the emergence of many frameworks oriented to High Performance Computing (HPC) systems. The suitability of each one to a particular use case depends on its design and implementation, the underlying system resources and the type of application to be run. Therefore, the appropriate selection of one of these frameworks generally involves the execution of multiple experiments in order to assess their performance, scalability and resource efficiency. This work studies the main issues of this evaluation, proposing a new MapReduce Evaluator (MREv) tool which unifies the configuration of the frameworks, eases the task of collecting results and generates resource utilization statistics. Moreover, a practical use case is described, including examples of the experimental results provided by this tool. MREv is available to download at http://mrev.des.udc.es.
Jorge Veiga, Roberto R. Expósito, Guillermo L. Taboada, Juan Tourino
604 Load-Balancing for Large Scale Situated Agent-Based Simulations [abstract]
Abstract: In large scale agent-based simulations, memory and computational power requirements can increase dramatically because of high numbers of agents and interactions. To be able to simulate millions of agents, distributing the simulator on a computer network is promising, but raises some issues like: agents allocation and load-balancing between machines. In this paper, we study the best ways to automatically balance the loads between machines in large scale situations. We study the performance of two different applications with two different distribution approaches, and we show in our experimental results that some applications can automatically adapt the loads between machines and get alone a high performance in large scale simulations with one distribution approach than the other.
Omar Rihawi, Yann Secq, Philippe Mathieu
669 Changing CPU Frequency in CoMD Proxy Application Offloaded to Intel Xeon Phi Co-processors [abstract]
Abstract: Obtaining exascale performance is a challenge. Although the technology of today features hardware with very high levels of concurrency, exascale performance is primarily limited by energy consumption. This limitation has lead to the use of GPUs and specialized hardware such as many integrated core (MIC) co-processors and FPGAs for computation acceleration. The Intel Xeon Phi co-processor, built upon the MIC architecture, features many low frequency, energy efficient cores. Applications, even those which do not saturate the large vector processing unit in each core, may benefit from the energy-efficient hardware and software of the Xeon Phi. This work explores the energy savings of applications which have not been optimized for the co-processor. Dynamic voltage and frequency scaling (DVFS) is often used to reduce energy consumption during portions of the execution where performance is least likely to be affected. This work investigates the impact on energy and performance when DVFS is applied to the CPU during MIC-offloaded sections (i.e., code segments to be processed on the co-processor). Experiments, conducted on the molecular dynamics proxy application CoMD, show that as much as 14\% energy may be saved if two Xeon Phi's are used. When DVFS is applied to the host CPU frequency, energy savings of as high as 9\% are obtained in addition to the 8\% saved from reducing link-cell count.
Gary Lawson, Masha Sosonkina, Yuzhong Shen

ICCS 2015 Main Track (MT) Session 3

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: M101

Chair: Gabriele Maria Lozito

37 Improving OpenCL programmability with the Heterogeneous Programming Library [abstract]
Abstract: The use of heterogeneous devices is becoming increasingly widespread. Their main drawback is their low programmability due to the large amount of details that must be handled. Another important problem is the reduced code portability, as most of the tools to program them are vendor or device-specfic. The exception to this observation is OpenCL, which largely suffers from the reduced programmability problem mentioned, particularly in the host side. The Heterogeneous Programming Library (HPL) is a recent proposal to improve this situation, as it couples portability with good programmability. While the HPL kernels must be written in a language embedded in C++, users may prefer to use OpenCL kernels for several reasons such as their growing availability or a faster development from existing codes. In this paper we extend HPL to support the execution of native OpenCL kernels and we evaluate the resulting solution in terms of performance and programmability, achieving very good results.
Moises Vinas, Basilio B. Fraguela, Zeki Bozkus, Diego Andrade
241 Efficient Particle-Mesh Spreading on GPUs [abstract]
Abstract: The particle-mesh spreading operation maps a value at an arbitrary particle position to con- tributions at regular positions on a mesh. This operation is often used when a calculation involving irregular positions is to be performed in Fourier space. We study several approaches for particle-mesh spreading on GPUs. A central concern is the use of atomic operations. We are also concerned with the case where spreading is performed multiple times using the same particle configuration, which opens the possibility of preprocessing to accelerate the overall com- putation time. Experimental tests show which algorithms are best under which circumstances.
Xiangyu Guo, Xing Liu, Peng Xu, Zhihui Du, Edmond Chow
279 AMA: Asynchronous Management of Accelerators for Task-based Programming Models [abstract]
Abstract: Computational science has benefited in the last years from emerging accelerators that increase the performance of scientific simulations, but using these devices hinders the programming task. This paper presents AMA: a set of optimization techniques to efficiently manage multi-accelerator systems. AMA maximizes the overlap of computation and communication in a blocking-free way. Then, we can use such spare time to do other work while waiting for device operations. Implemented on top of a task-based framework, the experimental evaluation of AMA on a quad-GPU node shows that we reach the performance of a hand-tuned native CUDA code, with the advantage of fully hiding the device management. In addition, we obtain up to more than 2x performance speed-up with respect to the original framework implementation.
Judit Planas, Rosa M. Badia, Eduard Ayguadé, Jesús Labarta
286 Adaptive Partitioning for Irregular Applications on Heterogeneous CPU-GPU Chips [abstract]
Abstract: Commodity processors are comprised of several CPU cores and one integrated GPU. To fully exploit this type of architectures, one needs to automatically determine how to partition the workload between both devices. This is specially challenging for irregular workloads, where each iteration's work is data dependent and shows control and memory divergence. In this paper, we present a novel adaptive partitioning strategy specially designed for irregular applications running on heterogeneous CPU-GPU chips. The main novelty of this work is that the size of the workload assigned to the GPU and CPU adapts dynamically to maximize the GPU and CPU utilization while balancing the workload among the devices. Our experimental results on an Intel Haswell architecture using a set of irregular benchmarks show that our approach outperforms exhaustive static and adaptive state-of-the-art approaches in terms of performance and energy consumption.
Antonio Vilches, Rafael Asenjo, Angeles Navarro, Francisco Corbera, Ruben Gran, Maria Garzaran
304 Using high performance algorithms for the hybrid simulation of disease dynamics on CPU and GPU [abstract]
Abstract: In the current work the authors present several approaches to the high performance simulation of human diseases propagation using hybrid two-component imitational models. The models under study were created by coupling compartmental and discrete-event submodels. The former is responsible for the simulation of the demographic processes in a population while the latter deals with a disease progression for a certain individual. The number and type of components used in a model may vary depending on the research aims and data availability. The introduced high performance approaches are based on batch random number generation, distribution of simulation runs and the calculations on graphical processor units. The emphasis was made on the possibility to use the approaches for various model types without considerable code refactoring for every particular model. The speedup gained was measured on simulation programs written in C++ and MATLAB for the models of HIV and tuberculosis spread and the models of tumor screening for the prevention of colorectal cancer. The benefits and drawbacks of the described approaches along with the future directions of their development are discussed.
Vasiliy Leonenko, Nikolai Pertsev, Marc Artzrouni

ICCS 2015 Main Track (MT) Session 4

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: M101

Chair: Sascha Hell

411 Point Distribution Tensor Computation on Heterogeneous Systems [abstract]
Abstract: Big data in observational and computational sciences impose increasing challenges on data analysis. In particular, data from light detection and ranging (LIDAR) measurements are questioning conventional methods of CPU-based algorithms due to their sheer size and complexity as needed for decent accuracy. These data describing terrains are natively given as big point clouds consisting of millions of independent coordinate locations from which meaningful geometrical information content needs to be extracted. The method of computing the point distribution tensor is a very promising approach, yielding good results to classify domains in a point cloud according to local neighborhood information. However, an existing KD-Tree parallel approach, provided by the VISH visualization framework, may very well take several days to deliver meaningful results on a real-world dataset. Here we present an optimized version based on uniform grids implemented in OpenCL that is able to deliver results of equal accuracy up to 24 times faster on the same hardware. The OpenCL version is also able to benefit from a heterogeneous environment and we analyzed and compared the performance on various CPU, GPU and accelerator hardware platforms. Finally, aware of the heterogeneous computing trend, we propose two low-complexity dynamic heuristics for the scheduling of independent dataset fragments in multi-device heterogenous systems.
Ivan Grasso, Marcel Ritter, Biagio Cosenza, Werner Benger, Günter Hofstetter, Thomas Fahringer
465 Toward a multi-level parallel framework on GPU cluster with PetSC-CUDA for PDE-based Optical Flow computation [abstract]
Abstract: In this work we present a multi-level parallel framework for the Optical Flow computation on a GPUs cluster, equipped with a scientific computing middleware (the PetSc library). Starting from a flow-driven isotropic method, which models the optical flow problem through a parabolic partial differential equation (PDE), we have designed a parallel algorithm and its software implementation that is suitable for heterogeneous computing environments (multiprocessor, single GPU and cluster of GPUs). The proposed software has been tested on real SAR images sequences. Experiments highlight the performances obtained and a gain of about 95% with respect to the sequential implementation.
Salvatore Cuomo, Ardelio Galletti, Giulio Giunta, Livia Marcellino
472 Performance Analysis and Optimisation of Two-Sided Factorization Algorithms for Heterogeneous Platform [abstract]
Abstract: Many applications, ranging from big data analytics to nanostructure designs, require the solution of large dense singular value decomposition (SVD) or eigenvalue problems. A first step in the solution methodology for these problems is the reduction of the matrix at hand to condensed form by two-sided orthogonal transformations. This step is standardly used to significantly accelerate the solution process. We present a performance analysis of the main two-sided factorizations used in these reductions: the bidiagonalization, tridiagonalization, and the upper Hessenberg factorizations on heterogeneous systems of multicore CPUs and Xeon Phi coprocessors. We derive a performance model and use it to guide the analysis and to evaluate performance. We develop optimized implementations for these methods that get up to $80\%$ of the optimal performance bounds. Finally, we describe the heterogeneous multicore and coprocessor development considerations and the techniques that enable us to achieve these high-performance results. The work here presents the first highly optimized implementation of these main factorizations for Xeon Phi coprocessors. Compared to the LAPACK versions optmized by Intel for Xeon Phi (in MKL), we achieve up to $50\%$ speedup.
Khairul Kabir, Azzam Haidar, Stanimire Tomov, Jack Dongarra
483 High-Speed Exhaustive 3-locus Interaction Epistasis Analysis on FPGAs [abstract]
Abstract: Epistasis, the interaction between genes, has become a major topic in molecular and quantitative genetics. It is believed that these interactions play a significant role in genetic variations causing complex diseases. Several algorithms have been employed to detect pairwise interactions in genome-wide association studies (GWAS) but revealing higher order interactions remains a computationally challenging task. State of the art tools are not able to perform exhaustive search for all three-locus interactions in reasonable time even for relatively small input datasets. In this paper we present how a hardware-assisted design can solve this problem and provide fast, efficient and exhaustive third-order epistasis analysis with up-to-date FPGA technology.
Jan Christian Kässens, Lars Wienbrandt, Jorge González-Domínguez, Bertil Schmidt and Manfred Schimmler
487 Evaluating the Potential of Low Power Systems for Headphone-based Spatial Audio Applications [abstract]
Abstract: Embedded architectures have been traditionally designed tailored to perform a dedicated (specialized) function, and in general feature a limited amount of processing resources as well as exhibit very low power consumption. In this line, the recent introduction of systems-on-chip (SoC) composed of low power multicore processors, combined with a small graphics accelerator (or GPU), presents a notable increment of the computational capacity while partially retaining the appealing low power consumption of embedded systems. This paper analyzes the potential of these new hardware systems to accelerate applications that integrate spatial information into an immersive audiovisual virtual environment or into video games. Concretely, our work discusses the implementation and performance evaluation of a headphone-based spatial audio application on the Jetson TK1 development kit, a board equipped with a SoC comprising a quad-core ARM processor and an NVIDIA "Kepler" GPU. Our implementations exploit the hardware parallelism of both types of architectures by carefully adapting the underlying numerical computations. The experimental results show that the accelerated application is able to move up to 300 sound sources simultaneously in real time on this platform.
Jose A. Belloch, Alberto Gonzalez, Rafael Mayo, Antonio M. Vidal, Enrique S. Quintana-Orti

ICCS 2015 Main Track (MT) Session 5

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: M101

Chair: Lars Wienbrandt

488 Real-Time Sound Source Localization on an Embedded GPU Using a Spherical Microphone Array [abstract]
Abstract: Spherical microphone arrays are becoming increasingly important in acoustic signal processing systems for their applications in sound field analysis, beamforming, spatial audio, etc. The positioning of target and interfering sound sources is a crucial step in many of the above applications. Therefore, 3D sound source localization is a highly relevant topic in the acoustic signal processing field. However, spherical microphone arrays are usually composed of many microphones and running signal processing localization methods in real time is an important issue. Some works have already shown the potential of Graphic Processing Units (GPUs) for developing high-end real-time signal processing systems. New embedded systems with integrated GPU accelerators providing low power consumption are becoming increasingly relevant. These novel systems play a very important role in the new era of smartphones and tablets, opening further possibilities to the design of high-performance compact processing systems. This paper presents a 3D source localization system using a spherical microphone array fully implemented on an embedded GPU. The real-time capabilities of these platforms are analyzed, providing also a performance analysis of the localization system under different acoustic conditions.
Jose A. Belloch, Maximo Cobos, Alberto Gonzalez, Enrique S. Quintana-Orti
81 The Scaled Boundary Finite Element Method for the Analysis of 3D Crack Interaction [abstract]
Abstract: The Scaled Boundary Finite Element Method (SBFEM) can be applied to solve linear elliptic boundary value problems when a so-called scaling center can be defined such that every point on the boundary is \textit{visible} from it. From a more practical point of view, this means that in linear elasticity, a separation of variables ansatz can be used for the displacements in a scaled boundary coordinate system. This approach allows an analytical treatment of the problem in the scaling direction. Only the boundary needs to be discretized with Finite Elements. Employment of the separation of variables ansatz in the virtual work balance yields a Cauchy-Euler differential equation system of second order which can be transformed into an eigenvalue problem and solved by standard eigenvalue solvers for nonsymmetric matrices. A further obtained linear equation system serves for enforcing the boundary conditions. If the scaling center is located directly at a singular point, elliptic boundary value problems containing singularities can be solved with high accuracy and computational efficiency. The application of the SBFEM to the linear elasticity problem of two meeting inter-fiber cracks in a composite laminate exposed to a simple homogeneous temperature decrease reveals the presence of hypersingular stresses.
Sascha Hell and Wilfried Becker
85 Algorithmic Differentiation of Numerical Methods: Second-Order Tangent Solvers for Systems of Parametrized Nonlinear Equations [abstract]
Abstract: Forward mode algorithmic differentiation transforms implementations of multivariate vector functions as computer programs into first directional derivative (also: first-order tangent) code. Its reapplication yields higher directional derivative (higher-order tangent) code. Second derivatives play an important role in nonlinear programming. For example, second-order (Newtontype) nonlinear optimization methods promise faster convergence in the neighborhood of the minimum through taking into account second derivative information. Part of the objective function may be given implicitly as the solution of a system of n parameterized nonlinear equations. If the system parameters depend on the free variables of the objective, then second derivatives of the nonlinear system’s solution with respect to those parameters are required. The local computational overhead for the computation of second-order tangents of the solution vector with respect to the parameters by Algorithmic Differentiation depends on the number of iterations performed by the nonlinear solver. This dependence can be eliminated by taking a second-order symbolic approach to differentiation of the nonlinear system.
Niloofar Safiran, Johannes Lotz, Uwe Naumann

ICCS 2015 Main Track (MT) Session 6

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: M101

Chair: Niloofar Safiran

131 How High a Degree is High Enough for High Order Finite Elements? [abstract]
Abstract: High order finite element methods can solve partial differential equations more efficiently than low order methods. But how large of a polynomial degree is beneficial? This paper addresses that question through a case study of three problems representing problems with smooth solutions, problems with steep gradients, and problems with singularities. It also contrasts h-adaptive, p-adaptive, and h-adaptive refinement. The results indicate that for low accuracy requirements, like 1% relative error, h-adaptive refinement with relatively low order elements is sufficient, and for high accuracy requirements, p-adaptive refinement is best for smooth problems and hp-adaptive refinement with elements up to about 10th degree is best for other problems.
William Mitchell
179 Higher-Order Discrete Adjoint ODE Solver in C++ for Dynamic Optimization [abstract]
Abstract: Parametric ordinary differential equations (ODE) arise in many engineering applications. We consider ODE solutions to be embedded in an overall objective function which is to be minimized, e.g. for parameter estimation. For derivative-based optimization algorithms adjoint methods should be used. In this article, we present a discrete adjoint ODE integration framework written in C++ (NIXE 2.0) combined with Algorithmic Differentiation by overloading (dco/c++). All required derivatives, i.e. Jacobians for the integration as well as gradients and Hessians for the optimization, are generated automatically. With this framework, derivatives of arbitrary order can be implemented with minimal programming effort. The practicability of this approach is demonstrated in a dynamic parameter estimation case study for a batch fermentation process using sequential method of dynamic optimization. Ipopt is used as the optimizer which requires second derivatives.
Johannes Lotz, Uwe Naumann, Alexander Mitsos, Tobias Ploch, Ralf Hannemann-Tamás
211 A novel Factorized Sparse Approximate Inverse preconditioner with supernodes [abstract]
Abstract: Krylov methods preconditioned by Factorized Sparse Approximate Inverses (FSAI) are an efficient approach for the solution of symmetric positive definite linear systems on massively parallel computers. However, FSAI often suffers from a high set-up cost, especially in ill-conditioned problems. In this communication we propose a novel algorithm for the FSAI computation that makes use of the concept of supernode borrowed from sparse LU factorizations and direct methods.
Massimiliano Ferronato, Carlo Janna, Giuseppe Gambolati
343 Nonsymmetric preconditioning for conjugate gradient and steepest descent methods [abstract]
Abstract: We analyze a possibility of turning off post-smoothing(relaxation) in geometric multigrid when used as a preconditioner in preconditioned conjugate gradient (PCG) linear and eigenvalue solvers for the 3D Laplacian. The geometric Semicoarsening Multigrid (SMG) method is provided by the hypre parallel software package. We solve linear systems using two variants (standard and flexible) of PCG and preconditioned steepest descent (PSD) methods. The eigenvalue problems are solved using the locally optimal block preconditioned conjugate gradient (LOBPCG) method available in hypre through BLOPEX software. We observe that turning off the post-smoothing in SMG dramatically slows down the standard PCG-SMG. For flexible PCG and LOBPCG, our numerical tests show that removing the post-smoothing results in overall 40--50 percent acceleration, due to the high costs of smoothing and relatively insignificant decrease in convergence speed. We demonstrate that PSD-SMG and flexible PCG-SMG converge similarly if SMG post-smoothing is off. A theoretical justification is provided.
Henricus Bouwmeester, Andrew Dougherty, Andrew Knyazev

ICCS 2015 Main Track (MT) Session 7

Time and Date: 10:15 - 11:55 on 3rd June 2015

Room: M101

Chair: Michal Marks

345 Dynamics with Matrices Possessing Kronecker Product Structure [abstract]
Abstract: In this paper we present an application of Alternating Direction Implicit Algorithm to solving non-stationary PDE-s, allowing to obtain linear computational complexity. We illustrate this approach by solving two example non-stationary three-dimensional problems using explicit Euler time-stepping scheme: heat equation and linear elasticity equations for a cube.
Marcin Łoś, Maciej Woźniak, Maciej Paszyński, Lisandro Dalcin, Victor M. Calo
360 A Nonuniform Staggered Cartesian Grid Approach for Lattice-Boltzmann Method [abstract]
Abstract: We propose a numerical approach based on the Lattice-Boltzmann method (LBM) for dealing with mesh refinement of Non-uniform Staggered Cartesian Grid. We explain, in detail, the strategy for mapping LBM over such geometries. The main benefit of this approach, compared to others, consists of solving all fluid units only once per time-step, and also reducing considerably the complexity of the communication and memory management between different refined levels. Also, it exhibits a better matching for parallel processors. To validate our method, we analyze several standard test scenarios, reaching satisfactory results with respect to other state-of-the-art methods. The performance evaluation proves that our approach not only exhibits a simpler and efficient scheme for dealing with mesh refinement, but also fast resolution, even in those scenarios where our approach needs to use a higher number of fluid units.
Pedro Valero-Lara, Johan Jansson
48 A Novel Cost Estimation Approach for Wood Harvesting Operations Using Symbolic Planning [abstract]
Abstract: While forestry is an important economic factor, the methods commonly used to estimate potential financial gains from undertaking a harvesting operation are usually based on heuristics and experience. Those methods use an abstract view on the harvesting project at hand, focusing on a few general statistical parameters. To improve the accuracy of felling cost estimates, we propose a novel, single-tree-based cost estimation approach, thich utilizes knowledge about the harvesting operation at hand to allow for a more specific and accurate estimate of felling costs. The approach utilizes well-known symbolic planning algorithms which are interfaced via the Planning Domain Definition Language (PDDL) and compile work orders. The work orders can then be used to estimate the total working time and thus the estimated cost for an individual harvesting project, as well as some additional efficiency statistics. Since a large proportion of today's harvesting operations are mechanized instead of motor manual, we focus on the planning of harvester and forwarder workflows. However, the use of these heavy forest machines carries the risk of damaging forest soil when repeatedly driving along skidding roads. Our approach readily allows for assessment of these risks.
Daniel Losch, Nils Wantia, Jürgen Roßmann
140 Genetic Algorithm using Theory of Chaos [abstract]
Abstract: This paper is focused on genetic algorithm with chaotic crossover operator. We have performed some experiments to study possible use of chaos in simulated evolution. A novel genetic algorithm with chaotic optimization operation is proposed to optimization of multimodal functions. As the basis of a new crossing operator a simple equation involving chaos is used, concrete the logistic function. The logistic function is a simple one-parameter function of the second order that shows a chaotic behavior for some values of the parameter. Generally, solution of the logistic function has three areas of its behavior: convergent, periodic and chaotic. We have supposed that the convergent behavior leads to exploitation and the chaotic behavior aids to exploration. The periodic behavior is probably neutral and thus it is a negligible one. Results of our experiments conrm these expectations. A proposed genetic algorithm with chaotic crossover operator leads to more ecient computation in comparison with the traditional genetic algorithm.
Petra Snaselova, Frantisek Zboril
271 PSO-based Distributed Algorithm for Dynamic Task Allocation in a Robotic Swarm [abstract]
Abstract: Dynamic task allocation in a robotic swarm is a necessary process for proper management of the swarm. It allows the distribution of the identified tasks to be performed, among the swarm of robots, in such a way that a pre-defined proportion of execution of those tasks is achieved. In this context, there is no central unit to take care of the task allocation. So any algorithm proposal must be distributed, allowing every, and each robot in the swarm to identify the task it must perform. This paper proposes a distributed control algorithm to implement dynamic task allocation in a swarm robotics environment. The algorithm is inspired by the particle swarm optimization. In this context, each robot that integrates the swarm must run the algorithm periodically in order to control the underlying actions and decisions. The algorithm was implemented on ELISA III real swarm robots and extensively tested. The algorithm is effective and the corresponding performance is promising.
Nadia Nedjah, Rafael Mendonça, Luiza De Macedo Mourelle

ICCS 2015 Main Track (MT) Session 8

Time and Date: 14:10 - 15:50 on 3rd June 2015

Room: M101

Chair: Nadia Nedjah

469 Expressively Modeling the Social Golfer Problem in SAT [abstract]
Abstract: Constraint Satisfaction Problems allow one to expressively model problems. On the other hand, propositional satisfiability problem (SAT) solvers can handle huge SAT instances. We thus present a technique to expressively model set constraint problems and to encode them automatically into SAT instances. Our technique is expressive and less error-prone. We apply it to the Social Golfer Problem and to symmetry breaking of the problem.
Frederic Lardeux, Eric Monfroy
538 Multi-Objective Genetic Algorithm for Variable Selection in Multivariate Classication Problems: A Case Study in Verification of Biodiesel Adulteration [abstract]
Abstract: This paper proposes multi-objective genetic algorithm for the problem of variable selection in multivariate calibration. We consider the problem related to the classification of biodiesel samples to detect adulteration, Linear Discriminant Analysis classifier. The goal of the multi-objective algorithm is to reduce the dimensionality of the original set of variables; thus, the classification model can be less sensitive, providing a better generalization capacity. In particular, in this paper we adopted a version of the Non-dominated Sorting Genetic Algorithm (NSGA-II) and compare it to a mono-objective Genetic Algorithm (GA) in terms of sensitivity in the presence of noise. Results show that the mono-objective selects 20 variables on average and presents an error rate of 14%. One the other hand, the multi-objective selects 7 variables and has an error rate of 11%. Consequently, we show that the multi-objective formulation provides classification models with lower sensitivity to the instrumental noise when compared to the mono-objetive formulation.
Lucas de Almeida Ribeiro, Anderson Da Silva Soares
653 Sitting Multiple Observers for Maximum Coverage: An Accurate Approach [abstract]
Abstract: The selection of the lowest number of observers that ensures the maximum visual coverage over an area represented by a digital elevation model (DEM) is an important problem with great interest in many elds, e.g., telecommunications, environment planning, among others. However, this problem is complex and intractable when the number of points of the DEM is relatively high. This complexity is due to three issues: 1) the diculty in determining the visibility of the territory from a point, 2) the need to know the visibility at all points of the territory and 3) the combinatorial complexity of the selection of observers. The recent progress in total-viewshed maps computation not only provides an ecient solu-tion to the rst two problems, but also opens other ways to new solutions that were unthinkable previously. This paper presents a new type of cartography, called the masked total viewshed map, and based on this algorithm, optimal solutions for both sequential and simultaneous observers location are provided.
Antonio Manuel Rodriguez Cervilla, Siham Tabik, Luis Felipe Romero Gómez
169 USING CRITERIA RECONSTRUCTION OF LOW-SAMPLING TRAJECTORIES AS A TOOL FOR ANALYTICS [abstract]
Abstract: Today, a lot of applications with incorporated Geo Positional Systems (GPS) deliver huge quantities of spatio-temporal data. Trajectories followed by moving objects can be generated from this data. However, these trajectories may have silent durations, i.e., time durations when no data are available for describing the route of a MO. As a result, the movement during silent durations must be described and the low sampling data trajectory need to be filled in using specialized techniques of data imputation to study and discover new knowledge based on movement. Our interest is to show opportunities of analytical tasks using a criteria based operator over reconstructed low-sampling trajectories. Also, a simple visual analysis of the reconstructed trajectories is done to offer a simple analytic perspective of the reconstruction and how the criterion of movement can change the analysis. To the best of our knowledge, this work is the first attempt to use the different reconstruction of trajectories criteria to identify the opportunities of analytical tasks over reconstructed low-sampling trajectories as a whole.
Francisco Moreno, Edison Ospina, Iván Amón Uribe
258 Using Genetic Algorithms for Maximizing Technical Efficiency in Data Envelopment Analysis [abstract]
Abstract: Data Envelopment Analysis (DEA) is a non-parametric technique for estimating the technical efficiency of a set of Decision Making Units (DMUs) from a database consisting of inputs and outputs. This paper studies DEA models based on maximizing technical efficiency, which aim to determine the least distance from the evaluated DMU to the production frontier. Usually, these models have been solved through unsatisfactory methods used for combinatorial NP-hard problems. Here, the problem is approached by metaheuristic techniques and the solutions are compared with those of the methodology based on the determination of all the facets of the frontier in DEA. The use of metaheuristics provides solutions close to the optimum with low execution time.
Martin Gonzalez, Jose J. Lopez-Espin, Juan Aparicio, Domingo Gimenez, Jesus T. Pastor

ICCS 2015 Main Track (MT) Session 9

Time and Date: 10:35 - 12:15 on 1st June 2015

Room: V101

Chair: Megan Olsen

673 The construction of complex networks from linear and nonlinear measures — Climate Networks [abstract]
Abstract: During the last decade the techniques of complex network analysis have found application in climate research. The main idea consists in embedding the characteristics of climate variables, e.g., temperature, pressure or rainfall, into the topology of complex networks by appropriate linear and nonlinear measures. Applying such measures on climate time series leads to defining links between their corresponding locations on the studied region, whereas the locations are the network’s nodes. The resulted networks, consequently, are analysed using the various network analysis tools present in literature in order to get a better insight on the processes, patterns and interactions occurring in climate system. In this regard we present ClimNet; a complete set of software tools to construct climate networks based on a wide range of linear (cross correlation) and nonlinear (Information theoretic) measures. The presented software will allow the construction of large networks’ adjacency matrices from climate time series while supporting functions to tune relationships to different time-scales by means of symbolic ordinal analysis. The provided tools have been used in the production of various original contributions in climate research. This work presents an in-depth description of the implemented statistical functions widely used to construct climate networks. Additionally, a general overview of the architecture of the developed software is provided as well as a brief analysis of application examples.
J. Ignacio Deza, Hisham Ihshaish
70 Genetic Algorithm evaluation of green search allocation policies in multilevel complex urban scenarios [abstract]
Abstract: This paper investigates the relationship between the underlying complexity of urban agent-based models and the performance of optimisation algorithms. In particular, we address the problem of optimal green space allocation within a densely populated urban area. We find that a simple monocentric urban growth model may not contain enough complexity to be able to take complete advantage of advanced optimisation techniques such as Genetic Algorithms (GA) and that, in fact, simple greedy baselines can find a better policy for these simple models. We then turn to more realistic urban models and show that the performance of GA increases with model complexity and uncertainty level.
Marta Vallejo, Verena Rieser and David Corne
80 A unified and memory efficient framework for simulating mechanical behavior of carbon nanotubes [abstract]
Abstract: Carbon nanotubes possess many interesting properties, which make them a promising material for a variety of applications. In this paper, we present a unified framework for the simulation of mechanical behavior of carbon nanotubes. It allows the creation, simulation and visualization of these structures, extending previous work by the research group ”MISMO” at TU Darmstadt. In particular, we develop and integrate a new iterative solving procedure, employing the conjugate gradient method, that drastically reduces the memory consumption in comparison to the existing approaches. The increase in operations for the memory saving approach is partially offset by a well scaling shared-memory parallelization. In addition the hotspots in the code have been vectorized. Altogether, the resulting simulation framework enables the simulation of complex carbon nanotubes on commodity multicore desktop computers.
Michael Burger, Christian Bischof, Christian Schröppel, Jens Wackerfuß
129 Towards an Integrated Conceptual Design Evaluation of Mechatronic Systems: The SysDICE Approach [abstract]
Abstract: Mechatronic systems play a significant role in different types of industry, especially in transportation, aerospace, automotive and manufacturing. Although their multidisciplinary nature provides enormous functionalities, it is still one of the substantial challenges which frequently impede their design process. Notably, the conceptual design phase aggregates various engineering disciplines, project and business management fields, where different methods, modeling languages and software tools are applied. Therefore, an integrated environment is required to intimately engage the different domains together. This paper outlines a model-based research approach for an integrated conceptual design evaluation of mechatronic systems using SysML. Particularly, the state of the art is highlighted, most important challenges, remaining problems in this field and a novel solution is proposed, named SysDICE, combining model based system engineering and artificial intelligence techniques to support for achieving efficient design.
Mohammad Chami, Jean-Michel Bruel
164 MDE in Practice for Computational Science [abstract]
Abstract: Computational Science tackles complex problems by definition. These problems concern people not only in large scale, but in their day-to-day life. With the development of computing facilities, novel application areas can legitimately benefit from the existing experience in the field. Nevertheless, the lack of reusability, the growing in complexity, and the “computing-oriented” nature of the actual solutions call for several improvements. Among these, raising the level of abstraction is the one we address in this paper. As an illustration we can mention the problem of the validity of the experimentations which depends on the validity of the defined programs (bugs not in the experiment and data but in the simulators/validators!). This raise the needs for leveraging on knowledge / expertise. In the software and systems modeling community, research on domain-specific modeling languages (DSMLs) is focused since the last decade on providing technologies for developing languages and tools that allow domain experts to develop system solutions efficiently. In this vision paper, based on concrete experiments, we claim that DSMLs can bridge the gap between the (problem) space in which scientist work and the implementation (programming) space. Incorporating domain-specific concepts and high-quality development experience into DSMLs can significantly improve scientist productivity and experimentation quality.
Jean-Michel Bruel, Benoit Combemale, Ileana Ober, Helene Raynal

ICCS 2015 Main Track (MT) Session 10

Time and Date: 14:30 - 16:10 on 1st June 2015

Room: V101

Chair: Wentong Cai

153 Co-evolution in Predator Prey through Reinforcement Learning [abstract]
Abstract: In general we know that high-level species such as mammals must learn from their environment to survive. We believe that most species evolved over time by ancestors learning the best traits, which allowed them to propagate more than their less effective counterparts. In many instances, learning occurs in a competitive environment, where a species is evolving alongside its food source and/or its predator. We are unaware of work that studies co-evolution of predator and prey through simulation such that each entity learns to survive within its world, and passes that information on to its progeny, without running multiple training runs. We propose an agent-based model of predators and prey with co-evolution through feature-based Q-learning, to allow predators and prey to learn during their lifetime. We show that this learning results in a more successful species for both predator and prey. We suggest that feature-based Q-learning is more effective for this problem than traditional variations on reinforcement learning, and would improve current population dynamics simulations.
Megan Olsen and Rachel Fraczkowski
184 Adaptive Autonomous Navigation using Reactive Multi-agents System for Control Laws Merging [abstract]
Abstract: This paper deals with intelligent autonomous navigation of a vehicle in cluttered environment. We present a control architecture for safe and smooth navigation of a Unmanned Ground Vehicles (UGV). This control architecture is designed to allow the use of a single control law for different vehicle contexts (attraction to the target, obstacle avoidance, etc.). The reactive obstacle avoidance strategy is based on the limit-cycle approach. To manage the interaction between the controllers according to the context, the multi-agents system is proposed. Multi-agents systems are an efficient approach for problem solving and decision making. They can be applied to a wide range of applications thanks to their intrinsic properties such as self-organization/emergent phenomena. Merging approach between control laws is based on their properties to adapt the control to the environment. Different simulations on cluttered environment show the performance and the efficiency of our proposal, to obtain fully reactive and safe control strategy, for the navigation of a UGV.
Baudouin Dafflon, Franck Gechter, José Vilca, Lounis Adouane
309 Quantitative Evaluation of Decision Effects in the Management of Emergency Department Problems [abstract]
Abstract: Due to the complexity and crucial role of an Emergency Department(ED) in the healthcare system. The ability to more accurately represent, simulate and predict performance of ED will be invaluable for decision makers to solve management problems. One way to realize this requirement is by modeling and simulating the emergency department, the objective of this research is to design a simulator, in order to better understand the bottleneck of ED performance and provide ability to predict such performance on defined condition. Agent-based modeling approach was used to model the healthcare staff, patient and physical resources in ED. This agent-based simulator provides the advantage of knowing the behavior of an ED system from the micro-level interactions among its components. The model was built in collaboration with healthcare staff in a typical ED and has been implemented and verified in a Netlogo modeling environment. Case studies are provided to present some capabilities of the simulator in quantitive analysis ED behavior and supporting decision making. Because of the complexity of the system, high performance computing technology was used to increase the number of studied scenarios and reduce execution time.
Zhengchun Liu, Eduardo Cabrera, Manel Taboada, Francisco Epelde, Dolores Rexachs, Emilio Luque
310 Agent Based Model and Simulation of MRSA Transmission in Emergency Departments [abstract]
Abstract: In healthcare environments we can find several microorganisms causing nosocomial infection, and of which one of the most common and most dangerous is Methicillin-resistant Staphylococcus Aureus. Its presence can lead to serious complications to the patient. Our work uses Agent Based Modeling and Simulation techniques to build the model and the simulation of Methicillin-resistant Staphylococcus Aureus contact transmission in emergency departments. The simulator allows us to build virtual scenarios with the aim of understanding the phenomenon of MRSA transmission and the potential impact of the implementation of different measures in propagation rates.
Cecilia Jaramillo, Manel Taboada, Francisco Epelde, Dolores Rexachs, Emilo Luque
373 Multi-level decision system for the crossroad scenario [abstract]
Abstract: Among the innovations aimed at tackling the transportation issues in the urban area, one of the most promising solutions is the possibility of making virtual trains of vehicles so as to provide a new kind of transportation system. Even if this kind of solutions is now widespread in the literature, some difficulties still need to be resolved. For instance, one must find solutions to make the crossing of the train possible while maintaining train composition (trains must not be split) and safety conditions. This paper proposes a multi-level decision process aimed at dealing with this issue. This proposal is based on train parameters dynamic adaptation which lead to trains crossing without stopping any of them. Results, obtained in simulations, make the comparison with a classical crossing strategy.
Bofei Chen, Franck Gechter, Abderrafiaa Koukam

ICCS 2015 Main Track (MT) Session 11

Time and Date: 16:40 - 18:20 on 1st June 2015

Room: V101

Chair: Emilio Luque

379 Towards a Cognitive Agent-Based Model for Air Conditioners Purchasing Prediction [abstract]
Abstract: Climate change as a result of human activities is a problem of a paramount importance. The global temperature on Earth is gradually increasing and it may lead to substantially hotter summers in a moderate belt of Europe, which in turn is likely to influence the air conditioning penetration in this region. The current work is an attempt to predict air conditioning penetration in different residential areas in the UK between 2030-2090 using an integration of calibrated building models, future weather predictions and an agent-based model. Simulation results suggest that up to 12% of homes would install an air conditioner in 75 years’ time assuming an average purchasing ability of the households. The performed simulations provide more insight into the influence of overheating intensity along with households’ purchasing ability and social norms upon households’ decisions to purchase an air conditioner.
Nataliya Mogles, Alfonso Ramallo-González, Elizabeth Gabe-Thomas
481 Crowd evacuations SaaS: an ABM approach [abstract]
Abstract: Crowd evacuations involve thousands of persons in closed spaces. Having knowledge about where the problematic exits will be or where the disaster may occur can be crucial in emergency planning. We implemented a simulator using Agent Based Modelling able to model the behaviour of people in evacuation situations and a workflow able to run it in the cloud. The input is just a PNG image and the output are statistical results of the simulation executed on the cloud. This allows to provide the user with a system abstraction and only a map of the scenario is needed. Many events are held in main city squares, so to test our system we chose Siena and we fit about 28,000 individuals in the centre of the square. The software has special computational requirements because the results need to be statistically reliable. Because these needs we use distributed computing. In this paper we show how the simulator scales efficiently on the cloud.
Albert Gutierrez-Milla, Francisco Borges, Remo Suppi, Emilio Luque
499 Differential Evolution with Sensitivity Analysis and the Powell's Method for Crowd Model Calibration [abstract]
Abstract: Evolutionary algorithms (EAs) are popular and powerful approaches for model calibration. This paper proposes an enhanced EA-based model calibration method, namely the differential evolution (DE) with sensitivity analysis and the Powell's method (DESAP). In contrast to traditional EA-based model calibration methods, the proposed DESAP owns three main features. First, an entropy-based sensitivity analysis operation is integrated so as to dynamically identify important parameters of the model as evolution progresses online. Second, the Powell's method is performed periodically to fine-tune the important parameters of the best individual in the population. Finally, in each generation, the DE operators are performed on a small number of better individuals rather than all individuals in the population. These new search mechanisms are integrated into the DE framework so as to reduce the computational cost and to improve the search efficiency. To validate its effectiveness, the proposed DESAP is applied to two crowd model calibration cases. The results demonstrate that the proposed DESAP outperforms several state-of-the-art model calibration methods in terms of accuracy and efficiency.
Jinghui Zhong and Wentong Cai
525 Strip Partitioning for Ant Colony Parallel and Distributed Discrete-Event Simulation [abstract]
Abstract: Data partitioning is one of the main problems in parallel and distributed simulation. Distribution of data over the architecture directly influences the efficiency of the simulation. The partitioning strategy becomes a complex problem because it depends on several factors. In an Individual-oriented Model, for example, the partitioning is related to interactions between the individual and the environment. Therefore, parallel and distributed simulation should dynamically enable the interchange of the partitioning strategy in order to choose the most appropriate partitioning strategy for a specific context. In this paper, we propose a strip partitioning strategy to a spatially dependent problem in Individual-oriented Model applications. This strategy avoids sharing resources, and, as a result, it decreases communication volume among the processes. In addition, we develop an objective function that calculates the best partitioning for a specific configuration and gives the computing cost of each partition, allowing for a computing balance through a mapping policy. The results obtained are supported by statistical analysis and experimentation with an Ant Colony application. As a main contribution, we developed a solution where the partitioning strategy can be chosen dynamically and always returns the lowest total execution time.
Francisco Borges, Albert Gutierrez-Milla, Remo Suppi, Emilio Luque
530 Model of Collaborative UAV Swarm Toward Coordination and Control Mechanisms Study [abstract]
Abstract: In recent years, thanks to the low cost of deploying, maintaining an Unmanned Aerial Vehicle (UAV) system and the possibility to operating them in areas inaccessible or dangerous for human pilots, UAVs have attracted much research attention both in the military field and civilian application. In order to deal with more sophisticated tasks, such as searching survival points, multiple target monitoring and tracking, the application of UAV swarms is forseen. This requires more complex control, communication and coordination mechanisms. However, these mechanisms are difficult to test and analyze under flight dynamic conditions. These multi- UAV scenarios are by their nature well suited to be modeled and simulated as multi-agent systems. The first step of modeling an multi-agent system is to construct the model of agent, namely accurate model to represent its behavior, constraints and uncertainties of UAVs. In this paper we introduce our approach to model an UAV as an agent in terms of multi-agent system principle. Construction of the model to satisfy the need for a simulation environment that researchers can use to evaluate and analyze swarm control mechanisms. Simulations results of a case study is provided to demonstrate one possible use of this approach.
Xueping Zhu, Zhengchun Liu, Jun Yang

ICCS 2015 Main Track (MT) Session 12

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: V101

Chair: George Kampis

569 Simulation of Alternative Fuel Markets using Integrated System-Dynamics Model of Energy System [abstract]
Abstract: An integrated system-dynamics model of energy systems is employed to explore the transition process towards alternative fuel markets. The model takes into account the entire energy system including interactions among supply sectors, energy prices, infrastructure and fuel demand. The paper presents the model structure and describes the algorithm for the short-term and long-term simulation of energy markets. The integrated model is applied to the renewable-based energy system of Iceland as a case study to simulate the transition path towards alternative fuel market during the time horizon of 2015-2050. An optimistic transition scenario towards hydrogen and biofuels is investigated for the numerical results. The market simulation algorithm effectively exhibits the continual transition towards equilibrium as market prices dynamically adjust to changes in supply and demand. The application of the model has potential to provide important policy insights as it can simulate the impact of different policy instruments on both supply and demand sides.
Ehsan Shafiei, Brynhildur Davíðsdóttir, Jonathan Leaver, Hlynur Stefansson, Eyjólfur Ingi Ásgeirsson
586 Information Impact on Transportation Systems [abstract]
Abstract: With a broader distribution of personal smart devices and with an increasing availability of advanced navigation tools, more drivers can have access to real time information regarding the traffic situation. Our research focuses on determining how using the real time information about a transportation system could influence the system itself. We developed an agent based model to simulate the effect of drivers using real time information to avoid traffic congestion. Experiments reveal that the system's performance is influenced by the number of participants that have access to real time information. We also discover that, in certain circumstances, the system performance when all participants have information is no different from, and perhaps even worse than, when no participant has access to information.
Sorina Litescu, Vaisagh Viswanathan, Michael Lees, Alois Knoll and Heiko Aydt
609 The Multi-Agent Simulation-Based Framework for Optimization of Detectors Layout in Public Crowded Places [abstract]
Abstract: In this work the framework for detectors layout optimization based on a multi-agent simulation is proposed. Its main intention is to provide a decision support team with a tool for automatic design of social threat detection systems for public crowded places. Containing a number of distributed detectors, such system performs detection and an identification of threat carriers. The generic model of detector used in the framework allows considering detection of various types of threats, e.g. infections, explosives, drugs, radiation. The underlying agent-based models provide data on social mobility which is used along with a probability based quality assessment model within the optimization process. The implemented multi-criteria optimization scheme is based on a genetic algorithm. For experimental study the framework has been applied in order to get the optimal detectors' layout in the Pulkovo airport.
Nikolay Butakov, Denis Nasonov, Konstantin Knyazkov, Vladislav Karbovskii, Yulya Chuprova
626 Towards Ensemble Simulation of Complex Systems [abstract]
Abstract: The paper presents an early-stage research which is aimed towards the development of comprehensive conceptual and technological framework for ensemble-based simulation of complex systems. The concept of multi-layer ensemble is presented as a background for further development of the framework to cover different kind of ensembles: ensemble of system’s state, data ensemble, and models ensemble. Formal description of a hybrid model is provided as a core concept for ensemble-based complex system simulation. The example of water level forecasting application is used to show selected ensemble classes covered by the proposed framework.
Sergey Kovalchuk, Alexander Boukhanovsky

ICCS 2015 Main Track (MT) Session 13

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: V101

Chair: Witold Dzwinel

712 Collaborative Knowledge Fusion by Ad-Hoc Information Distribution in Crowds [abstract]
Abstract: We study situations where (such as in a city festival) in the case of a phone signal outage cell phones can communicate opportunistically (for instance, using WiFi or Bluetooth) and we want to understand and control information spreading. A particular question is, how to prevent false information from spreading, and how to facilitate the spreading of useful (true) information? We introduce collaborative knowledge fusion as the operation by which individual, local knowledge claims are ``merged". Such fusion events are local, e.g. happen upon the physical meetings of knowledge providers. We study and evaluate different methods for collaborative knowledge fusion and study the conditions for and tradeoffs of the convergence to a global true knowledge state under various conditions.
George Kampis, Paul Lukowicz
220 Modeling Deflagration in Energetic Materials using the Uintah Computational Framework [abstract]
Abstract: Predictive computer simulations of large-scale deflagration and detonation are dependent on the availability of robust reaction models embedded in a computational framework capable of running on massively parallel computer architectures. We have been developing such models in the Uintah Computational Framework, which is capable of scaling up to 512k cores. Our particular interest is in predicting DDT for accident scenarios involving large numbers of energetic devices; the 2005 truck explosion in Spanish Fork Canyon, UT is a prototypical example. Our current reaction model adapts components from Ward, Son and Brewster to describe the effects of pressure and initial temperature on deflagration, from Berghout et al. for burning in cracks in damaged explosives, and from Souers for describing fully developed detonation. The reaction model has been subjected to extensive validation against experimental tests. Current efforts are focused on effects of carrying the computational grid elements on multiple aspects of deflagration and the transition to detonation.
Jacqueline Beckvermit, Todd Harman, Andrew Bezdjian, Charles Wight
237 Fast Equilibration of Coarse-Grained Polymeric Liquids [abstract]
Abstract: The study of macromolecular systems may require large computer simulations that are too time consuming and resource intensive to execute in full atomic detail. The integral equation coarse-graining approach by Guenza and co-workers enables the exploration of longer time and spatial scales without sacrificing thermodynamic consistency, by approximating collections of atoms using analytically-derived soft-sphere potentials. Because coarse-grained (CG) characterizations evolve polymer systems far more efficiently than the corresponding united atom (UA) descriptions, we can feasibly equilibrate a CG system to a reasonable geometry, then transform back to the UA description for a more complete equilibration. Automating the transformation between the two different representations simultaneously exploits CG efficiency and UA accuracy. By iteratively mapping back and forth between CG and UA, we can quickly guide the simulation towards a configuration that would have taken many more time steps within the UA representation alone. Accomplishing this feat requires a diligent workflow for managing input/output coordinate data between the different steps, deriving the potential at runtime, and inspecting convergence. In this paper, we present a lightweight workflow environment that accomplishes such fast equilibration without user intervention. The workflow supports automated mapping between the CG and UA descriptions in an iterative, scalable, and customizable manner. We describe this technique, examine its feasibility, and analyze its correctness.
David Ozog, Jay McCarty, Grant Gossett, Allen Malony and Marina Guenza
392 Massively Parallel Simulations of Hemodynamics in the Human Vasculature [abstract]
Abstract: We present a computational model of three-dimensional and unsteady hemodynamics within the primary large arteries in the human on 1,572,864 cores of the IBM Blue Gene/Q. Models of large regions of the circulatory system are needed to study the impact of local factors on global hemodynamics and to inform next generation drug delivery mechanisms. The HARVEY code successfully addresses key challenges that can hinder effective solution of image-based hemodynamics on contemporary supercomputers, such as limited memory capacity and bandwidth, flexible load balancing, and scalability. This work is the first demonstration of large (> 500 cm) fluid dynamics simulations of the circulatory system modeled at resolutions as high as 10 μm.
Amanda Randles, Erik W. Draeger and Peter E. Bailey
402 Parallel performance of an IB-LBM suspension simulation framework [abstract]
Abstract: We present performance results from ficsion, a general purpose parallel suspension solver, employing the Immersed-Boundary lattice-Boltzmann method (IB-LBM). ficsion is build on top of the open-source LBM framework Palabos, making use of its data structures and their inherent parallelism. We describe in brief the implementation and present weak and strong scaling results for simulations of dense red blood cell suspensions. Despite its complexity the simulations demonstrate a fairly good, close to linear scaling, both in the weak and strong scaling scenarios.
Lampros Mountrakis, Eric Lorenz, Orestis Malaspinas, Saad Alowayyed, Bastien Chopard and Alfons G. Hoekstra

ICCS 2015 Main Track (MT) Session 14

Time and Date: 16:20 - 18:00 on 2nd June 2015

Room: V101

Chair: Lampros Mountrakis

405 A New Stochastic Cellular Automata Model for Traffic Flow Simulation with Driver's Behavior Prediction [abstract]
Abstract: In this work we introduce a novel, flexible and robust traffic flow cellular automata model. Our proposal includes two important stages that make possible the consideration of different profiles of drivers' behaviors. We first consider the motion expectation of cars that are in front of each driver. Secondly, we define how a specific car decides to get around, considering the foreground traffic configuration. Our model uses stochastic rules for both situations, adjusting the Probability Density Function of the Beta Distribution for three neighborhoods drives behavior, adjusting different parameters of the Beta distribution for each one.
Marcelo Zamith, Leal-Toledo Regina, Esteban Clua, Elson Toledo and Guilherme Magalhães
557 A Model Driven Approach to Water Resource Analysis based on Formal Methods and Model Transformation [abstract]
Abstract: Several frameworks have been proposed in literature in order to cope with critical infrastructure modelling issues, and almost all rely on simulation techniques. Anyway simulation is not enough for critical systems, where any problem may lead to consistent loss in money and even human lives. Formal methods are widely used in order to enact exhaustive analyses of these systems, but their complexity grows with system dimension and heterogeneity. In addition, experts in application domains could not be familiar with formal modelling techniques. A way to manage complexity of analysis is the use of Model Based Transformation techniques: analysts can express their models in the way they use to do and automatic algorithms translate original models into analysable ones, reducing analysis complexity in a completely transparent way. In this work we describe an automatic transformation algorithm generating hybrid automata for the analysis of a natural water supply system. We use real system located in the South of Italy as case study.
Francesco Moscato, Flora Amato, Francesco De Paola, Crescenzo Diomaiuta, Nicola Mazzocca, Maurizio Giugni
175 An Invariant Framework for Conducting Reproducible Computational Science [abstract]
Abstract: Computational reproducibility depends on being able to isolate necessary and sufficient computational artifacts and preserve them for later re-execution. Both isolation and preservation of artifacts can be challenging due to the complexity of existing software and systems and the resulting implicit dependencies, resource distribution, and shifting compatibility of systems as time progresses---all conspiring to break the reproducibility of an application. Sandboxing is a technique that has been used extensively in OS environments for isolation of computational artifacts. Several tools were proposed recently that employ sandboxing as a mechanism to ensure reproducibility. However, none of these tools preserve the sandboxed application for re-distribution to a larger scientific community---aspects that are equally crucial for ensuring reproducibility as sandboxing itself. In this paper, we describe a combined sandboxing and preservation framework, which is efficient, invariant and practical for large-scale reproducibility. We present case studies of complex high energy physics applications and show how the framework can be useful for sandboxing, preserving and distributing applications. We report on the completeness, performance, and efficiency of the framework, and suggest possible standardization approaches.
Haiyan Meng, Rupa Kommineni, Quan Pham, Robert Gardner, Tanu Malik and Douglas Thain
264 Very fast interactive visualization of large sets of high-dimensional data [abstract]
Abstract: The embedding of high-dimensional data into 2D (or 3D) space is the most popular way of data visualization. Despite recent advances in developing of very accurate dimensionality reduction algorithms, such as BH-SNE, Q-SNE and LoCH, their relatively high computational complexity still remains the obstacle for interactive visualization of truly large sets of high-dimensional data. We show that a new clone of the multidimensional scaling method (MDS) – nr-MDS – can be up to two orders of magnitude faster than the modern dimensionality reduction algorithms. We postulate its linear O(M) computational and memory complexity. Simultaneously, our method preserves in 2D and 3D target spaces high separability of data, similar to that obtained by the state-of-the-art dimensionality reduction algorithms. We present the effects of nr-MDS application in visualization of data repositories such as 20 Newsgroups (M=18000), MNIST (M=70000) and REUTERS (M=267000).
Witold Dzwinel, Rafał Wcisło
315 Automated Requirements Extraction for Scientific Software [abstract]
Abstract: Requirements engineering is crucial for software projects, but formal requirements engineering is often ignored in scientific software projects. Scientists do not often see the benefit of directing their time and effort towards documenting requirements. Additionally, there is a lack of requirements engineering knowledge amongst scientists who develop software. We aim at helping scientists to easily recover and reuse requirements without acquiring prior requirements engineering knowledge. We apply an automated approach to extract requirements for scientific software from available knowledge sources, such as user manuals and project reports. The approach employs natural language processing techniques to match defined patterns in input text. We have evaluated the approach in three different scientific domains, namely seismology, building performance and computational fluid dynamics. The evaluation results show that 78--97% of the extracted requirement candidates are correctly extracted as early requirements.
Yang Li, Emitzá Guzmán Ortega, Konstantina Tsiamoura, Florian Schneider, Bernd Bruegge

ICCS 2015 Main Track (MT) Session 15

Time and Date: 10:15 - 11:55 on 3rd June 2015

Room: V101

Chair: Dirk De Vos

387 Interactive 180º Rear Projection Public Relations [abstract]
Abstract: In the globalized world, good products may not be enough to reach potential clients if creative marketing strategies are not well delineated. Public relations are also important when it comes to capture clients attention, making the first contact between them and companies products while being persuasive enough to gain the of the client that the company has the right products to fit their needs. A virtual public relations is purposed, combining technology and a human like public relations capable of interacting with potential clients placed 180 degrees in front of the installation, by using gestures and sound. Four 4 Microsoft Kinects were used to develop de 180 degrees model for interaction, which allows recognition of gestures, sound sources, words, extract the face and body of the user and track users positions (including an heat map).
Ricardo Alves, Aldric Négrier, Luís Sousa, J.M.F Rodrigues, Paulo Felizberto, Miguel Gomes, Paulo Bica
11 Identification of DNA Motif with Mutation [abstract]
Abstract: The conventional way of identifying possible motif sequences in a DNA strand is to use representative scalar weight matrix for searching good match substring alignments. However, this approach, solely based on match alignment information, is susceptible to a high number of ambiguous sites or false positives if the motif sequences are not well conserved. A significant amount of time is then required to verify these sites for the suggested motifs. Hence in this paper, the use of mismatch alignment information in addition to match alignment information for DNA motif searching is proposed. The objective is to reduce the number of ambiguous false positives encountered in the DNA motif searching, thereby making the process more efficient for biologists to use.
Jian-Jun Shu
231 A software tool for the automatic quantification of the left ventricle myocardium hyper-trabeculation degree [abstract]
Abstract: Isolated left ventricular non-compaction (LVNC) is a myocardial disorder characterised by prominent ventricular trabeculations and deep recesses extending from the LV cavity to the subendocardial surface of the LV. Up to now, there is no common and stable solution in the medical community for quantifying and valuing the non-compacted cardiomyopathy. A software tool for the automatic quantification of the exact hyper-trabeculation degree in the left ventricle myocardium is designed, developed and tested. This tool is based on medical experience, but the possibility of the human appreciation error has been eliminated. The input data for this software are the cardiac images of the patients obtained by means of magnetic resonance. The output results are the percentage quantification of the trabecular zone with respect to the compacted area. This output is compared with human processing performed by medical specialists. The software proves to be a valuable tool to help diagnosis, so saving valuable diagnosis time.
Gregorio Bernabe, Javier Cuenca, Pedro E. López de Teruel, Domingo Gimenez, Josefa González-Carrillo
453 Blending Sentence Optimization Weights of Unsupervised Approaches for Extractive Speech Summarization [abstract]
Abstract: This paper evaluates the performance of two unsupervised approaches, Maximum Marginal Relevance (MMR) and concept-based global optimization framework for speech summarization. Automatic summarization is very useful techniques that can help the users browse a large amount of data. This study focuses on automatic extractive summarization on multi-dialogue speech corpus. We propose improved methods by blending each unsupervised approach at sentence level. Sentence level information is leveraged to improve the linguistic quality of selected summaries. First, these scores are used to filter sentences for concept extraction and concept weight computation. Second, we pre-select a subset of candidate summary sentences according to their sentence weights. Last, we extend the optimization function to a joint optimization of concept and sentence weights to cover both important concepts and sentences. Our experimental results show that these methods can improve the system performance comparing to the concept-based optimization baseline for both human transcripts and ASR output. The best scores are achieved by combining all three approaches, which are significantly better than the baseline system.
Noraini Seman, Nursuriati Jamil
513 The CardioRisk Project: Improvement of Cardiovascular Risk Assessment [abstract]
Abstract: The CardioRisk project addresses the coronary artery disease (CAD), namely, the management of myocardial infarction (MI) patients. The main goal is the development of personalized clinical models for cardiovascular (CV) risk assessment of acute events (e.g. death and new hospitalization), in order to stratify patients according to their care needs. This paper presents an overview of the scientific and technological issues that are under research and development. Three major scientific challenges can be identified: i) the development of fusion approaches to merge CV risk assessment tools; ii) strategies for the grouping (clustering) of patients; iii) biosignal processing techniques to achieve personalized diagnosis. At the end of the project, a set of algorithms/models must properly address these three challenges. Additionally, a clinical platform was implemented, integrating the developed models and algorithms. This platform supports a clinical observational study (100 patients) that is being carried out in Leiria Hospital Centre to validate the developed approach. Inputs from the hospital information system (demographics, biomarkers, clinical exams) are considered as well as an ECG signal acquired based on a Holter device. A real patient dataset provided by Santa Cruz Hospital, Portugal, comprising N=460 ACS-NSTEMI patients is also applied to perform initial validations (individual algorithms). The CardioRisk team is composed by two research institutions, the University of Coimbra (Portugal), Politecnico di Milano (Italy) and Leiria Hospital Centre (a Portuguese public hospital).
Simão Paredes, Teresa Rocha, Paulo de Carvalho, Jorge Henriques, Diana Mendes, Ricardo Cabete, Ramona Cabiddu, Anna Maria Bianchi and João Morais

ICCS 2015 Main Track (MT) Session 16

Time and Date: 14:10 - 15:50 on 3rd June 2015

Room: V101

Chair: Jian-Jun Shu

563 Parallel metaheuristics in computational biology: an asynchronous cooperative enhanced Scatter Search method [abstract]
Abstract: Metaheuristics are gaining increased attention as efficient solvers for hard global optimization problems arising in bioinformatics and computational systems biology. Scatter Search (SS) is one of the recent outstanding algorithms in that class. However, its application to very hard problems, like those considering parameter estimation in dynamic models of systems biology, still results in excessive computation times. In order to reduce the computational cost of the SS and improve its success, several research efforts have been made to propose dierent variants of the algorithm, including parallel approaches. This work presents an asynchronous Cooperative enhanced Scatter Search (aCeSS) based on the parallel execution of different enhanced Scatter Search threads and the cooperation between them. The main features of the proposed solution are: low overhead in the cooperation step, by means of an asynchronous protocol to exchange information between processes; more effectiveness of the cooperation step, since the exchange of information is driven by quality of the solution obtained in each process, rather than by a time elapsed; optimal use of available resources, thanks to a complete distributed approach that avoids idle processes at any moment. Several challenging parameter estimation problems from the domain of computational systems biology are used to assess the efficiency of the proposal and evaluate its scalability in a parallel environment.
David R Penas, Patricia Gonzalez, Jose A. Egea, Julio R. Banga, Ramon Doallo
716 Simulating leaf growth dynamics through Metropolis-Monte Carlo based energy minimization [abstract]
Abstract: Throughout their life span plants maintain the ability to generate new organs such as leaves. This is normally done in an orderly way by activating limited groups of dormant cells to divide and grow. It is currently not understood how that process is precisely regulated. We have used the VirtualLeaf framework for plant organ growth modelling to simulate the typical developmental stages of leaves of the model plant Arabidopsis thaliana. For that purpose the Hamiltonian central to the Monte-Carlo based mechanical equilibration of VirtualLeaf was modified. A basic two-dimensional model was defined starting from a rectangular grid with a dynamic phytohormone gradient that spatially instructs the cells in the growing leaf. Our results demonstrate that such a mechanism can indeed reproduce various spatio-temporal characteristics of leaf development and provides clues for further model development.
Dirk De Vos, Emil De Borger, Jan Broeckhove and Gerrit Ts Beemster
118 Clustering Acoustic Events in Environmental Recordings for Species Richness Surveys [abstract]
Abstract: Environmental acoustic recordings can be used to perform avian species richness surveys, whereby a trained ornithologist can observe the species present by listening to the recording. This could be made more efficient by using computational methods for iteratively selecting the richest parts of a long recording for the human observer to listen to, a process known as “smart sampling”. This allows scaling up to much larger ecological datasets. In this paper we explore computational approaches based on information and diversity of selected samples. We propose to use an event detection algorithm to estimate the amount of information present in each sample. We further propose to cluster the detected events for a better estimate of this amount of information. Additionally, we present a time dispersal approach to estimating diversity between iteratively selected samples. Combinations of approaches were evaluated on seven one-day recordings that have been manually annotated by bird watchers. The results show that on average all the methods we have explored would allow annotators to observe more new species in fewer minutes compared to a baseline of random sampling at dawn.
Philip Eichinski, Laurianne Sitbon, Paul Roe
337 On the Effectiveness of Crowd Sourcing Avian Nesting Video Analysis at Wildlife@Home [abstract]
Abstract: Wildlife@Home is citizen science project developed to provide wildlife biologists a way to swiftly analyze the massive quantities of data that they can amass during video surveillance studies. The project has been active for two years, with over 200 volunteers who have participated in providing observations through a web interface where they can stream video and report the occurrences of various events within that video. Wildlife@Home is currently analyzing avian nesting video from three species: Sharptailed-Grouse (Tympanuchus phasianellus) an indicator species which plays a role in determining the effect of North Dakota's oil development on the local wildlife, Interior Least Tern (Sternula antillarum) a federally listed endangered species, and Piping Plover (Charadrius Melodus) a federally listed threatened species. Video comes from 105 grouse, 61 plover and 37 tern nests from multiple nesting seasons, and consists of over 85,000 hours (13 terabytes) of 24/7 uncontrolled outdoor surveillance video. This work describes the infrastructure supporting this citizen science project, and examines the effectiveness of two different interfaces for crowd sourcing: a simpler interface where users watch short clips of video and report if an event occurred within that video, and a more involved interface where volunteers can watch entire videos and provide detailed event information including beginning and ending times for events. User observations are compared against expert observations made by wildlife biology research assistants, and are shown to be quite effective given strategies used in the project to promote accuracy and correctness.
Travis Desell, Kyle Goehner, Alicia Andes, Rebecca Eckroad, Susan Felege
594 Prediction of scaling resistance of concrete modified with high-calcium fly ash using classification methods [abstract]
Abstract: The goal of the study was applying machine learning methods to create rules for prediction of the surface scaling resistance of concrete modified with high-calcium fly ash. To determine the scaling durability the Bor{\aa}s method, according to European Standard procedure (PKN-CEN/TS 12390-9:2007), was used. The results of numeral experiments were utilized as a training set to generate rules indicating the relation between material composition and the scaling resistance. The classifier generated by BFT algorithm from the WEKA workbench can be used as a tool for adequate classification of plain concretes and concretes modified with high-calcium fly ash as materials resistant or not resistant to the surface scaling.
Michal Marks, Maria Marks

ICCS 2015 Main Track (MT) Session 17

Time and Date: 10:15 - 11:55 on 2nd June 2015

Room: V206

Chair: Ilya Valuev

59 Swarming collapse under limited information flow between individuals [abstract]
Abstract: Information exchange is critical to the execution and effectiveness of natural and artificial collective behaviors: fish schooling, birds flocking, amoebae aggregating or robots swarming. In particular, the emergence of dynamic collective responses in swarms confronted to complex environments underscore the central role played by social transmission of information. Here, the different possible origins of information flow bottlenecks are identified, and the associated effects on dynamic collective behaviors revealed using a combination of network-, control- and information-theoretic elements applied to a group of interacting self-propelled particles (SPPs). Specifically, we consider a minimalistic agent-based model consisting of N topologically interacting SPPs moving at constant speed through a domain having periodic boundaries. Each individual agent is characterized by its direction of travel and a canonical swarming behavior of the consensus type is examined. To account for the finiteness of the bandwidth, we consider synchronous information exchanges occurring every T = 1/2B, where the unit interval T is the minimum time interval between condition changes of data transmission signal. The agents move synchronously at discrete time steps T by a fixed distance upon receiving informational signals from their neighbors as per a linear update rule involving. We find a sufficient condition on the agents’ bandwidth B that guarantees the effectiveness of swarming while also highlighting the profound connection with the topology of the underlying interaction network. We also show that when decreasing B, the swarming behavior invariably vanishes following a second-order phase transition irrespectively of the intrinsic noise level.
Roland Bouffanais
63 Multiscale simulation of organic electronics via massive nesting of density functional theory computational kernels [abstract]
Abstract: Modelling is essential for development of organic electronics, such as organic light emitting diodes (OLEDs), organic field-effect transistors (OFETs) and organic photovoltaics (OPV). OLEDs have currently most applications, as they are already used in super-thin energy-efficient displays for television sets and smartphones, and in future will be used for lighting applications exploiting a world market worth tens of billions Euro. OLEDs should be further developed to increase their performance and durability, and reduce the currently high production costs. The conventional development process is very costly and time-demanding due to the large number of possible materials which have to be synthesized for the production and characterization of prototypes. Deeper understanding of the relationship between OLED device properties and materials structure allows for high-throughput materials screening and thus a tremendous reduction of development costs. In simulations, the properties of various materials one can be virtually and cost-effectively explored and compared to measurements. Based on these results, material composition, morphology and manufacturing processes can be systematically optimized. A typical OLED consists of a stack of multiple crystalline or amorphous organic layers. To compute electronic transport properties, e.g. charge mobilities, a quantum mechanical model, in particular the density functional theory (DFT) is commonly employed. Recently, we performed simulations of electronic processes in OLED materials achieved by multiscale modelling, i.e. by integrating sub-models on different length scales to investigate charge transport in thin films based on the experimentally characterized semi-conducting small molecules [1]. Here, we present a novel scale-out computational strategy to for a tightly coupled multiscale model consisting of a core region with 500 molecules (5000 pairs) of charge hopping sites and a embedding region, containing about 10000 electrostatically interacting molecules. The energy levels of each site depend on the local electrostatic environment yielding a significant contribution to the energy disor-der. This effect is explicitly taken into account in the quantum mechanical sub-model in a self-consistent manner, which represents however, a considerable computational challenge. Thus the total number of DFT calculations needed is of the order of 10^5-10^6. DFT models scale mostly as N^3, where N is the number of basis functions which is strongly related to the number of electrons. While DFT is implemented in a number of efficiently parallelized electronic structure codes, the computational scaling of a single DFT calculation applied for amorphous organic materials is naturally limited by the molecule size. After every iteration cycle, data are exchanged between all contained molecules of the self-consistence loop to update the electrostatic environment of each site. This requires that the DFT sub-model is executed employing a second-level parallelisation with a special scheduling strategy. The realisation of this model on high performance computer (HPC) systems has several issues: i) The DFT sub-models, which are stand-alone applications (such as NWChem or TURBOMOLE), have to be spawned at run time via process forking; ii) Large amounts of input and output data have to be transferred to and from the DFT sub-models though the cluster file system. These two requirements limit the computational performance and often conflict with the usage policies of common HPC environments. In addition, sub-model scheduling and DFT data pre-/post-processing have severe impact on the overall performance. To this end, we designed a DFT application programming interface (API) with different language bindings, such as Python and C++, allowing linking of DFT sub-models, independent of the concrete DFT implementation, to multiscale models. In addition, we propose solutions for in-core handling large input and output data as well as efficient scheduling algorithms. In this contribution, we will describe the architecture and outline the technical implementation of a framework for nesting DFT sub-models. We will demonstrate the use and analyse the performance of the framework for multiscale modelling of OLED materials. The framework provides an API which can be used to integrate DFT sub-models in other applications. [1] P. Friederich, F. Symalla, V. Meded, T. Neumann and W. Wenzel, “Ab Initio Treatment of Disorder Effects in Amorphous Organic Materials: Toward Parameter Free Materials Simulation”, Journal of Chemical Theory and Computation 10, 3720–3725 (2014).
Angela Poschlad, Pascal Friederich, Timo Strunk, Wolfgang Wenzel and Ivan Kondov
189 Optimization and Practical Use of Composition Based Approaches Towards Identification and Collection of Genomic Islands and Their Ontology in Prokaryotes [abstract]
Abstract: Motivation: Horizontally transferred genomic islands (islands, GIs) have been referred to as important factors which contribute towards the emergences of pathogens and outbreak instances. The development of tools towards the identification of such elements and retracing their distribution patterns will help to understand how such cases arise. Sequence composition has been used to identify islands, infer their phylogeny; and determine their relative times of insertions. The collection and curation of known islands will enhance insight into island ontology and flow. Results: This paper introduces the merger of SeqWord Genomic Islands Sniffer (SWGIS) which utilizes composition based approaches for identification of islands in bacterial genomic sequences and the Predicted Genomic Islands (Pre_GI) database which houses 26,744 islands found in 2,407 bacterial plasmids and chromosomes. SWGIS is a standalone program that detects genomic islands using a set of optimized parametric measures with estimates of acceptable false positive and false negative rates. Pre_GI is novel repository that includes island ontology and flux. This study furthermore illustrates the need for parametric optimization towards the prediction of islands to minimize false negative and false positive predictions. In addition Pre_GI emphasizes the practicality of compounded knowledge a database affords in the detection and visualization of ontological links between islands. Availability: SWGIS is freely available on the web at http://www.bi.up.ac.za/SeqWord/sniffer/index.html. Pre_GI is freely accessible at http://pregi.bi.up.ac.za/index.php.
Rian Pierneef, Oliver Bezuidt, Oleg Reva

ICCS 2015 Main Track (MT) Session 18

Time and Date: 14:10 - 15:50 on 2nd June 2015

Room: V206

Chair: Roland Bouffanais

603 Path Optimization Using Nudged Elastic Band Method for Analysis of Seismic Travel Time [abstract]
Abstract: A path optimization method is presented here for analysis of travel times of seismic waves.The method is an adaption of the nudged elastic band method to ray tracing where the the path corresponding to minimal travel time is determined. The method is based on a discrete representation of an initial path followed by iterative optimization of the discretization points so as to minimize the integrated time of propagation along the path. The gradient of the travel time with respect to the location of the discretization points is evaluated and used to find the optimal location of the points. An important aspect of the method is an estimation of the tangent to the path at each discretization point and elimination of the component of the gradient along the path during the iterative optimization. The distribution of discretization points along the path is controlled by spring forces acting only in the direction of the path tangent. The method is illustrated on two test problems and performance compared with previously proposed and actively used methods in the field of seismic data inversion.
Igor Nosikov, Pavel Bessarab, Maksim Klimenko and Hannes Jonsson
642 Global optimization using saddle traversals (GOUST) [abstract]
Abstract: The GOUST method relies on a fast way to identify first order saddle points on multidimensional objective function surfaces [1,2]. Given a local minimum, the method involves farming out several searches for first order saddle points, and then sliding down on the other side to discover new local minima. The system is then advanced to one of the newly discovered minima. In this way, local minima of the objective function are mapped out with a tendency to progress towards the global minimum. A practical application of this approach in global optimization of a few geothermal reservoir model parameters has recently been demonstrated [3]. The difficulty of an optimization problem, however, generally increases exponentially with the number of degrees of freedom and we investigate GOUST’s ability to search for the global minimum using a group of selected test functions. The performance of GOUST is tested as a function of the number of dimensions and compared with various other global optimization methods such as evolutionary algorithms and basin hopping. [1] 'A Dimer Method for Finding Saddle Points on High Dimensional Potential Surfaces Using Only First Derivatives',G. Henkelman and H. Jónsson, J. Chem. Phys., Vol. 111, page 7010 (1999) [2] 'Comparison of methods for finding saddle points without knowledge of the final states’, R. A. Olsen, G. J. Kroes, G. Henkelman, A. Arnaldsson and H. Jónsson, J. Chem. Phys. vol. 121, 9776 (2004). [3] 'Geothermal model calibration using a global minimization algorithm based on finding saddle points as well as minima of the objective function', M. Plasencia, A. Pedersen, A. Arnaldsson, J-C. Berthet and H. Jónsson, Computers and Geosciences 65, 110 (2014)
Manuel Plasencia Gutierrez, Kusse Sukuta and Hannes Jónsson
655 Memory Efficient Finite Difference Time Domain Implementation for Large Meshes [abstract]
Abstract: In this work we propose a memory-efficient (cache oblivious) implementation of the Finite Difference Time Domain algorithm. The implementation is based on a recursive space-time decomposition of the mesh update dependency graph into subtasks. The algorithm is suitable for processing large spatial meshes, since, unlike in the traditional layer-by-layer update, its efficiency (number of processed mesh cells per unit time) does not drastically drop with growing total mesh size. Additionally, our implementation allows for concurrent execution of subtasks of different size. Depending on the computer architecture, the scheduling may simultaneously encompass different parallelism levels such as vectorization, multithreading and MPI. Concurrent execution mechanisms are switched on (programmed) for subgraphs reaching some suitable size (rank) in course of recursion. In this presentation we discuss the implementation and analyze the performance of the implemented FDTD algorithm for various computer architectures, including multicore systems and large clusters. We demonstrate the FDTD update performance reaching up to 50% of the estimated CPU peak which is 10-30 times higher than that of the traditional FDTD solvers. We also demonstrate an almost perfect parallel scaling of the implemented solver. We discuss the effect of mesh memory layouts such as Z-curve (Morton order) increasing locality of data or interleaved layouts for vectorized updates.
Ilya Valuev and Andrey Zakirov
706 Coupled nuclear reactor simulation with the Virtual Environment for Reactor Applications (VERA) [abstract]
Abstract: The Consortium for Advanced Simulation of Light Water Reactors (CASL) was established in July 2010 for the modeling and simulation of commercial nuclear reactors. Led by Oak Ridge National Laboratory (ORNL), CASL also includes three major universities, three industry partners, and three other U.S. National Laboratories. In order to deliver advanced simulation capabilities, CASL has developed and deployed VERA, the Virtual Environment for Reactor Applications (VERA), which integrates components for physical phenomena to enable high-fidelity analysis of conditions within nuclear reactors under a wide range of operating conditions. We report on the architecture of the system, why we refer to it as an Environment rather than a Toolkit or Framework, numerical approaches to the coupled nonlinear simulations, and show results produced on large HPC systems such as the 300,000-core NVIDIA GPU-accelerated Cray XK7 Titan system at Oak Ridge National Laboratory.
John Turner