Session7 10:15 - 11:55 on 8th June 2016

ICCS 2016 Main Track (MT) Session 7

Time and Date: 10:15 - 11:55 on 8th June 2016

Room: KonTiki Ballroom

Chair: Vassil Alexandrov

162	On Parallel Induction of Nondeterministic Finite Automata [abstract] Abstract: The paper discusses the problem of the induction of a minimal nondeterministic finite automaton (NFA) consistent with a given set of examples and counterexamples. The main contribution of the paper is the proposal of an efficient parallel algorithm transforming the NFA induction problem into a family of constraint satisfaction problems (CSP). Two original techniques for fast CSPs evaluation are proposed and discussed. The efficacy of the parallel algorithm is evaluated experimentally on selected benchmarks. The proposed algorithm solves all analyzed benchmark examples, so called Tomita languages, in the time below half a minute, which should be considered an important achievement, as compared to our previous efforts which required minutes or hours to solve some of the aforementioned benchmarks.	Tomasz Jastrzab
168	On Optimizing Aluminum Extrusion Yield [abstract] Abstract: The purpose of this research is to develop a decision making tool to select aluminum billet size as well as to plan the extrusion operation to meet the required profile length with minimum waste. The decision making scheme developed includes practical constraints and considerations in the extrusion operation which are not typically considered by the optimization community. To address the variations in aluminum density, extrusion experiments to investigate the behavior of Al 6063 are performed on various cross-sectional areas (products) with different process parameters as well as desired lengths. The range of aluminum density is determined. The range of weight per lengths of the new aluminum profile is predicted from the curve fitting scheme. Maximum density is used in the optimization tool to ensure that the extruded profile is always sufficient for the number of workpieces determined. The billet size can be reduced incrementally if the extruded profile is longer than expected. Even with modest problem sizes, solving the ILP formulation is time-consuming. The solutions for non-linear integer programming for the case of multiple billets per extrusion are not obtainable. Otherwise the optimal solutions from the ILP are compared with the solutions from heuristic algorithms developed. For online billet cutting case, it is found that the solutions from ILP usually call for the use of more billet sizes than those obtained from the heuristic algorithm. However, the solutions from ILP lead to better material utilization than those from the algorithm by 1-4%. The tradeoff between cutting and extruding more billets and 4% difference in waste should be examined by the end user. In the standard billet size case, the heuristic algorithms perform well as compared to those obtained from solving the ILP. The solutions from both sources are almost identical. Solutions obtained from both ILP and heuristic methods are equal or better than the manual calculation employed by the company. The actual implementation of the program developed is now underway. It is expected that the solution methodology developed in this research can improve material utilization up to 10% over the current approach used (manual calculation) in some cases.	Jaramporn Hassamontr and Theera Leebhaicharoen
172	Acceleration and Parallelization of ZENO/Walk-on-Spheres [abstract] Abstract: This paper describes our on-going work to accelerate ZENO, a software tool based on Monte Carlo methods (MCMs), used for computing material properties at nanoscale. ZENO employs three main algorithms: (1) Walk on Spheres (WoS), (2) interior sampling, and (3) surface sampling. We have accelerated the first two algorithms. For the sake of brevity, the paper will discuss our work on the first one only as it is the most commonly used and the acceleration techniques were similar in both cases. WoS is a Brownian motion MCM for solving a class of partial differential equations (PDEs). It provides a stochastic solution to a PDE by estimating the probability that a random walk, which started at infinity, will hit the surface of the material under consideration. WoS is highly effective when the problem's geometry is additive, as this greatly reduces the number of walk steps needed to achieve accurate results. The walks start on the surface of an enclosing sphere and can make much larger jumps than in a direct simulation of Brownian motion. Our current implementation represents the molecular structure of nanomaterials as a union of possibly overlapping spheres. The core processing bottleneck in WoS is a Computational Geometry one, as the algorithm repeatedly determines the distance from query point to the material surface in each step of the random walk. In this paper, we present results from benchmarking spatial data structures, including several open-source implementations of k-D trees, for accelerating WoS algorithmically. The paper also presents results from our multicore and cluster parallel implementation to show that it exhibits linear strong scaling with the number of cores and compute nodes; this implementation delivers up to 4 orders of magnitude speedup compared to the original FORTRAN code when run on 8 nodes (each with dual 6-core Intel Xeon CPUs) with 24 threads per node.	Derek Juba, Walid Keyrouz, Michael Mascagni, Mary Brady
358	Permutation-based Recombination Operator to Node-Depth Encoding [abstract] Abstract: The node-depth encoding is a representation for evolutionary algorithms applied to tree prob-lems. Its represents trees by storing the nodes and their depth in a proper ordered list. Theoriginal formulation of the node-depth encoding has only mutation operators as the searchmechanism. Although the representation has this restriction, it has obtained good results withlow convergence. Then, this work proposes a specic recombination operator to improve theconvergence of the node-depth encoding representation. These operators are based on recom-bination for permutation representations. An investigation into the bias and heritability of theproposed recombination operator shows that it has a bias towards stars and low heritability.The performance of node-depth encoding with the proposed operator is investigated for theoptimal communication spanning tree problem. The results are presented for benchmark in-stances in the literature. The use of the recombination operator results in a faster convergencethan with only mutation operators.	Telma de Lima, Alexandre Delbem, Roney Lopes Lima, Gustavo Post Sabin, Marcos Antônio Almeida de Oliveira
403	Assessing metaheuristics by means of random benchmarks [abstract] Abstract: Typically, the performance of swarm and evolutionary methods is assessed by comparing their results when applied to some known finite benchmarks. In general, these metaheuristics depend on many parameter values and many possible exchangeable sub-steps, which yields a huge number of possible algorithm configurations. In this paper we argue that this high setup versatility lets developers expressively tune the method, in an ad-hoc way, to the target inputs to be solved, and hence to those in the benchmark under consideration. However, this does not imply properly solving any other input not considered in the benchmark. Several subtle ways to support that tuning (which can be consciously noticed by the developer, but can also be unconscious) are presented and discussed in the paper. Besides, as a posible alternative to using known finite benchmarks, we discuss the pros and cons of using random input generators, and we illustrate how to use such generators in a specific problem, MAX-3SAT. A general protocol to support the fair development of comparisons of metaheuristics based on random input generators is presented.	Pablo Rabanal, Ismael Rodriguez, Fernando Rubio

ICCS 2016 Main Track (MT) Session 14

Time and Date: 10:15 - 11:55 on 8th June 2016

Room: Toucan

Chair: Marcin Plociennik

408	Efficient Sphere Detector Algorithm for Massive MIMO using GPU Hardware Accelerator [abstract] Abstract: To further enhance the capacity of next generation wireless communication systems, massive MIMO has recently appeared as a necessary enabling technology to achieve high performance signal processing for large-scale multiple antennas. However, massive MIMO systems inevitably generate signal processing overheads, which translate into ever-increasing rate of complexity, and therefore, such system may not maintain the inherent real-time requirement of wireless systems. We redesign the non-linear sphere decoder method to increase the performance of the system, cast most memory-bound computations into compute-bound operations to reduce the overall complexity, and maintain the real-time processing thanks to the GPU computational power. We show a comprehensive complexity and performance analysis on an unprecedented MIMO system scale, which can ease the design phase toward simulating future massive MIMO wireless systems.	Mohamed-Amine Arfaoui, Hatem Ltaief, Zouheir Rezki, Mohamed-Slim Alouini, David Keyes
126	In-situ Data Infrastructure for Scientific Unit Testing Platform [abstract] Abstract: Testing is a significant software development process for the management of software system and scientific code. However, many scientific codes have become much more complicated, which means there are extra needs to check the changes positively impact dependent modules and additional needs to verify the system constraints. The software complexity also impediments module developers and software engineers to rapidly develop and extend their code. Recently, we have developed an automatic methodology and prototype platform to facilitate scientific verification of individual functions within complex scientific codes, so that, the science module builders are able to track variables conveniently in one module or track variables changes among different modules. In this paper, we present a procedure for automatic unit testing generation. For the interest of general audience of this conference, we place emphasis on the technical details of integrating the In Situ data Infrastructure into our platform. At the end of this paper, we have included an implementation of the unit testing for ACME Land Model (ALM) to demonstrate the usefulness and correctness of the platform. We’ve also used single point check and multi point check to demonstrate the easy variable tracking capability of this platform.	Zhuo Yao, Yulu Jia, Dali Wang, Chad Steed, Scott Atchley
213	Recovering the MSS-sequence via CA [abstract] Abstract: A cryptographic sequence generator, the modified self-shrinking generator (MSSG), was recently designed as a novel version of the self-shrinking generator. Taking advantage of the cryptographic properties of the irregularly decimated generator class, the MSSG was mainly created to be used in stream cipher applications and hardware implementations. Nevertheless, in this work it is shown that the MSSG output sequence, the so-called modified self-shrunken sequence, is generated as one of the output sequences of a linear model based on Cellular Automata that use rule 60 for their computations. Thus, the linearity of these structures can be advantageous exploited to recover the complete modified self-shrunken sequence from a number of intercepted bits.	Sara D. Cardell, Amparo Fúster-Sabater
328	Accelerated graph-based non-linear denoising filters [abstract] Abstract: Denoising filters, such as bilateral, guided, and total variation filters, applied to images on general graphs may require repeated Application if noise is not small enough. We formulate two acceleration techniques of the resulted iterations: conjugate gradient method and Nesterov's acceleration. We numerically show efficiency of the accelerated nonlinear filters for image denoising and demonstrate 2-12 times speed-up, i.e., the acceleration techniques reduce the number of iterations required to reach a given peak signal-to-noise ratio (PSNR) by the above indicated factor of 2-12.	Andrew Knyazev, Alexander Malyshev

Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY) Session 1

Time and Date: 10:15 - 11:55 on 8th June 2016

Room: Macaw

Chair: Stephane Louise

546	Reinventing computing in the post-Moore era [abstract] Abstract: After 50 years of unrelenting exponential Moore's Law progress, much of the high tech industry have built in assumptions that the future will always bring cheaper, faster, better devices. The trillion dollar question today is: what happens when the music stops? In this talk I will review "how we got here" and propose what will happen to the semiconductor and software industry post-Moore. Insights are based on experiences from the crowd funded many-core Parallella computing platform and Epiphany design work in leading edge processes.	Andreas Olofsson
506	Advances in Run-Time Performance and Interoperability for the Adapteva Epiphany Coprocessor [abstract] Abstract: The energy-efficient Adapteva Epiphany architecture exhibits massive many-core scalability in a physically compact 2D array of RISC cores with a fast network-on-chip (NoC). The architecture presents many features and constraints which contribute to software design challenges for the application developer. Addressing these challenges within the software stack that supports application development is critical to improving productivity and expanding the range of applications for the architecture. We report here on advances that have been made in the COPRTHR-2 software stack targeting the Epiphany architecture that address critical issues identified in previous work. Specifically, we describe improvements that bring greater control and precision to the design of compact compiled binary programs in the context of the limited per-core local memory of the architecture. We describe a new design for run-time support that has been implemented to dramatically improve the program load and execute performance and capabilities. Finally, we describe developments that advance host-coprocessor interoperability to expand the functionality available to the application developer.	David Richie, James Ross
347	Implementing OpenSHMEM for the Adapteva Epiphany RISC Array Processor [abstract] Abstract: The energy-efficient Adapteva Epiphany architecture exhibits massive many-core scalability in a physically compact 2D array of RISC cores with a fast network-on-chip (NoC). With fully divergent cores capable of MIMD execution, the physical topology and memory-mapped capabilities of the core and network translate well to partitioned global address space (PGAS) parallel programming models. Following an investigation into the use of two-sided communication using threaded MPI, one-sided communication using SHMEM is being explored. Here we present work in progress on the development of an OpenSHMEM 1.2 implementation for the Epiphany architecture.	James Ross, David Richie
462	Pattern Based Cache Coherency Architecture for Embedded Manycores [abstract] Abstract: Modern parallel programming frameworks like OpenMP often rely on shared memory concepts to harness the processing power of parallel systems. But for embedded devices, memory coherence protocols tend to account for a sizable portion of chip's power consumption. This is why any means to lower this impact is important. Our idea for this issue is to use the fact that most of usual workloads display a regular behavior with regards to their memory accesses to prefetch the relevant memory lines in locale caches of execution cores on a manycore system. Our contributions are, on one hand the specifications of a hardware IP for prefetching memory access patterns, and on another hand, a hybrid protocol which extends the classic MESI/baseline architecture to reduce the control and coherence related traffic by at least an order of magnitude. Evaluations are done on two benchmark programs and show the potential of this approach.	Jussara Marandola, Stephane Louise, Loic Cudennec

Workshop on Computational Optimization, Modelling & Simulation (COMS) Session 4

Time and Date: 10:15 - 11:55 on 8th June 2016

Room: Cockatoo

Chair: Leifur Leifsson

362	Using Computational Fluid Dynamics (CFD) for Blast Wave Propagation under Structure [abstract] Abstract: In recent years, improvised explosive devices has been an aspect of crusades by terrorist or movements around the world. The blast wave propagation of an explosive detonation can cause disastrous damage on the buildings, vehicles and also injuries to vehicle occupants. Full scale blast tests are expensive and time consuming but by using computational based numerical simulations can virtually predict these wave propagations and minimize the need of experimental testing. Computational fluid dynamics (CFD) is a common tool to do an analysis of free-field blast wave and against structure. This paper presents two different blast analyses; free field air blast and blast loading towards a structure using ANSYS FLUENT software. A high explosive of 1 kg blast peak overpressure data from an experiment has been patched at the specific domain of the symmetry plane. The computed results were found to be in agreement with theoretical and additional experimental data. The verified free field air blast model was expanded to study the blast loading response towards a structure. It was found that developed CFD can be further used to predict the blast wave propagation subjected to the vehicle structures or buildings.	Arif S. M. Sohaimi, Risby M. S.
377	A VNS-based heuristic for solving the vehicle routing problem with time windows and vehicle preventive maintenance constraints [abstract] Abstract: We address a vehicle routing problem with time windows (VRPTW) that also contains vehicle with preventive maintenance constraints (VRPTW-PM) and propose a MIP mathematical formulation as well as a general variable neighborhood search metaheuristic (VNS) to solve with large instance the problematic situation. First we create a initial solution using Solomon heuristic then we minimize the number of used routes, and then the total travelled distance by all vehicles is minimized. Computational results show the efficiency of the proposed approach.	Amine Dhahri, Anis Mjirda, Kamel Zidi, Khaled Ghedira
384	Simulation on the shock response of vehicle occupant subjected to underbelly blast loading [abstract] Abstract: Explosion from an anti-tank mines or improvised explosive devices are recognized as one of the lethal threat towards occupants inside an armoured vehicle. The detonation of these threats creates high intensity blast waves that transmitted to the occupant through vehicle structures and seats. Minimizing the occupant’s casualty can be achieved by properly dissipating the shock waves exerted to the vehicle. It is important to distinguish the contributing factors that affect the behavior of the blast wave so that proper reduction on the shock waves can be achieved. In this paper, three factors such as occupant seating height, charge weight placement and the Hopkinson-Cranz blast scaling were studied using numerical simulations. Design of experiment (DOE) was utilized to determine the ranks and interaction between each factor from the most influential on the results to the least effecting towards the results. From the results it was found that the seating position plays a significant role in reduction the shock response towards the finite element dummy model.	Khalis Suhaimi, Risby Mohd Sohaini, Tan Kean Sheng, Victor Feizal Knight
404	A Heuristic Algorithm for Multi-Site Computation Offloading in Mobile Cloud Computing [abstract] Abstract: Due to limitation of mobile device in terms of battery life and processing power, Mobile Cloud Computing (MCC) has become an attractive choice to leverage this shortcoming as the mobile computation could be offloaded to the cloud, which is so-called \emph{mobile computation offloading}. Existing research on mobile computation offloading considers offloading a mobile computation to a single cloud. However, in the real world a computation service could be provided by multiple clouds and each computation service may have different performance and different prices. Thus, a new and interesting research problem in mobile computation offloading is how to select a computation service for each of the computation tasks of a mobile computation such that the computation time of the mobile computation, the energy consumption of the mobile device and the cost of using the computation services are minimized. This is so called multi-site computation offloading in mobile cloud computing. In this paper we formulate the multi-site computation offloading problem, propose a heuristic algorithm for the multi-site computation offloading problem and evaluate the heuristic algorithm.	Nur Idawati Md Enzai, Maolin Tang

Workshop on Nonstationary Models of Pattern Recognition and Classifier Combinations (NMRPC) Session 1

Time and Date: 10:15 - 11:55 on 8th June 2016

Room: Boardroom East

Chair: Michal Wozniak

545	Workshop on Nonstationary Models of Pattern Recognition and Classifier Combinations [abstract] Abstract: Workshop on Nonstationary Models of Pattern Recognition and Classifier Combinations	Michal Wozniak, Bartosz Krawczyk
550	Keynote speach: Computational Aspects of Data Processing and Pattern Recognition with Tensor Methods [abstract] Abstract: We live in the era of massive data processing. Computational requirements on information processing and retrieval systems are therefore enormous - not only huge amounts of data needs to be processed and classified but also the systems need to deal with data multidimensionality. However, only recently data processing methods were extended to directly deal with multidimensional N-D patterns, without their prior vectorization, thanks to application of tensors. This talk will be focused on computational aspects of data processing and pattern recognition with tensors. We will present a systematic overview of tensor algebra and tensor decomposition methods with special stress on their applications in data representation, analysis, as well as pattern recognition. In the talk we will especially emphasize practical aspects, as well as implementation issues, of the presented algorithms. Prof. Cyganek bio: Bogusław Cyganek received his M.Sc. degree in electronics in 1993, and then M.Sc. in computer science in 1996, from the AGH University of Science and Technology, Krakow, Poland. He obtained his Ph.D. degree cum laude in 2001 with a thesis on correlation of stereo images, and D.Sc. degree in 2011 with a thesis on methods and algorithms of object recognition in digital images.  During recent years dr. Bogusław Cyganek cooperated with many scientific and industrial partners such as Glasgow University Scotland UK, DLR Germany, and Surrey University UK, as well as Nisus Writer, USA, Compression Techniques, USA, Pandora Int., UK, and The Polished Group, Poland. He is an associated professor at the Department of Electronics of the AGH University of Science and Technology, Poland, currently serving as a visiting professor to the Wroclaw Technical University in the ENGINE project. His research interests include computer vision, pattern recognition, data mining, as well as development of embedded systems. He is an author or a co-author of over a hundred of conference and journal papers, as well as books with the latest “Object Detection and Recognition in Digital Images: Theory and Practice” published by Wiley in 2013. Dr. Cyganek is a senior member of the IEEE and member of IAPR and SPIE.	Bogusław Cyganek
327	Anticipative Hybrid Extreme Rotation Forest [abstract] Abstract: This paper introduces an improvement on the recently published Hybrid Extreme Rotation Forest (HERF), consisting in the anticipative determination of the the fraction of each classifier architecture included in the ensemble. We call it AHERF. Both HERF and AHERF are hetero- geneous classifier ensembles, which aim to profit from the diverse problem domain specificities of each classifier architecture in order to achieve improved generalization over a larger spec- trum of problem domains. In this paper AHERF are built from a pool of Decision Trees (DT), Extreme Learning Machines (ELM), Support Vector Machines (SVM), k-Nearest Neighbors (k-NN), Adaboost, Random Forests (RF), and Gaussian Naive Bayes (GNB) classifiers. Given a problem dataset, the process of anticipative determination of the ensemble composition is as follows: First, we estimate the performance of each classifier architecture by independent pilot cross-validation experiments on a small subsample of the data. Next, classifier architectures are ranked according to their accuracy results. A probability distribution of classifier architec- tures appearing in the ensemble is built from this ranking. Finally, the type of each individual classifier is decided by sampling this probability distribution. Computational experiments on a collection of benchmark classification problems shows improvement on the original HERF, and other state-of-the-art approaches.	Borja Ayerdi, Manuel Grana
398	Learning Decision Trees from Data Streams with Concept Drift [abstract] Abstract: This paper address the data mining task of classifying data stream with concept drift. The proposed algorithm, named Concept-adapting Evolutionary Algorithm For Decision Tree (CEVOT) does not require any knowledge of the environment in which it operates (e.g. numbers and rates of drifts). The novelty of the approach is combining tree learner and evolutionary algorithm, where the decision tree is learned incrementally and all information (knowledge) are stored in the internal structure of the trees’ population. The proposed algorithm is experimentally compared with state-of-the-art stream methods on several real live and synthetic datasets. Results proves its high performance in term of accuracy and processing time.	Dariusz Jankowski, Konrad Jackowski, Bogusław Cyganek

Urgent Computing -Computations for Decision Support in Critical Situations (UC) Session 1

Time and Date: 10:15 - 11:55 on 8th June 2016

Room: Boardroom West

Chair: Valeria Krzhizhanovskaya

518	Quality-based approach to urgent workflows scheduling [abstract] Abstract: Urgent computing capabilities for early warning systems and decision support systems are vital in situations that require execution be completed before a specified deadline. The cost of missing the deadline in such situations can be unacceptable, while providing insufficient results can mean an ineffective solution that may come at a very high cost. In order to provide a solution that is appropriate under the current conditions (i.e. available volume of computational resources, workload, and time available), a new approach is required. In this paper, we present a schema and algorithm of regulating the volume of computations within an urgent workflow to deliver a solution that is as sufficient as possible given the current conditions and deadline. To achieve these goals, we develop an approach that modifies an urgent workflow by changing its structure and the parameters of its individual tasks. Such modifications are based on introducing a notion of quality and applying quality-based models to estimate the sufficiency of solutions generated by the resulting workflow structures. Finally, a special extension of the genetic algorithm that performs quality-based scheduling of urgent workflows is described along with an experimental study to demonstrate its efficacy.	Nikolay Butakov, Denis Nasonov, Andrey Svitenkov, Anton Radice
521	Urgent information spreading multi-layer model for simulation in mobile networks [abstract] Abstract: Information spreading simulation is an important problem in scientific community and is widely studied nowadays using different techniques. Efficient users’ activity simulation for urgent scenarios is even more important, because fast and accurate reaction in such situations can save human lives. In this paper we present multi-layer agent-based network model for information spreading simulation in urgent scenarios, which allows to investigate agents’ behavior in a variety of situations. This model can be used for live city simulation in integration with other agent-based human interaction models. Experimental results demonstrate logical consistency of the proposed approach and show different cases of information spreading in the network with different social aspect.	Alexander A. Visheratin, Tamara B. Trofimenko, Ksenia D. Mukhina, Denis Nasonov, Alexander V. Boukhanovsky
523	Workflow scheduling algorithms for hard-deadline constrained cloud environments [abstract] Abstract: Cloud computational platforms today are very promising for execution of scientific applications since they provide ready to go infrastructure for almost any task. However, complex tasks, which contain a large number of interconnected applications, which are usually called workflows, require efficient tasks scheduling in order to satisfy user defined QoS, like cost or execution time (makespan). When QoS has some restrictions – limited cost or deadline – scheduling becomes even more complicated. In this paper we propose heuristic algorithm for scheduling workflows in hard-deadline constrained clouds – Levelwise Deadline Distributed Linewise Scheduling (LDD-LS) – which, in combination with implementation of IC-PCP algorithm, is used for initialization of proposed metaheuristic algorithm – Cloud Deadline Coevolutional Genetic Algorithm (CDCGA). Experiments show high efficiency of CDCGA, which makes it potentially applicable for scheduling in cloud environments.	Alexander A Visheratin, Mikhail A Melnik, Denis Nasonov
528	Toolbox for Visual Explorative Analysis of Complex Temporal Multiscale Contact Networks Dynamics in Healthcare [abstract] Abstract: Public healthcare can be cast as a complex systems and network analysis is one of the methodological approaches that are commonly used to study these types of systems. In this paper we describe a multi-scale and multi-level interpretation of complex networks in public healthcare. Our contribution is to provide a toolbox for visualization and visual data-driven analysis of complex multiscale temporal contact networks that allows to simulate various dynamic processes using user-defined models. An example of explorative analysis of a dataset from real clinical data obtained from the Federal Almazov North-West Medical Research Centre in Saint Petersburg is described.	Andrey Karsakov, Alexander Moiseev, Ksenia Mukhina, Irina Ankudinova, Mariia Ignatieva, Evgeniy Krotov, Vladislav Karbovskii, Sergey V. Kovalchuk, Aleksandra O. Konradi