Session6 16:30 - 18:10 on 13th June 2019

ICCS 2019 Main Track (MT) Session 6

Time and Date: 16:30 - 18:10 on 13th June 2019

Room: 1.5

Chair: Andrew Lewis

18 Towards Unknown Traffic Identification Using Deep Auto-Encoder and Constrained Clustering [abstract]
Abstract: Nowadays, network traffic identification, as the fundamental technique in the field of cybersecurity, suffers from a critical problem, namely “unknown traffic”. The unknown traffic refers to network traffic generated by previously unknown applications (i.e., zero-day applications) in a pre-constructed traffic classification system. The ability to divide the mixed unknown traffic into multiple clusters, each of which contains only one application traffic as far as possible, is the key to solve this problem. In this paper, we propose the DePCK to improve the clustering purity. There are two main innovations in our framework: (i) It learns to extract bottleneck features via deep auto-encoder from traffic statistical characteristics; (ii) It uses the flow correlation to guide the process of pairwise constrained k-means. To verify the effectiveness of our framework, we make contrast experiments on two real-world datasets. The experimental results show that the clustering purity rate of DePCK can exceed 94.81% on the ISP-data and 91.48% on the WIDE-data, which outperform the state-of-the-art methods: RTC, and k-means with log data.
Shuyuan Zhao, Yafei Sang and Yongzheng Zhang
41 How to compose product pages to enhance the new users’ interest in the item catalog? [abstract]
Abstract: Converting first-time users into recurring ones is key for the success of Web-based applications. This problem is known as Pure Cold-Start and it refers to the capability of Recommender Systems (RSs) to provide useful recommendations to users without historical data. Traditionally, RSs assume that non-personalized recommendation can mitigate this problem. However, several users are not interested in consuming just biased-items, such as popular or best-rated items. Then, we introduce two new approaches inspired in user coverage maximization to deal with this problem. These coverage-based RSs reached a high number of distinct first-time users. Thus, we proposed to compose the product’s page by mixing complementary non-personalized RSs. An online study, conducted with 204 real users confirmed that we should diversify the RSs used to conquer first-time users.
Nícollas Silva, Diego Carvalho, Fernando Mourão, Adriano Pereira and Leonardo Rocha
91 Rumor Detection on Social Media: A Multi-View Model using Self-Attention Mechanism [abstract]
Abstract: With the unprecedented prevalence of social media, rumor detection has become increasingly important since it can prevent misin- formation from spreading in the public. Traditional approaches extract features from the source tweet, the replies, the user profiles as well as the propagation path of a rumor event. However, these approaches do not take the sentiment view of the users into account. The conflicting affirmative or denial stances of users can provide crucial clues for rumor detection. Besides, the existing work attach the same importance to all the words in the source tweet, but actually these words are not equally informative. To address these problems, we propose a simple but effec- tive multi-view deep learning model that is supposed to excavate stances of users and assign weights for different words. Experimental results on a social-media based dataset reveal that the multi-view model we pro- posed are useful, which achieves the state-of-the-art performance on the accuracy of automatic rumor detection. Our three-view model achieves 95.6% accuracy and our four-view model using BERT as a view also reaches an improvement of detection accuracy.
Yue Geng, Zheng Lin, Peng Fu, Weiping Wang and Dan Meng
156 EmoMix: Building An Emotion Lexicon for Compound Emotion Analysis [abstract]
Abstract: Building a high-quality emotion lexicon is regarded as the foundation of research on emotion analysis. Existing methods have focused on the study of primary categories (i.e., anger, disgust, fear, happiness, sadness, and surprise). However, there are many emotions expressed in texts that are difficult to be mapped to primary emotions, which poses a great challenge in emotion annotation for big data analysis. For instance, "despair" is a combination of "fear" and "sadness," and thus it is difficult to divide into each of them. To address this problem, we propose an automatic building method of emotion lexicon based on the psychological theory of compound emotion. This method could map emotional words into an emotion space, and annotate different emotion classes through a cascade clustering algorithm. Our experimental results show that our method outperforms the state-of-the-art methods in both word and sentence-level primary classification performance, and also offer us some insights into compound emotion analysis.
Ran Li, Zheng Lin, Peng Fu, Weiping Wang and Gang Shi
183 Long Term Implications of Climate Change on Crop Planning [abstract]
Abstract: The eects of climate change have been much speculated on in the past few years. Consequently, there has been intense interest in one of its key issues of food security into the future. This is particularly so given population increase, urban encroachment on arable land, and the degradation of the land itself. Recently, work has been done on predicting precipitation and temperature for the next few decades as well as developing optimisation models for crop planning. Combining these together, this paper examines the eects of climate change on a large food producing region in Australia, the Murrumbidgee Irrigation Area. For time periods between 1991 and 2071 for dry, average and wet years, an analysis is made about the way that crop mixes will need to change to adapt for the eects of climate change. It is found that sustainable crop choices will change into the future, particularly those that require large amounts of water, such as cotton
Andrew Lewis, Marcus Randall, Sean Elliott and James Montgomery

ICCS 2019 Main Track (MT) Session 14

Time and Date: 16:30 - 18:10 on 13th June 2019

Room: 1.3

Chair: Nadia Nedjah

278 Function and pattern extrapolation with product-unit networks [abstract]
Abstract: Neural networks are a popular method for function approximation and data classification and have recently drawn much attention because of the success of deep-learning strategies. Artificial neural networks are built from elementary units that generate a piecewise, often almost linear approximation of the function or pattern. To improve the extrapolation of nonlinear functions and patterns beyond the training domain, we propose to augment the fundamental algebraic structure of neural networks by a product unit that computes the product of its inputs raised to the power of their weights, namely $\prod_{i} x_i^{w_i}$. Linearly combining their outputs in a weighted sum allows representing most nonlinear functions known in calculus, including roots, fractions and approximations of power series. We train the network using gradient descent. The enhanced extrapolation capabilities of the network are demonstrated by comparing the results for a function and pattern extrapolation task with those obtained with the nonlinear support vector machine (SVM) and a standard neural network (standard NN). Convergence behavior of stochastic gradient descent is discussed and the feasibility of the approach is demonstrated in a real-world application in image segmentation.
Babette Dellen, Uwe Jaekel and Marcell Wolnitza
333 Fast and Scalable Outlier Detection with Metric Access Methods [abstract]
Abstract: It is well-known that the theoretical models existing for outlier detection make assumptions that may not reflect the true nature of outliers in every real application. With that in mind, this paper describes an empirical study performed on unsupervised outlier detection using 8 algorithms from the state-of-the-art and 8 datasets that refer to a variety of real-world tasks of high impact, like spotting cyberattacks, clinical pathologies and abnormalities in nature. We present the lowdown on the results obtained, pointing out to the strengths and weaknesses of each technique from the application specialist’s point of view, which is a shift from the designer-based point of view that is commonly considered. Interestingly, many of the techniques had unfeasibly high runtime requirements or failed to spot what the specialists consider as outliers in their own data. To tackle this issue, we propose MetricABOD: a novel ABOD- based algorithm that makes the analysis up to thousands of times faster, still being in average 12% more accurate than the most accurate related work. This improvement is essential to enable outlier detection in many real-world applications for which the existing methods lead to unexpected results or unfeasible runtime requirements. Finally, we studied two real collections of text data to show that our MetricABOD works also for adimensional, purely metric data.
Altamir Gomes Bispo Junior and Robson Leonardo Ferreira Cordeiro
384 Deep Learning Based LSTM and SeqToSeq Models to Detect Monsoon Spells of India [abstract]
Abstract: Monsoon spells are important climatic phenomenon modulating the quality and quantity of monsoon over an year. India being an agricultural country, identification of monsoonal spells is extremely important to plan agricultural policies following the phases of monsoon to attain maximum productivity. Monsoon spells' detection involve analyzing and predicting monsoon at daily levels which make it more challenging as daily-variability is higher as compared to monsoon over a month or an year. In this article, deep-learning based long short-term memory and sequence-to-sequence models are utilized to classify monsoon days, which are finally assembled to detect the spells. Dry and wet days are classified with precision of 0.95 and 0.87, respectively. Break spells are observed to be forecast with higher accuracy than the active spells. Additionally, sequence-to-sequence model is noted to perform superior to that of long-short term memory model. The proposed models also outperform traditional classification models for monsoon spell detection.
V. Saicharan, Moumita Saha, Pabitra Mitra and Ravi S. Nanjundiah
507 Data Analysis for Atomic Shapes in Nuclear Science [abstract]
Abstract: One of the overarching questions in the field of nuclear science is how simple phenomena emerges from complex systems. A nucleus is composed of both protons and neutrons and while many assume the atomic nucleus adopts a spherical shape, the nuclear shape is, in fact, quite variable. Nuclear physicists seek to understand the shape of the atomic nucleus by probing specific transitions between nuclear energy states which occur at high energy with short timescales. This is achieved through detecting a unique experimental signature in the recorded time-series data in experiments conducted at the National Superconducting Cyclotron Laboratory. The current method involves fitting each sample in the dataset to a given parameterized model function. However, this procedure is computationally expensive due to the nature of the nonlinear curve fitting problem. Since data is skewed towards non-unique signatures, we offer a way to filter out the majority of the uninteresting samples from the dataset by using machine learning methods. By doing so, we decrease the computational costs for detection of the unique experimental signatures in the time-series data. Also, we present a way to generate synthetic training data by estimating the distribution of the underlying parameters of the model function with Kernel Density Estimation. The new workflow that leverages machine learned classifiers trained on the synthetic data are shown to significantly outperform the current procedures used in actual datasets.
Mehmet Kaymak, Hasan Metin Aktulga, Fox Ron and Sean Liddick

Workshop on Teaching Computational Science (WTCS) Session 3

Time and Date: 16:30 - 18:10 on 13th June 2019

Room: 0.3

Chair: Angela Shiflet

43 Resource for Undergraduate Research Projects in Mathematical and Computational Biology [abstract]
Abstract: A substantial hurdle faced by undergraduate mathematics faculty and students wishing to embark on collaborative research projects is not knowing quite where to begin. This is true of any field in mathematics, including the more applied areas of mathematical and computational biology, which is the focus of a new volume of the FURM (Foundations for Undergraduate Research in Mathematics) book series published by Birkhauser. Topics that might afford productive inroads for new student researchers are often not obvious, even to experts, and finding unanswered questions that are well-suited to student projects is time-consuming. The volume, which is the topic of this talk, aims to reduce the challenges in starting faculty-student collaborations by presenting self-contained, undergraduate-accessible articles, each of which provides directions for new research in mathematical and computational biology, enough background to get started, and recommendations for further reading. The content spans the breadth of mathematical and computational biology, including many topics that are appropriate for student-faculty exploration and are not normally addressed by the undergraduate curriculum (Hidden Markov Models, e.g.). Each article in this collection has been written with an eye toward generating new research collaborations between undergraduates and faculty. As such, each article presents background material sufficient for preparing readers to tackle specific open problems, which the material also includes. Moreover, authors have carefully cultivated lists of references intended to launch productive ongoing investigations for readers wanting to delve more deeply into a given field. The intended audience is broad: undergraduate mathematics faculty, with a particular emphasis on faculty interested in (but not necessarily experienced in) mathematical and computational biology, and students with sophomore-to junior-level coursework as a background. Undergraduate faculty wishing to direct research will benefit from the many project ideas suggested by the authors, as will faculty simply wishing to expand their own research repertoire in a new direction. Undergraduate mathematics students will appreciate the accessible, yet rigorous treatment of topics previously relegated to the graduate curriculum, some of these with fairly minimal prerequisite assumptions (e.g., calculus and linear algebra). The primary intended audience is undergraduate students, typically in STEM majors, with an interest in pursuing undergraduate research. A secondary audience is applied mathematics and computer science faculty interested in mentoring undergraduate research but unsure of how to get started. A further potential source of interest is among math faculty who are interested in learning a new area of mathematics. This talk will present information about the volume and some of the included material, such as the following: “Using Neural Networks to Identify Bird Species from Birdsong Samples,” “Using Regularized Singularities to Model Stokes Flow: A Study of Fluid Dynamics Induced by Metachronal Ciliary Waves,” “Network Structure and Stochastic Dynamics of Biological Systems,” “Simulating Bacterial Growth, Competition, and Resistance with Agent-Based Models and Laboratory Experiments,” “Phase Sensitivity in Ecological Oscillators,” “What Are the Chances? – Hidden Markov Models,” and “A Tour of the Basic Reproductive Number and the Next Generation of Researchers.”
Hannah Highlander, Carrie Eaton, Alex Capaldi, Angela Shiflet and George Shiflet
151 Numerical Analysis project in ODEs for undergraduate students [abstract]
Abstract: Designing good projects involving programming in numerical analysis for large groups of students with different backgrounds is a challenging task. The assignment has to be manageable for the average student, but to additionally inspire the better students it is preferable that it has some depth and leads to them to think about the subject. We describe a project that was assigned to the students of an introductory Numerical Analysis course at the University of Iceland. The assignment is to numerically compute the length of solution trajectories of a system of ordinary differential equations with a stable equilibrium point. While not difficult to do, the results are somewhat surprising and got the better students to get interested in what was happening. We describe the project, its solution using Matlab, and the underlying mathematics in some detail.
Sigurdur Hafstein
514 Computational Modeling at Rose-Hulman Institute of Technology [abstract]
Abstract: In 2008 Rose-Hulman Institute of Technology in Terre Haute, Indiana, USA introduced a new Computational Science minor built around two key junior-level courses: Introduction to Computational Science and a follow-on, Computational Modeling. The latter course adopted an innovative approach to teaching students to work effectively in teams to develop, implement, test, and refine nontrivial computational models and simulations in a lab-like setting, then use them to investigate a scientific phenomenon, with an emphasis on models based primarily around systems of ordinary differential equations solved in Matlab, but also including models that were discrete, stochastic, and so on. Students teams write several reports during the term, spending one to two weeks per project. This course sequence has led to a textbook by two of the faculty involved in the program (to be published in Spring 2019) which covers material for both courses. In 2012 the minor was expanded to an undergraduate major in Computational Science that requires additional coursework including a full course in parallel computing, separate courses in analytical and numerical (finite differences or finite elements, at the student’s choice) partial differential equations, and other areas. But the Computational Modeling course—popular enough that it is frequently offered off-schedule as an independent study for small groups of students—also served as the inspiration for the more recent Bioinformatics course that is part of the new undergraduate Biomathematics major, which also requires students to take the Introduction to Computational Science course. This course uses a similar approach but with a narrower focus; less emphasis is placed on learning to create and utilize simulations in teams, but lengthier and more detailed reports are expected. Again, there is a strong focus on performing computational science, that is, on using these models to perform computational experiments and to address scientific questions in a detailed and convincing manner. Meanwhile, an even more recent Data Science minor—expected to be expanded to a major in the near future—incorporates aspects of these same courses, and new courses continue to be created along these lines; for example, I will be offering a trial Computational Data Science course for the first time during Spring quarter of 2019. A prototype second course in computational modeling, with the main course as a prerequisite, will be given a soft trial the same term. In this talk I will briefly discuss the structure of the curricula of these majors and minors, emphasizing the Computational Science major and minor, then focus on how computational modeling skills are developed in Rose-Hulman’s Computational Science and Biomathematics students via the Computational Modeling and Bioinformatics courses: Philosophy, syllabi, materials, effective use of classroom time in a lab-like atmosphere, and so on.
Jeffery Leader
102 Growing an inclusive scientific computing community at Boise State University [abstract]
Abstract: In this work we describe the results of campus-wide efforts to grow a campus computing community over a four-year period, strongly leveraging The Carpentries pedagogy. We discuss (1) Development of a required introductory programming course within a materials science curriculum, (2) Impact of regular Software Carpentry training events, and (3) Outcomes of an interdisciplinary Vertically Integrated Projects course entitled "Computing Across Campus" aimed at supporting student researchers. We find that pedagogical approaches focused on lowering cognitive load are effective for efficiently training competent computational science practitioners as evidenced by student research outputs (posters, papers, talks, allocations, awards). We also find that creating computing demand from students requires strategic infrastructure planning and describe obstacles, challenges, and solutions that arose over this four-year period.
Eric Jankowski

Biomedical and Bioinformatics Challenges for Computer Science (BBC) Session 1

Time and Date: 16:30 - 18:10 on 13th June 2019

Room: 0.4

Chair: Mario Cannataro

38 Parallelization of an algorithm for automatic classification of medical data [abstract]
Abstract: In this paper, we present the optimization and parallelization of a state-of-the-art algorithm for automatic classification, aiming to perform real time classification of clinical data. The parallelization has been carried out so that the algorithm can be used in real time in standard computers, or in high performance computing servers. The fastest versions have been obtained carrying out most of the computations in Graphics processing Units (GPUs). The algorithms obtained have been tested in a case of automatic classification of electroencephalographic signals from patients.
Victor M. Garcia-Molla, Addisson Salazar, Gonzalo Safont, Antonio M. Vidal and Luis Vergara
146 Comparing Deep and Machine Learning approaches in bioinformatics: a miRNA-target prediction case study [abstract]
Abstract: MicroRNAs (miRNAs) are small non-coding RNAs with a key role in the post-transcriptional gene expression regularization, thanks to their ability to link with the target mRNA through the complementary base pairing mechanism. Given their role, it is important to identify their targets and, to this purpose, different tools were proposed to solve this problem. However, their results can be very different, so the community is now moving toward the deployment of integration tools, which should be able to perform better than the single ones. As Machine and Deep Learning algorithms are now in their popular years, we developed different classifiers from both areas to verify their ability to recognize possible miRNA-mRNA interactions and evaluated their performance, showing the potentialities and the limits that those algorithms have in this field.
Mauro Castelli, Stefano Beretta, Valentina Giansanti and Ivan Merelli

Multiscale Modelling and Simulation (MMS) Session 3

Time and Date: 16:30 - 18:10 on 13th June 2019

Room: 0.5

Chair: Derek Groen

392 Regional superparameterization of the OpenIFS atmosphere model by nesting 3D LES models [abstract]
Abstract: We present a superparameterization of the ECMWF global weather forecasting model OpenIFS with a local, cloud-resolving model. Superparameterization is a multiscale modeling approach used in atmospheric science in which conventional parameterizations of small-scale processes are replaced by local high-resolution models that resolve these processes. Here, we use the Dutch Atmospheric Large Eddy Simulation model (DALES) as the local model. Within a selected region, our setup nests DALES instances within model columns of the global model OpenIFS. This is done so that the global model parameterizations of boundary layer turbulence, cloud physics and convection processes are replaced with tendencies derived from the vertical profiles of the local model. The local models are in turn forced towards the corresponding vertical profiles of the global model, making the model coupling bidirectional. We consistently combine the sequential physics scheme of OpenIFS with the Grabowski superparameterization scheme and achieve concurrent execution of the independent DALES models on separate CPUs. The superparameterized region can be chosen to match the available compute resources, and we have implemented mean-state acceleration to speed up the LES time stepping. The coupling of the components has been implemented in a Python software layer using the OMUSE multi-scale physics framework. As a result, our setup yields a cloud-resolving weather model that displays emergent mesoscale cloud organization and has the potential to improve the representation of clouds and convection processes in OpenIFS. It allows us to study the interaction of boundary layer physics with the large scale dynamics, to assess cloud and convection parameterization in the ECMWF model, and eventually to improve our understanding of cloud feedback in climate models. [Regional superparameterization in a Global Circulation Model using Large Eddy Simulations, Fredrik Jansson, Gijs van den Oord, Inti Pelupessy, Johanna H. Grönqvist, A. Pier Siebesma, Daan Crommelin, Under review (2018)]
Gijs van den Oord, Fredrik Jansson, Inti Pelupessy, Maria Chertova, Pier Siebesma and Daan Crommelin
396 MaMiCo: Parallel Noise Reduction for Multi-Instance Molecular-Continuum Flow Simulation [abstract]
Abstract: Transient molecular-continuum coupled flow simulations often suffer from high thermal noise, created by fluctuating hydrodynamics within the molecular dynamics (MD) simulation. Multi-instance MD computations are an approach to extract smooth flow field quantities on rather short time scales, but they require a huge amount of computational resources. Filtering particle data using signal processing methods to reduce numerical noise can significantly reduce the number of instances necessary. This leads to improved stability and reduced computational cost in the molecular-continuum setting. We extend the Macro-Micro-Coupling tool (MaMiCo) - a software to couple arbitrary continuum and MD solvers - by a new parallel interface for universal MD data analytics and post-processing, especially for noise reduction. It is designed modularly and compatible with multi-instance sampling. We present a Proper Orthogonal Decomposition (POD) implementation of the interface, capable of massively parallel noise filtering. The resulting coupled simulation is validated using a three-dimensional Couette flow scenario. We quantify the denoising, conduct performance benchmarks and scaling tests on a supercomputing platform. We thus demonstrate that the new interface enables massively parallel data analytics and post-processing in conjunction with any MD solver coupled to MaMiCo.
Piet Jarmatz and Philipp Neumann
303 A Multiscale Model of Atherosclerotic Plaque Development: toward a Coupling between an Agent-Based Model and CFD Simulations [abstract]
Abstract: Computational models have been widely used to predict the efficacy of surgical interventions in response to Peripheral Occlusive Diseases. However, most of them lack of a multiscale description of the development of the dis-ease, which it is our hypothesis being the key to develop an effective predictive model. Accordingly, in this work we present a multiscale computational framework that simulates the generation of atherosclerotic arterial occlusions. Starting from a healthy artery in homeostatic conditions, the perturbation of specific cellular and extracellular dynamics led to the development of the pathology, with the final output being a diseased artery. The presented model was developed on an idealized portion of a Superficial Femoral Artery (SFA), where an Agent-Based Model (ABM), locally replicating the plaque development, was coupled to Computational Fluid Dynamics (CFD) simulations that define the Wall Shear Stress (WSS) profile at the lumen interface. The ABM was qualitatively validated on histological images and a preliminary analysis on the coupling method was conducted. Once optimized the coupling method, the presented model can serve as a predictive platform to improve the outcome of surgical interventions such as angioplasty and stent deployment.
Anna Corti, Stefano Casarin, Claudio Chiastra, Monika Colombo, Francesco Migliavacca and Marc Garbey
215 Mesoscopic simulation of droplet coalescence in fibrous porous media [abstract]
Abstract: Flow phenomena in porous media are relevant in many industrial applications including fabric filters, gas diffusion membranes, and biomedical implants. For instance, nonwoven membranes can be used as filtration media with tailored permeability range and controllable pore size distribution. However, predicting the structure-property relations that arise from specific porous microstructures remains a challenging task. Theoretical approaches have been limited to simple geometries and can often only predict the general trend of experimental data. Computer simulations are a cost-effective way of validating semi-empirical relations and predicting the precise relations between macroscopic transport properties and microscopic pore structure. To this end, multiscale simulation techniques have proven particularly successful in solving numerically the coupled partial differential equations for the complex boundary conditions in porous media. In this talk, I will present simulations of multiphase flow in fibrous porous media based on a multiphase lattice Boltzmann model for water droplets in oil. We study the effect of fibrous structures and their surface properties on the coalescence behavior of water droplets. We will discuss how the insights can be used to design optimized materials for diesel fuel filters and other filtration devices.
Fang Wang and Ulf D. Schiller
382 Computational Analysis of Pulsed Radiofrequency Ablation in Treating Chronic Pain [abstract]
Abstract: In this paper, a parametric study has been conducted to evaluate the effects of frequency and duration of the short burst pulses during pulsed radiofrequency ablation (RFA) in treating chronic pain. Affecting the brain and nervous system, this disease remains one of the major challenges in neuroscience and clinical practice. A two-dimensional axisymmetric RFA model has been developed in which a single needle radiofrequency electrode has been inserted. A finite-element-based coupled thermo-electric analysis has been carried out utilizing the simplified Maxwell’s equations and the Pennes bioheat transfer equation to compute the electric field and temperature distributions within the computational domain. Comparative studies have been carried out between the continuous and pulsed RFA to highlight the significance of pulsed RFA in chronic pain treatment. The frequencies and durations of short burst RF pulses have been varied from 1 Hz to 10 Hz and from 10 ms to 50 ms, respectively. Such values are most commonly applied in clinical practices for mitigation of chronic pain. By reporting such critical input characteristics as temperature distributions for different frequencies and durations of the RF pulses, this computational study aims at providing the first-hand accurate quantitative information to the clinicians on possible consequences in those cases where these characteristics are varied during the pulsed RFA procedure. The results demonstrate that the efficacy of pulsed RFA is significantly dependent on the duration and frequency of the RF pulses.
Sundeep Singh and Roderick Melnik

Solving Problems with Uncertainties (SPU) Session 1

Time and Date: 16:30 - 18:10 on 13th June 2019

Room: 0.6

Chair: Vassil Alexandrov

14 Path-Finding with a Full-Vectorized GPU Implementation of Evolutionary Algorithms in an Online Crowd Model Simulation Framework [abstract]
Abstract: This article introduces a path-finding method based on evolutionary algorithms. It makes an extension of the current work on this problem providing a path-finding algorithm and a parallel computing implementation (GPU-based) of it. The article describes both the GPU implementation of full-vectorized genetic algorithms and a path-finding method for large maps based on dynamic tiling. The approach is able to serve a large number of agents due its performance and can handle dynamic obstacles in maps of arbitrary size. The experiments show the proposed approach outperforms other traditional path-finding algorithms, like breadth-first search, Dijkstra’s algorithm, and A*. The conclusions present further improvement possibilities to the proposed approach like the application of multi-objective algorithms to represent full crowd models. Also, further improvement of the presented approach is discussed.
Anton Aguilar-Rivera
44 Analysing the trade-off between computational performance and representation richness in ontology-based systems [abstract]
Abstract: As the result of the intense research activity of the past decade, Semantic Web technology has achieved a notable popularity and maturity. This technology is leading the evolution of the Web via interoperability by providing structured metadata. Because of the adoption of rich data models on a large scale to support the representation of complex relationships among concepts and automatic reasoning, the computational performance of ontology-based systems can significantly vary. In the evaluation of such a performance, a number of critical factors should be considered. Within this paper, we provide an empirical framework that yields an extensive analysis of the computational performance of ontology-based systems. The analysis can be seen as a decision tool in managing the constraints of representational requirements versus reasoning performance. Our approach adopts synthetic ontologies characterised by an increasing level of complexity up to OWL 2 DL. The benefits and the limitations of this approach are discussed in the paper.
Salvatore Flavio Pileggi, Fabian Peña, Maria Del Pilar Villamil and Ghassan Beydoun
92 Assessing uncertainties of unrepresented heterogeneity in soil hydrology using data assimilation [abstract]
Abstract: Soil hydrology is a discipline of environmental physics exhibiting considerable model errors in all its processes. Soil water movement is a key ecosystem process, with a crucial role in services like water buffering, fresh water retention, and climate regulation. The soil hydraulic properties as well as the multi-scale soil architecture are hardly ever known with sufficient accuracy. In interplay with a highly non-linear process described by the Richards equation, this yields significant prediction uncertainties. Data assimilation is a recent approach to cope with the challenges of quantitative soil hydrology. The ensemble Kalman filter (EnKF) is a method which allows to handle model errors for non-linear processes. This enables estimation of system state and trajectory, soil hydraulic parameters, and small-scale soil heterogeneities at measurement locations. Uncertainties in all estimated compartments can be incorporated and quantified. However, as measurements are typically scarce, estimations of high-resolution heterogeneity fields remain challenging. Relevant spatial scales for soil water movement range from less than a meter to kilometers. Accurately representing soil heterogeneities in models at all scales is exceptionally difficult. We investigate this issue on the small scale, where we model a two-dimensional domain with prescribed heterogeneity and conduct synthetic observations in typical measurement configurations. The EnKF is applied to estimate a one-dimensional soil profile including heterogeneities. We assess the capability of the method to cope with the effects of unrepresented heterogeneity by analyzing the discrepancy between synthetic 2D and estimated 1D representation.
Lukas Riedel, Hannes Helmut Bauser and Kurt Roth
119 A Framework for Distributed Approximation of Moments with Higher-Order Derivatives through Automatic Differentiation [abstract]
Abstract: We present a framework for the distributed approximation of moments, enabling an online evaluation of the uncertainty in a dynamical system. The first and second moment, mean, and variance are computed with up to third-order Taylor series expansion. The required derivatives for the expansion are generated automatically by automatic differentiation and propagated through an implicit time stepper. The computational kernels are the accumulation of the derivatives (Jacobian, Hessian, tensor) and the covariance matrix. We apply distributed parallelism to the Hessian or third-order tensor, and the user merely has to provide a function for the differential equation, thus achieving similar ease of use as Monte Carlo-based methods. We demonstrate our approach using with benchmarks on Theta, a KNL-based system at the Argonne Leadership Computing Facility.
Michel Schanen, Daniel Adrian Maldonado and Mihai Anitescu
191 IPIES for Uncertainly Dened Shape of Boundary, Boundary Conditions and Other Parameters in Elasticity Problems [abstract]
Abstract: The main purpose of this paper is modelling and solving boundary value problems considering simultaneously uncertainty of all of input data. These are such data as: shape of boundary, boundary conditions and other parameters. The strategy is presented on problems described by Navier-Lamé equations. Therefore, the uncertainty of parameters here, means the uncertainty of the Poisson ratio and Young's modulus. For solving uncertainly defined problems we use interval parametric integral equations system method (IPIES). In this method we propose modification of directed interval arithmetic for modeling and solving uncertainly defined problems. We consider an examples of uncertainly defined, 2D elasticity problems. We present boundary value problems with linear and as well curvelinear (modelled using NURBS curves) shape of boundary. We verify obtained interval solutions with compare to precisely defined (without uncertainty) analytical solutions. Additionally, to obtain errors of such solutions, we decided to use the total differential method. We also analyze influence of input data uncertainty on interval solutions.
Marta Kapturczak and Eugeniusz Zieniuk

Computational Optimization, Modelling and Simulation (COMS) Session 2

Time and Date: 16:30 - 18:10 on 13th June 2019

Room: 1.4

Chair: Xin-She Yang

149 Fully-Asynchronous Cache-Efficient Simulation of Detailed Neural Networks [abstract]
Abstract: Modern asynchronous runtime systems allow the re-thinking of large-scale scientific applications. With the example of a simulator of morphologically detailed neural networks, we show how detaching from the commonly used bulk-synchronous parallel (BSP) execution allows for the increase of prefetching capabilities, better cache locality, and a overlap of computation and communication, consequently leading to a lower time to solution. Our strategy removes the operation of collective synchronization of ODEs’ coupling information, and takes advantage of the pairwise time dependency between equations, leading to a fully-asynchronous exhaustive yet not speculative stepping model. Combined with fully linear data structures, communication reduce at compute node level, and an earliest equation steps first scheduler, we perform an acceleration at the cache level that reduces communication and time to solution by maximizing the number of timesteps taken per neuron at each iteration. Our methods were implemented on the core kernel of the NEURON scientific application. Asynchronicity and distributed memory space are provided by the HPX runtime system for the ParalleX execution model. Benchmark results demonstrate a superlinear speedup that leads to a reduced runtime compared to the bulk synchronous execution, yielding a speedup between 25% to 65% across different compute architectures, and in the order of 15% to 40% for distributed executions.
Bruno Magalhaes, Thomas Sterling, Michael Hines and Felix Schuermann
441 Application of the model with a non-Gaussian linear scalar filters to determine life expectancy, taking into account the cause of death [abstract]
Abstract: It is widely known that the worldwide development of civilization diseases (especially in the second half of the twentieth century) is the cause of the increase in mortality not caused by death from natural causes. In Poland, the most common causes of death, both for women and men, include cancer and cardiovascular disease. The aim of the article is to propose a method of modeling the life expectancy index based on the non-Gaussian linear scalar filter model stand on death rates after eliminating one or both of the above causes of death. The obtained results indicate that their elimination may be expected to extend life expectancy by several or more years depending on the cause of death and gender.
Piotr Sliwka
353 Improving ODE integration on graphics processing units by reducing thread divergence [abstract]
Abstract: Ordinary differential equations are widely used for the mathematical modeling of complex systems in biology and statistics. Since the analysis of such models needs to be performed using numerical integration, many applications can be gravely limited by the computational cost. This paper present a general-purpose integrator that runs massively parallel on graphics processing units. By minimizing thread divergence and bundling similar tasks using linear regression, execution time can be reduced by 40-80% when compared to a naive GPU implementation. Compared to a 36-core CPU implementation, a 150 fold runtime improvement is measured.
Thomas Kovac, Tom Haber, Frank Van Reeth and Niel Hens
143 Data Compression for Optimization of Molecular Dynamics System: Preserving Basins of Attraction [abstract]
Abstract: Understanding the evolution of atomistic systems is essential in various fields such as materials science, biology, and chemistry. The gold standard for these calculations is molecular dynamics, which simulates the dynamical interaction between pairs of molecules. The main challenge of such simulation is the numerical complexity, given a vast number of atoms over a long time scale. Furthermore, such systems often contain exponentially many optimal states, and the simulation tends to get trapped in local configurations. Recent developments leverage the existing temporal evolution of the system to improve the stability and scalability of the method; however, they suffer from large data storage requirements. To efficiently compress the data while retaining the basins of attraction, we have developed a framework to determine the acceptable level of compression for an optimization method by application of a Kantorovich-type theorem, using binary digit rounding as our compression technique. Choosing the Lennard-Jones potential function as a model problem, we present a method for determining the local Lipschitz constant of the Hessian with low computational cost, thus allowing the use of our technique in real-time computation.
Michael Retzlaff, Todd Munson and Zichao Di
519 An algorithm to perform hydraulic tomography based on a mixture model [abstract]
Abstract: Hydraulic Tomography (HT) has become one of the most sophisticated methods to characterize aquifer heterogeneity and in some experiments it has proved to be an accurate technique, but in order to achieve this goal, it is needed to perform several pumping/injection tests and to have enough measurements at each test. Also, during the solution of the inverse problem, the groundwater flow equation is solved numerically many times and thus the computational time can be very large, specially when a 3D or a transient models is used. In this work we present a new approach to model the aquifer heterogeneity based in a Gaussian Mixture Model, the proposed approach improves computation time and accuracy of the HT experiment and also it tries address the problems involved in the inverse problem, as is the effect of noisy data, the need of many pumping/injection tests and the lack of resolution when the distribution of the aquifer conductivity does not correspond to a Gaussian distribution. In synthetic experiments this approach was able to achieve one fifth of the error in the estimation of the conductivity field than one of the most used inversion methods for HT (SSLE/VSAFT), in one fourth of the computation time. In a steady-state sandbox experiment, it detected the main layers of conductivity in one fourth of the computational time than VSAFT, including a layer that was only detected with VSAFT when a transient model was used to perform the HT.
Carlos Minutti, Walter Illman and Susana Gomez

Smart Systems: Bringing Together Computer Vision, Sensor Networks and Machine Learning (SmartSys) Session 3

Time and Date: 16:30 - 18:10 on 13th June 2019

Room: 2.26

Chair: João Rodrigues

157 Towards Low-Cost Indoor Localisation Using a Multi-camera System [abstract]
Abstract: Indoor localisation is a fundamental problem in robotics, which has been the subject of several research works over the last few years. Indeed, while solutions based on fusion of inertial and global navigation satellite system (GNSS) measurements have proved their efficiency in outdoor environments, indoor localisation remains an open research problem. Although commercial motion tracking systems can offer very accurate position estimation, their high cost cannot be afforded by all research laboratories. This paper presents an indoor localisation solution based on a multi-camera setup. The proposed system rely on low-cost sensors, which makes it very affordable compared to commercial motion-tracking systems. We show through the experiments conducted that the proposed approach, although being cheap, can provide real-time position measurements with an error of less than 2 cm up to a distance of 2m.
Oualid Araar, Bouhired Saadi, Sami Moussiou and Ali Laggoune
169 A New Shape Descriptor and Segmentation Algorithm for Automated Classifying of Multiple-Morphological Filamentous Algae [abstract]
Abstract: In our previous work on automated microalgae classification system we proposed the multi-resolution image segmentation that can handle well with unclear boundary of algae bodies and noisy background, since an image segmentation is the most important preprocessing step in object classification and recognition. The previously proposed approach was able to classify twelve genera of microalgae successfully; however, when we extended it to work with new genera of filamentous algae, new challenging problems were encountered. These difficulties arise due to a variety of the forms of filamentous algae, which complicates both image segmentation and classification processes, resulting in substantial degradation of classification accuracy. Thus, in this work we propose a modified version of our multi-resolution segmentation algorithm by combining them in such a way that the strengths of both algorithms complement each other's weaknesses. We also propose a new skeleton-based shape descriptor to alleviate an ambiguity caused by multiple morphologies of filamentous forms of algae in classification process. Effectiveness of the two proposed approaches are evaluated on five genera of filamentous microalgae. SMO is used as a classifier. Experimental result of 91.30% classification accuracy demonstrates a significant improvement of our proposed approaches.
Saowanee Iamsiri, Nuttha Sanevas, Chakrit Watcharopas and Pakaket Wattuya
393 Application of hierarchical clustering for object tracking with a Dynamic Vision Sensor [abstract]
Abstract: Monitoring public space with imaging sensors to perform a object- or person-tracking is often associated with privacy concerns. We present a Dynamic Vision Sensor (DVS) based approach to achieve this tracking, that does not require the creation of conventional grey- or colorimages. These Dynamic Vision Sensors produce an event-stream of information, which only includes the changes in the scene. The presented approach for tracking consider the scenario of fixed mounted sensors. The method is based on clustering events and tracing the resulting cluster centers to accomplish the object tracking. We show the usability of this approach with a first proof-of-concept test.
Tobias Bolten, Regina Pohle-Fröhlich and Klaus D. Tönnies
473 Binarization of Degraded Document Images with Generalized Gaussian Distribution [abstract]
Abstract: One of the most crucial steps of preprocessing of document images subjected to further text recognition is their binarization, which influences significantly obtained OCR results. Since for degrades images, particularly historical documents, classical global and local thresholding methods may be inappropriate, a challenging task if their binarization is still up-to-date. In the paper a novel approach to the use of Generalized Gaussian Distribution for this purpose is presented. Assuming the presence of distortions, which may be modelled using the Gaussian noise distribution, in historical document images, a significant similarity of their histograms to those obtained for binary images corrupted by Gaussian noise may be observed. Therefore, extracting the parameters of Generalized Gaussian Distribution, distortions may be modelled and removed, enhancing the quality of input data for further thresholding and text recognition. Due to relatively long processing time, its reduction using the Monte Carlo method is proposed as well. The proposed algorithm has been verified using well-known DIBCO datasets leading to very promising results of binarization.
Robert Krupiński, Piotr Lech, Mateusz Tecław and Krzysztof Okarma
256 Nonlinear dimensionality reduction in texture classication: is manifold learning better than PCA? [abstract]
Abstract: This paper presents a comparative analysis of algorithms belonging to manifold learning and linear dimensionality reduction. We first combine classical texture image descriptors, namely Gray-Level Co-occurrence Matrix features, Haralick features, Histogram of Oriented Gradients features and Local Binary Patterns to characterize and discriminate textures. For patches extracted from several texture images, we perform a concatenation of the image descriptors. Using four algorithms to wit PCA, LLE, ISOMAP and Lap. Eig., dimensionality reduction is achieved. The resulting learned features are then used to train four different classifiers: k-nearest neighbors, naive Bayes, decision tree and multilayer perceptron. Finally, the non-parametric statistical hypothesis test, Wilcoxon signed-rank test, is used to figure out whether or not manifold learning algorithms perform better than PCA. Computational experiments were conducted using the Outex and Salzburg datasets and the obtained results show that among twelve comparisons that were carried out, PCA presented better results than ISOMAP, LLE and Lap. Eig. in tree comparisons. The remainder nine comparisons did not presented significant differences, indicating that in the presence of huge collections of texture images (bigger databases) the combination of image feature descriptors or patches extracted directly from raw image data and manifold learning techniques is potentially able to improve texture classification.
Cedrick Bamba and Alexandre Levada