Session7 13:25 - 15:05 on 14th June 2017

ICCS 2017 Main Track (MT) Session 7

Time and Date: 13:25 - 15:05 on 14th June 2017

Room: HG F 30

Chair: Ming Xu

424	Efficient Simulation of Financial Stress Testing Scenarios with Suppes-Bayes Causal Networks [abstract] Abstract: The most recent financial upheavals have cast doubt on the adequacy of some of the conventional quantitative risk management strategies, such as VaR (Value at Risk), in many common situations. Consequently, there has been an increasing need for verisimilar financial stress testings, namely simulating and analyzing financial portfolios in extreme, albeit rare scenarios. Unlike conventional risk management which exploits statistical correlations among financial instruments, here we focus our analysis on the notion of probabilistic causation, which is embodied by Suppes-Bayes Causal Networks (SBCNs), SBCNs are probabilistic graphical models that have many attractive features in terms of more accurate causal analysis for generating financial stress scenarios. In this paper, we present a novel approach for conducting stress testing of financial portfolios based on SBCNs in combination with classical machine learning classification tools. The resulting method is shown to be capable of correctly discovering the causal relationships among financial factors that affect the portfolios and thus, simulating stress testing scenarios with a higher accuracy and lower computational complexity than conventional Monte Carlo Simulations.	Gelin Gao, Bud Mishra and Daniele Ramazzotti
531	Simultaneous Prediction of Wind Speed and Direction by Evolutionary Fuzzy Rule Forest [abstract] Abstract: An accurate estimate of wind speed and direction is important for many application domains including weather prediction, smart grids, and e.g. traffic management. These two environmental variables depend on a number of factors and are linked together. Evolutionary fuzzy rules, based on fuzzy information retrieval and genetic programming, have been used to solve a variety of real-world regression and classification tasks. They were, however, limited by the ability to estimate only one variable by a single model. In this work, we introduce an extended version of this predictor that facilitates an artificial evolution of forests of fuzzy rules. In this way, multiple variables can be predicted by a single model that is able to comprehend complex relations between input and output variables. The usefulness of the proposed concept is demonstrated by the evolution of forests of fuzzy rules for simultaneous wind speed and direction prediction.	Pavel Kromer and Jan Platos
557	Performance Improvement of Stencil Computations for Multi-core Architectures based on Machine Learning [abstract] Abstract: Stencil computations are basis to solve many problems related to Partial Differential Equations (PDEs). Obtaining the best performance with such numerical kernels is a major issue as many critical parameters (architectural features, compiler flags, memory policies, multithreading strategies) must be finely tuned. In this context, auto-tuning methods have been extensively used last few years to improve the overall performance. However, the complexity of current architectures and the large number of optimizations to consider reduce the efficiency of this approach. This paper focuses on the use of Machine Learning to predict the performance of PDEs on multicore architectures. Low-level hardware counters (e.g. cache-misses and TLB misses) on a limited number of executions are used to build our predictive model. We have considered two different kernels (7-point Jacobi and seismic equation) to demonstrate the effectiveness of our approach. Our results show that the performance can be predicted and the best input configuration for stencil problems can be obtained by simulations of hardware counters and performance measurements.	Victor Martinez, Fabrice Dupros, Márcio Castro and Philippe Navaux
321	Distributed training strategies for a computer vision deep learning algorithm on a distributed GPU cluster [abstract] Abstract: Deep learning algorithms base their success on building high learning capacity models with millions of parameters that are tuned in a data-driven fashion. These models are trained by processing millions of examples, so that the development of more accurate algorithms is usually limited by the throughput of the computing devices on which they are trained. In this work, we explore how the training of a state-of-the-art neural network for computer vision can be parallelized on a distributed GPU cluster. The effect of distributing the training process is addressed from two different points of view. First, the scalability of the task and its performance in the distributed setting are analyzed. Second, the impact of distributed training methods on the final accuracy of the models is studied.	Víctor Campos, Francesc Sastre, Maurici Yagües, Míriam Bellver, Xavier Giró-I-Nieto and Jordi Torres

ICCS 2017 Main Track (MT) Session 14

Time and Date: 13:25 - 15:05 on 14th June 2017

Room: HG D 1.1

Chair: Jose A. Belloch

530	A Multithreaded Algorithm for Sparse Cholesky Factorization [abstract] Abstract: We present a multithreaded method for supernodal sparse Cholesky factorizations on a hybrid multicore platform consisting of a multicore CPU and GPU. Our algorithm can utilize concurrentcy at differnt levels of the elimination tree by using multiple threads in both the CPU and the GPU. By factorizing multiple matrices in a batch our algorithm can generate better performance than previous implementations. Our experiments results on a platform consisting of an Intel multicore processor along with an Nvidia GPU indicate a significant improvement in performance over single-threaded supernodal algorithm.	Meng Tang, Mohamed Gadou and Sanjay Ranka
550	Utilizing Intel Advanced Vector Extensions for Monte Carlo Simulation based Value at Risk Computation [abstract] Abstract: Value at Risk (VaR) is a statistical method of predicting market risk associated with financial portfolios. There are numerous statistical models which forecast VaR and out of those, Monte Carlo Simulation is a commonly used technique with a high accuracy though it is computationally intensive. Calculating VaR in real time is becoming a need of short term traders in current day markets and adapting Monte Carlo method of VaR computation for real time calculation poses a challenge due to the computational complexity involved with the simulation step of the Monte Carlo Simulation. The simulation process has an independent set of tasks. Hence a performance bottleneck occurs during the sequential execution of these independent tasks. By parallelizing these tasks, the time taken to calculate the VaR for a portfolio can be reduced significantly. In order to address this issue, we looked at utilizing the Advanced Vector Extensions (AVX) technology to parallelize the simulation process. We compared the performance of the AVX based solution against the sequential approach as well as against a multi threaded solution and a GPU based solution. The results showed that the AVX approach outperformed the GPU approach for up to an iteration count of 200000. Since such a number of iterations is generally not required to gain a sufficiently accurate VaR measure, it makes sense both computationally and economically to utilize AVX for Monte Carlo method of VaR computation.	Nipuna Liyanage, Pubudu Fernando, Dilini Mampitiya Arachchi, Dilip Karunathilaka and Amal Perera
564	Sparse Local Linear Embedding [abstract] Abstract: The Locally Linear Embedding (LLE) algorithm has proven useful for determining structure preserving, dimension reducing mappings of data on manifolds. We propose a modification to the LLE optimization problem that serves to minimize the number of neighbors required for the representation of each data point. The algorithm is shown to be robust over wide ranges of the sparsity parameter producing an average number of nearest neighbors that is consistent with the best performing parameter selection for LLE. Given the number of non-zero weights may be substantially reduced in comparison to LLE, Sparse LLE can be applied to larger data sets. We provide three numerical examples including a color image, the standard swiss roll, and a gene expression data set to illustrate the behavior of the method in comparison to LLE. The resulting algorithm produces comparatively sparse representations that preserve the neighborhood geometry of the data in the spirit of LLE.	Lori Ziegelmeier, Michael Kirby and Chris Peterson
148	Efficient iterative methods for multi-frequency wave propagation problems: A comparison study [abstract] Abstract: In this paper we present a comparison study for three different iterative Krylov methods that we have recently developed for the simultaneous numerical solution of wave propagation problems at multiple frequencies. The three approaches have in common that they require the application of a single shift-and-invert preconditioner at a suitable 'seed' frequency. The focus of the present work, however, lies on the performance of the respective iterative method. We conclude with numerical examples that provide guidance concerning the suitability of the three methods.	Manuel Baumann and Martin B. van Gijzen
437	Lyapunov Function computation for systems with multiple equilibria [abstract] Abstract: Recently a method was presented to compute Lyapunov functions for nonlinear systems with multiple local attractors. This method was shown to succeed in delivering algorithmically a Lyapunov function giving qualitative information on the system's dynamics, including lower bounds on the attractors' basins of attraction. We suggest a simpler and faster algorithm to compute such a Lyapunov function if the attractors in question are exponentially stable equilibrium points. Just as in the earlier publication one can apply the algorithm and expect to obtain partial information on the system dynamics if the assumptions on the system at hand are only partially fulfilled. We give four examples of our method applied to different dynamical systems from the literature.	Sigurdur Hafstein and Johann Bjornsson

Biomedical and Bioinformatics Challenges for Computer Science (BBC) Session 2

Time and Date: 13:25 - 15:05 on 14th June 2017

Room: HG D 1.2

Chair: Mario Cannataro

485	Accelerating the Diffusion-Weighted Imaging Biomarker in the clinical practice: Comparative study [abstract] Abstract: Diffusion Weighted Image (DWI) methods (ADC and IVIM models) extract meaningful information about the microscopic motions of water of human tissues from MRIs. This is a non invasive method which plays an important role in the diagnosis of ischemic strokes, high grade gliomas or tumors. In the La Fe Polytechnic and University Hospital, the DWI methods aforementioned are used in clinical practice and Matlab is used as a development tool for his out of box performance and fast prototyping. However, each image may require hours to compute due to Matlab environment and interpreted functions. Because of this, its use in clinical practice is limited. In this paper we present three compiled versions on which different parallel paradigms based on multicore (OpenMP) and GPU (CUDA) are applied. These implementations have managed to reduce the computation time to less than one minute, therefore, it allows easing their use in daily clinical practice at a cheap acquisition cost.	Ferran Borreguero Torro, J Damian Segrelles Quilis, Ignacio Blanquer Espert, Angel Alberich Bayarri and Luis Martí Bonmatí
311	Combining Grid Computing and Docker Containers for the Study and Parametrization of CT Image Reconstruction Methods [abstract] Abstract: Computed tomography (CT) is one of the most widely used methods in Medical Imaging. Despite of its relevance in the diagnosis of diseases with a high impact in our society (such as cancer), it is one of the most potentially harmful modalities. CT requires a high X-ray dose to be induced to the patients. Solving the CT Image Reconstruction problem iteratively in order to approximate the solution allows working with only a subset of the input data required by direct methods. This directly implies a reduction of the radiation received by the patient and a strong reduction on the potential morbidity. Therefore, we aim to study the feasibility of such methods for their actual application, with the purpose of concluding if they are accurate and can obtain good quality images with a lower dose of X rays. This paper discusses the use of containers within a Grid Computing platform to perform a thorough study of all the possible congurations and parameters of various methods being developed to reconstruct CT images iteratively, which could lead to nd the optimal conguration of the parameters. The work compares two approaches for managing the software dependencies of the code: store the software libraries on a Storage Element and using containers for executing the job.	Mónica Chillarón Pérez, Vicente Vidal Gimeno, J. Damià Segrelles Quilis, Ignacio Blanquer Espert and Gumersindo Verdú Martín
280	Investigation of the visual attention role in clinical bioethics decision-making using machine learning algorithms [abstract] Abstract: This study proposes the use of a computational approach based on machine learning (ML) algorithms to build predictive models using eye tracking data. Our intention is to provide results that may support the study of medical investigation in the decision-making process in clinical bioethics, particularly in this work, in cases of euthanasia. The data used in the approach were collected from 75 students of the nursing undergraduate course using an eye tracker. The available data were processed through feature selection methods, and were later used to create models capable of predicting the euthanasia decision through ML algorithms. Statistical experiments showed that the predictive model resultant from the multilayer perceptron (MLP) algorithm led to the best performance compared with the other tested algorithms, presenting an accuracy of 90.7% and a mean area under the ROC curve of 0.90. Interesting knowledge (patterns and rules) for the studied bioethical decision-making was extracted using simulations with MLP models and inspecting the obtained decision-tree rules. The good performance shown by the obtained MLP predictive model demonstrates that the proposed investigation approach may be used to test scientific hypotheses related to visual attention and decision-making.	Daniel L. Fernandes, Rodrigo Siqueira-Batista, Andréia P. Gomes, Camila R. Souza, Israel T. Da Costa, Felippe Da S. L. Cardoso, João V. De Assis, Gustavo H. L. Caetano and Fabio R. Cerqueira
206	Emotion recognition using facial expressions [abstract] Abstract: In the article there are presented the results of recognition of seven emotional states (neutral, joy, sadness, surprise, anger, fear, disgust) based on facial expressions. Coefficients describing elements of facial expressions, registered for six subjects, were used as features. The features have been calculated for three-dimensional face model. The classification of features were performed using k-NN classifier and MLP neural network.	Pawel Tarnowski, Marcin Kolodziej, Remigiusz Rak and Andrzej Majkowski
479	Vocal signal analysis in patients affected by Multiple Sclerosis [abstract] Abstract: Multiple Sclerosis (MS) is one of the most common neurodegenerative disorder that presents specific manifestations among which the impaired speech (known also as dysarthria). The evaluation of the speech plays a crucial role in the diagnosis and follow-up since the identification of anomalous patterns in vocal signal may represent a valid support to physician in diagnosis and monitoring of these neurological diseases. In this contribution, we present a method to perform voice analysis of neurologically impaired patients affected by MS aiming to early detection, differential diagnosis, and monitoring of disease progression. This method integrates two well-known methodologies to support the health structure in MS diagnosis in clinical practice. Acoustic analysis and vowel metric methodologies have been considered to implement this procedure to better define the pathological voices compared to healthy voices. Specifically, the method acquires and analyzes vocal signals performing features extraction and identifying possible important patterns useful to associate impaired speech with this neurological disease. The contribution consists in furnishing to physician a guide method to support MS trend. As result, this method furnishes patterns that could be valid indicators for physician in monitoring of patients affected by MS. Moreover, the procedure is appropriate to be used in early diagnosis that is critical in order to improve the patient's quality of life and to prolong it.	Patrizia Vizza, Domenico Mirarchi, Giuseppe Tradigo, Maria Redavide, Roberto Bruno Bossio and Pierangelo Veltri

Computational Finance and Business Intelligence (CFBI) Session 2

Time and Date: 13:25 - 15:05 on 14th June 2017

Room: HG D 7.1

Chair: Yong Shi

180	Pension Fund Asset Allocation: A Mean-Variance Model with CVaR Constraints [abstract] Abstract: In this paper, we first review some important aspects of asset allocation for some typical large Social Security Reserve Funds (SSRFs) in the world. Then we present the mean-variance model with CVaR constraints as asset allocation methodology. Concerning the real circumstance in China, we apply the model to pension fund asset allocation. The empirical results show that to maintain purchase power of pension fund, certain proportion should be invested in stocks as well as direct equity investments. We also find that time horizon significantly influence asset allocation of pension fund. If time horizon is longer, more allocations to stocks and equity investments help the pension fund to achieve better performance.	Yibing Chen, Xiaolei Sun and Jianping Li
304	Short-term Electricity Price Forecasting with Empirical Mode Decomposition based Ensemble Kernel Machines [abstract] Abstract: Short-term electricity price forecasting is a critical issue for the operation of both electricity markets and power systems. An ensemble method composed of Empirical Mode Decomposition (EMD), Kernel Ridge Regression (KRR) and Support Vector Regression (SVR) is presented in this paper. For this purpose, the electricity price signal was first decomposed into several intrinsic mode functions (IMFs) by EMD, followed by a KRR which was used to model each extracted IMF and predict the tendencies. Finally, the prediction results of all IMFs were combined by an SVR to obtain an aggregated output for electricity price. The electricity price datasets from Australian Energy Market Operator (AEMO) are used to test the effectiveness of the proposed EMD-KRR-SVR approach. Simulation results demonstrated attractiveness of the proposed method based on both accuracy and efficiency.	Xueheng Qiu, Ponnuthurai Suganthan and Gehan Amaratunga
534	Russian Interbank Network Reconstruction via Metaheuristic Algorithm [abstract] Abstract: We propose an application of the metaheuristic algorithm to interbank market reconstruction. This is a simulated annealing algorithm that is considered, and it is Russian interbank market that this is applied to. We consider a network with the 504 largest Russian banks to be compared with corresponding empirical results obtained by Leonidov & Rumyantsev. The topological properties of a graph to be fitted was average in- and out- degree, density and average clustering coefficient. The proposed algorithm of network reconstruction is compared with maximum entropy, minimum density, low density methods. Results shown the efficiency of the approach.	Valentina Y. Guleva, Vyacheslav Povazhnyuk, Klavdiya Bochenina and Alexander Boukhanovsky
58	Identification of failing banks using Clustering with self-organising neural networks [abstract] Abstract: This paper presents experimental results of cluster analysis using self organising neural networks for identifying failing banks. The paper first describes major reasons and likelihoods of bank failures. Then it demonstrates an application of a self-organising neural network and presents results of the study. Findings of the paper demonstrate that a self-organising neural network is a powerful tool for identifying potentially failing banks. Finally, the paper discusses some of the limitations of cluster analysis related to understanding of the exact meaning of each cluster.	Michael Negnevitsky
570	Clustering algorithms for Risk-Adjusted Portfolio Construction [abstract] Abstract: This paper presents the performance of seven portfolios created using clustering analysis techniques to sort out assets into categories and then applying classical optimization inside every cluster to select best assets inside each asset category. The proposed clustering algorithms are tested constructing portfolios and measuring their performances over a two month dataset of 1-minute asset returns from a sample of 175 assets of the Russell 1000® index. A three-week sliding window is used for model calibration, leaving an out of sample period of five weeks for testing. Model calibration is done weekly. Three different rebalancing periods are tested: every 1, 2 and 4 hours. The results show that all clustering algorithms produce more stable portfolios with similar volatility. In this sense, the portfolios volatilities generated by clustering algorithms are smaller when compare to the portfolio obtained using classical Mean-Variance Optimization (MVO) over all the dataset. Hierarchical clustering algorithms achieve the best financial performance obtaining an adequate trade-off between accumulated financial returns and the risk-adjusted measure Omega ratio during the out of sample testing period.	Diego León, Arbey Aragón, Javier Sandoval, Germán Hernández, Andrés Arévalo and Jaime Niño
167	Study of the periodicity in Euro-US Dollar exchange rates using local alignment and random matrixes [abstract] Abstract: The purpose of this study was to detect latent periodicity in the presence of deletions or insertions in the analyzed data, when the points of deletions or insertions are unknown. A mathematical method was developed to search for periodicity in the numerical series, using dynamic programming and random matrices. The developed method was applied to search for periodicity in the Euro/Dollar (Eu/$) exchange rate. Period length equal to 24 and 25 h were found. The reasons for the existence of the periodicity in the financial time series are discussed. The results can find application in computer systems, for the purpose of forecasting exchange rates.	Eugene Korotkov and Maria Korotkova

Data-Driven Computational Sciences (DDCS) Session 4

Time and Date: 13:25 - 15:05 on 14th June 2017

Room: HG D 7.2

Chair: Craig Douglas

381	Improving Performance of Multiclass Classification by Inducing Class Hierarchies [abstract] Abstract: In the last decades, one issue that has received a lot of attention in classification problems is how to obtain better classifications. This problem becomes even more complicated when the number of classes is high. In this multiclass scenario, it is assumed that the class labels are independent of each other, and thus, most techniques and methods proposed to improve the performance of the classifiers rely on it. An alternative way to address the multiclass problem is to hierarchically distribute the classes in a collection of multiclass subproblems by reducing the number of classes involved in each local subproblem. In this paper, we propose a new method for inducing a class hierarchy from the confusion matrix of a multiclass classifier. Then, we use the class hierarchy to learn a tree-like hierarchy of classifiers for solving the original multiclass problem in a similar way as the top-down hierarchical classification approach does for working with hierarchical domains. We experimentally evaluate the proposal on a collection of multiclass datasets showing that, in general, the generated hierarchies not only outperforms the original (flat) classification but also hierarchical approaches based on other alternative ways of constructing the class hierarchy.	Daniel Andrés Silva Palacios, Cèsar Ferri and Maria José Ramírez Quintana
574	The Impact of Large-Data Transfers in Shared WANs: An Empirical Study [abstract] Abstract: Computational science, especially in the era of Big Data, sometimes requires large data files to be transferred over high bandwidth-delay-product (BDP) wide-area networks (WANs). Experimental data (e.g., LHC, SKA), analytics logs, and filesystem backups are regularly transferred between research centres and between private-public clouds. Fortunately, a variety of tools (e.g., GridFTP, UDT, PDS) have been developed to transfer bulk data across WANs with high performance. However, using large-data transfer tools are known to adversely affect other network applications on shared networks. Many of the tools explicitly ignore TCP fairness to achieve high performance. Users have experienced high latencies and low bandwidth situations when a large-data transfer is underway. But there have been few empirical studies that quantify the impact of the tools. As an extention of our previous work using synthetic background traffic, we perform an empirical analysis of how the bulk-data transfer tools perform when competing with a non-synthetic, application-based workload (e.g., Network File System). Conversely, we characterize and quantify the impact of bulk-data transfers on the application-based traffic. For example, we show that the RTT latency for other applications can increase from about 130 ms to about 230 ms for the non-bulk-data users of a shared network.	Hamidreza Anvari and Paul Lu

Mathematical Methods and Algorithms for Extreme Scale (MATH-EX) Session 1

Time and Date: 13:25 - 15:05 on 14th June 2017

Room: HG D 3.2

Chair: Vassil Alexandrov

356	Variable-Size Batched Gauss-Huard for Block-Jacobi Preconditioning [abstract] Abstract: In this work we present new kernels for the generation and application of block-Jacobi preconditioners that accelerate the iterative solution of sparse linear systems on graphics processing units (GPUs). Our approach departs from the conventional LU factorization and decomposes the diagonal blocks of the matrix using the Gauss-Huard method. When enhanced with column pivoting, this method is as stable as LU with partial/row pivoting. Due to extensive use of GPU registers and integration of implicit pivoting, our variable size batched Gauss-Huard implementation outperforms the batched version of LU factorization. In addition, the application kernel combines the conventional two-stage triangular solve procedure, consisting of a backward solve followed by a forward solve, into a single stage that performs both operations simultaneously.	Hartwig Anzt, Jack Dongarra, Goran Flegar, Enrique S. Quintana-Orti and Andres E. Tomas
367	Parallel Modularity Clustering [abstract] Abstract: In this paper we develop a parallel approach for computing the modularity clustering often used to identify and analyse communities in social networks. We show that modularity can be approximated by looking at the largest eigenpairs of the weighted graph adjacency matrix that has been perturbed by a rank one update. Also, we generalize this formulation to identify multiple clusters at once. We develop a fast parallel implementation for it that takes advantage of the Lanczos eigenvalue solver and k-means algorithm on the GPU. Finally, we highlight the performance and quality of our approach versus existing state-of-the-art techniques.	Alexandre Fender, Nahid Emad, Serge Petiton and Maxim Naumov
405	Parallel Monte Carlo on Intel MIC Architecture [abstract] Abstract: The trade-off between the cost-efficiency of powerful computational accelerators and the increasing energy needed to perform numerical tasks can be tackled by implementation of algorithms on the Intel Multiple Integrated Cores (MIC) architecture. The best performance of the algorithms requires the use of appropriate optimization and parallelization approaches throughout all process of their design. Monte Carlo methods and Quasi-Monte Carlo methods depend on a huge number of computational cores. In this paper we present the advances in our studies on the performance of algorithms for solving multidimensional integrals on Intel MIC architecture and their comparison with the performance of Monte Carlo methods. The fast implementations are due to the high parallelism in the operations with the many coordinates of the sequences achieved with the Intel MIC architecture. These implementations are easy to be integrated and demonstrate high performance in terms of timing and computational speeds.	Emanouil Atanassov, Todor Gurov, Sofiya Ivanovska and Aneta Karaivanova