Session1 10:35 - 12:15 on 6th June 2016

ICCS 2016 Main Track (MT) Session 1

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: KonTiki Ballroom

Chair: David Abramson

19	Performance Analysis and Optimization of a Hybrid Seismic Imaging Application [abstract] Abstract: Applications to process seismic data are computationally expensive and, therefore, employ scalable parallel systems to produce timely results. Here we describe our experiences of using performance analysis tools to gain insight into an MPI+OpenMP code developed by Shell that performs Reverse Time Migration on a cluster to produce models of the subsurface. Tuning MPI+OpenMP programs for modern platforms is difficult, and, therefore, assistance is required from performance analysis tools. These tools provided us with insights into the effectiveness of the domain decomposition strategy, the use of threaded parallelism, and functional unit utilization in individual cores. By applying insights obtained from Rice University's HPCToolkit and hardware performance counters, we were able to improve the performance of Shell's prototype distributed-memory Reverse Time Migration code by roughly 30 percent.	Sri Raj Paul, Mauricio Araya-Polo, John Mellor-Crummey, Detlef Hohl
33	Portable Application-level Checkpointing for Hybrid MPI-OpenMP Applications [abstract] Abstract: As parallel machines increase their number of processors, so does the failure rate of the global system, thus, long-running applications will need to make use of fault tolerance techniques to ensure the successful execution completion. Most of current HPC systems are built as clusters of multicores. The hybrid MPI-OpenMP paradigm provides numerous benefits on these systems. This paper presents a checkpointing solution for hybrid MPI-OpenMP applications, in which checkpoint consistency is guaranteed by using a coordination protocol intra-node, while no internode coordination is needed. The proposal reduces network utilization and storage resources in order to optimize the I/O cost of fault tolerance, while minimizing the checkpointing overhead. Besides, the portability of the solution and the dynamic parallelism provided by OpenMP enable the restart of the applications using machines with different architectures, operating systems and/or number of cores, adapting the number of running OpenMP threads for the best exploitation of the available resources. Extensive evaluation using hybrid MPI-OpenMP applications from the ASC Sequoia Benchmark Codes and NERSC-8/Trinity benchmarks is presented, showing the effectiveness and efficiency of the approach.	Nuria Losada, María J. Martín, Gabriel Rodríguez, Patricia González
38	Checkpointing of Parallel MPI Applications using MPI One-sided API with Support for Byte-addressable Non-volatile RAM [abstract] Abstract: The increasing size of computational clusters results in an increasing probability of failures, which in turn requires application checkpointing in order to survive those failures. Traditional checkpointing requires data to be copied from application memory into persistent storage medium, which increases application execution time as it is usually done in a separate step. In this paper we propose to use emerging byte-addressable non-volatile RAM (NVRAM) as a persistent storage medium and we analyze various methods of making consistent checkpoints with support of MPI one-sided API in order to minimize checkpointing overhead. We test our solution on two applications: HPCCG benchmark and PageRank algorithm. Our experiments showed that NVRAM based checkpointing performs much better than traditional disk based approach. We also simulated different possible latencies and bandwidth of future NVRAM and our experiments showed that only bandwidth had visible impact onto application execution time.	Piotr Dorożyński, Pawel Czarnul, Artur Malinowski, Krzysztof Czuryło, Łukasz Dorau, Maciej Maciejewski, Paweł Skowron
57	Acceleration of Tear Film Map Definition on Multicore Systems [abstract] Abstract: Dry eye syndrome is a public health problem, and one of the most common conditions seen by eye care specialists. Among the clinical tests for its diagnosis, the evaluation of the interference patterns observed in the tear film lipid layer is often employed. In this sense, tear film maps illustrate the spatial distribution of the patterns over the whole tear film and provide useful information to practitioners. However, the creation of a single map usually takes tens of minutes. Medical experts currently demand applications with lower response time in order to provide a faster diagnosis for their patients. In this work, we explore different parallel approaches to accelerate the definition of the tear film map by exploiting the power of today's ubiquitous multicore systems. They can be executed on any multicore system without special software or hardware requirements. The experimental evaluation determines the best approach (on-demand with dynamic seed distribution) and proves that it can significantly decrease the runtime. For instance, the average runtime of our experiments with 50 real-world images on a system with AMD Opteron processors is reduced from more than 20 minutes to one minute and 12 seconds.	Jorge González-Domínguez, Beatriz Remeseiro, María J. Martín
99	Modeling and Implementation of an Asynchronous Approach to Integrating HPC and Big Data Analysis [abstract] Abstract: With the emergence of exascale computing and big data analytics, many important scientific applications require the integration of computationally intensive modeling and simulation with data-intensive analysis to accelerate scientific discovery. In this paper, we create an analytical model to steer the optimization of the end-to-end time-to-solution for the integrated computation and data analysis. We also design and develop an intelligent data broker to efficiently intertwine the computation stage and the analysis stage to practically achieve the optimal time-to-solution predicted by the analytical model. We perform experiments on both synthetic applications and real-world computational fluid dynamics (CFD) applications. The experiments show that the analytic model exhibits an average relative error of less than 10%, and the applications’ performance can be improved by up to 131% for the synthetic programs and by up to 78% for the real-world CFD application.	Yuankun Fu, Fengguang Song, Luoding Zhu

ICCS 2016 Main Track (MT) Session 8

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: Toucan

Chair: Jack Dongarra

12	Identifying the Sport Activity of GPS Tracks [abstract] Abstract: The wide propagation of devices, such as mobile phones, that include a global positioning system (GPS) sensor has popularised the storing of geographic information for different kind of activities, many of them recreational, such as sport. Extracting and learning knowledge from GPS data can provide useful geographic information that can be used for the design of novel applications. In this paper we address the problem of identifying the sport from a GPS track that is recorded during a sport session. For that purpose, we store 8500 GPS tracks from ten different kind of sports. We extract twelve features that are able to represent the activity that was recorded in a GPS track. From these features several models are induced by diverse machine learning classification techniques. We study the problem from two different perspectives: flat classification, i.e, models classify the track in one of the ten possible sport types; and hierarchical classification, i.e. given the high number of classes and the structure of the problem, we induce a hierarchy in the classes and we address the problem as a hierarchical classification problem. For this second framework, we analyse three different approaches. According to our results, multiclassifier systems based on decision trees obtain the better performance in both scenarios.	Cesar Ferri Ramírez
13	Wind-sensitive interpolation of urban air pollution forecasts [abstract] Abstract: People living in urban areas are exposed to outdoor air pollution. Air contamination is linked to numerous premature and pre-native deaths each year. Urban air pollution is estimated to cost approximately 2% of GDP in developed countries and 5% in developing countries. Some works reckon that vehicle emissions produce over 90% of air pollution in cities in these countries. This paper presents some results in predicting and interpolating real-time urban air pollution forecasts for the city of Valencia in Spain. Although many cities provide air quality data, in many cases, this information is presented with significant delays (three hours for the city of Valencia) and it is limited to the area where the measurement stations are located. We compare several regression models able to predict the levels of four different pollutants (NO, NO2, SO2, O3) in six different locations of the city. Wind strength and direction is a key feature in the propagation of pollutants around the city, in this sense we study different techniques to incorporate this factor in the regression models. Finally, we also analyse how to interpolate forecasts all around the city. Here, we propose an interpolation method that takes wind direction into account. We compare this proposal with respect to well-known interpolation methods. By using these contamination estimates, we are able to generate a real-time pollution map of the city of Valencia.	Lidia Contreras-Ochando, Cesar Ferri
66	Optimal Customer Targeting for Sustainable Demand Response in Smart Grids [abstract] Abstract: Demand Response (DR) is a widely used technique to minimize the peak to average consumption ratio during high demand periods. We consider the DR problem of achieving a given curtailment target for a set of consumers equipped with a set of discrete curtailment strategies over a given duration. An effective DR scheduling algorithm should minimize the curtailment error - the difference between the targeted and achieved curtailment values - to minimize costs to the utility provider and maintain system reliability. The availability of smart meters with fine-grained customer control capability can be leveraged to offer customers a dynamic range of curtailment strategies that are feasible for small durations within the overall DR event. Both the availability and achievable curtailment values of these strategies can vary dynamically through the DR event and thus the problem of achieving a target curtailment over the entire DR interval can be modeled as a dynamic strategy selection problem over multiple discrete sub-intervals. We argue that DR curtailment error minimizing algorithms should not be oblivious to customer curtailment behavior during sub-intervals as (expensive) demand peaks can be concentrated in a few sub-intervals while consumption is heavily curtailed during others in order to achieve the given target, which makes such solutions expensive for the utility. Thus in this paper, we formally develop the notion of Sustainable DR (SDR) as a solution that attempts to distribute the curtailment evenly across sub-intervals in the DR event. We formulate the SDR problem as an Integer Linear Program and provide a very fast $\sqrt{2}$-factor approximation algorithm. We then propose a Polynomial Time Approximation Scheme (PTAS) for approximating the SDR curtailment error to within an arbitrarily small factor of the optimal. We then develop a novel ILP formulation that solves the SDR problem while explicitly accounting for customer strategy switching overhead as a constraint. We perform experiments using real data acquired from the University of Southern California’s smart grid and show that our sustainable DR model achieves results with a very low absolute error of 0.001-0.05 kWh range.	Sanmukh R. Kuppannagari, Rajgopal Kannan, Charalampos Chelmis, Arash S Tehrani, Viktor K Prasanna
366	Influence of Charging Behaviour given Charging Station Placement at Existing Petrol Stations and Residential Car Park Locations in Singapore [abstract] Abstract: Electric Vehicles (EVs) are set to play a crucial role in making transportation systems more sustainable. However, charging infrastructure needs to be built up before EV adoption can increase. A crucial factor that is ignored in most existing studies of optimal charging station (CS) deployment is the role played by the charging behaviour of drivers. In this study, through an agent-based traffic simulation, we analyse the impact of different driver charging behaviour under the assumption that CSs are placed at existing petrol stations and residential car park locations in Singapore. Three models are implemented: a simple model with a charging threshold and two more sophisticated models where the driver takes the current trip distance and existing CS locations into account. We analyse the performance of these three charging behaviours with respect to a number of different measures. Results suggest that charging behaviours do indeed have a significant impact on the simulation outcome. We also discover that the sensitivity of model parameters in each charging behaviour is an important factor to consider as variations in model parameter can lead to significant different results.	Ran Bi, Jiajian Xiao, Vaisagh Viswanathan, Alois Knoll
222	Crack Detection in Earth Dam and Levee Passive Seismic Data Using Support Vector Machines [abstract] Abstract: We investigate techniques for earth dam and levee health monitoring and automatic detection of anomalous events in passive seismic data. We have developed a novel data-driven workflow that uses machine learning and geophysical data collected from sensors located on the surface of the levee to identify internal erosion events. In this paper, we describe our research experiments with binary and one-class Support Vector Machines (SVMs). We used experimental data from a laboratory earth embankment (80% normal and 20% anomalies) and extracted nine spectral features from decomposed segments of the time series data. The two-class SVM with 10-fold cross validation achieved over 97% accuracy. Experiments with the one-class SVM use the top two features selected by the ReliefF algorithm and our results show that we can successfully separate normal from anomalous data observations with over 83% accuracy.	Wendy Fisher, Tracy Camp, Valeria Krzhizhanovskaya

Agent-based simulations, adaptive algorithms and solvers (ABS-AAS) Session 1

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: Macaw

Chair: Maciej Paszynski

544	Agent-Based Simulations, Adaptive Algorithms and Solvers - Preface [abstract] Abstract: The aim of this workshop is to integrate results of different domains of computer science, computational science and mathematics. We invite papers oriented toward simulations, either hard simulations by means of finite element or finite difference methods, or soft simulations by means of evolutionary computations, particle swarm optimization and other. The workshop is most interested in simulations performed by using agent-oriented systems or by utilizing adaptive algorithms, but simulations performed by other kind of systems are also welcome. Agent-oriented system seems to be the attractive tool useful for numerous domains of applications. Adaptive algorithms allow significant decrease of the computational cost by utilizing computational resources on most important aspect of the problem.	Maciej Paszynski, Robert Schaefer, Krzysztof Cetnarowicz, David Pardo and Victor Calo
365	A Discontinuous Petrov-Galerkin Formulation Based on Broken H-Laplacian Trial and Test Spaces [abstract] Abstract: We present a new discontinuous Petrov-Galerkin (DPG) method for continuous finite element (FE) approximations of linear boundary value problems of second order partial differential equations by using discontinuous, yet optimal, weight functions. The DPG method is based on the derivation of equivalent integral formulations by starting with element-by-element local residual functionals involving partial derivatives and Laplacian terms. Continuity of the normal inter-element fluxes across the element boundaries is subsequently weakly enforced in H-1/2 by applying Green's Identity or integration by parts. Correspondingly, the resulting integral formulations are posed in broken solution spaces in the sense that they are in H-Laplacian in each element (broken) yet globally only in H1 (continuous). The test functions have the same regularity locally on the element level but are in L2 globally and therefore discontinuous. As in recent DPG approaches [2], we introduce test functions that result in stable FE approximations with best approximation properties in terms of the energy norm that is induced by the bilinear form of the integral formulation. Since the bilinear form involves broken elementwise functionals, the best approximation property is here established in a broken Hilbert norm. Remarkably, the local contributions of test functions can be numerically solved on each element with high numerical accuracy and do not require the solution of global variational statements, as observed in recent DPG methods (e.g. see [1,2]). Optimal asymptotic convergence rates are obtained in the H1 norm and a broken H-LaPlacian type norm. In L2, optimal convergence is achieved for p greater than or equal to 2. We present 1D numerical verifications for the solution of second order reaction-diffusion as well as convection-diffusion problems. [1] L. Demkowicz, and J. Gopalakrishnan. ``Analysis of the DPG Method for the Poisson Equation'', SIAM Journal on Numerical Analysis, Vol. 49, No. 5, pp. 1788-1809, 2011. [2] L. Demkowicz, and J. Gopalakrishnan. `` A class of discontinuous Petrov-Galerkin methods. II. Optimal test functions'', Numerical Methods for Partial Differential Equations, Vol. 27, No. 1, pp. 70-105, 2011.	Albert Romkes and Victor Calo
286	A Priori Fourier Analysis for 2.5D Finite Elements Simulations of Logging-While-Drilling (LWD) Resistivity Measurements [abstract] Abstract: Triaxial induction measurements provided by LWD tools generate crucial petrophysical data to determine several quantities of interest around the drilled formation under exploration, such as a map of resistivities. However, the corresponding forward modeling requires the simulation of a large-scale three-dimensional computational problem for each tool position. When the material properties are assumed to be homogeneous in one spatial direction, the problem dimensionality can be reduced to a so called 2.5 dimensional (2.5D) formulation. In this paper, we propose an a priori adaptive algorithm for properly selecting and interpolating Fourier modes in 2.5D simulations in order to speed up computer simulations. The proposed method first considers an adequate range of Fourier modes, and it then determines a subset of those which need to be estimated via solution of a Partial Differential Equation (PDE), while the remaining ones are simply interpolated in a logarithmic scale, without the need of solving any additional PDE. Numerical results validate our selection of Fourier modes, delivering superb results in real simulations when solving via PDE only for a very limited number of Fourier modes (below 50%).	Ángel Rodríguez-Rozas, David Pardo
486	Hybridization of isogeometric finite element method and evolutionary multi-agent system as a tool-set for multiobjective optimization of liquid fossil fuel reserves exploitation with minimizing groundwater contamination [abstract] Abstract: In the paper we consider the approach for solving the problem of extracting liquid fossil fuels respecting not only economical aspects but also the impact on natural environment. We model the process of extracting of the oil/gas by pumping the chemical fluid into the formation with the use of IGA-FEM solver as non-stationary flow of the non-linear fluid in heterogeneous media. The problem of extracting liquid fossil fuels is defined as a multiobjective one with two contradictory objectives: maximizing the amount of the oil/gas extracted and minimizing the contamination of the groundwater. The goal of the paper is to check the performance of a hybridized solver for multiobjective optimization of liquid fossil fuel extraction (LFFEP) integrating population-based heuristic (i.e.\ evolutionary multi-agent system and NSGA-II algorithm for approaching the Pareto frontier) with isogeometric finite element method IGA-FEM. The results of computational experiments illustrate how the considered techniques work for a particular test scenario.	Leszek Siwik, Marcin Los, Aleksander Byrski, Marek Kisiel-Dorohinicki

Advances in High-Performance Computational Earth Sciences: Applications and Frameworks (IHPCES) Session 1

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: Cockatoo

Chair: K. Nakajima

454	Progress towards nonhydrostatic adaptive mesh dynamics for multiscale climate models (Invited) [abstract] Abstract: Many of the atmospheric phenomena with the greatest potential impact in future warmer climates are inherently multiscale. Such meteorological systems include hurricanes and tropical cyclones, atmospheric rivers, and other types of hydrometeorological extremes. These phenomena are challenging to simulate in conventional climate models due to the relatively coarse uniform model resolutions relative to the native nonhydrostatic scales of the phenomonological dynamics. To enable studies of these systems with sufficient local resolution for the multiscale dynamics yet with sufficient speed for climate-change studies, we have adapted existing adaptive mesh dynamics packages for global atmospheric modeling. In this talk, we present an adaptive, conservative finite volume approach for moist non-hydrostatic atmospheric dynamics. The approach is based on the compressible Euler equations on 3D thin spherical shells, where the radial direction is treated implicitly (using a fourth-order Runga-Kutta IMEX scheme) to eliminate time step constraints from vertical acoustic waves. Refinement is performed only in the horizontal directions. The spatial discretization is the equiangular cubed-sphere mapping, with a fourth-order accurate discretization to compute flux averages on faces. By using both space-and time-adaptive mesh refinement, the solver allocates computational effort only where greater accuracy is needed. The resulting method is demonstrated to be highly accurate for model problems, and robust at solution discontinuities and stable for large aspect ratios. We present comparisons using a simplified physics package for dycore comparisons of moist physics. Bio: William D. Collins is an internationally recognized expert in climate modeling and climate change science. His personal research concerns the interactions among greenhouse gases and aerosols, the coupled climate system, and global environmental change. At Lawrence Berkeley National Laboratory (LBNL), Dr. Collins serves as the Director for the Climate and Ecological Sciences Division. At the University of California, Berkeley, he teaches in the Department of Earth and Planetary Science and directs the new multi-campus Climate Readiness Institute (CRI). Dr. Collins’s role in launching the Department of Energy’s Accelerated Climate Model for Energy (ACME) program was awarded the U.S. Department of Energy Secretary’s Achievement Award on May 7, 2015. He is also a Fellow of the American Association for the Advancement of Science (AAAS). He was a Lead Author on the Fourth Assessment of the Intergovernmental Panel on Climate Change (IPCC), for which the IPCC was awarded the 2007 Nobel Peace Prize, and was also a Lead Author on the recent Fifth Assessment. Before joining Berkeley and Berkeley Lab, Dr. Collins was a senior scientist at the National Center for Atmospheric Research (NCAR) and served as Chair of the Scientific Steering Committee for the DOE/NSF Community Climate System Model project. Dr. Collins received his undergraduate degree in physics from Princeton University and earned an M.S. and Ph.D. in astrophysics from the University of Chicago.	William Collins, Hans Johansen, Travis O'Brien, Jeff Johnson, Elijah Goodfriend and Noel Keen
276	Towards characterizing the variability of statistically consistent Community Earth System Model simulations [abstract] Abstract: Large, complex codes such as earth system models are in a constant state of development, requiring frequent software quality assurance. The recently developed Community Earth System Model (CESM) Ensemble Consistency Test (CESM-ECT) provides an objective measure of statistical consistency for new CESM simulation runs, which has greatly facilitated error detection and rapid feedback for model users and developers. CESM-ECT determines consistency based on an ensemble of simulations that represent the same earth system model. Its statistical distribution embodies the natural variability of the model. Clearly the composition of the employed ensemble is critical to CESM-ECT's effectiveness. In this work we examine whether the composition of the CESM-ECT ensemble is adequate for characterizing the variability of a consistent climate. To this end, we introduce minimal code changes into CESM that should pass the CESM-ECT, and we evaluate the composition of the CESM-ECT ensemble in this context. We suggest an improved ensemble composition that better captures the accepted variability induced by code changes, compiler changes, and optimizations, thus more precisely facilitating the detection of errors in the CESM hardware or software stack as well as enabling more in-depth code optimization and the adoption of new technologies.	Daniel Milroy, Allison Baker, Dorit Hammerling, John Dennis, Sheri Mickelson, Elizabeth Jessup
318	A New Approach to Ocean Eddy Detection, Tracking, and Event Visualization -Application to the Northwest Pacific Ocean- [abstract] Abstract: High-resolution ocean general circulation models have advanced the numerical study of ocean eddies. To gain an understanding of ocean eddies from the large volume of data produced by simulations, visualizing just the distribution of eddies at each time step is insufficient; time-variations in eddy events and phenomena must also be considered. However, existing methods cannot accurately detect and track eddy events such as amalgamation and bifurcation. In this study, we propose a new approach for eddy detection, tracking, and event visualization based on an eddy classification system. The proposed method detects streams and currents in addition to eddies, and it classifies detected eddies into several categories using the additional stream and current information. By tracking how the classified eddies vary over time, it is possible to detect events such as eddy amalgamation and bifurcation as well as the interaction between eddies and ocean currents. We visualize the detected eddies and events in a time series of images (or animation), enabling us to gain an intuitive understanding of a region of interest hidden in a high-resolution data set.	Daisuke Matsuoka, Fumiaki Araki, Yumi Inoue, Hideharu Sasaki
285	SC-ESAP: A Parallel Application Platform for Earth System Model [abstract] Abstract: The earth system model is one of the most complicated computer simulation software in the human development history, which is the basis of understanding and predicting the climate change, and an important tool to support the climate change related decisions. CAS-ESM, Chinese Academy of Science Earth System Model, is developed by the Institute of Atmospheric Physics(IAP) and its cooperators. This system contains the complete components of the climate system and ecological environment system including global atmospheric general circulation model(AGCM), global oceanic general circulation model(OGCM), ice model, land model, atmospheric chemistry model, dynamic global vegetation model(DGVM), ocean biogeochemistry model(OBM) and regional climate model(RCM), etc. Since CAS-ESM is a complex system and is designed as a scalable and pluggable system, a parallel software platform(SC-ESSP) is needed. SC-ESSP will be developed as an open software platform running on Chinese earth system numerical simulation facilities for different developers and users, which requires that the component models need to be standard and unified, and the platform should be pluggable, high performance and easy-to-use. To achieve this goal, based on the platform of Community Earth System Model(CESM), a parallel software application platform named SC-ESAP is designed for CAS-ESM, mainly including compile and run scripts, standard and unified component models, 3-D coupler component, coupler interface creator and some parallel and optimization work. A component framework SC-ESMF will be developed based on the framework SC-Tangram for the more distant future.	Jinrong Jiang, Tianyi Wang, Xuebin Chi, Huiqun Hao, Yuzhu Wang, Yiran Chen, He Zhang

Workshop on Computational and Algorithmic Finance (WCAF) Session 1

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: Boardroom East

Chair: A. Itkin and J.Toivanen

136	Reduced Order Models for Pricing American Options under Stochastic Volatility and Jump-Diffusion Models [abstract] Abstract: American options can be priced by solving linear complementary problems (LCPs) with parabolic partial(-integro) differential operators under stochastic volatility and jump-diffusion models like Heston, Merton, and Bates models. These operators are discretized using finite difference methods leading to a so-called full order model (FOM). Here reduced order models (ROMs) are derived employing proper orthogonal decomposition (POD) and non negative matrix factorization (NNMF) in order to make pricing much faster within a given model parameter variation range. The numerical experiments demonstrate orders of magnitude faster pricing with ROMs.	Maciej Balajewicz, Jari Toivanen
237	Implicit Predictor Corrector method for Pricing American Option under Regime Switching with Jumps [abstract] Abstract: We develop and analyze a second order implicit predictor-corrector scheme based on Exponential time differencing (ETD) method for pricing American put options under Multistate - Regime Switching economy with Jump Diffusion Models. Our approach formulates the problem of American options pricing as a set of coupled partial intgro-diffrential equations (PIDE), which we solve using a primitive tri-diagonal linear system, while we treat the complexity of the dense jump probability generator and the nonlinear regime switching terms explicitly in time. We define both differential and integral terms of the PIDE on the same domain, and discretize the spatial derivatives using a non-uniform mesh. The American option constraint is enforced by using a scaled penalty method approach to establish a conservative bound for the penalty parameter. We also provide a detailed treatment for the consistency, stability, and convergence of the proposed method, and analytically study the impact of the jump intensity, penalty and non-uniform parameters on convergence and solution accuracy. The dynamic properties of the no -uniform mesh and ETD approach are utilized to calibrate suitable values for the penalty and no uniform grid parameters. Superiority of the prosed scheme over recently published methods is demonstrated by numerical examples by discussing the efficiency, accuracy and reliability of the proposed approach	Abdul Khaliq, Mohammad Rasras and Mohammad Yousuf
235	Model Impact on Prices of American Options [abstract] Abstract: Different dividend assumptions consistent with prices of European option can lead to very different prices for American options. In this paper we study the impact of continuous versus discrete and cash versus proportional dividend assumption on the prices of European and American options and discuss the consequences it implies for calibration and pricing of exotic instruments.	Alexey Polishchuk
137	Fixing Risk Neutral Risk Measures [abstract] Abstract: As per regulations and common risk management practice, the credit risk of a portfolio is managed via its potential future exposures (PFEs), expected exposures (EEs), and related measures, the expected positive exposure (EPE), effective expected exposure (EEE) and the effective expected positive exposure (EEPE). Notably, firms use these exposures to set economic and regulatory capital levels. Their values have a big impact on the capital that firms need to hold to manage their risks. Due to the growth of CVA computations, and the similarity of CVA computations to exposure computations, firms find it expedient to compute these exposures under the risk neutral measure. Here we show that exposures computed under the risk neutral measure are essentially arbitrary. They depend on the choice of numeraire, and can be manipulated by choosing a different numeraire. The numeraire can even be chosen in such a way as to pass backtests. Even when restricting attention to commonly used numeraires, exposures can vary by a factor of two or more. As such, it is critical that these calculations be done under the real world measure, not the risk neutral measure. To help rectify the situation, we show how to exploit measure changes to efficiently compute real world exposures in a risk neutral framework, even when there is no change of measure from the risk neutral measure to the real world measure. We also develop a canonical risk neutral measure that can be used as an alternative approach to risk calculations.	Harvey Stein
336	Efficient CVA Computation by Risk Factor Decomposition [abstract] Abstract: According to Basel III, financial institutions have to charge a Credit Valuation Adjustment (CVA) to account for a possible counterparty default. Calculating this measure is one of the big challenges in risk management. In earlier studies, future distributions of derivative values have been simulated by a combination of finite difference methods for the option valuation and Monte Carlo methods for the state space sampling of the underlying, from which the portfolio exposure and its quantiles can be estimated. By solving a forward Kolmogorov PDE for the future underlying distribution instead of Monte Carlo simulation, we hope to achieve efficiency gains and better accuracy especially in the tails of future exposures. Together with the backward Kolmogorov equation, the expected exposure and quantiles can then directly be obtained without the need for an extra Monte Carlo simulation. We studied the applicability of PCA and ANOVA-based dimension reduction in the context of a portfolio of risk factors. Typically, for these portfolios, a huge number of derivatives are traded on a relatively small number of risk factors. By solving a PDE for one risk factor, it is possible to value all derivatives traded on this single factor over time. However, if we want to solve a PDE for multiple risk factors, one has to deal with the curse of dimensionality. Between these risk factors, the correlation is often high, and therefore PCA and ANOVA are promising techniques for dimension reduction and can enable us to compute the exposure profiles for higher dimensional portfolios. We compute lower dimensional approximations where only one factor is taken stochastic and all other factors follow a deterministic term structure. Next, we correct this low dimensional approximation by two dimensional approximations. We also look into the effect of taking higher (three) dimensional corrections. In our results, our method is able to compute Exposures (EE, EPE and ENE) and Quantiles for a real portfolio driven by 10 different risk factors. This portfolio consists of Cross-Currency Swaps, Interest rate swaps and FX call or put options. The risk factors are: stochastic FX rates, stochastic volatility and stochastic domestic and foreign interest rates. The method is accurate and fast when compared to a full-scale Monte Carlo implementation.	Kees de Graaf, Drona Kandhai and Christoph Reisinger

International Workshop on Computational Flow and Transport: Modeling, Simulations and Algorithms (CFT) Session 1

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: Boardroom West

Chair: Shuyu Sun

83	Uncertainty Quantification of Parameters in Stochastic BVPs Utilizing Stochastic Basis Representation and a Multi-Scale Domain Decomposition Method [abstract] Abstract: Quantifying uncertainty effects of coefficients that exhibit heterogeneity at multiple scales is among many outstanding challenges in subsurface flow models. Typically, the coefficients are modeled as functions of random variables governed by certain statistics. To quantify their uncertainty in the form of statistics (e.g., average fluid pressure or concentration) Monte-Carlo methods have been used. In a separate direction, multiscale numerical methods have been developed to efficiently capture spatial heterogeneity that otherwise would be intractable with standard numerical techniques. Since heterogeneity of individual realizations can differ drastically, a direct use of multiscale methods in Monte-Carlo simulations is problematic. Furthermore, Monte-Carlo methods are known to be very expensive as a lot of samples are required to adequately characterize the random component of the solution. In this study, we utilize a stochastic representation method that exploits the solution structure of the random process in order to construct a problem dependent stochastic basis. Using this stochastic basis representation a set of coupled yet deterministic equations is constructed. To reduce the computational cost of solving the coupled system, we develop a multiscale domain decomposition method utilizing Robin transmission conditions. In the proposed method, enrichment of the solution space can be performed at multiple levels that offer a balance between computational cost, and accuracy of the approximate solution.	Victor Ginting, Prosper Torsu, Bradley McCaskill
139	Locally Conservative B-spline Finite Element Methods for Two-Point Boundary Value Problems [abstract] Abstract: The standard nodal Lagrangian based continuous Galerkin finite element method (FEM) and control volume finite element method (CVFEM) are well known techniques for solving partial differential equations. Both of these methods have a common shortcoming in that the first derivative of the approximate solution of both methods is discontinuous. Further shortcomings of nodal Lagrangian bases arise when considering time dependent problems. For instance, increasing the degree of the basis in an effort to improve the accuracy of the approximate solution prohibits the use of common techniques such as mass matrix lumping. We introduce a $\mu^{\mathrm{th}}$ degree clamped basis-spline (B-spline) based analog of both the control volume finite element method and the continuous Galerkin finite element method in conjunction with a post processing technique which shall impose local conservation. The advantage of these techniques is that the B-spline basis is not only non-negative for any order $\mu$, and thus lends itself to mass matrix lumping for higher order basis functions, but also, for $\mu>2$, each basis function is smooth on the domain. We implement both the B-spline based CVFEM and FEM techniques as well as the post processing technique as they pertain to solving various two-point boundary value problems. A comparison of the convergence rates and properties of the error associated with satisfying local conservation is presented.	Russell Johnson, Victor Ginting
177	An Accelerated Iterative Linear Solver with GPUs for CFD Calculations of Unstructured Grids [abstract] Abstract: Computational Fluid Dynamics (CFD) utilizes numerical solutions of Partial Differential Equations (PDE) on discretized volumes. These sets of discretized volumes, grids, can often contain tens of mil-lions, or billions of volumes. The analysis time of these large unstructured grids can take weeks to months to complete even on large computer clusters. For CFD solvers utilizing the Finite Volume Method (FVM) with implicit time stepping or a segregated pressure solver, a large portion of the computation time is spent solving a large linear system with a sparse coefficient matrix. In an effort to improve the performance of these CFD codes, in effect decreasing the time to solution of engineering problems, a conjugate gradient solver for a Finite Volume Method Solver Graphics Processing Units (GPU) was implemented to solve a model Poisson’s equation. Utilizing the improved memory throughput of NVIDIA’s Tesla K20 GPU a 2.5 times improvement was observed compared to a parallel CPU implementation on all 10 cores of an Intel Xeon E5-2670 v2. The parallel CPU implementation was constructed using the open source CFD toolbox, Open-FOAM.	Justin Williams, Christian Sarofeen, Matthew Conley, Hua Shan
203	DarcyLite: A Matlab Toolbox for Darcy Flow Computation [abstract] Abstract: DarcyLite is a Matlab toolbox developed for numerical simulations of flow and transport in porous media in two dimensions. This paper focuses on the finite element methods and the corresponding code modules for solving the Darcy equation. Specifically, four major types of finite element solvers are presented: the continuous Galerkin (CG), the discontinuous Galerkin (DG), the weak Galerkin (WG), and the mixed finite element methods (MFEM). We further discuss the main design ideas and implementation strategies in DarcyLite. Numerical examples are included to demonstrate the usage and performance of this toolbox.	Jiangguo Liu, Farrah Sadre-Marandi, Zhuoran Wang
214	A Semi-Discrete SUPG Method for Contaminant Transport in Shallow Water Models [abstract] Abstract: In the present paper, a finite element model is developed based on a semi-discrete Streamline Upwind Petrov-Galerkin method to solve the fully-coupled two-dimensional shallow water and contaminant transport equations on a non-flat bed. The algorithm is applied on fixed computational meshes. Linear triangular elements are used to decompose the computational domain and a second-order backward differentiation implicit method is used for the time integration. The resulting nonlinear system is solved using a Newton-type method where the linear system is solved at each step using the Generalized Minimal Residual method. In order to examine the accuracy and robustness of the present scheme, numerical results are verified by different test cases.	Faranak Behzadi, James Newman

Workshop on Teaching Computational Science (WTCS) Session 1

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: Rousseau West

Chair: Alfredo Tirado-Ramos

219	Enhancing Computational Science Curriculum at Liberal Arts Institutions: A Case Study in the Context of Cybersecurity [abstract] Abstract: Computational science curriculum developments and enhancements in liberal arts colleges can face unique challenges compared with larger institutions. We present a case study of computational science curriculum improvement at a medium sized liberal arts university in the context of cybersecurity. Three approaches, namely a cybersecurity minor, content infusion into existing courses, and a public forum are proposed to enrich the current computational science curriculum with cybersecurity contents.	Paul Cao, Iyad Ajwa
191	Teaching Data Science [abstract] Abstract: We describe an introductory data science course, entitled Introduction to Data Science, offered at the University of Illinois at Urbana-Champaign. The course introduced general programming concepts by using the Python programming language with an emphasis on data preparation, processing, and presentation. The course had no prerequisites, and students were not expected to have any programming experience. This introductory course was designed to cover a wide range of topics, from the nature of data, to storage, to visualization, to probability and statistical analysis, to cloud and high performance computing, without becoming overly focused on any one subject. We conclude this article with a discussion of lessons learned and our plans to develop new data science courses.	Robert Brunner, Edward Kim
422	Little Susie: a PXE installation of openSUSE on a Little Fe [abstract] Abstract: Little Fe is a six node Beowulf cluster made from mini-itx motherboards. It is designed to be a low-cost portable parallel computer for educational purposes. Bishop's Theoretical Molecular Biology Lab at Louisiana Tech has reconfigured a Little Fe to model the lab's openSUSE based network. Our Little Susie boots each of its diskless nodes with the same openSUSE operating system installed on the lab's workstations. All nodes utilize a common home directory that is physically attached only to the head node. Thus Little Susie allows students to practice using, maintaining and administering a computer network that has all of the features and tools of the lab's research resources but without compromising lab workstations. In theory, our Preboot Execution Environment (PXE) solution supports installation of any live linux distribution on the Little Fe creating a family of Littles: Little Susie, Little Debbie, Litte Hat, Little Mints. The advantage of this approach over Little Fe's Bootable Cluster CD (BCCD) operating system is that each node of Little Susie has a complete linux distribution installed on each node. Little Susie can thus function as six independent linux workstations or as a Beowulf parallel computer. This approach allows instructors to set up a computational science teaching lab “on the fly” as follows: The instructor setups up a PXE server—head node. Students PXE boot their laptops at the beginning of class to obtain identically configured workstations for the lesson of the day. After saving the day's work to the instructors hard drive students restore their laptop to its native state by simply rebooting. Instructions for setting up a Little Susie and a parallel molecular dynamics simulation with NAMD/VMD will be presented.	Tom Bishop and Anthony Agee
226	The Scientific Programming Integrated Degree Program - A Pioneering Approach to join Theory and Practice [abstract] Abstract: While already established in other disciplines, integrated degree programs have become more popular in computer science and mathematical education in Germany as well over the last few years. These programs combine a theoretical education and a vocational training. The bachelor degree course "Scientific Programming", offered at FH Aachen University of Applied Sciences, is such an integrated degree program. It consists of 50% mathematics and 50% computer science. It incorporates the MATSE (MAthematical and Technical Software dEveloper) vocational training in cooperation with research facilities and IT companies located in and nearby Aachen, Jülich and Cologne. This paper presents the general concept behind integrated degree programs in Germany and the Scientific Programming educational program in particular. A key distinguishing feature of this concept is the continuous combination of theoretical education at university level with practical work experience at a company. In this fashion, students end up being very well positioned for the labor market, and companies educate knowledgeable staff familiar with their products and processes. Additionally students are able to earn two degrees in three years, which is a rare approach for computer science programs in Germany. Therefore, Scientific Programming offers an important contribution towards reducing the shortage in advanced software development and engineering on the German labor market.	Bastian Küppers, Thomas Dondorf, Benno Willemsen, Hans Joachim Pflug, Claudia Vonhasselt, Benedikt Magrean, Matthias S. Müller, Christian Bischof
234	Teaching computational modeling in the data science era [abstract] Abstract: Integrating data and models is an important and still challenging goal in science. Computational modeling has been taught for decades and regularly revised, for example in the 2000s where it became more inclusive of data mining. As we are now in the `data science' era, we have the occasion (and often the incentive) to teach in an integrative manner computational modeling and data science. In this paper, we reviewed the content of courses and programs on computational modeling and/or data science. From this review and our teaching experience, we formed a set of design principles for an integrative course. We independently implemented these principles in two public research universities, in Canada and the US, for a course targeting graduate students and upper-division undergraduates. We discuss and contrast these implementations, and suggest ways in which the teaching of computational science can continue to be revised going forward.	Philippe Giabbanelli, Vijay Mago

Workshop on Biomedical and Bioinformatics Challenges for Computer Science (BBC) Session 1

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: Rousseau East

Chair: Angela Shiflet

195	CFD investigation of human tidal breathing through human airway geometry [abstract] Abstract: This study compares the effect of the extra-thoracic airways on the flow field through the lower airways by carrying out computational fluid dynamics (CFD) simulations of the airflow through the human respiratory tract. In order to facilitate this comparison, two geometries were utilized. The first was a realistic nine-generation lower airway geometry derived from computed tomography (CT) images, while the second included an additional component, i.e., an idealized extra-thoracic airway (ETA) coupled with the same nine-generation CT model. Another aspect of this study focused on the impact of breathing transience on the flow field. Consequently, simulations were carried out for transient breathing in addition to peak inspiration and expiration. Physiologically-appropriate regional ventilation for two different flow rates was induced at the distal boundaries by imposing appropriate lobar specific flow rates. The scope of these simulations was limited to the modeling of tidal breathing at rest. The typical breathing rates for these cases range from 7.5 to 15 breaths per minute with a tidal volume of 0.5L. For comparison, the flow rates for constant inspiration/expiration were selected to be identical to the peak flow rates during the transient breathing. Significant differences were observed from comparing the peak inspiration and expiration with transient breathing in the entire airway geometry. Differences were also observed for the lower airway geometry. These differences point to the fact that simulations that utilize constant inspiration or expiration may not be an appropriate approach to gain better insight into the flow patterns present in the human respiratory system. Consequently, particle trajectories derived from these flow fields might be misleading in their applicability to the human respiratory system.	Jamasp Azarnoosh, Kidambi Sreenivas, Abdollah Arabshahi
468	Partitioning of arterial tree for parallel decomposition of hemodynamic calculations [abstract] Abstract: Modeling of fluid mechanics for the vascular system is of great value as a source of knowledge about development, progression, and treatment of cardiovascular disease. Full three-dimensional simulation of blood flow in the whole human body is a hard computational problem. We discuss parallel decomposition of blood flow simulation as a graph partitioning problem. The detailed model of full human arterial tree and some simpler geometries are discussed. The effectiveness of coarse-graining as well as pure spectral approaches is studied. Published data can be useful for development of parallel hemodynamic applications as well as for estimation of their effectiveness and scalability.	Andrew Svitenkov, Pavel Zun, Oleg Rekin, Alfons Hoekstra
265	Generating a 3D Normative Infant Cranial Model [abstract] Abstract: We describe an algorithm to generate a normative infant cranial model from the input of 3D meshes that are extracted from CT scans of normal infant skulls. We generate a correspondence map between meshes based on a registration algorithm. Then we apply our averaging algorithm to construct the normative model. The goal of this normal model is to assist an objective evaluating system to analyze the efficacy of plastic surgeries.	Binhang Yuan, Ron Goldman, Eric Wang, Olushola Olorunnipa, David Khechoyan
480	Targeting deep brain regions in transcranial electrical neuromodulation using the reciprocity principle [abstract] Abstract: Targeting deep regions in the brain is a key challenge in noninvasive transcranial electrical neuromodulation. We explore this problem by means of computer simulations within a detailed seven-tissue finite element head model (2 millions tetrahedrons) constructed from high resolution MRI and CT volumes. We solve the forward electrical stimulation and EEG problems governed by the quasi-static Poisson equation numerically using the first order Finite Element Method (FEM) with the Galerking approach. Given a dense array of EEG-electrode layout and location of regions of interest inside the brain, we compute optimal current injection patterns based on the reciprocity principle in EEG and compare results with optimization based on the Least Squares (LS) or Linearly Constrained Minimum Variance (LCMV) algorithms. It is found that the reciprocity algorithms show good performance comparable to the LCMV and LS solutions for deep brain targets which are generally computationally more expensive to obtain.	Mariano Fernandez-Corazza, Sergei Turovets, Phan Luu, Erik Anderson and Don Tucker
84	Supermodeling in simulation of melanoma progression [abstract] Abstract: Supermodeling is an interesting and non-standard concept used recently for simulation of complex and chaotic systems such as climate and weather dynamics. It consists in coupling of many imperfect models to create a single supermodel. We discuss here supermodeling strategy in the context of tumor growth. To check its adaptive flexibility we have developed a basic, but still computationally complex, modeling framework of melanoma growth. The supermodel of melanoma consists of a few coupled sub-models, which differ in values of a parameter responsible for tumor cells and extracellular matrix interactions. We demonstrate that due to synchronization of sub-models, the supermodel is able to simulate qualitatively different modes of cancer growth than those observed for a single model. These scenarios correspond to the basic types of melanoma cancer. This property makes the supermodel very flexible to be fit to real data. On the basis of preliminary simulation results, we discuss the prospects of supermodeling strategy as a promising coupling factor between both formal and data-based models of tumor.	Witold Dzwinel, Adrian Klusek, Oleg Vasilyev

Applications of Matrix Computational Methods in the Analysis of Modern Data (MATRIX) Session 1

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: Rousseau Center

Chair: Kouroush Modarresi

74	Fast and accurate finite-difference method solving multicomponent Smoluchowski coagulation equation with source and sink terms [abstract] Abstract: In this work we present the novel numerical method solving multicomponent Smoluchowski coagulation equation. The new method is based on application of the fast algorithms of linear algebra and the fast arithmetics in tensor train format to acceleration of well-known highly accurate second order Runge-Kutta scheme. After the application of proposed algorithmic optimizations we obtain a dramatical speedup of the classical methodology without loss of the accuracy. We test our solver the problem with source and sink terms and obtain that the TT-ranks of numerical solution do not grow tremendously even with the insert of the physical effects into the basic Smolushowski coagulation model.	Alexander Smirnov, Sergey Matveev, Dmitry Zheltkov, Eugene Tyrtyshnikov
95	A Riemannian Limited-Memory BFGS Algorithm for Computing the Matrix Geometric Mean [abstract] Abstract: Various optimization algorithms have been proposed to compute the Karcher mean (namely the Riemannian center of mass in the sense of the affine-invariant metric) of a collection of symmetric positive-definite matrices. Here we propose to handle this computational task with a recently developed limited-memory Riemannian BFGS method using an implementation tailored to the symmetric positive-definite Karcher mean problem. We also demonstrate empirically that the method is best suited for large-scale problems in terms of computation time and robustness when comparing to the existing state-of-the-art algorithms.	Xinru Yuan, Wen Huang, Pierre-Antoine Absil, Kyle Gallivan
256	GPU optimization for data analysis of Mario Schenberg spherical detector [abstract] Abstract: The Gravitational Wave (GW) detectors, advanced LIGO and advanced Virgo, are acquiring the potential for recording unprecedented astronomic data for astrophysical events. The Mario Schenberg detector (MSD) is a smaller scale experiment that could participate to this search. Previously, we developed a first data analysis pipeline (DAP) to transform the detector's signal into relevant GW information. This pipeline was extremely simplified in order to be executed in low-latency. In order to improve the analysis methods while keeping a low execution time, we propose three different parallel approaches using GPU/CUDA. We implemented the parallel models using cuBLAS library functions and enhance its capability with asynchronous processes in CUDA streams. Our novel model achieves performances that surpass the serial implementation within the data analysis pipeline by a speed up of 21% faster than the traditional model. This first result is part of a more comprehensive approach, in which all DAP modules that can be parallelized, are being re-written in GPGP/CUDA, and then tested and validated within the MSD context.	Eduardo C. Vasconcellos, Esteban W. G. Clua, Reinaldo R. Rosa, João G. F. M. Gazolla, Nuno César Da R. Ferreira, Victor Carlquist, Carlos F. Da Silva Costa

Environmental Computing Applications - State of the Art (ECASA) Session 1

Time and Date: 10:35 - 12:15 on 6th June 2016

Room: Plumeria Suite

Chair: M. Heikkurinen

552	Introduction to Environmental Computing [abstract] Abstract: TBC	Dieter Kranzlmüller
556	Scientific Workflows for Environmental Computing [abstract] Abstract: Environmental computing often involves diverse data sets, large and complex simulations, user interaction, optimisation, high performance computing, scientific visualisation and complex orchestration. Scientific Workflows are an ideal platform for incorporating all these aspects of environmental computing. They provide a common framework that both specifies and documents complex applications, but also also provides an execution platform. In this talk I will describe how Nimrod/OK achieves this goal. Nomrod/OK is based on the long running Nimrod tool set and the Kepler scientific workflow engine. It incorporates a novel user interaction tool called WorkWays, which combines Kepler and Science Gateways. It also includes non-linear optimisation algorithms that allow complex environmental problems to be solved. I will demonstrate Nimrod/OK and WorkWays with a number of environmental applications involving wild fire simulations and ecological planning.	David Abramson
558	Automating Real-time Seismic Analysis Through Streaming and High Throughput Workflows [abstract] Abstract: In order to support the computational and data needs of today’s science, new knowledge must be gained on how to deliver the growing capabilities of the national cyberinfrastructures and more recently commercial clouds to the scientist’s desktop in an accessible, reliable, and scalable way. In over a decade of working with domain scientists, the Pegasus workflow management system has being used by researchers to model seismic wave propagation, to discover new celestial objects, to study RNA critical to human brain development, and to investigate other important research questions. Recently, the Pegasus and the dispel4py teams have collaborated to enable automated processing of real-time seismic interferometry and earthquake “repeater” analysis using data collected from the IRIS database. The proposed integrated solution empowers real-time stream-based workflows to seamlessly run on different distributed infrastructures (or in the wide area), where data is automatically managed by a task-oriented workflow system, which orchestrates the distributed execution. We have demonstrated the feasibility of this approach by using docker containers to deploy the workflow management systems and two different computing infrastructures: an Apache Storm cluster for real-time processing, and an MPI-based cluster for shared memory computing. Stream-based executions is managed by dispel4py, while the data movement between the clusters and the workflow engine (submit host) is managed by Pegasus.	Rafael Ferreira Da Silva