### Chair: Xin-She Yang

 437 Comparison of Constraint-Handling Techniques for Metaheuristic Optimization [abstract]Abstract: Most engineering design problems have highly nonlinear constraints and the proper handling of such constraints can be important to ensure solution quality. There are many different ways of handling constraints and different algorithms for optimization problems, which makes it difficult to choose for users. This paper compare six different constraint-handling techniques such as penalty methods, barrier functions, $\epsilon$-constrained method, feasibility criteria and stochastic ranking. The pressure vessel design problem is solved by the flower pollination algorithm, and results show that stochastic ranking and $\epsilon$-constrained method are most effective for this type of design optimization. Xing-Shi He, Qin-Wei Fan and Xin-She Yang 231 Dynamic Partitioning of Evolving Graph Streams using Nature-inspired Heuristics [abstract]Abstract: Detecting communities of interconnected nodes is a frequently addressed problem in situation that be modeled as a graph. A common practical example is this arising from Social Networks. Anyway, detecting an optimal partition in a network is an extremely complex and highly time-consuming task. This way, the development and application of meta-heuristic solvers emerges as a promising alternative for dealing with these problems. The research presented in this paper deals with the optimal partitioning of graph instances, in the special cases in which connections among nodes change dynamically along the time horizon. This specific case of networks is less addressed in the literature than its counterparts. For efficiently solving such problem, we have modeled and implements a set of meta-heuristic solvers, all of them inspired by different processes and phenomena observed in Nature. Concretely, considered approaches are Water Cycle Algorithm, Bat Algorithm, Firefly Algorithm and Particle Swarm Optimization. All these methods have been adapted for properly dealing with this discrete and dynamic problem, using a reformulated expression for the well-known modularity formula as fitness function. A thorough experimentation has been carried out over a set of 12 synthetically generated dynamic graph instances, with the main goal of concluding which of the aforementioned solvers is the most appropriate one to deal with this challenging problem. Statistical tests have been conducted with the obtained results for rigorously concluding the Bat Algorithm and Firefly Algorithm outperform the rest of methods in terms of Normalized Mutual Information with respect to the true partition of the graph. Eneko Osaba, Miren Nekane Bilbao, Andres Iglesias, Javier Del Ser, Akemi Galvez-Tomida, Iztok Jr. Fister and Iztok Fister 317 Bat Algorithm for Kernel Computation in Fractal Image Reconstruction [abstract]Abstract: Computer reconstruction of digital images is an important problem in many areas such as image processing, computer vision, medical imaging, sensor systems, robotics, and many others. A very popular approach in that regard is the use of different kernels for various morphological image processing operations such as dilation, erosion, blurring, sharpening, and so on. In this paper, we extend this idea to the reconstruction of digital fractal images. Our proposal is based on a new affine kernel particularly tailored for fractal images. The kernel computes the difference between the source and the reconstructed fractal images, leading to a difficult nonlinear constrained continuous optimization problem, solved by using a powerful nature-inspired metaheuristics for global optimization called the bat algorithm. An illustrative example is used to analyze the performance of this approach. Our experiments show that the method performs quite well but there is also room for further improvement. We conclude that this approach is promising and that it could be a very useful technique for efficient fractal image reconstruction. Akemi Galvez-Tomida, Eneko Osaba, Javier Del Ser and Andres Iglesias Prieto 107 Heuristic Rules for Coordinated Resources Allocation and Optimization in Distributed Computing [abstract]Abstract: In this paper, we consider heuristic rules for resources utilization optimization in distributed computing environments. Existing modern job-flow execution mechanics impose many restrictions for the resources allocation procedures. Grid, cloud and hybrid computing services operate in heterogeneous and usually geographically distributed computing environments. Emerging virtual organizations and incorporated economic models allow users and resource owners to compete for suitable allocations based on market principles and fair scheduling policies. Subject to these features a set of heuristic rules for coordinated compact scheduling are proposed to select resources depending on how they fit a particular job execution and requirements. Dedicated simulation experiment studies integral job flow characteristics optimization when these rules are applied to conservative backfilling scheduling procedure. Victor Toporkov and Dmitry Yemelyanov 37 Nonsmooth Newton’s Method: Some Structure Exploitation [abstract]Abstract: We investigate real asymmetric linear systems arising in the search direction generation in a nonsmooth Newton’s method. This applies to constrained optimisation problems via reformulation of the necessary conditions into an equivalent nonlinear and nonsmooth system of equations. We propose a strategy to exploit the problem structure. First, based on the sub-blocks of the original matrix, some variables are selected and ruled out for a posteriori recovering; then, a smaller and symmetric linear system is generated; eventually, from the solution of the latter, the remaining variables are obtained. We prove the method is applicable if the original linear system is well-posed. We propose and discuss different selection strategies. Finally, numerical examples are presented to compare this method with the direct approach without exploitation, for full and sparse matrices, in a wide range of problem size. Alberto De Marchi and Matthias Gerdts

### Chair: Xin-She Yang

 149 Fully-Asynchronous Cache-Efficient Simulation of Detailed Neural Networks [abstract]Abstract: Modern asynchronous runtime systems allow the re-thinking of large-scale scientific applications. With the example of a simulator of morphologically detailed neural networks, we show how detaching from the commonly used bulk-synchronous parallel (BSP) execution allows for the increase of prefetching capabilities, better cache locality, and a overlap of computation and communication, consequently leading to a lower time to solution. Our strategy removes the operation of collective synchronization of ODEs’ coupling information, and takes advantage of the pairwise time dependency between equations, leading to a fully-asynchronous exhaustive yet not speculative stepping model. Combined with fully linear data structures, communication reduce at compute node level, and an earliest equation steps first scheduler, we perform an acceleration at the cache level that reduces communication and time to solution by maximizing the number of timesteps taken per neuron at each iteration. Our methods were implemented on the core kernel of the NEURON scientific application. Asynchronicity and distributed memory space are provided by the HPX runtime system for the ParalleX execution model. Benchmark results demonstrate a superlinear speedup that leads to a reduced runtime compared to the bulk synchronous execution, yielding a speedup between 25% to 65% across different compute architectures, and in the order of 15% to 40% for distributed executions. Bruno Magalhaes, Thomas Sterling, Michael Hines and Felix Schuermann 441 Application of the model with a non-Gaussian linear scalar filters to determine life expectancy, taking into account the cause of death [abstract]Abstract: It is widely known that the worldwide development of civilization diseases (especially in the second half of the twentieth century) is the cause of the increase in mortality not caused by death from natural causes. In Poland, the most common causes of death, both for women and men, include cancer and cardiovascular disease. The aim of the article is to propose a method of modeling the life expectancy index based on the non-Gaussian linear scalar filter model stand on death rates after eliminating one or both of the above causes of death. The obtained results indicate that their elimination may be expected to extend life expectancy by several or more years depending on the cause of death and gender. Piotr Sliwka 353 Improving ODE integration on graphics processing units by reducing thread divergence [abstract]Abstract: Ordinary differential equations are widely used for the mathematical modeling of complex systems in biology and statistics. Since the analysis of such models needs to be performed using numerical integration, many applications can be gravely limited by the computational cost. This paper present a general-purpose integrator that runs massively parallel on graphics processing units. By minimizing thread divergence and bundling similar tasks using linear regression, execution time can be reduced by 40-80% when compared to a naive GPU implementation. Compared to a 36-core CPU implementation, a 150 fold runtime improvement is measured. Thomas Kovac, Tom Haber, Frank Van Reeth and Niel Hens 143 Data Compression for Optimization of Molecular Dynamics System: Preserving Basins of Attraction [abstract]Abstract: Understanding the evolution of atomistic systems is essential in various fields such as materials science, biology, and chemistry. The gold standard for these calculations is molecular dynamics, which simulates the dynamical interaction between pairs of molecules. The main challenge of such simulation is the numerical complexity, given a vast number of atoms over a long time scale. Furthermore, such systems often contain exponentially many optimal states, and the simulation tends to get trapped in local configurations. Recent developments leverage the existing temporal evolution of the system to improve the stability and scalability of the method; however, they suffer from large data storage requirements. To efficiently compress the data while retaining the basins of attraction, we have developed a framework to determine the acceptable level of compression for an optimization method by application of a Kantorovich-type theorem, using binary digit rounding as our compression technique. Choosing the Lennard-Jones potential function as a model problem, we present a method for determining the local Lipschitz constant of the Hessian with low computational cost, thus allowing the use of our technique in real-time computation. Michael Retzlaff, Todd Munson and Zichao Di 519 An algorithm to perform hydraulic tomography based on a mixture model [abstract]Abstract: Hydraulic Tomography (HT) has become one of the most sophisticated methods to characterize aquifer heterogeneity and in some experiments it has proved to be an accurate technique, but in order to achieve this goal, it is needed to perform several pumping/injection tests and to have enough measurements at each test. Also, during the solution of the inverse problem, the groundwater flow equation is solved numerically many times and thus the computational time can be very large, specially when a 3D or a transient models is used. In this work we present a new approach to model the aquifer heterogeneity based in a Gaussian Mixture Model, the proposed approach improves computation time and accuracy of the HT experiment and also it tries address the problems involved in the inverse problem, as is the effect of noisy data, the need of many pumping/injection tests and the lack of resolution when the distribution of the aquifer conductivity does not correspond to a Gaussian distribution. In synthetic experiments this approach was able to achieve one fifth of the error in the estimation of the conductivity field than one of the most used inversion methods for HT (SSLE/VSAFT), in one fourth of the computation time. In a steady-state sandbox experiment, it detected the main layers of conductivity in one fourth of the computational time than VSAFT, including a layer that was only detected with VSAFT when a transient model was used to perform the HT. Carlos Minutti, Walter Illman and Susana Gomez

### Chair: Xin-She Yang

 269 Rapid Multi-Band Patch Antenna Yield Estimation Using Polynomial Chaos-Kriging [abstract]Abstract: Yield estimation of antenna systems is important to check their robustness with respect to the uncertain sources. Since the Monte Carlo sampling-based real physics simulation model evaluations are computationally intensive, this work proposes the polynomial chaos-Kriging (PC-Kriging) metamodeling technique for fast yield estimation. PC-Kriging integrates the polynomial chaos expansion (PCE) as the trend function of Kriging metamodel since the PCE is good at cap-turing the function tendency and Kriging is good at matching the observations at training points. The PC-Kriging is demonstrated with an analytical case and a multi-band patch antenna case and compared with direct PCE and Kriging meta-models. In the analytical case, PC-Kriging reduces the computational cost by around 42% compared with PCE and over 94% compared with Kriging. In the antenna case, PC-Kriging reduces the computational cost by over 60% compared with Kriging and over 90% compared with PCE. In both cases, the savings are obtained without compromising the accuracy. Xiaosong Du, Leifur Leifsson and Slawomir Koziel 24 Accelerating Limited-Memory Quasi-Newton Convergence for Large-Scale Optimization [abstract]Abstract: Quasi-Newton methods are popular gradient-based optimization methods that can achieve rapid convergence using only first-order derivatives. However, the choice of the initial Hessian matrix upon which quasi-Newton updates are applied is an important factor that can significantly affect the performance of the method. This fact is especially true for limited-memory variants, which are widely used for large-scale problems where only a small number of updates are applied in order to minimize the memory footprint. In this paper, we introduce both a scalar and a sparse diagonal Hessian initialization framework, and we investigate its effect on the restricted Broyden-class of quasi-Newton methods. Our implementation in PETSc/TAO allows us to switch between different Broyden class methods and Hessian initializations at runtime, enabling us to quickly perform parameter studies and identify the best choices. The results indicate that a sparse Hessian initialization based on the diagonalization of the BFGS formula significantly improves the base BFGS methods and that other parameter combinations in the Broyden class may offer competitive performance. Alp Dener and Todd Munson 88 Reduced-Cost Design Optimization of High-Frequency Structures Using Adaptive Jacobian Updates [abstract]Abstract: Electromagnetic (EM) analysis is the primary tool utilized in the design of high-frequency structures. In the vast majority of cases, simpler models (e.g., equivalent networks or analytical ones) are either not available or lack accuracy: they can only be used to yield initial designs that need to be further tuned. Consequently, EM-driven adjustment of geometry and/or material parameters of microwave and antenna components is a necessary design stage. This, however, is a computationally expensive process, not only because of the considerable computational cost of high-fidelity EM analysis but also due to a typically large number of parameters that need to be adjusted. In particular, conventional numerical optimization routines (both local and global) may be prohibitively expensive. In this paper, a reduced-cost trust-region-based gradient search algorithm is proposed for optimization of high-frequency components. Our methodology is based on a smart management of the system Jacobian enhancement which combines omission of (finite-differentiation-based) sensitivity updates for variables that exhibit small (relative) relocation in the directions of the corresponding coordinate system axes and selective utilization of a rank-one Broyden updating formula. Parameter selection for Broyden-based updating depends on the alignment between the direction of the latest design relocation and respective search space basis vectors. The proposed technique is demonstrated using an ultra-wideband antenna and a miniaturized coupler. In both cases, a significant reduction of the number of EM simulations involved in the optimization process is achieved as compared to the benchmark algorithm. At the same time, degradation of the design quality is minor. Slawomir Koziel, Anna Pietrenko-Dabrowska and Leifur Leifsson 53 An Algorithm for Selecting Measurements with High Information Content Regarding Parameter Identification [abstract]Abstract: Reducing the measurement effort that is made for identification of parameters is an important task in some fields of technology. This work focuses on calibration of functions running on the electronic control unit (ECU), where measurements are the main expense factor. An algorithm for information content analysis of recorded measurement data is introduced that places the calibration engineer in the position to shorten future test runs. The analysis is based upon parameter sensitivities and utilizes the Fisher-information matrix to determine the value of certain measurement portions with respect to parameter identification. By means of a simple DC motor model the algorithm's working principle is illustrated. The first use on a real ECU function achieves a measurement time reduction of 67% while a second use case opens up new features for the calibration of connected cars. Christian Potthast 461 Optimizing parallel performance of the cell based blood flow simulation software HemoCell [abstract]Abstract: Large scale cell based blood flow simulations are expensive, both in time and resource requirements. HemoCell can perform such simulations on high performance computing resources by dividing the simulation domain into multiple blocks. This division has a performance impact caused by the necessary communication between the blocks. In this paper we implement an efficient algorithm for computing the mechanical model for HemoCell together with an improved communication structure. The result is an up to $4$ times performance increase for blood flow simulations performed with HemoCell. Victor Azizi Tarksalooyeh, Gábor Závodszky and Alfons Hoekstra