Session1 11:20 - 13:00 on 10th June 2014

Main Track (MT) Session 1

Time and Date: 11:20 - 13:00 on 10th June 2014

Room: Kuranda

Chair: J. Betts

12 SparseHC: a memory-efficient online hierarchical clustering algorithm [abstract]
Abstract: Computing a hierarchical clustering of objects from a pairwise distance matrix is an important algorithmic kernel in computational science. Since the storage of this matrix requires quadratic space in terms of the number objects, the design of memory-efficient approaches is of high importance to research. In this paper, we address this problem by presenting a memory-efficient online hierarchical clustering algorithm called SparseHC. SparseHC scans a sorted (possibly sparse) distance matrix chunk-by-chunk and a dendrogram is built by merging cluster pairs as and when the distance between them is determined to be the smallest among all remaining cluster pairs. The key insight used is that for finding the cluster pair with the smallest distance, it is not necessary to wait for completing the computation of all cluster pair distances. Partial information can be used to determine a lower bound on cluster pair distances that are used for cluster distance comparison. Our experimental results show that SparseHC achieves a linear empirical memory complexity, which is a significant improvement compared to existing algorithms.
Thuy Diem Nguyen, Bertil Schmidt, Chee Keong Kwoh
131 Tuning Basic Linear Algebra Routines for Hybrid CPU+GPU Platforms [abstract]
Abstract: The introduction of auto-tuning techniques in linear algebra routines using hybrid combinations of multiple CPU and GPU computing resources is analyzed. Basic models of the execution time and} information obtained during the installation of the routines are used to optimize the execution time with a balanced assignation of the work to the computing components in the system. The study is carried out with a basic kernel (matrix-matrix multiplication) and a higher level routine (LU factorization) using GPUs and the host multicore processor. Satisfactory results are obtained, with experimental execution times close to the lowest experimentally achievable.
Gregorio Bernabé, Javier Cuenca, Domingo Gimenez, Luis-Pedro García
143 A portable OpenCL Lattice Boltzmann code for multi- and many-core processor architectures [abstract]
Abstract: The architecture of high performance computing systems is becoming more and more heterogeneous, as accelerators play an increasingly important role alongside traditional CPUs. Programming heterogeneous systems efficiently is a complex task, that often requires the use of specific programming environments. Programming frameworks supporting codes portable across different high performance architectures have recently appeared, but one must carefully assess the relative costs of portability versus computing efficiency, and find a reasonable tradeoff point. In this paper we address precisely this issue, using as test-bench a Lattice Boltzmann code implemented in OpenCL. We analyze its performance on several different state-of-the-art processors: NVIDIA GPUs and Intel Xeon-Phi many-core accelerators, as well as more traditional Ivy Bridge and Opteron multi-core commodity CPUs. We also compare with results obtained with codes specifically optimized for each of these systems. Our work shows that a properly structured OpenCL code runs on many different systems reaching performance levels close to those obtained by architecture-tuned CUDA or C codes.
Enrico Calore, Sebastiano F. Schifano, Raffaele Tripiccione
146 Accelerating Solid-Fluid Interaction using Lattice-Boltzmann and Immersed Boundary Coupled Simulations on Heterogeneous Platforms [abstract]
Abstract: We propose a numerical approach based on the Lattice-Boltzmann (LBM) and Immersed Boundary (IB) methods to tackle the problem of the interaction of solids with an incompressible fluid flow. The proposed method uses a Cartesian uniform grid that incorporates both the fluid and the solid domain. This is a very optimum and novel method to solve this problem and is a growing research topic in Computational Fluid Dynamics. We explain in detail the parallelization of the whole method on both GPUs and an heterogeneous GPU-Multicore platform and describe different optimizations, focusing on memory management and CPU-GPU communication. Our performance evaluation consists of a series of numerical experiments that simulate situations of industrial and research interest. Based on these tests, we have shown that the baseline LBM implementation achieves satisfactory results on GPUs. Unfortunately, when coupling LBM and IB methods on GPUs, the overheads of IB degrade the overall performance. As an alternative we have explored an heterogeneous implementation that is able to hide such overheads and allows us to exploit both Multicore and GPU resources in a cooperative way.
Pedro Valero-Lara, Alfredo Pinelli, Manuel Prieto-Matías
110 Spatio-temporal Sequential Pattern Mining for Tourism Sciences [abstract]
Abstract: Flickr presents an abundance of geotagged photos for data mining. Particularly, we propose the concept of extracting spatio-temporal meta data from Flickr photos, combining a collection of such photos together results in a spatio-temporal entity movement trail, a \textit{trajectory} describing an individual's movements. Using these spatio-temporal Flickr photographer trajectories we aim to extract valuable tourist information about where people are going, what time they are going there, and where they are likely to go next. In order to achieve this goal we present our novel spatio-temporal trajectory RoI mining and SPM framework. It is different from previous work because it forms RoIs taking into consideration both space and time simultaneously, and thus we reason producing higher-quality RoIs and thus by extension higher-quality sequential patterns too. We test our framework's ability to uncover interesting patterns for the tourism sciences industry by performing experiments using a large dataset of Queensland photo taker movements for the year 2012. Experimental results validate the usefulness of our approach at finding new, information rich spatio-temporal tourist patterns from this dataset, especially in comparison the 2D approaches shown in the literature.
Bermingham Luke, Ickjai Lee

Main Track (MT) Session 8

Time and Date: 11:20 - 13:00 on 10th June 2014

Room: Tully I

Chair: Michela Taufer

115 The influence of network topology on reverse-engineering of gene-regulatory networks [abstract]
Abstract: Modeling and simulation of gene-regulatory networks (GRNs) has become an important aspect of modern computational biology investigations into gene regulation. A key challenge in this area is the automated inference (reverse-engineering) of dynamic, mechanistic GRN models from gene expression time-course data. Common mathematical formalisms used to represent such models capture both the relative weight or strength of a regulator gene and the type of the regulator (activator, repressor) with a single model parameter. The goal of this study is to quantify the role this parameter plays in terms of the computational performance of the reverse-engineering process and the predictive power of the inferred GRN models. We carried out three sets of computational experiments on a GRN system consisting of 22 genes. While more comprehensive studies of this kind are ultimately required, this computational study demonstrates that models with similar training (reverse-engineering) error that have been inferred under varying degrees of a priori known topology information, exhibit considerably different predictive performance. This study was performed with a newly developed multiscale modeling and simulation tool called MultiGrain/MAPPER.
Alexandru Mizeranschi, Noel Kennedy, Paul Thompson, Huiru Zheng, Werner Dubitzky
188 Maximizing the Cumulative Influence through a Social Network when Repeat Activation Exists [abstract]
Abstract: We study the problem of employing social networks for propagate influence when repeat activation is involved. While influence maximization has been extensively studied as the fundamental solution, it neglects the reality that a user may purchase a product/service repeatedly, incurring cumulative sales of the product/service. In this paper, we explore a new problem of cumulative influence maximization that brings the influence maximization a step closer to real-world viral marketing applications. In our problem setting, repeat activation exists and we aim to find a set of initial users, through which the maximal cumulative influence can be stimulated in a given time period. To describe the repeat activation behavior, we adopt the voter model to reflect the variation of activations over time. Under the voter model, we formulate the maximization problem and present an effective algorithm. We test and compare our method with heuristic algorithms on real-world data sets. Experimental results demonstrate the utility of the proposed method.
Chuan Zhou, Peng Zhang, Wenyu Zang, Li Guo
320 Mining Large-scale Knowledge about Events from Web Text [abstract]
Abstract: This paper addresses the problem of automatic acquisition of semantic relations between events. Since most of the previous researches rely on annotated corpus, the main challenge is the need for more generic methods to identify related event pairs and to extract event-arguments (particularly the predicate, subject and object). Motivated by this background, we develop a three-phased approach that acquires causality from the Web. Firstly, we use explicit connective markers (such as “because”) as linguistic cues to discover causal related events. Then, we extract the event-arguments based on local dependency parse trees of event expressions. In the final phase, we propose a statistical model to measure the potential causal relations. The present results of our empirical evaluation on a large-scale Chinese Web corpus have shown that (a) the use of local dependency tree extensively improves both the accuracy and recall of event-arguments extraction task; (b) our measure which is an improvement on PMI has a better performance.
Yanan Cao, Peng Zhang, Jing Guo, Li Guo
200 Discovering Multiple Diffusion Source Nodes in Social Networks [abstract]
Abstract: Social networks have greatly amplified spread of information across different communities. However, we recently observe that various malicious information, such as computer virus and rumors, are broadly spread via social networks. To restrict these malicious information, it is critical to develop effective method to discover the diffusion source nodes in social networks. Many pioneer works have explored the source node identification problem, but they all based on an ideal assumption that there is only a single source node, neglecting the fact that malicious information are often diffused from multiple sources to intentionally avoid network audit. In this paper, we present a multi-source locating method based on a given snapshot of partially and sparsely observed infected nodes in the network. Specifically, we first present a reverse propagation method to detect recovered and unobserved infected nodes in the network, and then we use community cluster algorithms to change the multi-source locating problem into a bunch of single source locating problems. At the last step, we identify the nodes having the largest likelihood estimations as the source node on the infected clusters. Experiments on three different types of complex networks show the performance of the proposed method.
Wenyu Zang, Peng Zhang, Chuan Zhou, Li Guo
293 The Origin of Control in Complex Networks [abstract]
Abstract: Recent work at the borders of network science and control theory have begun to investigate the control of complex systems by studying their underlying network representations. A majority of the work in this nascent field has looked at the number of controls required in order to fully control a network. In this talk I will present research that provides a ready breakdown of this number into categories that are both easy to observe in real world networks as well as instructive in understanding the underlying functional reasons for why the controls must exist. This breakdown is able to shed light on several observations made in the previous literature regarding controllability of networks. This decomposition produces a mechanism to cluster networks into classes that are consistent with their large scale architecture and purpose. Finally, we observe that synthetic models of formation generate networks with control breakdowns substantially different from what is observed in real world networks.
Justin Ruths

Dynamic Data Driven Application Systems (DDDAS) Session 1

Time and Date: 11:20 - 13:00 on 10th June 2014

Room: Tully II

Chair: Craig Douglas

427 DDDAS – Bridging from the Exa-Scale to the Sensor-Scale [abstract]
Abstract: This talk will provide an overview of new opportunities created by DDDAS (Dynamic Data Driven Applications Systems) and engendering a new vision for Exa-Scale computing and Big Data. Exa-Scale is considered the next frontier of high-end computational power and Big-Data seen as the the next generation of data-intensive. The presentation will discuss new opportunities that exist through DDDAS in synergism with a vision of additional dimensions to the Exa-Scale and Big Data, namely considering that the next wave of Big Data and Big Computing will result not only from the Exa-Scale frontiers but also from the emerging trend of “ubiquitous sensing” - ubiquitous instrumentation of systems by multitudes of distributed and heterogeneous collections of sets of sensors and controllers. Undeniably, achieving and exploiting Exa-Scale will enable larger scale simulations and complex “systems of systems” modeling, which will produce large sets of computed data contributing to the Big Data deluge, and adding to the data avalanche created by large scientific and engineering instruments. The emerging trend of large-scale, ubiquitous instrumentation through multitudes of sensors and controllers creates another dimension to computing and to data, whereby data and computations for processing and analyzing such data will be performed in combinations of collections of sensor and higher performance platforms – including the Exa-Scale. DDDAS provides a driver for such environments and an opportunity for new and advanced capabilities. The DDDAS paradigm, by its definition of dynamically integrating in a feed-back control loop the computational model with the instrumentation aspects of an application system, premises a unified computational-instrumentation platform supporting DDDAS environments. In general this unified computational-instrumentation platform will consist of a range of systems such as high-end (petascale, exascale), mid-range and personal computers and mobile devices, and instrumentation platforms such as large instruments or collections of sensors and controllers, such networks of large numbers of heterogeneous sensors and controllers. Otherwise stated, in DDDAS the computational and data environments of a given application span a range of platforms from the high-end computing to the data collection instruments - from the exa-scale to sensor-scale. Consequently, DDDAS environments present these kinds of unprecedented levels of computational resource heterogeneity and dynamicity which require new systems software to support the dynamic and adaptive runtime requirements of such environments. In addition to the role of DDDAS in unifying these two extremes of computing and data, there are also technological drivers that lead us to consider the extremes and the range of scales together. Thus far, conquering the exascale has been considered as having “unique” challenges in terms power efficiency requirements at the multicore unit level, dynamic management of the multitudes of such resources for optimized performance, fault tolerance and resilience, to new application algorithms. However, ubiquitous instrumentation environments comprising of sensors (and controllers) have corresponding requirements in terms of power efficiencies, fault tolerance, application algorithms dealing with sparse and incomplete data, etc. Moreover, it is quite possible that the same kinds of multicores that will populate exascale platforms will also be the building blocks of sensors and controllers. In fact, it is likely that these sensors and controllers – these new “killer micros” – they will drive the technologies at the device and chip levels. Leveraging common technologies for the range of platforms from the Exa-cale to the Sensor-Scale, not only is driven by the underlying technologies, but is also driven by the trends in the application requirements. Commonality in the building blocks (e.g. at the chip and multicore levels) across the range and the extremes of the computational and instrumentation platforms will simplify the challenges of supporting DDDAS environments. Such considerations create new opportunities for synergistically advancing and expediting advances in the two extreme scales of computing. The talk will address such challenges and opportunities in the context of projects pursuing capability advances through DDDAS such as those presented in the 2014 ICCCS/DDDAS Workshop and elsewhere.
Frederica Darema
287 Control of Artificial Swarms with DDDAS [abstract]
Abstract: A framework for incorporating a swarm intelligent system with the Dynamic Data Driven Application System (DDDAS) is presented. Swarm intelligent systems, or artificial swarms, self-organize into useful emergent structures that are capable of solving complex problems, but are difficult to control and predict. The DDDAS concept utilizes repeated simulations of an executing application to improve analytic and predictive capability by creating a synergistic feedback loop. Incorporating DDDAS with swarm applications can significantly improve control of the swarm. An overview of the DDDAS framework for swarm control is presented, and then demonstrated with an example swarm application.
Robert Mccune, Greg Madey
114 Multifidelity DDDAS Methods with Application to a Self-Aware Aerospace Vehicle [abstract]
Abstract: A self-aware aerospace vehicle can dynamically adapt the way it performs missions by gathering information about itself and its surroundings and responding intelligently. We consider the specific challenge of an unmanned aerial vehicle that can dynamically and autonomously sense its structural state and re-plan its mission according to its estimated current structural health. The challenge is to achieve each of these tasks in real time---executing online models and exploiting dynamic data streams---while also accounting for uncertainty. Our approach combines information from physics-based models, simulated offline to build a scenario library, together with dynamic sensor data in order to estimate current flight capability. Our physics-based models analyze the system at both the local panel level and the global vehicle level.
Doug Allaire, David Kordonowy, Marc Lecerf, Laura Mainini, Karen Willcox
198 Model Based Design Environment for Data-Driven Embedded Signal Processing Systems [abstract]
Abstract: In this paper, we investigate new design methods for data-driven digital signal processing (DSP) systems that are targeted to resource- and energy-constrained embedded environments, such as UAVs, mobile communication platforms and wireless sensor networks. Signal processing applications, such as keyword matching, speaker identification, and face recognition, are of great importance in such environments. Due to critical application constraints on energy consumption, real-time performance, computational resources, and core application accuracy, the design spaces for such applications are highly complex. Thus, conventional static methods for configuring and executing such embedded DSP systems are severely limited in the degree to which processing tasks can adapt to current operating conditions and mission requirements. We address this limitation by developing a novel design framework for multi-mode, data driven signal processing systems, where different application modes with complementary trade-offs are selected, configured, executed, and switched dynamically, in a data-driven manner. We demonstrate the utility of our proposed new design methods on an energy-constrained, multi-mode face detection application.
Kishan Sudusinghe, Inkeun Cho, Mihaela van der Schaar, Shuvra Bhattacharyya
46 A Dynamic Data Driven Application System for Vehicle Tracking [abstract]
Abstract: Tracking the movement of vehicles in urban environments using fixed position sensors, mobile sensors, and crowd-sourced data is a challenging but important problem in applications such as law enforcement and defense. A dynamic data driven application system (DDDAS) is described to track a vehicle’s movements by repeatedly identifying the vehicle under investigation from live image and video data, predict probable future locations of the vehicle, and reposition sensors or retarget requests for information, in order to reacquire the vehicle under surveillance. An overview of the system is described that includes image processing algorithms to detect and recapture the vehicle from live image data, a computational framework to predict probable vehicle locations at future points in time, and an information and power aware data distribution system to disseminate data and requests for information. A prototype of the envisioned system is described that is under development in the midtown area of Atlanta, Georgia in the United States.
Richard Fujimoto, Angshuman Guin, Michael Hunter, Haesun Park, Ramakrishnan Kannan, Gaurav Kanitkar, Michael Milholen, Sabra Neal, Philip Pecher

Agent Based Simulations, Adaptive Algorithms and Solvers (ABS-AA-S) Session 1

Time and Date: 11:20 - 13:00 on 10th June 2014

Room: Tully III

Chair: Maciej Paszynski

Abstract: \begin{document} We have developed fast implementations of B-spline/NURBS based finite element solvers, written using PETSc. PETSc is frequently used in software packages to leverage its optimized and parallel implementation of solvers, however we also are using PETSc data structures to assemble the linear systems. These structures in PETSC (called DA’s) were originally intended for the parallel assembly of linear systems resulting from finite differences. We have reworked this structure for linear systems resulting from isogeometric analysis based on tensor product spline spaces. The result of which is the PetIGA framework for solving problems using isogeometric analysis which is scalable and greatly simplified over previous solvers. Our infrastructure has also allowed us to develop scalable solvers for a variety of problems. We have chosen to pursue nonlinear time dependent problems~\cite{PetIGAp, PetIGAc}, such as: \begin{itemize} \item Cahn-Hilliard \item Navier-Stokes-Korteweg \item Variational Multiscale for Navier-Stokes \item Diffusive Wave Approximation to Shallow Water Equations \item Phase-Field Crystal (PFC) equation and its time integration \item Divergence-conforming B-spline modeling of nanoparticle suspensions \end{itemize} We also have solvers for an assortment of linear problems: Poisson, elasticity, Helmholtz, thin shells, advection-diffusion, and diffusion-reaction. All solvers are written to be inherently parallel and run on anything from a laptop to a supercomputer such as Shaheen, KAUST’s IBM-BlueGeneP supercomputer. In this presentation we will focus on new time integration techniques for phase-field modeling which are energy stable and allow for stable linearizations of the underlying non-linear model~\cite{PFC}. \begin{thebibliography}{99} \setlength{\parskip}{0pt} \bibitem{PetIGAp} N. Collier, L. Dalcin, and V.M. Calo, ``PetIGA: High-Performance Isogeometric Analysis,'' submitted, 2013. \bibitem{PetIGAc} L. Dalcin and N. Collier, ``PetIGA: A framework for high performance Isogeometric Analysis,'', 2013 \bibitem{PFC} P. Vignal, L. Dalcin, D.L. Brown, N. Collier, and V.M. Calo, ``Energy-stable time-discretizations for the phase-field crystal equation,'' in preparation, 2014. \end{thebibliography}
Victor Calo, Nathan Collier, Lisandro Dalcin and Philippe Vignal
44 Graph grammar based multi-thread multi-frontal direct solver with Galois scheduler [abstract]
Abstract: In this paper, we present a multi-frontal solver algorithm for the adaptive finite element method expressed by graph grammar productions. The graph grammar productions construct first the binary elimination tree, and then process frontal matrices stored in distributed manner in nodes of the elimination tree. The solver is specialized for a class of one, two and three dimensional h refined meshes whose elimination tree has a regular structure. In particular, this class contains all one dimensional grids, two and three dimensional grids refined towards point singularities, two dimensional grids refined in an anisotropic way towards edge singularity as well as three dimensional grids refined in an anisotropic way towards edge or face singularities. In all these cases, the structure of the elimination tree and the structure of the frontal matrices are similar. The solver is implemented within the Galois environment, which allows parallel execution of graph grammar productions. We also compare the performance of the Galois implementation of our graph grammar based solver with the MUMPS solver
Damian Goik, Konrad Jopek, Maciej Paszynski, Andrew Lenharth, Donald Nguyen, Keshav Pingali
154 Automatically Adapted Perfectly Matched Layers for Problems with High Contrast Materials Properties [abstract]
Abstract: For the simulation of wave propagation problems, it is necessary to truncate the computational domain. Perfectly Matched Layers are often employed for that purpose, especially in high contrast layered materials where absorbing boundary conditions are difficult to design. In here, we define a Perfectly Matched Layer that automatically adjusts its parameters without any user interaction. The user only has to indicate the desired decay in the surrounding layer. With this Perfectly Matched Layer, we show that even in the most complex scenarios where the material contrast properties are as high as sixteen orders of magnitude, we do not introduce numerical reflections when truncating the domain, thus, obtaining accurate solutions.
Julen Alvarez-Aramberri, David Pardo, Helene Barucq, Elisabete Alberdi Celaya
127 A Linear Complexity Direct Solver for H-adaptive Grids With Point Singularities [abstract]
Abstract: In this paper we present a theoretical proof of linear computational cost and complexity for a recently developed direct solver driven by hypergraph grammar productions. The solver is specialized for computational meshes with point singularities in two and three dimensions. Linear complexity is achieved due to utilizing the special structure of such grids. We describe the algorithm and estimate the exact computational cost on an example of a two-dimensional mesh containing a point singularity. We extend this reasoning to the three dimensional meshes. Numerical results fully support our theoretical estimates.
Piotr Gurgul
436 Towards a new software tool for conservation planning [abstract]
Abstract: In a dynamic world, the process of prioritizing where to invest limited conservation resources is extremely complex. It needs to incorporate information on features (species, or landforms), planning units, ongoing or predicted future threats, and the costs and effectiveness of potential conservation actions. Extended research has been conducted on the spatial and temporal conservation prioritization using software tools such as Marxan, C-Plan, and Zonation to aid managers in their decision-making process. However, these tools are limited in various ways in addressing the full complexity of day-to-day management decisions. Some tools fail to consider variation in: land values in space and time; multiple threats and their spatio-temporal variations; multiple conservation actions applied to individual areas; the feasibility, effectiveness, and varying costs of actions; and the dynamic nature of biodiversity responses in space and time. Optimizing such a multi-dimensional system is a large challenge in complexity mathematics. What is needed is a new software tool that builds on current approaches, but allows for more realistic scenarios as described above, developed and parameterised in close collaboration with managers. This includes the modification of existing tools and the creation of new algorithms. The new software will be trialled in conservation planning exercises for islands in north-western Western Australia and the Great Barrier Reef. The current approaches mostly exploit simulated annealing as it was proven the fastest and sufficiently efficient for problems which do not need the best solution. The new software, however, intends to include sub-models on threats, costs, and contribution of action on individual islands. We are examining the option of use constraint programming to incorporate these sub-models into the decision process, with desirable time resolution.
Jana Brotankova, Bob Pressey, Ian Craigie, Steve Hall, Amelia Wenger

7th Workshop on Biomedical and Bioinformatics Challenges for Computer Science (BBC) Session 1

Time and Date: 11:20 - 13:00 on 10th June 2014

Room: Bluewater I

Chair: Giuseppe A. Trunfio

294 Mining Association Rules from Gene Ontology and Protein Networks: Promises and Challenges. [abstract]
Abstract: The accumulation of raw experimental data about genes and proteins has been accompanied by the accumulation of functional information organized and stored into knowledge bases and ontologies. The assembly, organization and analysis of this data has given a considerable impulse to research. Usually biological knowledge is encoded by using annotation terms, i.e. terms describing for instance function or localization of genes and proteins. Such annotations are often organized into ontologies, that offer a formal framework to organize in a systematic way biological knowledge. For instance, Gene Ontology (GO) provides a set of annotations (namely GO Terms) of biological aspects. Consequently, for each biological concept, i.e. gene or protein a list of annotating terms is available. Each annotation may be derived using different methods, and an Evidence Code (EC) takes into account of this process. For instance electronically inferred annotations are distinguished from manual ones. Mining annotation data may thus extract biologically meaningful knowledge. For instance the analysis of these annotated data using association rules may evidence the co-occurrence of annotation helping for instance the classification of proteins starting from the annotation. Nevertheless, the use of frequent itemset mining is less popular with respect to other techniques, such as statistical based methods or semantic similarities. Here we give a short survey of these methods discussing possible future directions of research. We considered in particular the impact of the nature of annotation on association rule performances by discussing two case studies on protein complexes and protein families. As evidenced on this preliminary study the presence of electronic annotation has not a positive impact on the quality of association rules suggesting the possibility to introduce novel algorithm that are aware of evidence codes.
Pietro Hiram Guzzi, Marianna Milano, Mario Cannataro
53 Automated Microalgae Image Classification [abstract]
Abstract: In this paper we present a new method for automated recognition of 12 microalgae that are most commonly found in water resources of Thailand. In order to handle some difficulties encountered in our problem such as unclear algae boundary and noisy background, we proposed a new method for segmenting algae bodies from an image background and proposed a new method for computing texture descriptors from a blurry texture object. Feature combination approach is applied to handle a variation of algae shapes of the same genus. Sequential Minimal Optimization (SMO) is used as a classifier. An experimental result of 97.22% classification accuracy demonstrates an effectiveness of our proposed method.
Sansoen Promdaen, Pakaket Wattuya, Nuttha Sanevas
192 A Clustering Based Method Accelerating Gene Regulatory Network Reconstruction [abstract]
Abstract: One important direction of Systems Biology is to infer Gene Regulatory Networks and many methods have been developed recently, but they cannot be applied effectively in full scale data. In this work we propose a framework based on clustering to handle the large dimensionality of the data, aiming to improve accuracy of inferred network while reducing time complexity. We explored the efficiency of this framework employing the newly proposed metric Maximal Information Coefficient (MIC), which showed superior performance in comparison to other well established methods. Utilizing both benchmark and real life datasets, we showed that our method is able to deliver accurate results in fractions of time required by other state of the art methods. Our method provides as output interactions among groups of highly correlated genes, which in an application on an aging experiment were able to reveal aging related pathways.
Georgios Dimitrakopoulos, Ioannis Maraziotis, Kyriakos Sgarbas, Anastasios Bezerianos
208 Large Scale Read Classification for Next Generation Sequencing [abstract]
Abstract: Next Generation Sequencing (NGS) has revolutionised molecular biology, resulting in an explosion of data sets and a pressing need for rapid identification as a prelude to annotation and further analysis. NGS data consists of a substantial number of short sequence reads, given context through downstream assembly and annotation, a process requiring reads consistent with the assumed species or species group. Highly accurate results have been obtained for restricted sets using SVM classifiers, but such methods are difficult to parallelise and success depends on significant attention to feature selection. This work examines the problem at very large scale, using a mix of synthetic and real data with a view to determining the overall structure of the problem and the effectiveness of parallel ensembles of simpler classifiers (principally random forests) in addressing the challenges of large scale genomics.
James Hogan, Timothy Peut

Workshop on Computational Chemistry and Its Applications (CCA) Session 1

Time and Date: 11:20 - 13:00 on 10th June 2014

Room: Bluewater II

Chair: Ponnadurai Ramasami

112 Computer-aided design of stereocontrol agents for radical polymerization [abstract]
Abstract: Controlling the stereochemistry of a polymer is highly desirable as this can affect its physical properties, such as its crystallinity, melting point, solubility and mechanical strength. Stereoregular polymers are normally prepared using expensive transition metal catalysts, which typically require demanding reaction conditions, and extending stereochemical-control to free radical polymerization has been a long sought goal. For monomers containing carbonyl groups certain Lewis acids have been shown to be capable of manipulating the stereochemistry, presumably via coordination to the polymer and/or monomer side chains so as to constrain their relative orientations during the propagation step. However, specific mechanistic details have yet to be clarified, and the degree of stereocontrol remains poor. To address these problems, we have been using computational chemistry, supported by polymerization experiments, to study the mechanism of stereocontrol in a variety of free-radical polymerization processes, and to predict the effect of solvents and novel control agents on polymer tacticity. Interestingly we have discovered that many Lewis acids do selectively coordinate to the terminal and penultimate radical side chains in a manner that should, in principle, facilitate control. However, this coordination is driven by the resulting stabilization of the propagating radical, which, ironically, deactivates it toward propagation. At the same time a less energetically favourable, non-controlling coordination mode involving the monomer side chain catalyzes propagation reaction and this provides the dominant reaction path. On this basis we suggest that simultaneous coordination to the monomer and propagating radical using bridging ligands or mixed Lewis acids may provide the way forward.
Michelle Coote, Benjamin Noble and Leesa Smith
38 Correlation between Franck-Condon Factors and Average Internulcear Separations for Diatomics Using the Fourier Grid Hamiltonian Method [abstract]
Abstract: The Fourier Grid Hamiltonian (FGH) Method is used to compute the vibrational eigenvalues and eigenfunctions of bound states of diatomic molecules. For these computations, the Hulburt and Hirschfelder (HH) potential model for diatomics is used. These potential energy functions are used for constructing and diagonalizing the molecular Hamiltonians. The vibrational wave functions for the ground and the excited states are used to calculate the Franck-Condon factors (FCFs), r-Centroids and average internuclear separations which play a significant role in determining the intensity of the bands in electronic transitions. The results of FCFs and r-Centroids for diatomic molecules such as H2, N2, CO, I2 and HF using the FGH method are compared with other methods. The FGH method provides an efficient and accurate alternative to calculate FCFs and other parameters that depend on the vibrational wavefunctions of the ground and exited electronic states. The Franck-Condon profiles indicate a strong correlation between the values of Franck-Condon factors and the mean internuclear separations for the corresponding transitions.
Mayank Kumar Dixit, Abhishek Jain, Bhalachandra Laxmanrao Tembe
122 Using hyperheuristics to improve the determination of the kinetic constants of a chemical reaction in heterogeneous phase [abstract]
Abstract: The reaction in the human stomach when neutralizing acid with an antacid tablet is simulated and the evolution over time of the concentration of all chemical species present in the reaction medium is obtained. The values of the kinetic parameters of the chemical reaction can be determined by integrating the equation of the reaction rate. This is a classical optimization problem that can be approached with metaheuristic methods. The use of a parallel, parameterized scheme for metaheuristics facilitates the development of metaheuristics and their application. The unified scheme can also be used to implement hyperheuristics on top of parameterized metaheuristics, so selecting appropriate values for the metaheuristic parameters, and consequently the metaheuristic itself. The hyperheuristic approach provides satisfactory values for the metaheuristic parameters and, consequently, satisfactory metaheuristics for the problem of determining the kinetic constants.
José Matías Cutillas Lozano, Domingo Gimenez
267 Speeding up Monte Carlo Molecular Simulation by a Non-Conservative Early Rejection Scheme [abstract]
Abstract: Molecular simulation describes fluid systems in detailed fashion. In general, they are more accurate and representative than equations of state. However, they require much more computational efforts. Several techniques have been developed in order to speed up Monte Carlo (MC) molecular simulations while preserving their precision. In particular, early rejection schemes are capable of reducing computational cost by reaching the rejection decision for the undesired MC trials at early stages. In this work, the introduced scheme is based on the fact that the energy due to interaction between any couple of Lennard-Jones (LJ) sites cannot be lower than a certain minimum energy that can be easily computed. It is called “non-conservative” as it generates slightly different Markov chains than the ones generated by the conventional algorithms. Nonetheless, the numerical experiments conducted show that these modifications are not significant, and both the proposed and the conventional methods converge to the same ensemble averages. In this study, the non-conservative scheme is first introduced and then compared to the conservative and bond formation early rejection schemes. The method was tested for LJ particles in canonical ensemble at several thermodynamic conditions. Results showed a relation between the thermodynamic conditions and the percentage of the CPU time saved. In principle, more CPU time was saved at conditions with high rejection rates for the MC trials. The non-conservative early rejection scheme was successful in saving more than 45 % of the CPU time needed by the conventional algorithms in canonical ensemble. Finally, this work presents an efficient early rejection method to accelerate MC molecular simulations which is easily extendable to other ensembles and complex molecules.
Ahmad Kadoura, Amgad Salama and Shuyu Sun

Mathematical Methods and Algorithms for Extreme Scale (MMAES) Session 1

Time and Date: 11:20 - 13:00 on 10th June 2014

Room: Mossman

Chair: Vassil Alexandrov

227 Fast Iterative Method in solving Eikonal equations : a multi-level parallel approach [abstract]
Abstract: The fast marching method is widely used to solve the eikonal equation. By introducing a new way of managing propagation interfaces which avoid the use of expensive data structures, the fast iterative method reveals to be a faster variant with a higher parallel potential compared to the fast marching method. We investigate in this paper a multi-level parallel approach for the Fast Iterative Method which is well fitted for today heterogenous and hierarchical architectures. We show experiment results which focus on the fine-grained parallel level of the algorithm and we give a performance analysis.
Florian Dang, Nahid Emad
304 A Parallel Implementation of Singular Value Decomposition for Video-on-Demand Services Design Using Principal Component Analysis [abstract]
Abstract: We have developed a mathematical model for video on demand server design based on principal component analysis. Singular value decomposition on the video correlation matrix is used to perform the PCA. The challenge is to counter the computational complexity, which grows proportionally to n^3, where n is the number of video streams. We present a solution from high performance computing, which splits the problem up and computes it in parallel on a distributed memory system.
Raul Ramirez-Velarde, Martin Roderus, Carlos Barba-Jimenez, Raul Perez-Cazares
213 Boosting FDTD Performance by Compiler- Generated Recursive Space-Time Update Schemes [abstract]
Abstract: Traditional implementations of the explicit Finite-Difference Time-Domain (FDTD) solvers use layer by layer time update which requires reload of the whole mesh data into memory (and synchronization between processes for parallel solvers) at every time step. This type of update is cache inefficient and renders the task to be memory bound by the slowest bandwidth in the computer memory hierarchy. Depending on the task size and computer architecture, this can be cache, shared memory, interprocessor communication or disk I/O bandwidth. For large scale calculations, especially when mesh data size exceeds the total available processor memory, the traditional FDTD simulation becomes prohibitively inefficient. There exist alternative approaches to implement FDTD solvers that explore the whole time-space problem and utilize additional data locality due to the time dimension. It was shown that it is possible to reach the theoretical peak rate of algorithm efficiency for any given task size and compute architecture, even for problems with very large computational grids (10^12 Yee cells) [1]. Efficient usage of the whole computer memory bandwidth hierarchy when implementing numerical algorithms is crucial for extreme scale computing since one may expect this hierarchy be rather diverse in the (future) exascale computers. In this work we present a systematic way of implementing a space-time update scheme which is based on a recursive decomposition of N+1 dimensional space-time data dependency graph of the whole calculation into smaller subgraphs. We compose a Locally Recursive Nonlocally Asynchronous (LRnLA) update algorithm [1]: each subgraph is split locally into similar subgraphs while processing of independent subgraphs may be performed concurrently. For explicit stencils the dependency graph is locally governed by the stencil slope in coordinate-time plane. We use primitive triangular up- and down-pointed geometric shapes to represent the projections of the dependency graph on any coordinate-time plane and develop universal mnemonic rules to process the shapes for arbitrary space dimension. In our implementation these shapes and rules are encoded into C++ template class hierarchy by using boost fusion functionality [2], thus the update algorithm is generated mainly at compile time by a standard C++ compiler. This keeps the implementation highly portable and minimizes the overhead for run-time analysis of the data dependency graph. Termination of the recurrence (smallest subgraph) performs the actual FDTD stencil operation corresponding to a time update of a single mesh cell. The resulting FDTD algorithm is cache oblivious [3] and also allows for concurrent execution of subtasks of different size. Concurrent task scheduling is governed by the dependencies between subgraphs which become known in course of the shape decomposition. Depending on the computer architecture the scheduling may simultaneously take into account different parallel execution levels such as MPI and multithreading. Concurrent execution mechanisms are switched on (programmed) for subgraphs reaching some suitable size (rank) in course of recursion. In this presentation we discuss the implementation and analyze the performance of the implemented 3D FDTD algorithm for various computer architectures, including multicore systems and large clusters (up to 9000 cores). We demonstrate the FDTD update performance reaching up to 75% of the estimated CPU peak which is 10-30 times higher than that of the traditional FDTD solvers. We also demonstrate an almost perfect parallel scaling of the implemented solver. We discuss the effect of mesh memory layouts such as Z-curve (Morton order) increasing locality of data or interleaved layouts for vectorized updates. The implementation of the algorithm for GPU is discussed with some preliminary results. [1] Zakirov A V and Levchenko V D 2009 PIERS proceedings, Moscow, Russia 580--584 [2] [3] H. Prokop. Cache-Oblivious Algorithms. Master’s thesis, MIT. 1999.
Ilya Valuev and Andrey Zakirov
409 Challenges of Big Data Mining [abstract]
Abstract: At present, Big Data becomes reality that no one can ignore. Big Data is our environment whenever we need to make a decision. Big Data is a buzz word that makes everyone understands how important it is. Big Data shows a big opportunity for academia, industry and government. Big Data then is a big challenge for all parties. This talk will discuss some fundamental issues of Big Data problems, such as data heterogeneity vs. decision heterogeneity, data stream research and data-driven decision management. Furthermore, this talk will provide a number of real-life Bid Data Applications. In the conclusion, the talk suggests a number of open research problems in Data Science, which is a growing field beyond Big Data.
Yong Shi
410 Scalable Stochastic and Hybrid Methods and Algorithms for Extreme Scale Computing [abstract]
Abstract: Novel mathematics and mathematical modelling approaches together with scalable algorithms are needed to enable key applications at extreme-scale. This is especially true as HPC systems continue to scale up in compute node and processor core count. At the moment computational scientists are at the critical point/threshold of novel mathematics development as well as large-scale algorithm development and re-design and implementation that will affect most of the application areas. Thus the paper will focus on the mathematical and algorithmic challenges and approaches towards exascale and beyond and in particular on stochastic and hybrid methods that in turn lead to scalable scientific algorithms with minimal or no global communication, hiding network and memory latency, have very high computation/communication overlap, have no synchronization points.
Vassil Alexandrov