Main Track (MT) Session 10

Time and Date: 11:00 - 12:40 on 11th June 2014

Room: Tully I

Chair: S. Smanchat

18 A Workflow Application for Parallel Processing of Big Data from an Internet Portal [abstract]
Abstract: The paper presents a workflow application for efficient parallel processing of data downloaded from an Internet portal. The workflow partitions input files into subdirectories which are further split for parallel processing by services installed on distinct computer nodes. This way, analysis of the first ready subdirectories can start fast and is handled by services implemented as parallel multithreaded applications using multiple cores of modern CPUs. The goal is to assess achievable speed-ups and determine which factors influence scalability and to what degree. Data processing services were implemented for assessment of context (positive or negative) in which the given keyword appears in a document. The testbed application used these services to determine how a particular brand was recognized by either authors of articles or readers in comments in a specific Internet portal focused on new technologies. Obtained execution times as well as speed-ups are presented for data sets of various sizes along with discussion on how factors such as load imbalance and memory/disk bottlenecks limit performance.
Pawel Czarnul
273 A comparative study of scheduling algorithms for the multiple deadline-constrained workflows in heterogeneous computing systems with time windows [abstract]
Abstract: Scheduling tasks with precedence constraints on a set of resources with different performances is a well-known NP-complete problem, and a number of effective heuristics has been proposed to solve it. If the start time and the deadline for each specific workflow are known (for example, if a workflow starts execution according to periodic data coming from the sensors, and its execution should be completed before data acquisition), the problem of multiple deadline-constrained workflows scheduling arises. Taking into account that resource providers can give only restricted access to their computational capabilities, we consider the case when resources are partially available for workflow execution. To address the problem described above, we study the scheduling of deadline-constrained scientific workflows in non-dedicated heterogeneous environment. In this paper, we introduce three scheduling algorithms for mapping the tasks of multiple workflows with different deadlines on the static set of resources with previously known free time windows. Simulation experiments show that scheduling strategies based on a proposed staged scheme give better results than merge-based approach considering all workflows at once.
Klavdiya Bochenina
292 Fault-Tolerant Workflow Scheduling Using Spot Instances on Clouds [abstract]
Abstract: Scientific workflows are used to model applications of high throughput computation and complex large scale data analysis. In recent years, Cloud computing is fast evolving as the target platform for such applications among researchers. Furthermore, new pricing models have been pioneered by Cloud providers that allow users to provision resources and to use them in an efficient manner with significant cost reductions. In this paper, we propose a scheduling algorithm that schedules tasks on Cloud resources using two different pricing models (spot and on-demand instances) to reduce the cost of execution whilst meeting the workflow deadline. The proposed algorithm is fault tolerant against the premature termination of spot instances and also robust against performance variations of Cloud resources. Experimental results demonstrate that our heuristic reduces up to 70% execution cost as against using only on-demand instances.
Deepak Poola, Kotagiri Ramamohanarao, Rajkumar Buyya
308 On Resource Efficiency of Workflow Schedules [abstract]
Abstract: This paper presents the Maximum Effective Reduction (MER) algorithm, which optimizes the resource efficiency of a workflow schedule generated by any particular scheduling algorithm. MER trades the minimal makespan increase for the maximal resource usage reduction by consolidating tasks with the exploitation of resource inefficiency in the original workflow schedule. Our evaluation shows that the rate of resource usage reduction far outweighs that of the increase in makespan, i.e., the number of resources used is halved on average while incurring an increase in makespan of less than 10%.
Young Choon Lee, Albert Y. Zomaya, Hyuck Han
346 GridMD: a Lightweight Portable C++ Library for Workflow Management [abstract]
Abstract: In this contribution we present the current state of the open source GridMD workflow library (http://gridmd.sourceforge.net). The library was originally designed for programmers of distributed Molecular Dynamics (MD) simulations, however nowadays it serves as a universal tool for creating and managing general workflows from a compact client application. GridMD is a programming tool aimed at the developers of distributed software that utilizes local or remote compute capabilities to perform loosely coupled computational tasks. Unlike other workflow systems and platforms, GridMD is not integrated with heavy infrastructure such as Grid systems, web portals, user and resource management systems and databases. It is a very lightweight tool accessing and operating on a remote site by delegated user credentials. For starting compute jobs the library supports Globus Grid environment; a set of cluster queuing managers such as PBS(Torque) or SLURM and Unix/Windows command shells. All job starting mechanisms may either be used locally or remotely via the integrated SSH protocol. Working with different queues, starting of parallel (MPI) jobs and changing job parameters is generically supported by the API. The jobs are started and monitored in a “passive” way, not requiring any special task management agents to be running or even installed on the remote system. The workflow execution is monitored by an application (task manager performing GridMD API calls) running on a client machine. Data transfer between different compute resources and from the client machine and a compute resource is performed by the exchange of files (gridftp or ssh channels). Task manager is able to checkpoint and restart the workflow and to recover from different types of errors without recalculating the whole workflow. Task manager itself can easily be terminated/restarted on the client machine or transferred to another client without breaking the workflow execution. Apart from the separated tasks such as command series or application launches, GridMD workflow may also manage integrated tasks that are described by the code compiled as part of task manager. Moreover, the integrated tasks may change the workflow dynamically by adding additional jobs or dependencies to the existing workflow graph. The dynamical management of the workflow graph is an essential feature of GridMD, which adds large flexibility for the programmer of the distributed scenarios. GridMD also provides a set of useful workflow skeletons for standard distributed scenarios such as Pipe, Fork, Parameter Sweep, Loop (implemented as dynamical workflow). In the talk we will discuss the architecture and special features of GridMD. We will also briefly describe the recent applications of GridMD as a base for distributed job manager, for example in the multiscale OLED simulation platform (EU-Russia IM3OLED project).
Ilya Valuev and Igor Morozov