ICCS 2016 Main Track (MT) Session 11

Time and Date: 10:15 - 11:55 on 7th June 2016

Room: Toucan

Chair: Raymond de Callafon

43 An Evaluation of Data Stream Processing Systems for Data Driven Applications [abstract]
Abstract: Real-time data stream processing technologies play an important role in enabling time-critical decision making in many applications. This paper aims at evaluating the performance of platforms that capable of processing streaming data. Candidate technologies include Storm, Samza, and Spark Streaming. To form the recommendation, a prototype pipeline is designed and implemented in each of the platform using data collected from sensors used in monitoring heavy-haul railway systems. Through the testing and evaluation of each candidate platform, using both quantitative and qualitative metrics, the paper describes the findings.
Jonathan Samosir, Maria Indrawan-Santiago, Pari Delir Haghighi
122 Improving Multivariate Data Streams Clustering [abstract]
Abstract: Clustering data streams is an important task in data mining research. Recently, some algorithms have been proposed to cluster data streams as a whole, but just few of them deal with multivariate data streams. Even so, these algorithms merely aggregate the attributes without touching upon the correlation among them. In order to overcome this issue, we propose a new framework to cluster multivariate data streams based on their evolving behavior over time, exploring the correlations among their attributes by computing the fractal dimension. Experimental results with climate data streams show that the clusters' quality and compactness can be improved compared to the competing method, leading to the thoughtfulness that attributes correlations cannot be put aside. In fact, the clusters' compactness are 7 to 25 times better using our method. Our framework also proves to be an useful tool to assist meteorologists in understanding the climate behavior along a period of time.
Christian Bones, Luciana Romani, Elaine de Sousa
465 Network Services and Their Compositions for Network Science Applications [abstract]
Abstract: Network science is moving more and more to computing dynamics on networks (so-called contagion processes), in addition to computing structural network features (e.g., key players and the like) and other parameters. Generalized contagion processes impose additional data storage and processing demands that include more generic and versatile manipulations of networked data that can be highly attributed. In this work, we describe a new network services and workflow system called MARS that supports structural network analyses and generalized network dynamics analyses. It is accessible through the internet and can serve multiple simultaneous users and software applications. In addition to managing various types of digital objects, MARS provides services that enable applications (and UIs) to add, interrogate, query, analyze, and process data. We focus on several network services and workflows of MARS in this paper. We also provide a case study using a web-based application that MARS supports, and several performance evaluations of scalability and work loads. We find that MARS efficiently processes networks of hundreds of millions of edges from many hundreds of simultaneous users.
Sherif Abdelhamid, Chris Kuhlman, Madhav Marathe, S. S. Ravi