ICCS 2019 Main Track (MT) Session 14

Time and Date: 16:30 - 18:10 on 13th June 2018

Room: 1.3

Chair: To be announced

272 Autism Screening using Deep Embedding Representation [abstract]
Abstract: Autism spectrum disorder (ASD) is a developmental disorder that affects communication and behavior. An early diagnosis of neurodevelopmental disorders can improve treatment and significantly decrease associated healthcare cost, which reveals an urgent need for the development of ASD screening. However, the data used for ASD screening is heterogenous and multi-source, resulting in existing screening tools for ASD screening are expensive, time intensive and sometimes fall short in predictive accuracy. In this paper, we apply novel feature engineering and feature encoding techniques, along with a deep learning classifier for ASD screening. Algorithms were created via a robust deep learning classifier and deep embedding representation for categorical variables to diagnose ASD based on behavioral features and individual characteristics. The proposed algorithm is effective compared with baselines, achieving 99\% sensitivity and 99\% specificity. The results suggest that deep embedding representation learning is a reliable method for ASD screening.
Haishuai Wang, Li Li, Lianhua Chi and Ziping Zhao
278 Function and pattern extrapolation with product-unit networks [abstract]
Abstract: Neural networks are a popular method for function approximation and data classification and have recently drawn much attention because of the success of deep-learning strategies. Artificial neural networks are built from elementary units that generate a piecewise, often almost linear approximation of the function or pattern. To improve the extrapolation of nonlinear functions and patterns beyond the training domain, we propose to augment the fundamental algebraic structure of neural networks by a product unit that computes the product of its inputs raised to the power of their weights, namely $\prod_{i} x_i^{w_i}$. Linearly combining their outputs in a weighted sum allows representing most nonlinear functions known in calculus, including roots, fractions and approximations of power series. We train the network using gradient descent. The enhanced extrapolation capabilities of the network are demonstrated by comparing the results for a function and pattern extrapolation task with those obtained with the nonlinear support vector machine (SVM) and a standard neural network (standard NN). Convergence behavior of stochastic gradient descent is discussed and the feasibility of the approach is demonstrated in a real-world application in image segmentation.
Babette Dellen, Uwe Jaekel and Marcell Wolnitza
333 Fast and Scalable Outlier Detection with Metric Access Methods [abstract]
Abstract: It is well-known that the theoretical models existing for outlier detection make assumptions that may not reflect the true nature of outliers in every real application. With that in mind, this paper describes an empirical study performed on unsupervised outlier detection using 8 algorithms from the state-of-the-art and 8 datasets that refer to a variety of real-world tasks of high impact, like spotting cyberattacks, clinical pathologies and abnormalities in nature. We present the lowdown on the results obtained, pointing out to the strengths and weaknesses of each technique from the application specialist’s point of view, which is a shift from the designer-based point of view that is commonly considered. Interestingly, many of the techniques had unfeasibly high runtime requirements or failed to spot what the specialists consider as outliers in their own data. To tackle this issue, we propose MetricABOD: a novel ABOD- based algorithm that makes the analysis up to thousands of times faster, still being in average 12% more accurate than the most accurate related work. This improvement is essential to enable outlier detection in many real-world applications for which the existing methods lead to unexpected results or unfeasible runtime requirements. Finally, we studied two real collections of text data to show that our MetricABOD works also for adimensional, purely metric data.
Altamir Gomes Bispo Junior and Robson Leonardo Ferreira Cordeiro
384 Deep Learning Based LSTM and SeqToSeq Models to Detect Monsoon Spells of India [abstract]
Abstract: Monsoon spells are important climatic phenomenon modulating the quality and quantity of monsoon over an year. India being an agricultural country, identification of monsoonal spells is extremely important to plan agricultural policies following the phases of monsoon to attain maximum productivity. Monsoon spells' detection involve analyzing and predicting monsoon at daily levels which make it more challenging as daily-variability is higher as compared to monsoon over a month or an year. In this article, deep-learning based long short-term memory and sequence-to-sequence models are utilized to classify monsoon days, which are finally assembled to detect the spells. Dry and wet days are classified with precision of 0.95 and 0.87, respectively. Break spells are observed to be forecast with higher accuracy than the active spells. Additionally, sequence-to-sequence model is noted to perform superior to that of long-short term memory model. The proposed models also outperform traditional classification models for monsoon spell detection.
V. Saicharan, Moumita Saha, Pabitra Mitra and Ravi S. Nanjundiah
507 Data Analysis for Atomic Shapes in Nuclear Science [abstract]
Abstract: One of the overarching questions in the field of nuclear science is how simple phenomena emerges from complex systems. A nucleus is composed of both protons and neutrons and while many assume the atomic nucleus adopts a spherical shape, the nuclear shape is, in fact, quite variable. Nuclear physicists seek to understand the shape of the atomic nucleus by probing specific transitions between nuclear energy states which occur at high energy with short timescales. This is achieved through detecting a unique experimental signature in the recorded time-series data in experiments conducted at the National Superconducting Cyclotron Laboratory. The current method involves fitting each sample in the dataset to a given parameterized model function. However, this procedure is computationally expensive due to the nature of the nonlinear curve fitting problem. Since data is skewed towards non-unique signatures, we offer a way to filter out the majority of the uninteresting samples from the dataset by using machine learning methods. By doing so, we decrease the computational costs for detection of the unique experimental signatures in the time-series data. Also, we present a way to generate synthetic training data by estimating the distribution of the underlying parameters of the model function with Kernel Density Estimation. The new workflow that leverages machine learned classifiers trained on the synthetic data are shown to significantly outperform the current procedures used in actual datasets.
Mehmet Kaymak, Hasan Metin Aktulga, Fox Ron and Sean Liddick