Biomedical and Bioinformatics Challenges for Computer Science (BBC) Session 1

Time and Date: 13:35 - 15:15 on 11th June 2018

Room: M7

Chair: Rodrigo Weber dos Santos

39 Combining Data Mining Techniques to Enhance Cardiac Arrhythmia Detection [abstract]
Abstract: Detection of Cardiac Arrhythmia (CA) is performed using the clinical analysis of the electrocardiogram (ECG) of a patient to prevent cardiovascular diseases. Machine Learning Algorithms have been presented as promising tools in aid of CA diagnoses, with emphasis on those related to automatic classification. However, these algorithms suffer from two traditional problems related to classification: (1) excessive number of numerical attributes generated from the decomposition of an ECG; and (2) the number of patients diagnosed with CAs is much lower than those classified as “normal” leading to very unbalanced datasets. In this paper, we combine in a coordinate way several data mining techniques, such as clustering, feature selection, oversampling strategies and automatic classification algorithms to create more efficient classification models to identify the disease. In our evaluations, using a traditional dataset provided by the UCI, we were able to improve significantly the effectiveness of Random Forest classification algorithm achieving an accuracy of over 88%, a value higher than the best already reported in the literature.
Christian Reis, Alan Cardoso, Thiago Silveira, Diego Dias, Elisa Albergaria, Renato Ferreira and Leonardo Rocha
238 CT medical imaging reconstruction using direct algebraic methods with few projections [abstract]
Abstract: In the field of CT medical image reconstruction, there are two approaches you can take to reconstruct the images: the analytical methods, or the algebraic methods, which can be divided into iterative or direct. Although analytical methods are the most used for their low computational cost and good reconstruction quality, they do not allow reducing the number of views taken and thus the radiation absorbed by the patient. In this paper, we present two direct algebraic approaches for CT reconstruction: performing the Sparse QR (SPQR) factorization of the system matrix or carrying out a singular values decomposition (SVD). We compare the results obtained in terms of image quality and computational time cost and analyze the memory requirements for each case.
Mónica Chillarón, Vicente Vidal, Gumersindo Verdú and Josep Arnal
355 On blood viscosity and its correlation with biological parameters [abstract]
Abstract: In recent years interest in blood viscosity has increased significantly in different biomedical areas. Blood viscosity, a measure of the resistance of blood flow, related to its thickness and stickiness, is one of the main biophysical properties of blood. Many factors affect blood viscosity, both in physiological and in pathological conditions. The aim of this study is to estimate blood viscosity by using the regression equation of viscosity which is based on hematocrit and total plasma proteins. It can be used to perform several observations regards the main factors which can influence blood viscosity. The main contribution regards the correlation between viscosity values and other important biological parameters such as cholesterol. This correlation has been supported by performing statistical tests and it suggest that the viscosity could be the main risk factor in cardiovascular diseases. Moreover, it is the only biological measure being correlated with the other cardiovascular risk factors. Results obtained are compliant with values obtained by using the standard viscosity measurement through a viscometer.
Patrizia Vizza, Giuseppe Tradigo, Marianna Parrilla, Pietro Hiram Guzzi, Agostino Gnasso and Pierangelo Veltri

Biomedical and Bioinformatics Challenges for Computer Science (BBC) Session 2

Time and Date: 15:45 - 17:25 on 11th June 2018

Room: M7

Chair: Rodrigo Weber dos Santos

383 Development of Octree-Based High-Quality Mesh Generation Method for Biomedical Simulation [abstract]
Abstract: This paper proposes a robust high-quality finite element mesh generation method which is capable of modeling problems with complex geometries and multiple materials and suitable for the use in biomedical simulation. The previous octree-based method can generate a high-quality mesh with complex geometries and multiple materials robustly allowing geometric approximation. In this study, a robust mesh optimization method is developed combining smoothing and topology optimization in order to correct geometries guaranteeing element quality. Through performance measurement using sphere mesh and application to HTO tibia mesh, the validity of the developed mesh optimization method is checked.
Keisuke Katsushima, Kohei Fujita, Tsuyoshi Ichimura, Muneo Hori and Lalith Maddegedara
258 1,000x Faster than PLINK: Genome-Wide Epistasis Detection with Logistic Regression Using Combined FPGA and GPU Accelerators [abstract]
Abstract: Logistic regression as implemented in PLINK is a powerful and commonly used framework for assessing gene-gene (GxG) interactions. However, fitting regression models for each pair of markers in a genome-wide dataset is a computationally intensive task. Performing billions of tests with PLINK takes days if not weeks, for which reason pre-filtering techniques and fast epistasis screenings are applied to reduce the computational burden. Here, we demonstrate that employing a combination of a Xilinx UltraScale KU115 FPGA and an Nvidia Tesla P100 GPU leads to runtimes of only minutes for logistic regression GxG tests on a genome-wide level. In particular, a dataset with 53,000 samples genotyped at 130,000 SNPs was analyzed in 8 minutes, resulting in a speedup of more than 1,000 when compared to PLINK v1.9 using 32 threads on a server-grade computing platform. Furthermore, on-the-fly calculation of test statistics, p-values and LD-scores in double precision make commonly used pre-filtering strategies obsolete.
Lars Wienbrandt, Jan Christian Kässens, Matthias Hübenthal and David Ellinghaus
280 Combining Molecular Dynamics Simulations and Informatics to Model Nucleosomes and Chromatin [abstract]
Abstract: Nucleosomes are the fundamental building blocks of chromatin, the biomaterial that houses the genome in all higher organisms. A nucleosome consist of 145-147 base pairs of double strained DNA wrapped approximately 1.7 times around eight histones. There are almost 100 atomic resolution structures of the nucleosome available from the protein data bank. Collectively they explore histone mutations, species variations, binding of drugs and ionic effects, but only three sequences of DNA. Given a four-letter code (A, C, G, T) for DNA there are on the order of 4^147 ~ 10^88 possible sequences of DNA that can form a nucleosome. Exhaustive studies are not possible. Fortunately, next generation sequencing enables researchers to identify a single nucleosome of interest, and today’s super computing resources enable simulation ensembles representing different realizations of the nucleosome to be accumulated overnight as a means of investigating its structure and dynamics. Here we present a workflow that integrates molecular simulation and genome browsing to manage such efforts. The workflow is exploited to study nucleosome positioning in atomic detail and its relation to chromatin folding in coarse-grained detail. The exchange of data between physical and informatics models is bidirectional. This allows cross validation of simulation and experiment and the discovery of structure‑function relationships. All simulation and analysis data from the studies are available on the TMB-iBIOMES server: http://dna.engr.latech.edu/ibiomes.html.
Ran Sun, Zilong Li and Thomas Bishop
169 A Stochastic Model to Simulate the Spread of Leprosy in Juiz de Fora [abstract]
Abstract: The Leprosy, also known as Hansen's disease, is an infectious disease in which the main etiological agent is the Mycobacterium leprae. The disease mainly affects the skin and peripheral nerves and can cause physical disabilities. For this reason, represents a global public health concern, especially in Brazil, where more than twenty-five thousand of new cases were reported in 2016. This work aims to simulate the spread of Leprosy in a Brazilian city, Juiz de Fora, using the SIR model and considering some of its pathological aspects. SIR models divide the studied population into compartments in relation to the disease, in which S, I and R compartments refer to the groups of susceptible, infected and recovered individuals, respectively. The model was solved computationally by a stochastic approach using the Gillespie algorithm. Then, the results obtained by the model were validated using the public health records database of Juiz de Fora.
Vinícius Clemente Varella, Aline Mota Freitas Matos, Henrique Couto Teixeira, Angélica Da Conceição Oliveira Coelho, Rodrigo Santos and Marcelo Lobosco