Tiziana Di Matteo Network Filtering for Big Data
Tiziana Di Matteo - King’s College London, UK
Session chair: Peter Sloot

Abstract : In this lecture I will present network-theoretic tools to filter information in large-scale datasets and I will show that these are powerful tools to study complex datasets. In particular I will introduce correlation-based information filtering networks and the planar filtered graphs (PMFG) and I will show that applications to financial data-sets can meaningfully identify industrial activities and structural market changes. It has been shown that by making use of the 3-clique structure of the PMFG a clustering can be extracted allowing dimensionality reduction that keeps both local information and global hierarchy in a deterministic manner without the use of any prior information. However, the algorithm so far proposed to construct the PMFG is numerically costly with O(N3) computational complexity and cannot be applied to large-scale data. There is therefore scope to search for novel algorithms that can provide, in a numerically efficient way, such a reduction to planar filtered graphs. I will introduce a new algorithm, the TMFG (Triangulated Maximally Filtered Graph), that efficiently extracts a planar subgraph which optimizes an objective function. The method is scalable to very large datasets and it can take advantage of parallel and GPUs computing. The method is adaptable allowing online updating and learning with continuous insertion and deletion of new data as well changes in the strength of the similarity measure. Finally I will also show that filtered graphs are valuable tools for risk management and portfolio optimization too and they allow to construct probabilistic sparse modeling for financial systems that can be used for forecasting, stress testing and risk allocation.