Applications of Matrix Computational Methods in the Analysis of Modern Data (MATRIX) Session 2

Time and Date: 14:30 - 16:10 on 6th June 2016

Room: Rousseau Center

Chair: Kouroush Modarresi

467	Algorithmic Approach for Learning a Comprehensive View of Online Users [abstract] Abstract: Online users may use many different channels, devices and venues for any online user experience. To make all services such as web design, ads, web content, shopping, personalized for every user; we need to be able to recognize them regardless of device, channels and venues they are using. This, in turn, requires building up a comprehensive view of the user which includes all of their behavioral characteristics - that are spread all over these different venues. This would not be possible without having all behavioral related data of the user which requires the capacity of connecting the user all over the devices, and channels, so to have all of their behavior under a single view. This work is a major attempt in doing this using only behavioral data of users while protecting the user’s privacy.	Kourosh Modarresi
473	Recommendation System Based on Complete Personalization [abstract] Abstract: Current recommender systems are very inefficient. There are many metrics that are used to measure the effectiveness of recommender systems. These metrics often include “conversion rate” and “click through rate”. Recently, these rates are in low single digit (less than 10%). In other words, more than 90% of times, the model that the targeting system is based on, produces noise. The belief in this work is that the main problem leading to getting such unsatisfactory outcomes is the modeling problem. Much of the modeling problem could be represented and exemplified in treating users and items as member of clusters (segments). In this work, we consider full personalization of recommendation systems. We aim at personalization of users and items simultaneously.	Kourosh Modarresi
520	Learning Vector-Space Representations of Items for Recommendations using Word Embedding Models [abstract] Abstract: We present a method of generating item recommendations by learning item feature vector embeddings. Our work is analogous to approaches like Word2Vec or Glove used to generate a good vector representation of words in a natural language corpus. We treat the items that a user interacted with as analogous to words and the string of items interacted with in a session as sentences. Our embedding generates semantically related clusters and the item vectors generated can be used to compute item similarity which can be used to drive product recommendations. Our method also allows us to use the feature vectors in other machine learning systems. We validate our method on the MovieLens dataset.	Balaji Krishnamurthy, Nikaash Puri
530	Improved Mahout Decision Tree Builders [abstract] Abstract: The default decision tree builder in Mahout 0.9 has severe implementation problems that build small, weak decision trees which limit its usefulness in production situations when the features are strictly numerical. In this talk I will describe a simple, more powerful decision tree builder that systematically produces regression models with much better AUCs without sacrificing performance. The new builder also creates models that are of relatively compact size (about 30-50 Kb in the tested data sets), as compared to the large (500 Kb – 2 Mb) models generated from a fixed version of the original decision tree builder. I will describe the problem with the Mahout decision tree builder and the simple replacement and how they work, and will compare the model size, build times, and AUC performance on several historic data sets from Adobe Target from customers in different industries to show that the improvement is very general.	John Kucera