Applications of Matrix Methods in Artificial Intelligence and Machine Learning (AMAIML) Session 1

Time and Date: 14:40 - 16:20 on 12th June 2019

Room: 0.3

Chair: Kourosh Modarresi

410	Biclustering via Mixtures of Regression Models [abstract] Abstract: Bi-clustering of observations and the variables is of interest in many scientific disciplines; In a single set of data matrix it is handled through the singular value decomposition. Here we deal with two sets of variables: Response and predictor sets. We model the joint relationship via regression models and then apply SVD on the coefficient matrix. The sparseness condition is introduced via Group Lasso; the approach discussed here is quite general and is illustrated with an example from Finance.	Raja Velu, Zhaoque Zhou and Chyng Wen Tee
524	An Evaluation Metric for Content Providing Models, Recommender Systems and Online Campaigns [abstract] Abstract: Creating an optimal digital experience for users require providing users desirable content and also delivering these contents in optimal time as user’s experience and interaction taking place. There are multiple metrics and variables that may determine the success of a “user digital experience”. These metrics may include accuracy, computational cost and other variables. Many of these variables may be contradictory to one another (as explained later in this submission) and their importance may depend on the specific application the digital experience optimization may be pursuing. To deal with this intertwined, possibly contradicting and confusing set of metrics, this invention introduces a generalized index entailing all possible metrics and variables - - that may be significant in defining a successful “digital experience design model”. Besides its generalizability, as it may include any metric the marketers or scientists consider to be important, this new index allows the marketers or the scientists to give different weights to the corresponding metrics as the significance of a specific metric may depends on the specific application. This index is very flexible and could be adjusted as the objective of” user digital experience optimization” may change. Here, we use “recommendation” as equivalent to “content providing” throughout the submission. One well known usage of “recommender systems” is in providing contents such as products, ads, goods, network connections, services, and so on. Recommender systems have other wide and broad applications and - - in general - - many problems and applications in AI and machine learning could be converted easily to an equivalent “recommender system” one. This feature increases the significance of recommender systems as an important application of AI and machine learning. The introduction of internet has brought a new dimension on the ways businesses sell their products and interact with their customers. Ubiquity of the web and consequently web applications are soaring and as a result much of the commerce and customer experience are taking place on line. Many companies offer their products exclusively or predominantly online. At the same time, many present and potential customers spend much time on line and thus businesses try to use efficient models to interact with online users and engage them in various desired initiatives. This interaction with online users is crucial for businesses that hope to see some desired outcome such as purchase, conversions of any types, simple page views, spending longer time on the business pages and so on. Recommendation system is one of the main tools to achieve these outcomes. The basic idea of recommender systems is to analyze what is the probability of a desires action by a specific user. Then, by knowing this probability, one can make decision of what initiatives to be taken to maximize the desirable outcomes of the online user’s actions. The types of initiatives could include, promotional initiatives (sending coupons, cash, …) or communication with the customer using all available media venues such as mail, email, online ad, etc. the main goal of recommendation or targeting model is to increase some outcomes such as “conversion rate”, “length of stay on sites”, “number of views” and so on. There are many other direct or indirect metrics influenced by recommender systems. Examples of these could include an increase of the sale of other products which were not the direct goal of the recommendations, an increase the chance of customer coming back at the site, increase in brand awareness and the chance of retargeting the same user at a later time. The Model: Overview At first, we demonstrate the problem we want to address, and we do it by using many models, data sets and multiple metrics. Then, we propose our unified and generalized metric to address the problems we observe in using different multiple and separate metrics. Thus, we use several models and multiple data sets to evaluate our approach. First, we use all these data sets to evaluate performances of the different models using different performance metrics which are “the state of the art”. Then, we are observing the difficulties of any evaluation using these performance metrics. That is because dealing with different performance metrics, which often make contradictory conclusion, it’d be hard to decide which model has the best performance (so to use the model for the targeting campaign in mind). Therefore, we create our performance index which produces a single, unifying performance metric evaluation a targeting model.	Kourosh Modarresi and Jamie Diner
162	Tolerance Near Sets and tNM Application in City Images [abstract] Abstract: The Tolerance Near Set theory - is a formal basis for the observation, comparison and classification of objects, and tolerance Nearness Measure (tNM) is a normalized value, that indicates how much two images are similar. This paper aims to present an application of the algorithm that performs the comparison of images based on the value of tNM, so that the similarities between the images are verified with respect to their characteristics, such as Gray Levels and texture attributes extracted using Gray Level Co-occurrence Matrix (GLCM). Images of the center of some selected cities around the world, are compared using tNM, and classified.	Deivid Silva, José Saito and Daniel Caio De Lima
363	Meta-Graph based Attention-aware Recommendation over Heterogeneous Information Networks [abstract] Abstract: Heterogeneous information network (HIN), which involves diverse types of data, has been widely used in recommender systems. However, most existing HINs based recommendation methods equally treat different latent features and simply model various feature interactions in the same way so that the rich semantic information cannot be fully utilized. To comprehensively exploit the heterogeneous information for recommendation, in this paper, we propose a Meta-Graph based Attention-aware Recommendation (MGAR) over HINs. First of all, the MGAR utilizes rich meta-graph based latent features to guide the heterogeneous information fusion recommendation. Specifically, in order to discriminate the importance of latent features generated by different meta-graphs, we propose an attention-based feature enhancement model. The model enables useful features and useless features contribute differently to the prediction, thus improves the performance of the recommendation. Furthermore, to holistically exploit the different interrelation of features, we propose a hierarchical feature interaction method which consists three layers of second-order interaction to mine the underlying correlations between users and items. Extensive experiments show that MGAR outperforms the state-of-the-art recommendation methods in terms of RMSE on Yelp and Amazon Electronics.	Feifei Dai, Xiaoyan Gu, Bo Li, Jinchao Zhang, Mingda Qian and Weiping Wang

Applications of Matrix Methods in Artificial Intelligence and Machine Learning (AMAIML) Session 2

Time and Date: 16:50 - 18:30 on 12th June 2019

Room: 0.3

Chair: Kourosh Modarresi

521	Determining Adaptive Loss Functions and Algorithms for Predictive Models [abstract] Abstract: We consider the problem of training models to predict sequential processes. We use two econometric datasets to demonstrate how different losses and learning algorithms alter the predictive power for a variety of state-of-the-art models. We investigate how the choice of loss function impacts model training and find that no single algorithm or loss function results in optimal predictive performance. For small datasets, neural models prove especially sensitive to training parameters, including choice of loss function and pre-processing steps. We find that a recursively-applied artificial neural network trained under L1 loss performs best under many different metrics on a national retail sales dataset, whereas a differenced autoregressive model trained under L1 loss performs best under a variety of metrics on an e-commerce dataset. We note that different training metrics and processing steps result in appreciably different performance across all model classes and argue for an adaptive approach to model fitting.	Kourosh Modarresi and Michael Burkhart
522	Adoptive Objective Functions and Distance Metrics for Recommendation Systems [abstract] Abstract: We describe, develop, and implement different models for the stan-dard matrix completion problem from the field of recommendation sys-tems. We benchmark these models against the publicly available Netflix Prize challenge dataset, consisting of ratings on a 1-5 scale for (user,movie)-pairs. We used the 99 million examples to develop individual models, built ensembles on a separate validation set of 1 million examples, and tested both individual models and ensembles on a held-out set of over 400,000 examples. While the original competition concentrated only on RMSE, we experiment with different objective functions for model training, ensemble construction, and model/ensemble testing. Our best-performing estimators were (1) a linear ensemble of base models trained using linear regression (see ensemble e1, RMSE: 0.912) and (2) a neural network that aggregated predictions from individual models (see ensemble e4, RMSE: 0.912). Many of the constituent models in our ensembles had yet to be developed at the time the Netflix competition con-cluded in 2009. To our knowledge, not much research has been done to es-tablish best practices for combining these models into ensembles. We con-sider this problem, with a particular emphasis on the role that the choice of objective function plays in ensemble construction. For a full list of learned models and ensembles, see Tables 1 and 2.	Kourosh Modarresi and Michael Burkhart
60	An Early Warning Method for Basic Commodities Price Spike Based on Artificial Neural Networks Prediction [abstract] Abstract: Basic commodities price spike is a serious problem for food security and can carry wide effect and even social unrest. Its occurrences should always be anticipated early enough because government needs sufficient time to form anticipatory policies and proactive actions to overcome the problem. According to law regarding food in Indonesia, the government should develop an integrated information system on food security, which includes an early warning function. This study proposes an early warning method based on Multi-Layer Perceptron predictive model with Multiple Input Multiple Output (MIMO). The warning status is determined based on the coefficient of variation of obtained price prediction from the government’s reference price. A great deal of attention was paid for tuning the model parameters to obtain the most accurate prediction. Model selection was conducted by time series k-fold cross-validation with the mean squared error criterion. The predictive model gives a good performance, where the average of normalized root mean squared errors of sample commodities is ranging from 9.909% to 18.046%. Importantly, the method is promising for modelling basic commodities price and may help the government to predict price spikes and to determine further anticipatory policies.	Amelec Viloria
22	Predicting Heart Attack through Explainable Artificial Intelligence [abstract] Abstract: This paper reports a novel classification technique by implementing a genetic al-gorithm (GA) based trained ANFIS to diagnose heart diseases. The performance of the proposed system was investigated by evaluation functions including sensi-tivity, specificity, precision, accuracy and also Root Mean Squared Error (RMSE) between the desired and predicted outputs. It was shown that the sug-gested model is reliable and suggests high values of evaluation functions. Also, a novel technique was proposed which provides explainability graphs based on the predicted results for the patients, automatically. The reliability and explainability of the system was the main aim of this paper and was proved by providing dif-ferent criteria. Additionally, the importance of the different symptoms and fea-tures in diagnosis of heart disease was investigated by defining an importance evaluation function and it was shown that some features have key role in predic-tion of the heart disease.	Mehrdad Aghamohammadi, Manvi Madan, Jung Ki Hong and Ian Watson