Workshop on Computational Finance and Business Intelligence (CFBI) Session 1

Time and Date: 16:20 - 18:00 on 11th June 2014

Room: Tully II

Chair: ?

100 Twin Support Vector Machine in Linear Programs [abstract]
Abstract: This paper propose a new algorithm, termed as LPTWSVM, for binary classification problem by seeking two nonparallel hyperplanes which is an improved method for TWSVM. We improve the recently proposed ITSVM and develop Generalized ITSVM. A linear function is chosen in the object function of Generalized ITSVM which leads to the primal problems of LPTWSVM. Comparing with TWSVM, a 1-norm regularization term is introduced to the objective function to implement structural risk minimization and the quadratic programming problems are changed to linear programming problems which can be solved fast and easily. Then we do not need to compute the large inverse matrices or use any optimization trick in solving our linear programs and the dual problems are unnecessary in the paper. We can introduce kernel function directly into nonlinear case which overcome the serious drawback of TWSVM. The numerical experiments verify that our LPTWSVM is very effective.
Dewei Li, Yingjie Tian
240 Determining the time window threshold to identify user sessions of stakeholders of a commercial bank portal [abstract]
Abstract: In this paper, we focus on finding the suitable value of the time threshold, which is then used in the method of user session identification based on the time. To determine its value, we used the Length variable representing the time a user spent on a particular site. We compared two values of time threshold with experimental methods of user session identification based on the structure of the web: Reference Length and H-ref. When comparing the usefulness of extracted rules using all four methods, we proved that the use of the time threshold calculated from the quartile range is the most appropriate method for identifying sessions for web usage mining.
Jozef Kapusta, Michal Munk, Peter Svec, Anna Pilkova
183 Historical Claims Data Based Hybrid Predictive Models for Hospitalization [abstract]
Abstract: Over $30 billion are wasted on unnecessary hospitalization each year, therefore it is needed to nd a better quantitative way to identify patients who are mostly likely to be hospitalized and then provide them utmost care. As a good starting point, the objective of this paper was to develop a predictive model to predict how many days patients may spend in the hospital next year based on patients' historical claims dataset, which is provided by the Heritage Health Prize Competition. The proposed predictive model applied the ensemble of binary classication and regression techniques. The model is evaluated on testing dataset in terms of the Root-Mean-Square-Error (RMSE). The best RMSE score was 0.474, and the corresponding prediction accuracy 81.9% was reasonably high. Therefore it is convincing to conclude that predictive models have the potentials to predict hospitalization and improve patients' quality of life.
Chengcheng Liu, Yong Shi