Computational Finance and Business Intelligence (CFBI) Session 1

Time and Date: 13:35 - 15:15 on 11th June 2018

Room: M8

Chair: Yong Shi

121	Deep Learning and Wavelets for High-Frequency Price Forecasting [abstract] Abstract: This paper presents improvements in financial time series prediction using a Deep Neural Network (DNN) in conjunction with a Discrete Wavelet Transform (DWT). When comparing our model to other three alternatives, including ARIMA and other deep learning topologies, ours has a better performance. All of the experiments were conducted on High Frequency Data (HFD). Given the fact that DWT decomposes signals in terms of frequency and time, we expect this transformation will make a better representation of the streaking behavior of high frequency data. The input information consists of 27 variables: The last 3 one-minute pseudo-log-returns and last 3 one-minute compressed tick-by-tick wavelet vectors. Each vector is a product of compressing the tick-by-tick transactions inside a particular minute using a DWT with length 8. Furthermore, the DNN predicts the next one-minute pseudo-log-return that can be transformed into the next predicted one-minute average price. For testing purposes, we use tick-by-tick data of 19 companies in the Dow Jones Industrial Average Index (DJIA), from January 2015 to July 2017. The proposed DNN's Directional Accuracy (DA) presents a remarkable forecasting performance ranging from 64% to 72%.	Andrés Arévalo, Jaime Nino, Diego León, German Hernandez and Javier Sandoval
131	Kernel Extreme Learning Machine for Learning from Label Proportions [abstract] Abstract: As far as we know, Inverse Extreme Learning Machine (IELM) is the first work extending ELM to LLP problem. Due to basing on extreme learning machine (ELM), it obtains the fast speed and achieves competitive classification accuracy with the existing LLP methods. Kernel extreme learning machine (KELM) generalizes basic ELM to the kernel-based framework. It not only solves the problem that the number of hidden layer nodes in basic ELM depends on manual setting, but also presents better generalization ability and stability than basic ELM. However, there is no research based on KELM for LLP. In this paper, we apply KELM and propose the novel method LLP-KELM for LLP. The classification accuracy is greatly improved compared with IELM. Lots of numerical experiments validate the effectiveness of our method.	Hao Yuan, Bo Wang and Lingfeng Niu
135	Extreme Market Prediction for Trading Signal with Deep Recurrent Neural Network [abstract] Abstract: Recurrent neural networks are a type of deep learning units that are well studied to extract features from sequential samples. They have been extensively applied in forecasting univariate financial time series, however their application to high frequency multivariate sequences has been merely considered. This paper solves a classification problem in which recurrent units are extended to deep architecture to extract features from multi-variance market data in 1-minutes frequency and extreme market are subsequently predicted for trading signals. Our results demonstrate the abilities of deep recurrent architecture to capture the relationship between the historical behavior and future movement of high frequency samples. The deep RNN is compared with other models, including SVM, random forest, logistic regression, using CSI300 1-minutes data over the test period. The result demonstrate that the capability of deep RNN to generate trading signal based on extreme movement prediction support more efficient market decision making and enhance the profitability.	Zhichen Lu, Wen Long and Ying Guo
181	Multi-view Multi-task Support Vector Machine [abstract] Abstract: Multi-view Multi-task (MVMT) Learning, a novel learning paradigm, can be used in extensive applications such as pattern recognition and natural language processing. Therefore, researchers come up with several methods from different perspectives including graph model, regularization techniques and feature learning. SVMs have been acknowledged as powerful tools in machine learning. However, there is no SVMbased method for MVMT learning. In order to build up an excellent MVMT learner, we extend PSVM-2V model, an excellent SVM-based learner for MVL, to the multi-task framework. Through experiments we demonstrate the effectiveness of the proposed method.	Jiashuai Zhang, Yiwei He and Jingjing Tang
225	Research on Stock Price Forecast Based on News Sentiment Analysis --A Case Study of Alibaba [abstract] Abstract: Based on the media news of Alibaba and improvement of L&M dictionary, this study transforms unstructured text into structured news sentiment through dictionary matching. By employing data of Alibaba’s opening price, closing price, maximum price, minimum price and volume in Thomson Reuters database, we build a fifth-order VAR model with lags. The AR test indicates the stability of VAR model. In a further step, the results of Granger causality tests, impulse response function and variance decomposition show that VAR model is successful to forecast variables dopen, dmax and dmin. What’s more, news sentiment contributes to the prediction of all these three variables. At last, MAPE reveals dopen, dmax and dmin can be used in the out sample forecast. We take dopen sequence for example, document how to predict the movement and rise of opening price by using the value and slope of dopen.	Lingling Zhang, Saiji Fu and Bochen Li

Computational Finance and Business Intelligence (CFBI) Session 2

Time and Date: 15:45 - 17:25 on 11th June 2018

Room: M8

Chair: Yong Shi

305	Parallel Harris Corner Detection on Heterogeneous Architecture [abstract] Abstract: Corner detection is a fundamental step for many image processing applications including image enhancement, object detection and pattern recognition. Recent years, the quality and the number of images are higher than before, and applications mainly perform processing on videos or image flow. With the popularity of embedded devices, the real- time processing on the limited computing resources is an essential problem in high-performance computing. In this paper, we study the parallel method of Harris corner detection and implement it on a heterogeneous architecture using OpenCL. We also adopt some optimization strategy on the many-core processor. Experimental results show that our parallel and optimization methods highly improve the performance of Harris algorithm on the limited computing resources.	Yiwei He, Yue Ma and Dalian Liu
307	A New Method for Structured Learning with Privileged Information [abstract] Abstract: In this paper, we present a new method JKSE+ for structured learning. Compared with some classical mathods such as SSVM and CRFs, the optimization problem in JKSE+ is a convex quadratical problem and can be easily solved because it is based on JKSE. By incorporating the privileged information into JKSE, the performance of JKSE+ is improved. We apply JKSE+ to the problem of object detec- tion, which is a typical one in structured learning. Some experimental results show that JKSE+ performs better than JKSE.	Shiding Sun and Chunhua Zhang
312	An Effective Model between Mobile Phone Usage and P2P Default Behavior [abstract] Abstract: P2P online lending platforms have become increasingly developed. However, these platforms may suffer a serious loss caused by default behaviors of borrowers. In this paper, we present an effective default behavior prediction model to reduce default risk in P2P lending. The proposed model uses mobile phone usage data, which are generated from widely used mobile phones. We extract features from five aspects, including consumption, social network, mobility, socioeconomic, and individual attribute. Based on these features, we propose a joint decision model, which makes a default risk judgment through combining Random Forests with Light Gradient Boosting Machine. Validated by a real-world dataset collected by a mobile carrier and a P2P lending company in China, the proposed model not only demonstrates satisfactory performance on the evaluation metrics but also outperforms the existing methods in this area. Based on these results, the proposed model implies the high feasibility and potential to be adopted in real-world P2P online lending platforms.	Huan Liu, Lin Ma, Xi Zhao and Jianhua Zou
340	A Novel Data Mining Approach towards Human Resource Performance Appraisal [abstract] Abstract: Performance appraisal has always been a very important research field in human resource management. A reasonable performance appraisal plan lays a solid foundation for the development of an enterprise. Traditional performance appraisal programs are mostly labor-based, with difficulty in fairly examining employee results. Furthermore, as globalization and technology advance, enterprises meet fast changing strategic goals and increasing cross-functional tasks, which raises new challenges for performance appraisal. Starting from the above angles, this paper sets up a data mining-based performance appraisal framework, to conduct comprehensive assessment of employees on their ability to work and job competency. This framework has been successfully applied, providing a reliable basis for human resources management.	Pei Quan, Ying Liu and Yong Shi
341	Word Similarity Fails in Multiple Sense Word Embedding [abstract] Abstract: Word representation is one foundational research in natu- ral language processing which full of challenges compared to other elds such as image and speech processing. It embeds words to a dense low- dimensional vector space and is able to learn syntax and semantics at the same time. But this representation only get one single vector for a word no matter it is polysemy or not. In order to solve this problem, sense information are added in the multiple sense language models to learn alternative vectors for each single word. However, as the most popular measuring method in single sense language models, word similarity did not get the same performance in multiple situation, because word simi- larity based on cosine distance doesn’t match annotated similarity scores. In this paper, we analyzed similarity algorithms and found there is ob- vious gap between cosine distance and benchmark datasets, because the negative internal in cosine space does not correspond to manual scores space and cosine similarity did not cover semantic relatedness contained in datasets. Based on this, we proposed a new similarity methods based on mean square error and the experiments showed that our new eval- uation algorithm provided a better method for word vector similarity evaluation.	Yong Shi, Yuanchun Zheng, Kun Guo, Wei Li and Luyao Zhu