We are proud to announce that a research paper developed under the Deep Hybrid DataCloud Project has been accepted for inclusion in 22nd International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES2018) to be held on 3-5 September 2018, in Belgrade (Serbia). This paper will be published by Elsevier Science in the open-access Procedia Computer Science series on-line.

Title: A multivariate fuzzy time series resource forecast model for clouds using LSTM and data correlation analysis

Authors, Nhuan Trana, Thang Nguyena, Binh Minh Nguyena, Giang Nguyenb

a School of Information and Communication Technology, Hanoi University of Science and Technology, Hanoi, Vietnam

b Institute of Informatics, Slovak Academy of Sciences, Bratislava, Slovakia

Abstract

Today, almost all clouds only offer auto-scaling functions using resource usage thresholds, which are defined by users. Meanwhile, applying prediction-based auto-scaling functions to clouds still faces a problem of inaccurate forecast during operation in practice even though the functions only deal with univariate monitoring data. Up until now, there are still very few efforts to simultaneously process multiple metrics to predict resource utilization. The motivation for this multivariate processing is that there could be some correlations among metrics and they have to be examined in order to increase the model applicability in fact. In this paper, we built a novel forecast model for cloud proactive auto-scaling systems with combining several mechanisms. For preprocessing data phase, to reduce the fluctuation of monitoring data, we exploit fuzzification technique. We evaluate the correlations between different metrics to select suitable data types as inputs for the prediction model. In addition, long-short term memory (LSTM) neural network is employed to predict the resource consumption with multivariate time series data at the same time. Our model thus is called multivariate fuzzy LSTM (MF-LSTM). The proposed system is tested with Google trace data to prove its efficiency and feasibility when applying to clouds.