Traffic forecasting study case of an application server (cluster, servers x4) using Machine Learning. The project consists of three parts, all presented in this README file:
- Data Preprocessing: Preprocessing of the input data (different levels of aggregation)
- Data Statistics: Statistical analysis of the input and preprocessed data (seasonality, trends)
- Traffic forecast using Machine Learning algorithms (time series forecasting)
The input data file contains traffic data per minute, on four application servers which they consist a single cluster. The traffic is disctributed to these four application servers in a load balanced way. The format of the input data is:
date host requests
15/03/26 14:00 as-01 316
15/03/26 14:00 as-02 285
15/03/26 14:00 as-03 306
15/03/26 14:00 as-04 286
15/03/26 14:01 as-01 268
15/03/26 14:01 as-02 303
15/03/26 14:01 as-03 266
15/03/26 14:01 as-04 290
...
The available data are from 2015-03-26 14:00:00 till 2020-04-03 19:59:00.
runForecast.py: Main script for the Traffic Forecast part. Usepython runForecast.py -hfor available options.utils.py: Utilities script for the Traffic Forecast part.dataFactory.py: Main script for the Data Preprocessing part.dataStatistics.py: Main script for the Data Statistics part.trafficForecast.py: Interface for the Traffic Forecast part.model.py: Abstract class for each model implementation.dnn.py: Deep Neural Network implementation.rnn.py: Recurrent Neural Network implementation. Not added to the repository yet.lstm.py: Long Short Term Memory Neural Network implementation. Not added to the repository yet.