Sales forecasting is one of the real-life business forecasting tasks that is of importance to retail stores. Sales Forecasting involves prediction of future sales based on historical data. The goal will be a projection of future revenue within a specific period of time. Inaccurate business forecasts could result in actual or opportunity losses. The dataset to be used in this challenge is real-life data from one of South Africa's large retail stores. The retail store is interested in being able to forecast sales for the next 70 days at 2 levels of time granularity (per day, and per week). The data to be used in this competition is multi-dimensional with the following dimensions; date, department and store, over which the forecasts should be done.
The forecasts should be as follows:
Apart from using the traditional time series forecasting methods, you are challenged to use machine learning techniques for this task. The classical approach to tackle this challenge for each level of time granularity would be to fit (i) d time series models, one for each of the d departments; (ii) s time series models, one for each of the s stores; and (iii) d*s models, one for each department-store pair. The problem with such an approach is the individual models don’t learn from each other. It is reasonable to hypothesize that there are some patterns that are shared across stores and across departments that can be leveraged on to come up with better sales forecasts for this task. Whilst this challenge is not against the use of the classical approach, participants are encouraged to assume the afore-mentioned hypothesis, and come up with times series models that can share patterns across departments and across stores to improve the forecasting accuracy.
Participants will be given training data to be used to train the forecasting models. Owing to the nature of time series data, no testing data will be released to the participants. The participants are expected to make predictions for the next 70 trading days and submit their forecasts.
If successful, your work will continue to advance the theory and practice of time series forecasting in retail sales data.
The winner(s) will present their solution during the SACAIR 2022 Conference. The conference will be hosted at the Stellenbosch Institute for Advanced Studies in Stellenbosch, South Africa, from 5 to 9 December 2022.
Contact the organizers via: sacairunconference@gmail.com
The submissions will be ranked according to the Root Mean Squared Error (RMSE) on the held out test set. Owing to the nature of time series data, no testing data will be released to the participants. Participants are expected to submit the 3 csv files with the following forecasts for the next 70 trading days:
To get the overall score for the competition weighted RMSE will be used. The Rand value of the sales will be used to formulate the weights.
26-08-2022: Start of the competition. Sample dataset released.
01-09-2022: Training data released. No test data will be released. Participants are expected to make their predictions for the next 70 trading days. The prediction will automatically be evaluated based on the heldout observed sales for the next 70 days
22-09-2-2022: Entry deadline
14-11-2022: Competition ends
1st Prize: R 5000.00 Takealot Voucher
2nd Prize: R 3000.00 Takealot Voucher
3rd Prize: R 1000.00 Takealot Voucher
Start: Sept. 1, 2022, noon
Description: Development phase: Submit your prediction for the next 70 trading days.
Nov. 14, 2022, 11:59 p.m.
You must be logged in to participate in competitions.
Sign In