Case Study: Time Series Forecasting

This case study was conducted as part of the Forecasting module during the first year of the Graduate Diploma in Computer and Information Science. The implementation was carried out using the R programming language, leveraging its robust statistical capabilities to address complex forecasting challenges.

The study was structured into four distinct sections, each aimed at applying various forecasting techniques to real-world datasets:

Brazil’s Global Economic Indicators: Forecasting a time series derived from Brazil’s economic indicators for the next five years using methods such as drift, mean, and naive approaches.
Queensland’s Trade Turnover: Exploring seasonal patterns in trade turnover in Queensland, Australia, and comparing forecasts using seasonal naive and STL decomposition techniques.
New Zealand’s Quarterly Unemployment Rates: Generating two-year forecasts for unemployment rates in New Zealand and comparing the performance of methods like mean, naive, seasonal naive, and drift.
Regression Modelling: Applying a linear regression model with seasonal dummies to quarterly unemployment data, generating forecasts, and assessing their accuracy against methods used in the third section.

Objective

The primary objective was to evaluate the performance of various forecasting techniques across diverse datasets, considering key accuracy metrics such as RMSE (Root Mean Squared Error) and MAPE (Mean Absolute Percentage Error). By analysing these metrics, the study aimed to identify the trade-offs between simplicity and accuracy in forecasting models and to understand how model complexity impacts predictive performance.

Tools and Techniques Used

Programming Language: R
Forecasting Methods: Drift, Mean, Naive, Seasonal Naive, STL Decomposition
Regression Models: Linear Trend with Seasonal Dummies, Multiple Linear Regression
Evaluation Metrics: RMSE, MAPE

Conclusion

Analysing the forecasting accuracy metrics (RMSE and MAPE) for the models yielded the following insights:

Mean Model: Achieved the lowest RMSE (20.8830) and MAPE (19.8824), indicating the best overall accuracy. However, its simplicity makes it unsuitable for capturing intricate patterns in the data.
Linear Trend & Seasonal Dummies Model: Showed the highest RMSE (42.7920) and MAPE (41.5249), reflecting reduced accuracy but a greater capacity for modelling complexity.
Multiple Linear Regression Model: Provided a balanced approach, with moderate RMSE and MAPE values, combining elements of simplicity and complexity in its forecasts.

In conclusion, while the Mean model offers superior accuracy, it fails to account for the data's underlying dynamics. The Linear Trend and Seasonal Dummies model and Multiple Linear Regression model, despite having higher error metrics, provide a more nuanced understanding of the data, making them valuable for capturing trends and seasonal variations. Thus, the choice of model should balance the need for accuracy with the complexity of the dataset and forecasting requirements.

See Complete Project Here