Ensemble Machine Learning, Deep Learning, and Time Series Forecasting: Improving Prediction Accuracy for Hourly Concentrations of Ambient Air Pollutants

Valentino Petrić, Hussain Hussain, Kristina Časni, Milana Vuckovic, Andreas Schopper, Željka Ujević Andrijić, Simonas Kecorius, Leizel Madueno, Roman Kern, Mario Lovrić

Research output: Contribution to journalArticlepeer-review

Abstract

This study aims to improve the generalisation capabilities of machine learning models for modelling hourly air pollutant concentrations in scenarios where access to high-quality data is limited. A diverse set of techniques was implemented to tackle this challenge, encompassing the utilisation of the prophet, random forest, and three different deep learning architectures: long short-term memory networks, convolutional neural networks, and multilayer perceptrons. A hybrid model of random forest and prophet was also tested. The role of the hybrid model was to combine the forecasting strengths of the Prophet model with the predictive power of the Random Forest model to better capture complex temporal patterns in the data. After testing, the hybrid model demonstrated improved generalization capabilities, achieving statistically significant improvements in R2 for hourly concentrations of NO (improving by 26%), NO2 (enhancing by 18%), PM10 (with changes ranging from an 8% decline to a 35% improvement), and O3 (showcasing R2 coefficients ranging from 0.83 to 0.87) at five sites in Graz, Austria. The utilisation of surface atmospheric ERA5-Land datasets within the models as model features showed high feature post hoc importance in the best (hybrid) models per pollutant and site. Furthermore, error analysis was performed to understand better the conditions under which these models might fail. The results showed that despite the expectations for models to fail with an increasing timeframe (the test set) from March 2019 to March 2020, the models were sufficiently stable for long-term prediction and thus can be used to forecast and predict air pollution.
Original languageEnglish
Article number230317
JournalAerosol and Air Quality Research
Volume24
Issue number12
DOIs
Publication statusPublished - Dec 2024

Keywords

  • Air pollution
  • CNN
  • LSTM
  • Ozone
  • Prophet
  • Random forests

ASJC Scopus subject areas

  • Environmental Chemistry
  • Pollution

Fingerprint

Dive into the research topics of 'Ensemble Machine Learning, Deep Learning, and Time Series Forecasting: Improving Prediction Accuracy for Hourly Concentrations of Ambient Air Pollutants'. Together they form a unique fingerprint.

Cite this