Which model to choose? Performance comparison of statistical and machine learning models in predicting PM2.5 from high-resolution satellite aerosol optical depth
Atmospheric Environment
By Padmavati Kulkarni, V. Sreekanth, Adithi R. Upadhya, Hrishikesh Chandra Gautam in Machine Learning Regression MODIS-MAIAC
May 18, 2022
Abstract
The mathematical solution to estimate surface fine particulate matter (PM2.5) from columnar aerosol optical depth (AOD) includes complex variables and involves a bunch of assumptions. Hence, researchers tend to use training-based models to predict PM2.5 from AOD. Here, we integrated regulatory composite PM2.5 measurements, high-resolution satellite AOD, reanalysis meteorological parameters, and a few other auxiliary parameters to train ten different regression models. The performance of these (seven statistical and three machine learning) models was evaluated and inter-compared to identify the best performing model. The accuracies of the model predicted PM2.5 were quantified based on the coefficient of determination (R2), mean absolute bias (MAB), normalized root mean square error (NRMSE), and other relevant regression coefficients. The model’s performance on unseen data was investigated in terms of 10-fold cross-validation (CV) and Leave-one station-out CV (LOOCV). For this exercise, we considered the case of NCT-Delhi due to: (i) the availability of dense regulatory PM2.5 measurements, (ii) the possibility of understanding the model performance over a large range of PM2.5 (the daily mean PM2.5 values ranged between 4 and 492 ug/m3 during the study period), and (iii) the scope of better understanding the influence of extreme meteorological conditions (e.g. the ambient surface temperature varies between 5 and 40 degree C during a calendar year) on the AOD-PM2.5 relationship. All the models were trained using data collected for the year 2019 (a non-COVID year). Among models under investigation, Machine Learning (ML) models performed better with R2, MAB, and NRMSE values for the CV exercises ranging between 0.88 and 0.93, 14.1 and 18.2 ug/m3, and 0.18 and 0.23, respectively. The generalizability of the results obtained in this study was discussed.
- Posted on:
- May 18, 2022
- Length:
- 2 minute read, 275 words
- Categories:
- Machine Learning Regression MODIS-MAIAC
- See Also: