Thunderstorm forecasting with GAMs

September 15, 2018
By

(This article was first published on Achim Zeileis, and kindly contributed to R-bloggers)

Boosted binary generalized additive models (GAMs) with stability selection and corresponding MCMC-based credibility intervals are discussed in a new MWR paper as a probabilistic forecasting method for the occurrence of thunderstorms.

Citation

Thorsten Simon, Peter Fabsic, Georg J. Mayr, Nikolaus Umlauf, Achim Zeileis (2018). “Probabilistic Forecasting of Thunderstorms in the Eastern Alps.” Monthly Weather Review. 146(9), 2999-3009. doi:10.1175/MWR-D-17-0366.1

Abstract

A probabilistic forecasting method to predict thunderstorms in the European eastern Alps is developed. A statistical model links lightning occurrence from the ground-based Austrian Lightning Detection and Information System (ALDIS) detection network to a large set of direct and derived variables from a numerical weather prediction (NWP) system. The NWP system is the high-resolution run (HRES) of the European Centre for Medium-Range Weather Forecasts (ECMWF) with a grid spacing of 16 km. The statistical model is a generalized additive model (GAM) framework, which is estimated by Markov chain Monte Carlo (MCMC) simulation. Gradient boosting with stability selection serves as a tool for selecting a stable set of potentially nonlinear terms. Three grids from 64 x 64 to 16 x 16 km2 and five forecast horizons from 5 days to 1 day ahead are investigated to predict thunderstorms during afternoons (1200–1800 UTC). Frequently selected covariates for the nonlinear terms are variants of convective precipitation, convective potential available energy, relative humidity, and temperature in the midlayers of the troposphere, among others. All models, even for a lead time of 5 days, outperform a forecast based on climatology in an out-of-sample comparison. An example case illustrates that coarse spatial patterns are already successfully forecast 5 days ahead.

Software

https://CRAN.R-project.org/package=bamlss

Case study

Predicting thunderstorms in complex terrain (like the Austrian Alps) is a challenging task since one of the main forecasting tools, NWP systems, cannot fully resolve convective processes or circulations and exchange processes over complex topography. However, using a boosted binary GAM based on a broad range of NWP outputs useful forecasts can be obtained up to 5 days ahead. As an illustration, lightning activity for the afternoon of 2015-07-22 is shown in the top-left panel below, indicating thunderstorms in many areas in the west but not the east. While the corresponding baseline climatology (top middle) has a low probability of thunderstorms for the entire region, the NWP-based probabilistic forecasts (bottom row) highlight increased probabilities already 5 days ahead, becoming much more clear cut when moving to 3 days and 1 day ahead.

observed and forecasted occurrence of thunderstorms on 2015-07-22

More precisely, the probability of thunderstorms is predicted based on a binary logit GAM that allows for potentially nonlinear smooth effects in all NWP variables considered. It selects the relevant variables by gradient boosting coupled with stability selection. Effects and 95% credible intervals of the model for day 1 are estimated via MCMC sampling and shown below (on the logit scale). The number in the bottom-right corner of each panel indicates the absolute range of the effect. The x-axes are cropped at the 1% and 99% quantiles of the respective covariate to enhance graphical representation.

stability-selected effects of the boosted binary logit GAM

(Note: As the data cannot be shared freely, the customary replication materials unfortunately cannot be provided.)

To leave a comment for the author, please follow the link and comment on their blog: Achim Zeileis.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)