Numerical weather models form the basis of all our forecasts. These complex computer programs are constantly improving, and so are our forecasts. Weather models are extremely useful, but they are not perfect.
Even our high-resolution models cannot fully represent the complex terrain of the Alps. Instead, a simplified representation of the topography must be used. As a result, small-scale processes relevant to our weather cannot be fully resolved by the model. This leads to systematic errors.
In addition, weather models provide a wealth of information about future weather, including multiple plausible outcomes to estimate the uncertainty of the forecast. To deal with this flood of information, forecasts from different weather models need to be combined and harmonized.
To automatically combine different weather models into a single, reliable, high-quality consensus forecast, we use statistical and machine learning methods. This is known as statistical postprocessing.
Postprocessing corrects and combines forecasts from weather models by comparing past forecasts with observations. For the different meteorological parameters such as wind, cloudiness, and precipitation, we employ different statistical and machine learning models. These differ in the selection of inputs from the weather models and the distributional assumptions of the forecast values. The models also vary in complexity ranging from linear methods (the well-established ensemble model output statistics methods) to novel deep-learning approaches using neural networks (currently used for postprocessing of wind).
Common to all our postprocessing is that we seek not only to minimize systematic errors, but also to adjust the variability of plausible forecast outcomes such that they better reflect the true uncertainty about the forecast as shown in Figure 2. This results in probabilistic forecasts that are more reliable and thus more useful.
Finally, postprocessing is also used to combine information from different weather models. This is done in an automated and objective way, such that the relative contribution of each weather model changes depending on forecast lead time (how far into the future we are forecasting), location, season, and many other aspects. In general, however, our high-resolution models are given more weight early in the forecast, while the global, coarse-resolution model is the only input used for the forecasts more than 5 days ahead.
This multi-model approach results in postprocessed forecasts that are less jumpy. That is, radical changes in the forecasts from a single weather model from one forecast initialization to the next affect the postprocessed forecasts less. Also, systematic errors vary more smoothly over time compared to the legacy system, where forecast errors were strongly dependent on the single weather model used (see Figure 2).
First and foremost, postprocessing uses statistical methods to correct the forecast from the weather models. As such, the forecasts are not radically different from the forecasts from the weather models used as input to the postprocessing. In particular, postprocessing does not invent new forecasts and cannot prevent forecasts from being significantly off in individual cases. What postprocessing does do, however, is reduce the systematic error on average.
Part of this reduction in forecast error can be attributed to the higher apparent resolution of the postprocessed forecasts. By using additional information such as high-resolution elevation data, forecasts can be adjusted for altitude, exposure and other effects that affect local conditions. This makes the postprocessed forecasts more localized (Figure 3). Care must be taken to strike a balance between producing specific forecasts that are optimal at one point but only representative of the immediate area, and forecasts that are representative of larger areas that are less representative of local specifics but easier to interpret.
Postprocessed forecasts are generally more uncertain than the forecasts from individual weather models (see Figure 2). Postprocessing accounts for several additional sources of uncertainty that are not reflected in the ensemble forecasts of weather models. Forecasts from a single ensemble system are much more similar to each other than to a realization from another weather model. By combining multiple models in postprocessing, this additional source of uncertainty is reflected in the postprocessed forecasts. Postprocessing also accounts for additional sources of variability present in the observations that are not represented in the weather models. This additional variability is often due to local effects and can add further uncertainty to the forecast.
Thunderstorms account for much of our summer rainfall. Forecasting thunderstorms, however, is difficult because the timing and location of these storms are often impossible to predict several hours in advance. If we were to provide a single best forecast, it would often miss the mark in these inherently unpredictable situations. Instead, our forecast models and postprocessed forecasts produce many plausible realizations to quantify the uncertainty in the forecast. This is called ensemble or probabilistic forecasting.
In our example, we can use the many realizations to estimate the probability of rainfall. On a typical summer day, the majority of forecast realizations a few hours into the forecast show no precipitation for the afternoon, but individual ensemble members - where a thunderstorm passes over the location of interest - may predict intense precipitation. Thus, it is important to focus not only on the most likely outcome (no rain), but also on the less likely outcomes that have a high impact (thunderstorm with heavy rain). This illustrates the benefit of probabilistic forecasting. On the other hand, we must learn to deal with uncertain forecasts in order to benefit from the additional information.
Despite their uncertainty, we can evaluate the quality of probabilistic forecasts in much the same way as we evaluate a single (deterministic) forecast. After observing the actual weather, we calculate the forecast error. This error depends on how far off the forecast was, but also on how certain the forecast is. A low uncertainty forecast that is on target will have a lower forecast error than either a high uncertainty forecast that is on target or a low uncertainty forecast that is off target.
We can also assess whether the forecast uncertainty or range of probabilistic forecasts is generally appropriate. For example, we can check whether it actually rained 60% of the time for which we forecast a 60% chance of rain. We find that, on average, the uncertainty of the postprocessed forecasts is reasonable. Forecasts from individual weather models, on the other hand, are often too certain, i.e., in the example above, it may rain only 40% of the cases for which the forecast probability from the weather model was 60%.