For those puzzling over the various hurricane computer forecast models to figure out which one to believe, the best answer is: Don’t believe any of them. Put your trust in the National Hurricane Center, or NHC, forecast.
It’s always been the case that a particular forecast model may outperform the official NHC forecast in some situations. However, the 2022 NHC Forecast Verification Report reiterates a longstanding truth: overall, it is very difficult for any one model to consistently beat the NHC forecasts for track and for intensity.
Track forecasts: New levels of accuracy
During the 2022 Atlantic hurricane season, NHC track forecasts had accuracies notably better than the five-year average. New records for track accuracy were set at time frames of 1, 1.5, 2, 2.5, 4, and 5 days.
Over the past 30 years, one- to three-day track forecast errors have been reduced by about 75%; over the past 20 years, four-day and five-day track forecast errors by 50 – 60%. Those numbers amount to an extraordinary accomplishment, one undoubtedly leading to huge savings in lives, damage, and emotional angst. The improvement in track forecast accuracy has slowed down in recent years, however, suggesting that forecasts may be nearing their limit in accuracy because of the chaotic nature of the atmosphere.
The average track error of 52.5 nautical miles for NHC’s two-day forecasts beat the goal of 55 nm set for 2022 as part of the Government Performance and Results Act of 1993.
Best track model in 2022: a mixed bag
As usual, the official NHC track forecasts for Atlantic storms in 2022 were tough to beat. Except for HMON at 60 and 72 hours, none of the individual models outperformed the official forecast at any time period when compared to a “no-skill” model called CLIPER5 (Figure 2). The CLIPER5 model (which combines the word “climatology” and “persistence” to show the nature of the forecasts it makes) is tough to outperform at short-term forecasts, since a hurricane will tend to keep moving in the same direction and at the same speed as at its initial point (this is called persistence). For that reason, the skill curve in Figure 2 shows relatively low skill for NHC forecasts for short-term forecasts out to one day; skill increases for forecasts between one and three days, when persistence tends not to be a good forecast (hurricanes generally don’t move in a straight line at a constant speed for days on end). Beyond three-day forecasts, NHC forecast skill starts to drop off, as the CLIPER5 model starts weighting its forecasts using climatology, which becomes tougher to beat at long ranges.
Overall, there was no single best-performing track model in 2022, according to NHC. Rather, there were different top performers at various time ranges: the GFS at short lead times (12 to 24 hours), HMON at middle lead times (2 to 4 days), and the GEFS (GFS ensemble) at long range (5 days). The Euro model was competitive across the range of time frames, and it was the best single-track model at 36 hours.
Here is a list of some of the top hurricane forecast models used by NHC:
Euro: The European Center for Medium-range Weather Forecasting (ECMWF) global forecast model
GFS: The National Oceanic and Atmospheric Administration (NOAA) Global Forecast System model
UKMET: The United Kingdom Met Office’s global forecast model
HAFS: Hurricane Analysis and Forecast System (newly added in 2023; see below)
HMON: Hurricanes in a Multi-scale Ocean-coupled Non-hydrostatic regional model, initialized using GFS data
HWRF: Hurricane Weather and Research Forecasting regional model, initialized using GFS data
COAMPS: COAMPS-TC regional model, initialized using GFS data
NHC intensity forecasts: Best yet in the short range, still challenged at longer range
Though intensity forecasts have not improved as dramatically as track forecasts over the past 30 years, there has been a notable decrease since around 2010 in intensity errors. Official NHC intensity forecast errors in the Atlantic in 2022 were 11-24% smaller than the five-year average for time periods from 12 to 72 hours, and records for intensity accuracy were set in 2022 for forecasts from 12 to 60 hours out into the future. However, the intensity errors were substantially higher than the five-year average at 96 and 120 hours (see Figure 3).
Mean intensity forecast errors in 2022 (expressed in maximum sustained winds) were about 7 mph at 24 hours and increased to about 24 mph for five-day forecasts. The official forecasts had little bias through 60 hours but were biased too low for 3-, 4-, and 5-day forecasts.
Best intensity model in 2022: HWRF
In 2022, the official NHC intensity forecast outperformed all models at 12 and 24 hours, while the IVCN and HCCA consensus models outcompeted the official forecast at time frames beyond 24 hours. The HWRF performed as well or better than the official forecast or any model blend at time frames from 60 to 120 hours—a switch from 2021, when HMON was the best single performer among the intensity models.
Over the past few years, the five top intensity models have typically been the regional/dynamical models HWRF, HMON, and COAMPS-TC (which subdivide the atmosphere into a 3-D grid around the storm and solve the atmospheric equations of fluid flow at each point on the grid), and the statistics-based LGEM and DSHP models (DSHP is the SHIPS model with inland decay of a storm factored in).
Two of the top-performing global dynamical models for hurricane track, the European (ECMWF) and GFS models, are typically not considered by NHC forecasters when making intensity forecasts. These models have traditionally made poor intensity forecasts, and this was the case again in 2022 for the Euro, as shown by the pale blue line near the bottom of Figure 4. However, the GFS model (dark blue line in Fig. 4) was surprisingly competitive in 2022, holding its own with the better intensity models and actually outcompeting the official NHC intensity forecasts at 96 and 120 hours. It will be interesting to see if this improvement—perhaps reflecting recent model updates—is a one-year fluke or is sustained in 2023. In addition, a major upgrade to the Euro that was implemented on June 23 could result in better intensity forecasts for that model as well, including its ensemble products. For example, the horizontal resolution of the medium-range Euro ensemble members has been improved from 18 km to 9 km.
New on the scene: the HAFS model
For the 2023 season, NHC is bringing the new Hurricane Analysis and Forecast System (HAFS) model into the fold of its model guidance. HAFS, which became fully operational on June 27, is now the preferred option within the National Weather Service for high-resolution track and intensity forecasts, similar to the guidance long provided by HMON and HWRF. (These two models are still being run this season, but in “legacy” mode, so the underlying code will no longer be updated.) Three years of testing (2020-2022) showed improvements of up to 10% in both track and intensity for HAFS versus HWRF.
HAFS is the hurricane-oriented element of the NWS Unified Modeling System, which uses a common dynamical core that’s designed to help streamline the agency’s key modeling efforts. Also part of this unified system is the current GFS model, which will provide input to the higher-resolution HAFS.
Two versions of HAFS are being run, both out to 126 hours and with maximum resolutions of 2 km in and around tropical cyclones:
- HAFS-A (for all global oceanic basins)
- HAFS-B (only for those basins monitored by NHC, including the North Atlantic and the Eastern and Central Pacific)
Sources of free model data
About ensemble models
Ensemble model runs are available for most of the top global models. An ensemble model is created by taking the forecast from the high-resolution version of a model like the GFS or European, then running multiple versions of the model with slightly different initial conditions to generate an ensemble of potential forecasts that suggest uncertainties that may exist. These ensemble members are run at a lower resolution to save computer time. The European model has 51 ensemble members, and the GFS has 31. The 0Z GFS run (called GEFS) goes out to Day 35 (note: there is approximately a 24-hour delay for Days 17-35 to be recorded). Note that Days 17-35 ensemble forecasts should be taken with a large grain of salt for now but may still be useful for tracking long-term or seasonal shifts.
Ensembles are especially useful for setups such as weak steering flow, where the varied starting conditions across a model ensemble may shed light on important features that the observing grid hasn’t yet captured directly. When the spread in a model ensemble decreases as a storm evolves, it’s a good sign that the forecast from that operational model is becoming more reliable. Keep in mind that one model’s ensemble tracks can sometimes be in tight agreement while another model’s ensemble is in tight agreement on a completely different solution. In such a case, it’s often the different physics within each model that are driving the difference, which makes it especially important to watch how the consensus model output evolves (the average forecast from three or more separate models averaged together, like the GFS, European, and UKMET models).
Tropical cyclone genesis forecasts
NHC has long issued a Tropical Weather Outlook four times per day, offering two-day and five-day forecasts of tropical cyclone genesis. The five-day forecasts have been expanded this year to cover seven days. For the Atlantic in 2022, these forecasts were pretty reliable for five-day genesis forecasts of 10 – 70%. For example, when NHC gave a 50% chance a tropical cyclone would form within five days, one actually did form about 53% of the time.
However, NHC’s genesis forecasts were too conservative at the upper end of the distribution. Ninety-five percent of the Atlantic storms to which NHC gave an 80% chance of development in fact did develop, and 100% of the systems to which NHC gave an 90% chance of development actually developed.
A 2016 study by a group of scientists led by Florida State’s Daniel Halperin, though now seven years old, is worth noting: it found that four models can make decent forecasts out to five days in advance of the genesis of new tropical cyclones in the Atlantic. The model with the highest success ratio (rewarding correct genesis forecasts combined with the fewest false alarms) was the European, followed by the UKMET, GFS, and Canadian models.
The scientists authoring that study found that skill declined markedly for forecasts beyond two days into the future, and skill was lowest for small tropical cyclones. The European model had the lowest probability of correctly making a genesis forecast – near 20% – but had the fewest false alarms. The GFS correctly made genesis forecasts 20 – 25% of the time but had more false alarms. The Canadian model had the best chance of making a correct genesis forecast but also had the highest number of false alarms. The take-home message: The Canadian model’s predicting genesis suggests something may be afoot, but don’t bet on it until the European model comes on board. In general, when two or more models make the same genesis forecast, the odds of the event actually occurring increase considerably, the study authors found.
Sources of tropical cyclone genesis forecasts
Website visitors can comment on “Eye on the Storm” posts (see below). Please read our Comments Policy prior to posting. (See all EOTS posts here. Sign up to receive notices of new postings here.)
Website visitors can comment on “Eye on the Storm” posts (see comments policy below). Sign up to receive notices of new postings here.