Lessons in predictability – The Post-Groundhog Day 2009 non-event and the March 2009 “Megastorm

(Click on thumbnails for larger images)

 

These events represent cases where forecast guidance, as provided by numerical weather prediction models (NWP) and ensemble predictions system (EPS) showed uncertainty with the events. There was considerable evidence of misinterpretation of the NWP and EPS data and the way the potential for the February ‘09 storm was conveyed to the public1.  The potential February ’09 storm had been given names such as “The Megastorm”, “Groundhogzilla”, Big Daddy Storm” and was being compared to the “Superstorm” of March 1993 several days prior to the storm. The actual impact of the February ’09 storm was much less than what was advertised2 with 3-6” of snow and isolated reports of up to 8” from central and eastern PA through NJ and Long Island, NY (Fig. 1).  In contrast, the March ’09 storm had a much more widespread impact in terms of observed snowfall (Fig. 2), but was rated a 1 out of 5 on the NESIS Scale, the lowest ranking possible, hardly a “Megastorm”.

 

In addition to individual models from different forecast centers, NCEP’s Global Ensemble Forecast System (GEFS) showed that the storms had contrasting degrees of uncertainty associated with them, with the low confidence in the forecast details of the February ’09 storm and the higher confidence in the March ’09 storm.  Ensemble display techniques will be used to illustrate this point.

 

These storms relied heavily on the merging of multiple upper troughs, which past versions of operational forecast models had difficulty resolving.  A well-known east coast storm known as the “Surprise Snowstorm” in January 2000, and two storms of which the impact was greatly overstated along the east coast, namely the “Millennium Storm” of 30 December 2000 and the March 2001 storm are other prime examples of older versions of operational forecast models not resolving upper trough interactions effectively. The memory of these storms 8-9 years ago may have affected perceptions and expectations of forecasters, not wanting another “surprise storm” to occur.  

 

Great commentary, explanations and lessons resulting from the different perceptions from different forecast sources prior to the February 2009 storm can be found at these links to the Capital Weather Gang (Washington D.C.) and Stu Ostro from The Weather Channel.  A link to an Accuweather explanation about computer model accuracy after the February 2009 storm can be found at this link.

------

1Washington Post Capital Weather Gang articles on 4 February and 6 February 2009 summed up issues and some names were assigned to this storm.

 2NWS hazardous Weather Outlooks (HWO) and Area Forecast Discussions (AFD) across the northeast and mid-Atlantic U.S. highlighted varying degrees of a significant storm.

 

 

a) b)

 

Figure 1.  Snowfall totals on 3-4 February 2009 for a) New York and New England and b) Delaware, Pennsylvania and New Jersey.

 

 

Figure 2.  Snowfall totals and NESIS scale for the 1-3 March 2009 “Megastorm”.

 

Overview of forecast guidance prior to the 3-4 February 2009 storm

 

The set of forecast guidance that began the promotion for a major east coast snowstorm, with comparisons to the March 1993 “Superstorm” was the 12Z 29 January guidance.  Successive sets of guidance every 12 hours from 00Z 30 January through 12Z 1 February showed significant trends eastward with the surface low pressure center and associated impacts.  A closer look at the sets of guidance showed that confidence in a high impact storm should not have been as high as was promoted by the meteorological community and broadcast media.  Ensemble guidance from the MREF, ECMWF and  Poor Man’s Ensemble” of the GFS, ECMWF, GGEM and UKMET showed significant spread and a high degree of uncertainty in the mean surface low pressure forecast in the 12Z 29 January and all successive guidance.  Lagged Average Forecasts (LAFs) for the GFS, ECMWF and GGEM also showed considerable spread and high uncertainty with regard to the track of the surface low pressure.  There are many ways of looking at the models and determining uncertainty.  Let’s begin with looking at set forecast times for multiple runs of different models and the GFS Ensemble.

 

Loops of model data from AWIPS D2D

 

a) b) c)

 

Figure 3.  Loops of 500 hPa heights from 12Z 29 January ECMWF through 00Z 2 February ECMWF for a) 12Z 4 February, b) 00Z 5 February and c) 12Z 5 February.  Note the significant changes from run to run, showing the high level of uncertainty in the evolution of 500 hPa features.

 

a) b) c)

 

Figure 4.  Loops of 500 hPa heights from 12Z 29 January GFS through 00Z 2 February GFS for a) 12Z 4 February, b) 00Z 5 February and c) 12Z 5 February.  Note the significant changes from run to run, showing the high level of uncertainty in the evolution of 500 hPa features.

 

a) b) c)

 

Figure 5.  Loops of 500 hPa heights from 12Z 31 January NAM through 00Z 2 February NAM for a) 12Z 4 February, b) 00Z 5 February and c) 12Z 5 February.  Note the significant changes from run to run, showing the high level of uncertainty in the evolution of 500 hPa features.

 

 

 

a) b) c)

 

Figure 6.  Loops of 500 hPa heights from 12Z 29 January GFS Ensemble through 00Z 1 February GFS Ensemble for a) 12Z 4 February, b) 00Z 5 February and c) 12Z 5 February.  Note the significant changes from run to run, showing the high level of uncertainty in the evolution of 500 hPa features.

 

a) b) c)

 

Figure 6.  Loops of MSLP from 12Z 29 January ECMWF, GFS and NAM through 00Z 3 February ECMWF for a) 12Z 4 February, b) 00Z 5 February and c) 12Z 5 February.  It is recommended that the loop be stopped and each image analyzed individually, as the ECMWF is shown first (green contours), then combined with GFS output (orange contours) and then the NAM is added (blue contours). Each successive image is the next model run, progressing every 12 hours, but valid at the times listed in a, b  and c.  Note the significant differences in each model and the changes from run to run, showing the high level of uncertainty in the evolution of MSLP.

 

a) b) c)

 

Figure 7. Loops of 850 hPa and 925 hPa temperatures from 00Z 30 January through 12Z 2 February ECMWF for a) 12Z 4 February, b) 00Z 5 February and c) 12Z 5 February.  It is recommended that the loop be stopped and each image analyzed individually. Each successive image is the next model run, progressing every 12 hours, but valid at the times listed in a, b  and c.  Note the significant changes from run to run, showing the high level of uncertainty in the evolution of thermal gradients.

 

a) b) c)

 

Figure 8. Loops of 850 hPa and 925 hPa temperatures from 12Z 29 January through 00z 2 February GFS for a) 12Z 4 February, b) 00Z 5 February and c) 12Z 5 February.  It is recommended that the loop be stopped and each image analyzed individually. Each successive image is the next model run, progressing every 12 hours, but valid at the times listed in a, b  and c.  Note the significant changes from run to run, showing the high level of uncertainty in the evolution of thermal gradients.

 

a) b) c)

 

Figure 9. Loops of 850 hPa temperatures from 12Z 29 January and 00Z 31 January GFS Ensemble for a) 12Z 4 February, b) 00Z 5 February and c) 12Z 5 February.  It is recommended that the loop be stopped and each image analyzed individually. Each successive image is the next model run, but valid at the times listed in a, b  and c.  Note the significant changes from run to run, showing the high level of uncertainty in the evolution of thermal gradients.

 

a) b) c)

 

Figure 10. Loops of 850 hPa and 925 hPa temperatures from 12Z 31 January through 00Z 2 February NAM for a) 12Z 4 February, b) 00Z 5 February and c) 12Z 5 February.  It is recommended that the loop be stopped and each image analyzed individually. Each successive image is the next model run, progressing every 12 hours, but valid at the times listed in a, b  and c.  Note the significant changes from run to run, showing the high level of uncertainty in the evolution of thermal gradients.

 

a) b)

 

Figure 11. Loops of accumulated liquid equivalent precipitation from 12Z 29 January through 12Z 31 January for a) ECMWF, b) GFS.  It is recommended that the loop be stopped and each image analyzed individually. Each successive image is the next model run, progressing every 12 hours, but valid at the times listed in a and b.  Note the significant changes from run to run, showing the high level of uncertainty in the evolution of precipitation forecasts.

 

Internet graphics of ensemble derived fields from EYEWALL web site

 

500 hPa Heights

 

MREF

a)b)c)d)

e)f)g)

 

Figure 13.  Spaghetti plot and mean MREF 500 hPa heights and anomalies.  Note the anomalously low 500 hPa heights in the southeastern U.S. in each model run. It is also important to note that each successive model run showed the mean 500 hPa trough slightly further east and/or more positively tilted, implying a storm track more offshore the eastern U.S.

 

850 hPa Winds

 

MREF

 

a)b)c)d)e)

f)g)h)i)

 

SREF

a)b)c)

 

Figure 14.  Wind barbs and anomalies for 850 hPa U and V winds.  Note the anomalous winds of |3| SD or more in the U and V directions in the 12Z 29 January MREF, implying strong boundary layer jet speed and thermal forcing as well as moisture advection into the interior northeastern U.S. off the Atlantic Ocean.  Also note that successive runs of the MREF showed much weaker winds.  However, the MREF and SREF showed sporadic runs where U wind anomalies were around -3 SD, but these were more the exception rather than the rule.  In general, after 12Z 29 January, the potential for a big snowstorm and associated extreme 850 hPa winds was greatly diminished.

 

850 hPa Temperatures

 

MREF

 

Figure 15.  Spaghetti plot and mean of 850 hPa temperatures and anomalies.  Note the anomalously cold air in the northern Gulf Coast region and the thermal ridge and tight gradient into New England.

 

Precipitable Water (PWAT)

 

MREF

a)b)c)

d) e)

 

SREF

 

Figure 16.  Spaghetti plot and mean Precipitable Water (PWAT) and anomalies.  Note the anomalously high PWAT values into New England (3-4 SD above normal) during the earlier model runs, then the trend east and north with successive model runs.  This implied that the predicted storm track was getting displaced more and more east and trending offshore with each model run.

 

Probability of 1.00” liquid equivalent in 24 hours ending 00Z 4 February

 

MREF

a)b)c)

d)e)

SREF

 

Figure 17.  Probability of 1.00” liquid equivalent precipitation ending 00Z 4 February from the MREF and SREF.  Note how the probability for 1.00” liquid equivalent rapidly shifted east in successive model runs after the initial 12Z 29 January model run.

 

Probability of 0.50” liquid equivalent in 24 hours ending 00Z 4 February

 

MREF

a)b)c)

 

d)e)f)

 

SREF

a)b)c)

 

Figure 18.  Probability of 0.50” liquid equivalent precipitation ending 00Z 4 February from the MREF and SREF.  Note how the probability for 0.50” liquid equivalent rapidly shifted east as early as the 00Z 31 January model run, and then in successive MREF and SREF model runs.

 

Probability of a 0.20” in 24 hours ending 00Z 4 February

 

SREF

 

 

Figure 19.  Probability of 0.20” liquid equivalent precipitation ending 00Z 4 February from the SREF.  Note how the probability for 0.20” liquid equivalent was confined to southern New York and southern New England.

 

 

Plume Diagrams

 

MREF

Albany, NY

a)b)c)d)

e)f)g)h)

 

Binghamton, NY

 

 

Islip, NY

a)b) c)d)

 

SREF

Albany, NY

a)b)c)d)

Islip, NY

a)b)c)

 

Figure 20.  Plume diagrams from Albany, NY (ALB), Binghamton, NY (BGM) and Islip, NY (ISP).  Note the clustering above 1.00” in the MREF at both ALB and BGM in the 12Z 29 January MREF, and the mix of rain and snow indicated.  Note also there were 3 members suggesting 0.60” or less, which represents @14% of the members.  Note the clustering and the maximum values in the spread steadily decrease with each MREF model run until the clustering reduces to 0.20” or less.  The spread in the MREF at Islip actually reduced, increasing confidence in precipitation amounts on Long Island.  The signals in the SREF were more subtle, with the spreads remaining fairly consistent at ALB and ISP, but the clustering at ALB gradually reduced to smaller values, and the clustering in ISP gradually increased in values.

 

Analysis of predictability

 

 

Figure 21.  MSLP from the GFS valid at 12Z 3 February, initialized at a) 12Z 29 January, b) 18Z 29 January, c) 00Z 30 January and 06Z 30 January.  Note that the predicted low pressure center is considerably further east with each model run, which would result in a high level of uncertainty and a potentially considerable difference in sensible weather across the northeastern U.S.

 

 

Figure 22.  MSLP from the GFS valid at 12Z 3 February, initialized at a) 12Z 29 January, b) 00Z 30 January, c) 12Z 30 January and 00Z 31 January.  Note that the predicted low pressure center is significantly further east with each model run, which would result in a high level of uncertainty and a potentially significant difference in sensible weather across the northeastern U.S.

 

 

Figure 23.  MSLP from the GFS valid at 12Z 3 February, initialized at a) 12Z 29 January, b) 12Z 30 January, c) 12Z 31 January and 12Z 1 February.  Note that the predicted low pressure center is significantly further east with each model run, which would result in a high level of uncertainty and a potentially significant difference in sensible weather across the northeastern U.S.

 

 

Figure 24.  Mean MSLP and spread from the 12Z 29 January GFS valid 12Z 3 February.  Note the greatest spread, and greatest uncertainty over the northeastern U.S.

 

 

 

Figure 25.  MSLP from the ECMWF valid at 12Z 3 February, initialized at a) 12Z 29 January, b) 00Z 30 January, c) 12Z 30 January and 00Z 31 January.  Note that the predicted low pressure center is significantly further east with each model run, similar to the GFS, which would result in a high level of uncertainty and a potentially significant different in sensible weather across the northeastern U.S.

 

 

 

Figure 26.  MSLP from the ECMWF valid at 00Z 4 February, initialized at a) 12Z 29 January, b) 00Z 30 January, c) 12Z 30 January and 00Z 31 January.  Note that the predicted low pressure center is significantly further east with each model run, similar to the GFS, which would result in a high level of uncertainty and a potentially significant difference in sensible weather across the northeastern U.S.

 

 

Figure 27.  Mean MSLP and spread from the 12Z 29 January ECMWF valid 12Z 3 February.  Note the greatest spread, and greatest uncertainty over the interior eastern U.S.

 

 

Figure 28.  MSLP from the GGEM valid at 12Z 3 February, initialized at a) 12Z 29 January, b) 00Z 30 January, c) 12Z 30 January and 00Z 31 January.  Note that the predicted low pressure center is significantly further east with each model run, similar to the GFS and ECMWF, which would result in a successively higher level of uncertainty and a potentially significant difference in sensible weather across the northeastern U.S.

 

 

Figure 29.  Mean MSLP and spread from the 12Z 29 January GGEM valid 12Z 3 February.  Note the greatest spread, and greatest uncertainty over the northeastern U.S.

 

 

Figure 30.  MSLP valid at 12Z 3 February from the 12Z 29 January a) ECMWF, b) GGEM, c) GFS and d) 18Z 29 January GFS.  Note the predicted low pressure center is significantly different in each model run, which would result in a higher level of uncertainty and a significant difference in sensible weather across the northeastern U.S.

 

 

Figure 31.  Mean MSLP and spread from a “Poor man’s ensemble” of the ECMWF, GFS, GGEM and UKMET initialized at 12Z 29 January GGEM valid 12Z 3 February.  Note the greatest spread, and greatest uncertainty over the northeastern U.S.

 

 

Analysis of predictability through experimental flip-flop tool (Courtesy of developer Mike Bodner of the Hydrometeorological Prediction Center)

 

Calculation for 500 hPa Flip Flop tool – results in units of decameters:

  __________________________________

√(cycle-12hr-cycle-24hr)x(cyclecurrent-cycle-12hr)

 

Note in the figures below that the greatest magnitude of flip flop occurred in the ECMWF for each run of the guidance.  Also note the greater magnitudes on both the GFS and ECMWF at the later forecast hours of each run of the guidance, especially at 108-132 hours, when some forecast sources were promoting high confidence in a big east coast storm.  Finally, notice the greatest GFS spread, or flip-flop occurred in the 132 hour forecast from the 12Z 29 January cycle, and for the ECMWF, the 108 hour and 120 hour forecast from the 00Z 30 January cycle. 

 

 

              120 hour GFSand ECMWF mean and flip flop (color shaded)                   132 hour GFSand ECMWF mean and flip flop (color shaded)

                     Initialized 29 January 12Z and valid 12Z 3 February                                  Initialized 29 January 12Z and valid 00Z 4 February

  

 

             108 hour GFSand ECMWF mean and flip flop (color shaded)                   120 hour GFSand ECMWF mean and flip flop (color shaded)

                     Initialized 30 January 00Z and valid 12Z 3 February                                  Initialized 30 January 00Z and valid 00Z 4 February

 

  

 

             96 hour GFSand ECMWF mean and flip flop (color shaded)                   108 hour GFSand ECMWF mean and flip flop (color shaded)

                     Initialized 30 January 12Z and valid 12Z 3 February                                  Initialized 30 January 12Z and valid 00Z 4 February

 

Conclusions from the February 2009 storm

 

Bullet points from NWA presentation:

 

         No two storms are exactly alike, so citing analogs 2 or more days prior to an event is at the very least dangerous

         Consult ensemble mean and spread guidance

         Consult ensemble probabilities for various liquid equivalent precipitation values, along with plumes – let numerical probabilities guide you to “chance”, “likely” and “definite”

         Look for run-to-run consistency in 00Z and 12Z guidance/ensembles, for at least 2 consecutive runs before increasing forecast confidence to “scenario likely”

         Run-to-run trends are EXTREMELY IMPORTANT, as are ensemble spreads (spaghetti plots), especially if spreads are large and if shifts in storm tracks are noted

         Communicate sources of uncertainty and ranges of possibilities, especially 2 or more days prior to the event

         Avoid specific snow, sleet and ice amounts 2 or more days prior to an event

         Avoid emotionally charged language including but not limited toBlizzard”, “Crippling”, “Mega”, “Super” “Colossal”, “Historic”, or anything with the suffix “zilla”, especially ≥3 days prior to an event

         Routinely study past events, including rarely studied storms that do not occur

         Do broadcast meteorologists need to negotiate agreements with company/station management as to proper communication of uncertainties?

         Broadcast/internet media hype affects the ENTIRE forecasting community

         Increased phone requests to all information sources

         Inconsistencies between information sources

         Ultimately the CREDIBILITY of the ENTIRE forecasting community can be affected

 

Overview of guidance for the 1-2 March 2009 storm

 

The general run-to-run consistency for the track and strength of the storm was quite different from the 3-4 February 2009 storm.  Still, sources of guidance were not indicating a “Historical” storm, but a long track, Miller “A” storm originating from the northern Gulf Coast States.  The storm was promoted from some sources as a “Megastorm” possible due to the origins from the Gulf Coast, with the assumption that forecast guidance was indicating a system that was too weak and underestimating the potential moisture associated with the system.  In the end, all sources of forecast guidance were relatively accurate, and while acknowledging thunder snow in the southern Appalachians, and a band of 5-10” of snow from north Georgia through the Carolinas, Virginia, the DELMARVA, NYC metro and southern New England was a rare event, the societal impact was only modestly noteworthy, with a NESIS impact value of 1, the lowest possible rank.  The designation of “Megastorm” will be left to one’s own interpretation.

 

 

 

Figure 32.  Forecast guidance from the 12Z 27 February run of a) the GFSEnsemble and b) the GFS, NAM80 and ECMWF overlayed, valid at 00Z 3 March.

 

 

 

Figure 33.  Forecast guidance from the 12Z 1 March run of (left) the GFSEnsemble and (right) the GFS, NAM80 and ECMWF overlayed, valid at 00Z 2 March.

 

  

 

Figure 34.  (left) 00Z 1 March GEFS 24 hour probability of 1.00 inch liquid equivalent valid 00Z 2 March-00Z 3 March. (Center) 03Z 1 March SREF 12 hour probability of 0.50 inch liquid equivalent valid 06Z 2 March-18Z 2 March and (right) 03Z 1 March SREF 24 hour probability of 1.00 inch liquid equivalent valid 00Z 2 March-00Z 3 March

 

 

 

Figure 35.  MSAS analyses at 04Z 2 March of (left) MSLP and 3 hour pressure change, (right) wind barbs and dew points.

 

 

 

Figure 36.  (left) Water Vapor satellite imagery at 0130 UTC 2 March with lightning overlayed (note the lightning in north Georgia and SC) and (right) precipitation at 1258 UTC 2 March associated with the weakening upper deformation zone.

 

   

 

Figure 37.  Sequence of radar imagery from left to right showing a gravity wave that tracked from the DELMARVA, to offshore south of Long Island.

 

Analysis of predictability through mean and spread of the GEFS and GFS/ECMWF Poor Man’s ensemble (4 runs of the GFS and ECMWF) (Courtesy of developer Mike Bodner of the Hydrometeorological Prediction Center)

 

  

 

                     84 hours initialized 26 February 12Z                                        96 hours initialized 26 February 12Z                                           108 hours initialized 26 February 12Z

                               Valid 00Z 2 March                                                                      Valid 12Z 2 March                                                                   Valid 00Z 3 March

 

Analysis of predictability through experimental flip-flop tool (Courtesy of developer Mike Bodner of the Hydrometeorological Prediction Center)

 

Note the GFS showed less flip flopping than the ECMWF, similar to the February storm.  Also note the greater magnitude of flip flopping at later forecast hours, further supporting the fact that even in seemingly high confidence events, there is still considerable spread that precludes high confidence at 84 hours and beyond.

 

  

 

                   84 hour GFSand ECMWF mean                                          96 hour GFSand ECMWF mean                                        108 hour GFSand ECMWF mean

                       and flip flop (color shaded)                                                   and flip flop (color shaded)                                                    and flip flop (color shaded)

                      Initialized 26 February 12Z                                                    Initialized 26 February 12Z                                                     Initialized 26 February 12Z

                        and valid 00Z 2 March                                                          and valid 12Z 2 March                                                           and valid 00Z 3 March

 

 

 

Final thoughts

 

Despite significant differences between forecasts from models from different modeling centers, inconsistencies in forecasts between successive runs of the same model, and considerable uncertainty in more traditional EPS data, the February ’09 storm threat was overstated from numerous sources of weather forecasts.  Conversely, the March ’09 storm brought heavy snow to much of the east coast of the United States (Figure 14), but produced amounts of  mainly 4-10”, with scattered amounts just over 12” in New Jersey and Long Island, NY, suggesting the label of “Megastorm” was overstated. 

 

It would appear that many forecasters did not examine or understand the uncertainty associated with this potential storm. Meteorologists should understand and employ uncertainty information in all weather forecasts. Uncertainty information is readily obtained using LAFs, PME techniques, and specific ensemble prediction systems.  Deliberate learning (Colvin 2008) on how to use EPS data is a potential means to reduce these overstated forecasts in the future.

 

The true LAF from the GFS and GEM quantified the differences. The large spread on the west side of the cyclone in all of the LAFs was due to the more eastward forecasts with time. Better methods of ensembling are available to forecast uncertainty; however these data illustrate how each modeling system is sensitive to small changes in initial conditions.

 

In the February ’09 case, the potential for a high impact storm affecting a large and densely populated region made this a more highly publicized example of forecast misinterpretation and miscommunication.  The impact of subtle changes in initial conditions and their impacts on the forecasts are an irresolvable issue in weather forecasting.

 

Manjo (2008) discussed how “truthiness” has permeated much of the media. Selective exposure to points and counter points leads people to seek the truth within the confines of their comfort zones. The media therefore focuses on marketing the truth through the comfort zones of a desired audience. The attention these storms gained based on few and inconsistent forecasts implies perhaps meteorology is being marketed to those who seek information about weather within their comfort zones.

 

These 2 storms raise questions relative to individual forecast biases, human emotion in the forecast process or a lack of knowledge about predictability. It is likely that all factors played a role in these events.