Rare Events

Rare Events



In the weather business, most forecasts are pretty routine. Predictions for sky cover, precipitation probabilities, and temperature have gotten very accurate in the modern day and age. These routine predictions, if wrong, usually cause little more than a minor inconvenience. What people really need to know about is the high impact events that cause numerous casualties and major disruptions to everyday life. As it turns out, the deadliest and most impactful events are also the rarest, which brings about a number of logistical challenges to a weather forecaster.

To really illustrate this point, here are some statistics about thunderstorms and severe thunderstorms in the United States:

As it turns out, the majority of tornado fatalities (about 60%) are caused by EF4+ tornadoes. That is, a majority of all severe weather fatalities are caused by 0.01% of all thunderstorms in the United States (1 in 10,000). Think about that for a minute. 10 of the 100,000 thunderstorms in the United States are responsible for a majority of severe weather fatalities (not counting flood deaths).

This is the reason why the Storm Prediction Center (SPC) has reworked their public messaging campaign. In accordance with the new (impact-based messaging) mission statement from the National Weather Service, the SPC has implemented a new 5-tier outlook system that more directly conveys the expected impacts. This ranges from the common "marginal risk" for the lower echelon of strong to severe thunderstorms to the rare "high risk" when violent (EF4+) long-track tornadoes appear likely, which are the top 1% of tornadoes that cause catastrophic damage and mass casualty events.

Nowadays, the SPC averages about 1 high risk day in the United States per year, but, prior to 2015, the high risk was more common (about 2-3 times per year, depending on which decade you look at). Why have these high risk days become less common? It's because the SPC has raised the standard for issuing a high risk, because they want to (1) have a low false alarm rate and (2) more effectively highlight when an exceptionally volatile atmosphere that demands great respect is in place. Even in tornado prone sections of the country, there are often several years between high risks.

The main challenge with attempting to predict these rare events is the small sample size. Even though no two days will have exactly the same weather, there are 100s to 1000s of common events that all had similar outcomes and had similar patterns. So, it's easy to place an upcoming common weather event into historical context, i.e. "we've seen this particular pattern 100 times before, and almost every time this happened" (this is known as an "analog"). With rare events, you don't have that. Even though major (rare) tornado events share common elements, you may have (at best) a handful of similar historical events that have any resemblance at all. And, within the "rare events", you have the "extreme events", such as the two super outbreaks that have been recorded (April 3rd, 1974 and April 27th, 2011). Those two super outbreaks have substantial differences in their patterns, but both produced essentially the same outcome.

If and when the next super outbreak occurs, can it actually be predicted? We only have 2 events to look at, and the more recent event undoubtedly has more data than the older event. It's entirely possible that the next super outbreak will be missed because it might involve a pattern that hasn't been seen before. Or, what's more likely is, a pattern that looks like a potential super outbreak will materialize, and the event turns into something bad, but below expectations.

In order to get a "super outbreak" with numerous EF4+ tornadoes in a 24-hour timeframe, you need an extremely delicate balance of multiple ingredients to coincide with each other. These ingredients include:

Let's examine some parameters from the April 27th, 2011 super outbreak to gain some perspective. The gold standard for representing atmospheric instability (energy) is CAPE (convective available potential energy). CAPE values above 1000 J/kg usually signal potential for severe thunderstorms. CAPE values above 2000 J/kg usually signal potential for significant severe thunderstorms. CAPE values above 3000 J/kg usually indicate a very energetic and volatile atmosphere. On April 27th, 2011; CAPE values exceeded 4000 J/kg in many areas (which, by the way, CAPE values this high are almost never seen in the Deep South).

There are a number of parameters that represent wind shear, but we'll focus on just two of them for simplicity's sake. The first one is "bulk shear", which essentially represents the change in wind speed from one altitude to another altitude. To get a profile favorable for organized supercells (which are required for violent tornadoes), Surface-6 kilometer bulk shear usually has to be at least 35 knots. On April 27th, the Surface-6 kilometer bulk shear reached around 70 knots.

When directly assessing tornado potential, a wind shear metric known as "helicity" is also used. Without getting too technical, this essentially accounts for both the change in wind speed and direction with altitude. Values above 100 usually indicate potential for weak tornadoes. Values above 200 usually indicate potential for strong (EF2+) tornadoes. On April 27th, these values exceeded 500.

Near sea level, the dewpoint is often used to represent moisture content. For most tornado events, a dewpoint of at least 65 °F is needed. On April 27th, dewpoints were between 70 °F and 75 °F (which is extremely moist). A nearly perfect amount of lift was present during that outbreak as numerous discrete (isolated) supercells had formed in this environment.

So, there you have it. Not only were these ingredients in place, but they were in great abundance. No wonder that day saw over 200 tornadoes with 15 of those tornadoes being EF4 or stronger. In other words, more than a year's worth of violent (EF4+) tornadoes occurred in the span of about 12 hours. In fact, this outbreak was so extreme that it completely overshadowed another major tornado event that occurred about 2 weeks earlier in the same region (April 15th, 2011).

Alright, so if we see this combination of parameters again, it means a super outbreak will occur, right? Wrong. As it turns out, this combination of ingredients is more common than previously thought, and there have been several recorded "false alarms" even though the indications all pointed to a super outbreak event.

Probably the most infamous example is the May 20th, 2019 event that affected parts of Oklahoma and Texas. Let's take a look at the same four numerical measurements and see how they compare:

Parameter April 27th, 2011 May 20th, 2019
CAPE 4000 5000
Surface-6 km Bulk Shear 70 knots 65 knots
Helicity 450 - 525 400 - 550
Dewpoint 73 °F 72 °F


There were tornadoes on May 20th, 2019; but there weren't as many as expected, and the strongest tornado was rated EF3. Even though every computer model showed numerous supercells forming all over Texas and Oklahoma that day, the lift was too weak to form and sustain that many supercells. Only a handful formed, and there were no violent (EF4+) tornadoes recorded.

A better (and more recent) example would be May 6th, 2024. Here is how that day compared with April 27th, 2011:

Parameter April 27th, 2011 May 6th, 2024
CAPE 4000 4000
Surface-6 km Bulk Shear 70 knots 76 knots
Helicity 450 - 525 400 - 650
Dewpoint 73 °F 70 °F


So, what happened with May 6th? In this case, no one really knows the answer. There were about 8 or 9 supercells that formed in Kansas and Oklahoma. Only 1 of them produced an EF4+ tornado and the others produced almost nothing in terms of tornadoes. On paper, every supercell that day should have produced a big tornado, but that's not what happened. All of the ingredients were in place for an extreme tornado event, but only 1 storm actually made full use of the available ingredients.

If I were to offer up my own professional opinion as to what's going on here, I believe that these exceptionally volatile setups are just not predictable from a practical standpoint. I realize that's an unsettling conclusion, because we really need to know when the next super outbreak will occur, but I'm not convinced that's going to be possible without sounding a number of false alarms. To help illustrate why I believe this, I will attempt to relate this to a more generalized concept.

In the realm of engineering and the sciences, almost everything has what's called a "critical threshold". Once the critical threshold is exceeded, the system you're looking at no longer behaves in a predictable pattern. In other words, it becomes chaotic and practically impossible to accurately predict. I believe that the combination of ingredients needed for a super outbreak event carries the atmosphere beyond the "critical threshold", and this conclusion is supported by a number of observations.1

Once the CAPE parameter exceeds a certain amount (the exact amount is up for debate), storms begin to behave erratically. This is one of the reasons why the May 31st, 2013 El Reno tornado killed 4 storm chasers, because the tornado itself was not moving in a straight line or at a constant speed like most tornadoes do. That particular tornado covered every cardinal direction and had forward speeds ranging from 5 mph to 55 mph.

There is another well-known instance where uniquely erratic storm behavior has been documented. The F5 tornado that hit Jarrell, TX in 1997 formed in an environment that theoretically should never support violent tornadoes because the wind shear was too weak. However, the CAPE values were extremely high (over 6000 J/kg), and it's believed the extreme energy somehow managed to compensate for the relative absence of wind shear. Even so, that does not explain why similar storms have formed in similar circumstances and didn't produce violent tornadoes. In fact, the vast majority of storms that form in a "high CAPE/low shear" environment produce only rain, lightning, wind, and hail.

A similar thing also happened on June 13th, 2012 in the Dallas/Fort Worth metroplex. CAPE values exceeded 4000 J/kg, but the wind shear should never have supported the development of supercells. Yet, somehow, two ordinary storms that every meteorologist thought would be short-lived and forgettable mutated into monstrous supercells that dropped destructive hail on the metroplex. Again, similar atmospheric environments have been seen on numerous occasions, but the same results haven't really been replicated.2

There's also a physical explanation for why this might be the case. When the atmosphere has extreme energy levels, any small or trivial detail can be amplified very rapidly. Meaning, if there's a small imperfection in the wind pattern or an imperfection near the storm's updraft, that imperfection can be amplified so much that it completely wrecks the storm's organization, which would not happen if the energy levels were lower. The reverse can also be true; a small feature could enhance the storm's organization if sufficiently amplified, but wouldn't affect the storm at all if energy levels were lower. The problem is, such subtle details cannot be reliably observed or forecasted.

If one of the preconditions for forecasting a major tornado outbreak is "high or extreme CAPE", that begs the question: Has the Storm Prediction Center raised the bar so high that they're now trying to do the impossible? Although this question cannot be definitively answered right now, there is reason to believe the answer could be "yes".

There have been 9 high risks issued by the SPC since the new 5-tier outlook system was implemented. On average, those high risks saw 47 tornadoes per high risk day. If you look at the 9 most recent high risks issued while the old 3-tier outlook system was in place (excluding April 27th, 2011 and high risks for derechos), that average is 61 tornadoes per high risk day.

If you look at maximum EF-scale rating from high risk days, the 5-tier average is 2.9 and the 3-tier average is 3.9. And, all of this is in spite of the fact that the number of potential tornado observers has increased and the number of buildings for tornadoes to hit has increased.

In fact, I would argue the SPC's uncharacteristically poor performance in the spring of 2024 can be at least partially explained by the unusually high CAPE values that were seen throughout the entire spring. The operational tools haven't changed much since 2020 and the staffing hasn't really changed since then. As a matter of fact, the SPC correctly predicted a major tornado outbreak on March 31st, 2023 (which largely had CAPE values under 4000, but were sufficiently high to support significant tornadoes). It would seem that the high/extreme CAPE setups in 2024 have created a high number of severe weather events that are practically impossible to accurately predict with any meaningful lead time.

If we're really entering a new era where these high/extreme CAPE environments will become the norm and not the exception, that is going to pose a major forecasting and public messaging challenge. In chaotic systems, you need to know what the exact state of the system is in order to predict it. If you're off by a seemingly trivial amount at any time or location, the outcome is going to change significantly. It is simply not possible to know the exact state of the entire atmosphere, because we can't observe every cubic centimeter of the atmosphere at all times. Even if we could, our instruments are not perfect, so we would still be dealing with a chaotic system.

Even worse, this would mean that the exceptionally rare forecasting messages that have been reserved for the hypothetical unprecedented (but potentially possible) weather events are useless. Few people know this, but the SPC can issue a 60% tornado probability. Usually, the highest tornado probability they'll issue is 30%, and there is a 45% tornado probability for more extreme threats, which has been issued only a handful of times. As fate would have it, the two most recent 45% tornado probability days failed to verify, both of which were issued in the new 5-tier outlook system. Could this also be evidence that these extreme events just can't be reliably predicted? If this is the case, the next issuance of a 60% tornado probability might turn into a major public messaging nightmare if the outcome is below the mark.

I realize I've been primarily focusing on tornado events, but major straight-line wind events (derechos) also suffer from this problem. In fact, derechos feature some of the worst forecast performance out of all severe weather events. Since the majority of derechos occur during the summer (when CAPE values are maximized), this could also be evidence to suggest high/extreme CAPE setups are not predictable. The SPC has issued at least 4 high risks specifically for derechos and none of them verified. However, there are a few documented extreme derechos (e.g. the August 2020 Derecho) that arguably warranted a high risk because of how much damage they caused, but the high risk was not issued.

This also extends past severe thunderstorms. Some of the worst flash flooding events in recent history were not correctly predicted in advance. Guess what kind of atmosphere a lot of these flash flooding events occurred in? High/extreme CAPE environments during the summer! From my own experience, trying to predict the exact behavior of summer thunderstorms is essentially impossible, and it just so happens that the CAPE parameter reaches its highest values during the summer (it's not unheard of to see CAPE values over 9000 during the summer months).

It should also be noted that the true rarity of "rare weather events" will never be known. While it is true that the two recorded super outbreak events occurred about 40 years apart from each other, that does not mean the next super outbreak event will occur in the 2050s. It might occur in the 2040s, it might occur in the 2060s, or it could occur next year. Just because the "return period" is 40 years doesn't mean it will occur exactly once every 40 years, and this can be disproven by a simple experiment.

Once a day, find a random coin in your home and flip it. Assuming the coin flip is truly random, the probability of seeing "heads" is 50%, so doing this once a day means getting "heads" has a return period of 2 days (1 ÷ 0.50 = 2). That is, on average, you should see "heads" on half the days you do this. However, if you do this experiment, you'll find that the outcomes do not follow a steady pattern at all. You'll probably have 2 consecutive days where you get "tails" instead. You might even have a stretch where you record 5 or 6 "heads" in a 7 day timeframe. The only way to accurately measure the true probability of an event is to witness the event an infinite number of times. Since our time on this planet is limited (not infinite), we have to approximate the probability after running a very large number of trials (i.e. a number that is so large that it is "approximately infinite"). How many trials are required? Unfortunately, the answer to this question is far from simple.

For the coin example, you would probably need at least 50 trials to get a probability estimate near 50%, but realistically it would require more than 100 trials. In other words, the number of trials needs to be a number that is much greater than the value of the return period (in this example, the return period is 2 days, so 50 trials would be 50 days, which is 25 times larger than the return period).

If we assume the return period for a super outbreak event is 40 years, then 25 × 40 years = 1000 years. Do we have 1000 years of weather data? Not even close! At best, we have about 150 years of reasonably reliable data, but the quality of that data has changed over the years. Another complicating factor is that Earth's climate is always changing. Earth's climate 1000 years from now is going to be very different than the climate today. So, for all we know, Earth's climate might become more favorable for super outbreaks in the future or it might become less favorable. When you're figuring out the true probability of an event, you have to assume that the system you're examining is static (i.e. not changing, like the coin example). Earth's atmosphere is far from static and thus violates this assumption, so the true probability of any weather event will change over time.

There's also reason to believe that even bigger tornado outbreaks (a "mega outbreak" if you will) are theoretically possible, they're just so unlikely that they haven't been observed yet. How do you measure the likelihood of something that is theoretically possible but has never been observed? You can't. The best you can do is make an educated guess based on the likelihood of similar and more common events, but the answer you get is just that: a guess.

This is how scientists estimate what a "1 in a 1000 year flood" or a "1 in a 500 year flood" is. They're estimating the probability of extreme flooding events using the "common" flooding events as a basis for their inference. If you truly wanted to verify the probability of a "1000 year flood", you would probably need 25,000 years of data, and you would have to assume Earth's climate doesn't change in that timeframe. It's simply an impossible task. So, just keep that in mind the next time you hear the term "1000 year flood" or "generational tornado outbreak". That figure is based on a probability that is estimated, and the true probability of such an event is unknown.

In some cases, you need a sample size that is massively larger than the return period to accurately gauge the true probability. The best example of this is the "probability of precipitation". While this figure varies based on your geographic location (which, by the way, is another factor that complicates probability calculations), the national average for precipitation probability in the United States is approximately 20% per day. Therefore, the return period on "measurable precipitation" is 5 days (1 ÷ 0.20 = 5). Do you see rain or snow every 5th day throughout the calendar year? Of course not. There are "dry seasons" and "wet seasons" that strongly influence the likelihood of precipitation. For most places, the winter and the spring months witness the most precipitation while the summer and autumn months witness the least precipitation. If you take a 125 day sample (5 × 25 = 125), you'll get a very different estimate for precipitation likelihood, depending on what months you capture.

What about sampling an entire year (365 or 366 days)? Yes, you will get a more accurate result, but what if that year is drier than normal or wetter than normal? To truly estimate the likelihood, you would probably need at least 10 years worth of observations. At which point, what's the ratio of sample size to return period? 3650 ÷ 5 = 730, which is much higher than the factor of 25 that was suggested in the above paragraphs.

It's entirely possible that the frequency of rare weather events is influenced by similar cycles that cause them to be more common in some centuries or millennia and less common in other centuries or millennia. Good luck getting 730,000 years worth of data to verify that something is a "1 in a 1000 year event".

If you get into the realm of truly unprecedented events, then forecasting the impacts becomes extremely challenging. The best example of this is Hurricane Harvey (60" of rain in 5 days had never been witnessed in southeast Texas before). Since this was an unprecedented event, the true impacts of this rainfall were unknown, especially considering how much urbanization had taken place since an event of even remotely comparable magnitude had occurred (Tropical Storm Allison). Also, this urbanization is one of the main reasons why the February 2021 Arctic Outbreak was so catastrophic. The event itself wasn't unprecedented (a more severe Arctic outbreak occurred in 1899), but the true impacts were unprecedented because of the rapid population growth seen in Texas.

There are lessons to be learned on both sides of this problem. If you're a weather forecaster and you see what looks like a potential unprecedented or "super" event, you need to be extremely careful about how you present that to your end users (usually the general public). If you predict a super event and overshoot the mark too often, you'll come across as being deranged and untrustworthy.

If you're the end user of a forecast, you'll need to accept the fact that there will be false alarms when a "super event" is being predicted, but that should NOT discourage you from taking the predictions seriously. If a professional meteorologist is predicting a potential super event, there's a really good reason for them doing so. However, such predictions should be rare (usually about once every 5 or 10 years at most often). If a weather forecaster is predicting unprecedented or "super events" every year, you'll definitely want to seek out additional opinions to help corroborate or disprove such claims.

Footnotes

1 A great everyday example of this concept is electricity. At low voltages, electricity flows in a predictable pattern (through conductors, e.g. wires). At extreme voltages (like a lightning bolt), electricity no longer behaves in any sort of a predictable pattern, and the traditional rules of circuitry are ignored.

2 Another really good example of this unpredictable behavior was May 24th, 2011. Similar conditions were present in Oklahoma and Texas on this day. Oklahoma saw several violent long-track tornadoes while Texas saw mostly weak tornadoes.

References

National Weather Service Thunderstorm Hazards

NOAA: Storm Prediction Center SVRGIS Page

NOAA: Storm Prediction Center Severe Weather Summary Archive