TOP500 Meanderings: Supercomputers for Weather Forecasting Have Come a Long Way

March 1, 2017

By: Michael Feldman

Behind the scenes of practically every weather forecast we encounter today are some of the most powerful supercomputers on the planet. An armchair analysis of the world's top systems reveals some interesting aspects about the HPC technologies and machinery being used to generate these forecasts.

Weather prediction is one of the more critical high performance computing applications we have. It affects both commerce and the lives of individuals, and does so on a daily basis. According to the National Oceanic and Atmospheric Administration (NOAA), in 2016, weather events caused more than a billion dollars of financial losses and the deaths of 138 people in the United States alone. And that represented a relatively uneventful weather year for the country. In China, historic flooding in 2016 killed 237 people, destroying nearly 150,000 homes, and causing at least $22 billion in losses. In 2015, a heatwave in India and Pakistan caused an estimated 3,650 deaths, while a European heatwave later that year led to 1,250 fatalities. Perhaps the costliest weather disaster was Hurricane Katrina in 2005, which killed more than 1,800 people in the southern US and cost insurance companies more than $40 billion. Obviously, being able to better predict when and where such events will occur can do much to mitigate the damage, saving money, and more importantly, human lives.

As of November 2016, there were 23 supercomputers on the TOP500 list dedicated to either meteorological forecasts. That doesn’t include a rather large number of systems housed at national labs that devote a portion of their resources to supporting climate and weather research. In fact, some countries without an established meteorological infrastructure, rely on these general-purpose research machines to perform this work. But the dedicated supercomputers for running numerical weather models in the US, Asia and Europe form the backbone of most forecasting for the world’s leading economies.

And even though those 23 systems represent less than five percent of the supercomputers on the TOP500 list, they represent more than seven percent of the list’s total performance – more than 47 Linpack petaflops. Currently the most powerful weather forecasting computer is the UK Met Office machine, a Cray XC40 machine that delivers 6.8 petaflops and sits at number 11 on the TOP500. The second most powerful is Cheyenne, the recently installed SGI ICE XA system at the National Center for Atmospheric Research (NCAR). It’s ranked number 20 on the latest list, delivering 4.8 petaflops.

The Cray XC40 is currently the most popular supercomputer for weather agencies worldwide, claiming eight of the 23 systems in this application category. Installations exist in the US (NOAA), the UK (Met Office, ECMWF), Germany (Deutscher Wetterdienst), and Korea (Meteorological Administration). Another large XC40 machine for weather forecasting, but not yet ranked in the TOP500, has been deployed at the Bureau of Meteorology in Australia.

The remaining systems consist mostly of few IBM (now Lenovo) iDataPlex clusters and Atos bull DLC blade systems. There was also a single Fujitsu PRIMEHPC FX100 system at the Meteorological Institute in Japan, and an older IBM Flex System p460 at the China Meteorological Administration. The latter two were the only non-x86-based weather machines in operation, employing the SPARC64 XIfx CPU, and Power7 processor, respectively.

Hardware aside, the forecast model can often make the difference between success and failure. The now famous prediction by the European Center for Medium-range Weather Forecasting (ECMWF), which accurately predicted Hurricane Sandy’s landfall in 2012, while the Global Forecasting System (GFS) used in the US had its path heading out to sea, serves as a reminder that the software models are critical to success. In any case, having enough flops to run more simulations (and crunch more data) to reduce forecasting uncertainty, gave ECMWF an extra edge in 2012.

One might naturally assume these supercomputers would be all tricked out with the latest accelerators. But with the exception of the Discover supercomputer at the NASA Center for Climate Simulation, none of the TOP500 machines make use of accelerators or other manycore processors (Discover, a climate modeling platform rather than a weather forecasting system per se, uses some of the older Xeon Phi 5110P coprocessors.) The Swiss Federal Office of Meteorology and Climatology, MeteoSwiss, has a non-TOP500 Cray CS-Storm cluster that uses NVIDIA GPUs to accelerate its daily forecasting using the COSMO model. COSMO is also used on smaller clusters for weather forecasting in Germany, Italy, Greece, Poland, Romania and Russia.

By and large though, the top production systems deployed for weather forecasting lack accelerators. Some components of the popular weather research and forecasting model (WRF) code have been ported to NVIDIA GPUs and Intel Xeon Phi processors. Likewise, the Nonhydrostatic Icosahedral Model NIM), which is used by NOAA, now enjoys GPU support. For the time being, it looks like these accelerated codes have not made the jump from research into large-scale production.

The idea behind getting more performance into these supercomputers is to be able to provide quicker turn-around time for the atmospheric and ocean simulations so that severe weather events can be predicted well ahead of time. More performance also equates to better fidelity, with grid sizes reduced to 1 to 2 km on the more granular models. That translates to more accurate forecasts, so localized phenomenon such tornadoes, hailstorms, and intense downpours can be predicted at more useful scales.

Even without performance acceleration, the 2016-era machines are light-years ahead of what was available ten years ago. In 2006, there were a similar number of TOP500 weather forecasting systems (26), but the most powerful one of the day was a 16-teraflop Cray X1E supercomputer. The aggregate performance for all the weather machines in 2006 was just 200 teraflops. To put that in perspective, today, every supercomputer dedicated to weather modeling delivers more performance than that total. In fact, the current top three weather systems deliver more performance than the entire TOP500 list in 2006.

The weather prediction machinery has changed as well. In 2006, the most popular forecasting supercomputers were IBM’s eServer pSeries clusters, with a scattering of other machines from SGI, Cray, and Hitachi. The move from the IBM Power-based supercomputers to today’s Cray systems powered by Intel Xeon processor, parallels the industry-wide shift from RISC CPUs like Power to x86 processors over that time period.

In the future, we are likely to see many of these systems employing manycore processors: GPUs, Xeon Phi processor, and perhaps some of the high core-count chips being developed in China. The economic and societal motivations to get increasingly more accurate forecasts will inevitably drive the architectures down this path. The other likely development is the integration of machine learning-based forecasting with the more traditional numerical simulations. That will impact not only the hardware, but also the ways in which agencies gather and dispense data. But that’s a story for another day.