ORNL Begins Construction of Summit Supercomputer

Oak Ridge National Laboratory has begun to install Summit, the IBM-NVIDIA-powered system that is likely to become the most powerful supercomputer in the world when completed.

The news comes courtesy of Oak Ridge Today, which reported that the first cabinets for Summit arrived last Monday (July 31).  According to ORNL spokesperson Morgan McCorkle, once the crates are unpacked, they will begin installing the internal computational and networking components and hook them into the power and cooling infrastructure at the Oak Ridge Leadership Computing Facility (OLCF).

Installation is expected to take six months of more, with the system expected to become generally available to scientific users by January 2019. However, select application developers at the Department of Energy and a handful of universities will get a crack at it well before that. McCorkle told TOP500 News that the pre-production Summit will be available via the Center for Accelerated Application Readiness, an early-access program designed to allow developers to port and optimize grand challenge codes for Summit’s new CPU-GPU architecture.

All of that suggests that the system may not be up and running until well into 2018, and will not turn up in the TOP500 list until next June. At that point, absent another surprise from China, it still has an excellent chance of unseating the current supercomputing champ, TaihuLight. That system has a peak performance of 125.4 petaflops and a Linpack result of 93 petaflops. Later this year, China is expected to deploy Tianhe-2a, a supercomputer expected to deliver around 100 petaflops, although, as we reported back in January, that number could rise in concert with China’s ambition to own the number one spot.

Officially, Summit is expected to be 5 to 10 times as powerful as Titan, ORNL’s current top system. Titan is currently ranked as number four on the TOP500, with a Linpack mark of 17.6 petaflops (from 27.1 peak petaflops). Given that Summit will be comprised of approximately 4,600 nodes, each containing six 7.5-teraflop NVIDIA V100 GPUs, and two IBM Power9 CPUs, its aggregate peak performance should be well north of 200 petaflops. The GPUs alone provide this level of performance.

Another possibility is that ORNL will run Linpack on a partially completed Summit in October or November, which at that point may be large enough to recapture the top supercomputing spot for the US. A possible glitch is that IBM has not officially launched its Power9 processor, and is not expected to do so until early 2018. But some number of chips will certainly be available before that, and, in fact, it’s unlikely that IBM would be shipping crates of Power9 servers to Oak Ridge without their CPUs.

Regardless of who is at the top of the supercomputing heap, Summit will be a unique resource for the DOE and its research community. Besides providing unprecedented amounts of computational capacity for traditional HPC applications, it will offer the largest platform in the world for deep learning workloads. Assuming the system is configured as advertised, it will deliver something like 3.3 exaflops of deep learning performance (mixed 16/32-bit precision math). That’s thanks to the Tensor Cores in the V100 GPUs, which were specifically bred to accelerate the type of matrix operations involved in this kind of software. As a result, Summit will be an exceptional resource for testing the limits of the neural networks used for deep learning.

Summit is also the last stop on the way to exascale, at least for the gang at Oak Ridge. Given the cadence of supercomputer upgrades at the DOE, the next big deployment at ORNL will almost certainly be an exascale machine – perhaps the first in the US. Whether that turns out to be a future implementation of Summit’s CPU-GPU architecture, or something else entirely, remains to be seen.

Image source: Oak Ridge Leadership Computing Facility (OLCF), distributed under Creative Commons license


Current rating: 4.7


Baajie Zhuu 1 year, 11 months ago

The Chinese is apparently taking a very different approach, at least commercially.

With the underhanded ban of exports of high end chips to China, the Shenway SW-26010 chip was brought to the fore at least 2 years early, and surprised everyone with its performance. With the SW-52010 (512 computing cores) well on its way to deployment, the National Supercomputing Center (China) announced this month (August 2017) that they will start deliver commercial versions of SW-26010 based supermicros. The entry unit has 2 of these AI-tuned multi-threaded chips, delivering a respectable 6 teraflops (about the performance of an American supercomputer in 2000). And there is no upper limit - you can theoretically order a custom configuration all the way up to the Taihu Light sized machine (it WILL cost you a bundle though).

The trend is clear: the Chinese is again executing a China Price + China Speed strategy on becoming a big supercomputing player. Trade supercomputing cycles to attract research to China. The offer is generous - low cost, low energy use supercomputer cycles, plus research staff at one third the cost in the West.

Link | Reply

New Comment


required (not published)