The introduction of NVIDIA’s Volta GPU architecture is being keenly anticipated by the supercomputing community. As we reported last July, when the rumors of an earlier-than-anticipated Volta release were bouncing around the internet, a 2017 launch of the next-generation Tesla GPUs seems all but certain. The latest speculation is that these first Volta parts will be based on a new 12nm FinFET technology recently devised by the Taiwan Semiconductor Manufacturing Company (TSMC).
Before we go any further, it should be noted that the rumor about the shrink to 12nm is rather thinly sourced, based on a post in a Beyond3D forum, although the gang at WCCFtech has been speculating about a 12nm Volta since at least October of last year. For what it’s worth, TSMC’s 12nm technology has been characterized as a refinement of their 16nm process, the same one used to manufacture NVIDIA’s current Pascal GPUs. TSMC’s next major process node is 10nm, which will supposedly be ready early this year, at least for less muscular chips than that of high-end GPUs.
The 10nm technology was certainly the process node that NVIDIA would have wished to use for its upcoming Voltas, inasmuch as TSMC is promising a 20 percent performance this technology compared to its 16nm process. The 12nm node would probably only achieve about half of that. In any case, one would assume that future renditions of Volta products will be manufactured on 10nm technology, or perhaps even 7nm technology if the architecture has an extended lifetime.
Of course, Volta isn’t all about silicon shrinkage. The new architecture is also in line for a reworked design with regard to its streaming multiprocessor (SM), the computational engine that power all NVIDIA’s GPUs. The SM update is supposed to deliver better performance and power efficiency than its Pascal predecessor, irrespective of transistor size. And according to at least one report , the design difference between Pascal and Volta will be much more significant than the one between Maxwell and Pascal.
Admittedly, this is all speculation. The reason we feel the need to discuss this now is that the Volta GPU will be the architecture NVIDIA will rely on for the following two years to fend off the upcoming manycore competition from both Intel and AMD. In the case of Intel, the Volta Tesla GPU will be the rival to the future “Knights Hill” processor, which should be ready to go by 2018, as well as the deep learning-optimized “Knights Mill” Xeon Phi processor, which is supposed to be available later this year. Meanwhile AMD will be fielding its Vega GPUs throughout 2017, which will include the new Radeon Instinct line for deep learning, and most likely an upgraded FirePro GPGPU as well. Unsurprisingly both Intel and AMD will be manufacturing these chips with smaller transistors to help them in the performance department.
A more immediate concern is that the Volta GPU is going to be the computational heart of Summit and Sierra, two of the upcoming pre-exascale supercomputers that the US Department of Energy is deploying under the agency’s CORAL (Collaboration of Oak Ridge, Argonne, and Lawrence Livermore) program. As far as we know, both systems are on track to be installed before the end of 2017 and go into production in early 2018. Although the host GPU of these two systems is the Power9 CPU, about 90 percent of their floating point performance is likely to be provided by the Volta coprocessors. Therefore, the performance of Summit and Sierra will be primarily derived from the capabilities of the Volta silicon.
The Summit supercomputer, in particular, will receive a great deal of scrutiny, since it is expected to deliver somewhere between 150 and 300 peak petaflops of performance when deployed at Oak Ridge National Lab toward the end of the year. That could be enough for the US to recapture the number one spot on the TOP500 list for the first time since 2012, assuming it gets installed in time for a Linpack run before November 2017. That also assumes that China doesn’t come up with a system even bigger than Summit in the interim. As we reported last week, the Tianhe-2A system is now overdue for its deployment, and is likely to be installed in 2017 with a peak capacity well north of 100 petaflops.
Much more of the Volta story should unfold in early May, during NVIDIA’s GPU Technology Conference (GTC), where the new architecture is expected to be introduced. We might even get a peak of what comes after Volta at GTC. But let’s not get ahead of ourselves.