AMD Makes EPYC Return to Cray Supercomputers

April 18, 2018

By: Michael Feldman

For the first time in five years, Cray will be offering supercomputers equipped with AMD chips. On Wednesday the company announced it has added support for EPYC 7000 processors in its CS500 product line.



Up until today, you could only order a CS500 with one of three compute blades using either Intel Xeon processors, Intel Xeon Phi processors, or a combo of Intel Xeon and NVIDIA Tesla processors. Now customers will be able to purchase a fourth option with the AMD EPYC 7000 CPUs. The last time you could buy AMD-based HPC machinery from Cray was when the company was selling Opteron-powered XE6 and XK7 supercomputers back in 2013.

Today, the CS500 is Cray’s flagship HPC cluster platform, offering x86-based rack-mounted servers using industry standard hardware. Unlike the XC line which uses the proprietary Aries network, the CS500 employs Mellanox InfiniBand or Intel Omni-Path as the system interconnect. For all practical purposes that limits the system size to something over 11 thousand nodes and about 40 petaflops.

Cray’s new EPYC server will housed in a 2U chassis comprised of four dual-socket nodes. Each node supports two PCIe Gen3 x16 slots, which can yield 200 Gbps worth of InfiniBand or Omni-Path if a dual-rail option is desired. Local SATA disk and SSD attachments are supported as well.

The rationale for including an EPYC option in the product line was not explicitly spelled out in Cray’s announcement. Fred Kohout, the company’s senior vice president of products and chief marketing officer, issued a statement saying the addition was “emblematic of Cray’s commitment to the community to deliver a comprehensive line of high-density systems with an optimized programing environment to deliver the required performance and scalability.”

Of course, with that rather broad logic you could justify supporting practically any top tier chip. In any case, Cray would probably rather not offend Intel by trumpeting any advantages of the AMD silicon, but when pressed a bit harder by TOP500 News, Chris Lindahl, Cray’s Product Marketing Director, told us that the new EPYC nodes would be “great for customers with applications that are memory bandwidth bound.” He says, for example, they would be a good choice for users running computational fluid dynamics (CFD) codes.

One of the highlights of the EPYC 7000 chip, and one of the mains reasons it can outrun a Xeon CPU on some HPC workloads, is the support of eight DDR4 channels. (The latest Xeon Skylake processors only support six channels.) AMD says if all 16 DIMM slots on a dual-socket EPYC node are filled with 2666MHz memory, the theoretical memory bandwidth is 341 GB/second. According to the company, even with marginally slower 2400MHz memory, you can still achieve 307 GB/second. That’s a good deal faster than what a Xeon-based system can deliver.

Of course, memory subsystem performance is more complicated that just adding up the peak channel speeds; things like cache behavior and memory controller logic also enters into the mix. Plus, the data access behavior of a real application muddies the waters even further. Nonetheless, the EPYC 7000 processors have tended to come out on top on a number of memory-bounded HPC applications, including for the aforementioned CFD category.

Last summer, HPC consultant Joshua Mora ran the well-known CFD Fluent code on two comparably-equipped EPYC 7000 and Xeon Skylake-based systems. Based on his video demonstration, which was posted by AMD, the EPYC system ran the CFD code about 78 percent faster than the Xeon system. Testing by others appears to confirm the advantage.

Not every HPC code, or even every memory-bounded code, will behave the same way, but given the increasing prevalence of data-demanding applications across the HPC landscape, there are undoubtedly cases where these EPYC nodes will offer significantly better application performance than the competition. That said, Xeon Skylake processors still excel at pure floating point performance, especially when its AVX-512 vector processing capability is utilized effectively. And even integer performance tends to be faster on the high-clock-speed Xeon parts. All of which is fine. Cray customers with a combination of compute-bound and memory-bound workloads could presumably buy a mix of Xeon and EPYC servers relative to their particular application needs.

Apparently, there is no CS500 configuration as of yet that offers AMD CPUs with any sort of coprocessor – GPUs or FPGAs. The CS500 is not the platform of choice for acceleration though. According to the spec sheet, the best you can do here is a single NVIDIA Kepler-era K40 GPU per node. If you want state-of-the-art acceleration in a cluster platform, customers would naturally opt for the CS-Storm product.

Cray is not disclosing any customers at this point, although apparently they do exist. “We have a strong pipeline for this stage of a product introduction,” Lindahl told us. If that’s the case, one could imagine that an EPYC option for the company’s XC supercomputer platform may not be far off, although Cray didn’t offer any guidance regarding that possibility. We’ll see.

In the meantime, customers can purchase EPYC-powered CS500 clusters. They are expected to be generally available this summer.