By: Dairsie Latimer, Technical Advisor, Red Oak Consulting
Here’s a rapid-fire run through of things that we expect to see on the show floor and hear discussions of – some whispered – over the course of next week and beyond.
30 Years of SC
SC18 marks the 30th birthday of the preeminent HPC conference and gathering of those who call high performance computing a day job. I’m a relative latecomer to attending SC, having started with Reno in 2007, although I’ve been working in HPC since 2005. I’ve been to all but two since then, and it’s the conference everyone orients their professional calendar around. I know a few people who’ve been to every show thus far, which shows remarkable stamina.
SC is always a whirlwind of meetings, punctuated by mad dashes between hotel briefing suites, but there are the occasional lulls where one can catch up with friends old and new over a beer or a coffee and chew the fat over the latest industry happenings.
You should also definitely stop by the Cray-1 supercomputer on display and grab a selfie. If you really want to see how close to the haemorrhaging edge of computer technology HPC traditionally is, then this is the perfect example.
Race to Exascale
There have been a range of pre-exascale machine announcements – and maybe a few new entries on the TOP500 list due this week – but probably the most interesting of them so far is the news that Cray has won the NERSC-9 (Perlmutter) bid with a combination of their Shasta next-generation supercomputing architecture and AMD’s EPYC CPUs allied with NVIDIA GPUs, the exact flavours of which will align with the expected delivery in late 2020.The all-flash scratch storage will be provided by a Cray Clusterstor system.
The interconnect will also use Cray’s latest interconnect, called Slingshot, which is spiritually closer to Ethernet than anything else. Given the leg up that Aries gave the current XC range, Slingshot promises to be an interesting development and I wouldn’t bet against it being one of the killer features of this generation of Cray machines.
Speaking of interconnects
With Slingshot being unveiled and also being shown on the exhibition floor, the news that Mellanox may be for sale (with Barclays Bank retained to advise) and Xilinx emerging as the frontrunners to buy the firm, this will certainly get people talking about the future of HPC interconnects. Mellanox has been doing a lot of stuff right over the last few years and certainly is offering the next best interconnect on the block for HPC. Intel’s Omni-Path has had its share of detractors and the roadmap seems to have suffered slippage along with everything else recently. While Bull’s BXI promised much, it has yet to be seen in the wild and has suffered a protracted arrival into the market.
It seems that the gloves are most definitely off from AMD with the release of some of the notable headline features for its second generation of EPYC CPUs (Zen 2). With claims of 4x the peak FP performance per socket over Zen 1 based CPUS, delivered by a combination of up to double the number of cores and double the throughput from the AVX units, coupled with a revised IO subsystem (supporting PCIE 4.0 and lower latency links between memory controllers which will hopefully make up for greater memory contention between cores) LINPACK performance should be somewhere close to Intel’s stopgap CPU announcement of Cascade Lake-AP. Speaking of which it will be interesting to see the actual TDPs on Cascade Lake–AP actually are, with figures north of 250W being thrown around and we may all need to sit down in a dark room when the pricing is announced.
We of course also have the non-86x ISAs providing further CPU diversity with Arm and IBM’s Power being the current challengers. Each has their own set of advantages and disadvantages, but all of them are set to compete on price and power performance with Intel and disrupt the current hegemony in HPC.
Continued convergence of HPC, data analytics and ML/AI
The continued convergence of HPC and data analytics and machine learning techniques has normalised the concept of heterogeneous computing to such an extent that it is no longer just the top 10 machines on the TOP500 list that routinely deploy large numbers of accelerators. We are now seeing the product maturity that comes from investing in a platform for close to ten years from NVIDIA, and they are certainly setting themselves up as the vendor to beat outside Intel and AMD.
There are so many AI/ML/DL accelerator companies coming out of stealth that I’ve almost lost track, and I’m sure a few will be making some interesting announcements around partnerships and perhaps even some deals at the show. While we shouldn’t forget that FPGA’s with their undoubted flexibility will have a role to play in the Wild West that is ML/DL research, it’s still not clear if they will see widespread deployment in an HPC context – as opposed to an edge computing environment.
We’re still coming down from the peak of inflated expectation and starting the descent down into the trough of disillusionment on Gartner’s hype cycle for all things AI, but there are real uses cases starting to be trialled and used on the more traditional HPC installations, as opposed to cloud vendors and retail companies. Certainly, several recent procurements and HPC announcements (think Post-K) have emphasised the fact that we are already deploying exaops of performance, in the reduced precision sense. This is reflected in that fact that all the latest ISAs are including support for reduced precision arithmetic.
We’re also seeing a change in emphasis for storage as the I/O patterns for ML/DL start to impact the system workloads. As always benchmarking and balancing the budget for the greater good will take centre stage here. Storage, be it on-premises or in the cloud, is always a contentious topic. There are a few new entrants who may be lucky enough to ride the recent changes in NAND-based storage prices to some success against the established vendors.
IBM and Red Hat
This is probably a bigger impact for enterprise than HPC, but given the mood music about how the tie-up will enhance IBMs ability to deliver hybrid cloud, it will be interesting to see how this one plays out. It would seem like a culturally incompatible deal and one wonders how much institutional indigestion this will cause IBM, and if the system shock to Red Hat will be terminal. The less charitable commentators might note that this is probably Ginni Rometty’s chance at legacy building, given IBM’s recent financial performance and older product set.
Cloud computing/hybrid cloud
Ok, we’ve been saying this for a while, but it seems that the message is finally out. Prepare for the hybrid cloud environment today. If you don’t soon, you will find yourself with some ground to make up in the future. With recent announcements in the IaaS and HPCaaS space, cloud vendors are gradually closing the gap between themselves and on-premises HPC, both technically and pricing wise, but even now it’s certainly not a one-size fits all argument.
Simply put, being able to take advantage of hybrid cloud, by pushing certain suitable workloads into more elastic on-demand infrastructure makes good business sense for many users of HPC. There will always be circumstances, workloads and use cases that mean retaining on-premises HPC capability is necessary, but the argument that cloud will continue to be unsuitable for a wide range of HPC tasks is starting to look a little thin.
With recent announcements of large-scale investment by the US ($1.25B), EU (EUR1.0B) and China making similar investments, the race is most definitely on to define the quantum computing future.
Although IBM, Intel, Google, Microsoft, D-Wave, and a handful of other proponents all have well-advanced programmes in developing practical quantum computing, there are still a number of barriers to wide-scale adoption that would enable this technology to displace more traditional computational methods. While quantum computing has been a coming technology for at least a decade, we probably have at least that long again before we see any dramatic challenges to the current computing paradigm.
For the uninitiated, there will be a range of talks and tutorials on quantum computing at the conference.
Above all …
Have fun at SC18! It’s certainly shaping up to be one of the most interesting. Here’s to another exciting 30 years!
About the author
Dairsie has a somewhat eclectic background, having worked in a variety of roles on supplier side and client side across the commercial and public sectors as both a consultant and a software engineer. Following an early career in computer graphics, micro-architecture design and full stack software development; he has over twelve years' specialist experience in the HPC sector, ranging from developing low-level libraries and software for novel computing architectures to porting complex HPC applications to a range of accelerators. Dairise joined Red Oak Consulting @redoakHPC in 2010 bringing his wealth of experience to both the business and customers.