By: Michael Feldman
Cray is going to build what will looks to be the world’s first ARM-based supercomputer. The system, known as “Isambard,” will be the basis of a new UK-based HPC service that will offer the machine as a platform to support scientific research and to evaluate ARM technologies for high performance computing. Installation of Isambard is scheduled to begin in March and be up and running before the end of the year.
Prof Simon McIntosh-Smith, leader of the project and Professor of High Performance Computing at the University of Bristol made a presentation about the upcoming system at the Mont-Blanc ARM event taking place at the Barcelona Supercomputing Centre (BSC) this week. “I think this is really exciting for a number of reasons,” McIntosh-Smith told TOP500 News. “It’s one of, it not the first serious, large(ish)-scale ARMv8 64-bit production machines. And it’s the first time Cray has explicitly announced an ARMv8 product meant for more than just prototyping.”
Whether this actually turns into a commercial offering remains to be seen. Today’s announcement in Barcelona did not coincide with a product announcement on the other side of the world at Cray headquarters. According to McIntosh-Smith though, Isambard will be based on Cray’s CS400 platform, an existing product the company offers to provide mid-sized x86/InfiniBand clusters. They are currently outfitted with Intel Xeons, along with optional Xeon Phi processors or NVIDIA GPUs as accelerators. If Cray intends to add an ARM CPU option to the CS400 product, that announcement has been deferred to another day.
Product or not, Isambard looks to be a formidable machine – probably on the order of tens of teraflops. Isambard will include over 10,000 64-bit ARMv8 cores, in addition to a smattering of x86 CPUs, Intel Knights Landing Xeon Phi processors, and NVIDIA P100 GPUs. The project’s rationale for this architectural diversity is to compare application performance across a range of processors on the same machine. From Cray’s perspective, such diversity fits neatly into its vision of a heterogeneous computing future. “Scientists have a growing choice of potential computer architectures to choose from, including new 64-bit ARM CPUs, graphics processors, and manycore CPUs from Intel,” said McIntosh-Smith.
It was not revealed what type of ARM processor or SoC would be used, but given that Cray was working on Cavium-based HPC systems as far back as 2014, it’s a good bet that Isambard will be outfitted with the ThunderX or ThunderX2 chips. The latter is the second-generation ARM server SoC under development at Cavium, which is supposed to be generally available sometime this year. ThunderX2 also happens to be the same processor that the future Mont-Blanc prototype will be using. That system, which was revealed yesterday at the same event in Barcelona, will be constructed by Bull, and is intended to be used strictly as a proof-of-concept machine for the purpose of developing ARM-based exascale technology.
By contrast, Isambard will be a production machine, at least to some degree. It will be used to run a new national “Tier 2 service” by the Great Western 4 (GW4) consortium, which comprises universities in Bristol, Bath, Cardiff and Exeter. The consortium’s mission is to strengthen the regional economy via scientific research with industry partners. Procurement of the system is the result of a £3 million award from the Engineering and Physical Sciences Research Council (EPSRC), the UK’s principle agency for funding technology and engineering R&D in the public sector. An additional £1.7 million will be allocated to operate the system over its projected three-year lifetime.
The UK’s Met Office is also a partner in the effort, since they want to evaluate Isambard’s ability to run its own weather and climate simulations. The rationale here is to see if these compute-heavy workloads can be supported on a more energy-efficient platform. These workloads are currently being run on their in-house 8-teraflop (peak) Cray XC40 supercomputer powered by x86-based Intel CPUs, specifically the 18-core Xeon E5-2695 v4 processors. Like many public agencies with petascale supercomputing infrastructure, the Met Office is looking to reduce the considerable cost involved in running and cooling these large beasts. Comparing Isambard with its Cray XC40 will be made easier by the fact that the Office will be hosting the ARM-based machine on behalf of the GW4 consortium.
The GW4 machine will also be compared against Archer, the UK’s primary academic research supercomputer run by EPSRC. Archer is a Cray XC30 system, again powered by Intel Xeon CPUs. However, unlike the Met Office supercomputer, Archer runs a wide variety of scientific workloads including in areas such as CFD, materials science, molecular dynamics, quantum chemistry, and earth science.
“We chose about 10,000 cores [for Isambard] because most Archer science runs use no more than 8,000 to 9,000 cores,” explained McIntosh-Smith. “So with a system this size we’ll get a good idea how well ARMv8 could perform for real UK science jobs, and by extension, as the CPU in a future UK national HPC service.”
Testing the waters for a much wider deployment of the technology seems to be the real driver here. McIntosh-Smith thinks it will be interesting to see if and when more of these ARM-powered HPC systems start to show up. “I don’t expect a huge flood yet because most of the community has no hard data on how well HPC-optimized ARMv8 CPUs might perform for their codes,” he explained. “My intention is that Isambard can help start to address this lack of rigorous data. If Isambard performs well, it may be the start of more ARMv8 based systems being used in production.”