MareNostrum Supercomputer Adds Power9/V100 Racks

June 13, 2018

By: Michael Feldman

The Barcelona Supercomputing Center (BSC) has installed three new racks of IBM servers in its MareNostrum 4 supercomputer, increasing the capacity of the system by 1.5 petaflops.

The IBM servers are similar to the ones that power the recently launched Summit supercomputer at Oak Ridge National Lab (ORNL) in the US, which are also outfitted with Power9 CPUs and NVIDIA V100 GPUs. And like Summit, the Power9/V100 sub-cluster is expected to get a lot of machine learning and data analytics workloads thrown its way. Mateo Valero, director at Barcelona Supercomputing Center says they have “expectations that IBM Power Systems will help BSC accelerate MareNostrum’s ability to advance research in personalized medicine, deep learning, and AI applications.”

The additional IBM servers are not quite as computationally dense as the ones in Summit, however, since they have only four V100 GPUs per node, compared to six in the ORNL supercomputer. In addition, the MareNostrum servers are equipped with 20-core Power9 CPUs, while Summit uses a 22-core version. For memory and storage, the three new racks are outfitted with 30.7 terabytes of DRAM, 207 terabytes of SSDs, and 345.6 terabytes of NVMe-attached flash. The sub-cluster is glued together with Mellanox EDR InfiniBand.

The extra Power9/V100 hardware adds nearly 1.5 petaflops to the fourth-generation MareNostrum, which will bring the machine up to 12.6 peak petaflops. The system’s original 11.15-petaflop general-purpose partition is comprised of 48 racks of Lenovo SD530 servers, powered exclusively by Intel “Skylake” Xeon CPUs.  BSC expects the majority of its users will rely on this base partition for their production work.

When completed, MareNostrum is expected to deliver 13.7 peak petaflops. The additional petaflop or so is supposed be split between an additional cluster equipped with Fujitsu’s upcoming vector-enhanced ARM SVE processor and another cluster based on Intel’s Knights Hill Xeon Phi.  But since the chipmaker abandoned Knights Hill, it’s uncertain if BSC will wait for whatever processor Intel comes up with for the DOE’s 2021 Aurora exascale supercomputer or just double up on the ARM SVE partition.

The goal of all this computational diversity is to give BSC an idea of what technologies are most useful to its mission of supporting scientific research. More specifically, it will enable the center to “evaluate which applications will get a better cost/performance ratio in each architecture and its suitability for future iterations of MareNostrum.” BSC has not revealed when it plans to deploy its first exascale supercomputer, but given the timelines of the EU’s various exascale initiatives, MareNostrum 4’s successor is not likely to be up and running until 2022 or 2023, at the earliest.

Image: IBM Power9/V100 server in MareNostrum rack. Source: Barcelona Supercomputing Center.