Hitting 6.18 petaflops on the Linpack benchmark, the new system is Germany’s highest-performing supercomputer. On the latest TOP500 list announced last month, it sits at number 23, between the MareNostrum supercomputer at the Barcelona Supercomputer Centre and the Pleiades cluster at NASA’s Ames Research Center.
JUWELS, which stands for Jülich Wizard for European Leadership Science, will take the reins from JUQUEEN, the Jülich machine that was retired in May. However, unlike JUQEEN, which was a monolithic architecture based on IBM Blue Gene/Q technology, JUWELS uses a novel multi-module design that was developed under the EU-funded DEEP and DEEP-ER projects.
“A computer like JUWELS is not an off-the-shelf solution,” says JSC director Professor Thomas Lippert. “But as one of the largest German research centres, we are in a position to work together with our partners Atos from France and ParTec in Germany to develop the next generation of supercomputers ourselves. For us, modular supercomputing is the key to a forward-looking, affordable, and energy-efficient technology, which will facilitate the realization of forthcoming exascale systems.”
As of now, JUWELS consists of only of the “Cluster” module, in this case, an Atos Bull Sequana X1000 system, powered by 24-core Xeon “Skylake” processors from Intel. The 2,575-node machine is hooked together with Mellanox EDR InfiniBand. Although the Intel processors provide the majority of the computational horsepower in JUWELS, there are 48 accelerated nodes, each with four NVIDIA V100 GPUs, as well as four visualization nodes, each with a single GPU P100 GPU.
JSC says the entire system tops out at 12 peak petaflops. The TOP500 entry reflects only 9.9 peak petaflops, which suggests the GPU nodes and perhaps the large memory nodes were omitted from the Linpack run. Nonetheless, JUWELS has about twice the theoretical performance of JUQUEEN, but draws only 60 percent of its predecessor’s power.
The next module, known as the “Booster,” is scheduled to be installed next year, adding a few more petaflops to the total. More important than the flop count, the Booster is designed for massively parallel algorithms that run more efficiently on a manycore platform. As such, it is intended to complement the Cluster module, which is suitable for more run-of-the mill HPC codes that rely on a conventional scaled-out architecture.
At one point, the Booster module seemed destined to employ the “Knight Hill” Xeon Phi as the basis for its computational power, but since Intel has dumped that processor from its roadmap, JUWELS will have to turn to other technology. As of now, the only description JSC is supplying for this module is that it will “comprise a large number of ultra-energy-efficient processor cores connected to each other via a particularly powerful and fast network.”
The software glue for the modules is being supplied by Munich-based Partec, whose ParaStation tool will guide the application code to the most appropriate hardware. “The Booster works a bit like a turbocharger, explains ParTec CEO Bernhard Frohwitter. “Complex parts of the code that cannot be processed efficiently on a large number of processors are executed on the Cluster. Program parts that can be efficiently processed in parallel can be exported dynamically with our ParaStation software to the Booster module.”
Even in unfinished state, the JUWELS cluster is already in high demand. According to JSC, 87 projects have already been allocated computing time on the machine, which is fully booked for the next few months. Targeted applications run the gamut of science and engineering domains, but with a particular focus on brain research. In fact, because of JSC’s leadership in this area, JUWELs is intended to be one of the key computational resources for the EU’s Human Brain Project for the next few years.
If you’re interested in more detailed information on JUWELS, it has its very own webpage here.