Jlich Preps for Five-Petaflop Supercomputer Booster

May 1, 2017

By: Michael Feldman

The Jülich Supercomputing Center is gearing up to deploy a 5-petaflop “Booster” system to augment its existing 1.7-petaflop JURECA cluster. The supercomputer combo will be the first Cluster-Booster platform modeled after the EU’s DEEP and DEEP-ER research projects.

“This will be the first-ever demonstration in a production environment of the Cluster-Booster concept, pioneered in DEEP and DEEP-ER at prototype-level, and a considerable step towards the implementation of JSC’s modular supercomputing concept”, explains Prof. Thomas Lippert, Director of the Jülich Supercomputing Centre.

Rhe idea is to construct a system comprised of different types of architectures, each of which is naturally suited for different HPC workloads. In this case, the JURECA cluster component is powered by typical high-end multicore Xeon processors, which are suited to codes needing fast cores, but exhibiting relatively little concurrency. The booster module, on the other hand, uses slower manycore processors, in this instance, the Xeon Phi, which are useful for applications that can be parallelized more easily across a greater number of cores. To manage the different resources, system software is used to assign the application software to the optimal hardware platform.

The existing JURECA cluster, which was built by T-Platforms, is a fairly conventional HPC system. It’s based on dual-socket servers powered by Intel Xeon E5-2680 v3 (12-cores, 2.5 GHz) processors; some of the servers are equipped with two NVIDIA K80 GPU coprocessors. The system interconnect is EDR InfiniBand. JURECA has been in operation since November 2015, patiently waiting for its other half to arrive.

The JURECA Booster will be built by Dell and will use the company’s PowerEdge C6230P servers. Each server will be powered by an Intel Xeon Phi 7250F (68 cores, 1.4 GHz) device, which includes the on-package Omni-Path interface -- the F designation indicates the integrated fabric. Including the fabric in the Xeon Phi package increases power draw by 15 watts compared to the non-fabric version. At just over 3 teraflops per Xeon Phi processor, the system will need around 1,640 of the devices to attain the desired five petaflops of peak performance.

The JURECA Booster will be connected to the JURECA cluster via a “novel high-speed bridging mechanism,” which will hook the cluster’s InfiniBand interconnect to the Booster’s Omni-Path fabric. To the application, the cluster and Booster will appear as a single system. ParTec’s ParaStation ClusterSuite will be used distribute application software across the different components as needed.

JURECA is currently ranked number 69 on the TOP500 list. If Jülich decides to turn in a Linpack run on the Cluster-Booster combo, the system will likely move into the top 20. No deployment date was provided for the Booster addition.

JURECA cluster image: Forschungszentrum Jülich