New MareNostrum Supercomputer Reflects Processor Choices Confronting HPC Users

The fourth-generation MareNostrum supercomputer is up and running at the Barcelona Supercomputer Centre (BSC), or at last the first phase of it is.  When completed, it will contain the most interesting medley of processors of any supercomputer in existence. We asked Sergi Girona, Director of Operations at BSC, to describe the makeup of the new system and explain the rationale for building such a diverse machine.


Source: Barcelona Supercomputer Centre


TOP500 News: Given that all previous MareNostrum systems were based on a single processor platform, what was the strategy behind making MareNostrum 4 a sort of showcase system that encompasses so many different chip technologies -- Intel Xeon, Power 9, NVIDIA GPUs, Fujitsu ARM, Intel Knights Landing and Knights Hill? In other words, what changed from the previous deployments?

Sergi Girona: First of all, the new MareNostrum 4 supercomputer will allow us to perform 13,400 trillion operations per second, meaning that it is 12 times more powerful than MareNostrum 3. Of this capacity, 11.15 petaflops – those of the general computing cluster – will be used to expand BSC’s day-to-day work and undertake research in house. Most of this computing capacity will also be available to Spanish and European scientists through the peer-reviewed access networks, the Spanish Supercomputing Network and PRACE.

This general-purpose block consists of 48 racks with 3,456 nodes, 165,888 Intel Xeon V5 processors and a total memory of 390 Terabytes – with 2GB/core in the majority of processors and 8GB/core in 216 nodes. One of the most interesting aspects of this computer is that, although it is 10 times more powerful than its predecessor, its energy consumption will only increase by around 30 percent, to roughly 1.3 MW.

The increased computing power provided by this cluster for BSC’s daily work will allow us to address strategic challenges for the center, such as those associated with personalized medicine. We have major projects in this field that aim to ensure that advances made, thanks to genomic research, are transferred to patient treatments. This requires tremendous cooperation between different stakeholders in the fields of research, medicine and government, as well as significant investment in computational power and technologies adapted to these requirements.

By contrast, the block of emerging technologies in the supercomputer has a very different purpose. This is to evaluate technologies currently under development to find out which are the best options for future updates of the supercomputer and therefore to be able to achieve the highest possible performance when the time comes.

TOP500 News: What is the timeline for getting these various sub-clusters deployed?

Girona: Due to the terms of the non-disclosure agreement, I am unable to reveal the exact dates for incorporating these emerging technologies. What I can say is that they will be incorporated into the MN4 architecture as soon as they are available. In the meantime, we are working with previous versions.

Once the emerging technology block has been installed in its final version, there will be a cluster consisting of IBM POWER9 and NVIDIA GPU processors, with computational power of over 1.5 petaflops. These processors and GPUs are the same as those that IBM and NVIDIA will use for the Summit and Sierra supercomputers, which the United States Department of Energy has ordered for the Oak Ridge and Lawrence Livermore national laboratories.

Another cluster will be made up of Intel Knights Hills processors and will have a computational power of more than 0.5 petaflops. These processors are the same as those to be used in the Theta and Aurora supercomputers, also contracted by the US Department of Energy, in this case, for the Argonne National Laboratory.

The third cluster will be composed of 64-bit ARM v8 processors in a prototype machine whose computational power exceeds 0.5 petaflops. These are the same processor that will be used in the leading-edge technology of the Post-K Japanese supercomputer.

None of these three technologies is available yet. However, following the specifications set out in the purchase agreement, we already have clusters built with previous versions of these technologies: one with POWER 8 and NVIDIA GPUs, and another with Intel Knights Landing processors.

TOP500 News: Are there specific groups or researchers interested in using the specific architectures you are deploying, especially with regard to ARM, Knights Hill, and the Power-GPU platform?

Girona: There is no doubt that this wide range of latest-generation technologies is very attractive for researchers undertaking pioneering research in the field of computer architecture and computer science in general. This is the great thing about BSC having more than 350 researchers, who can experiment with this type of technology, influence the development of future technologies and study how to transfer applications to these new systems.

Computer design is experiencing a clear trend towards the creation of specific architectures for specific users, a trend which growing increasingly common with respect to hardware-software co-design. Evidently, it follows that knowing how different systems behave with different applications is of particular relevance when creating platforms which are faster, more efficient and more effective for every type of application and user.

Another strategic challenge for BSC is to ensure we play an important part within the European Union’s plans to create European computing technology. The ability to experiment with different cutting-edge technologies give us a more comprehensive picture to help us take on this challenge.




Current rating: 4.3


Neil McNaughton 1 month, 3 weeks ago

"its energy consumption will only increase by around 30 percent, to roughly 1.3 MW per year"

should read ...

"its energy consumption will only increase by around 30 percent, to roughly 1.3 MW"


Link | Reply

Dan Pope 1 month, 3 weeks ago

"13,400 billion operations per second" should surely read "13,400 trillion operations per second". Otherwise this supercomputer is only 8 times faster than my laptop's GPU. ;)

Link | Reply

Michael Feldman 1 month, 3 weeks ago

Thanks for catching that! Text updated.

Link | Reply

New Comment


required (not published)