Highlights - November 2016

Highlights from the Top 10

Sunway TaihuLight, a system developed by China’s National Research Center of Parallel Computer Engineering & Technology (NRCPC) and installed at the National Supercomputing Center in Wuxi, which is in China's Jiangsu province maintains the lead as the No. 1 system with 93 petaflop/s (Pflop/s) on the Linpack benchmark.

Tianhe-2 (Milky Way-2), a system developed by China’s National University of Defense Technology (NUDT) and deployed at the National Supercomputer Center in Guangzho, China is now the No. 2 system with 33.86 petaflop/s (Pflop/s) on the Linpack benchmark. Tianhe-2 was the No.1 system in the TOP500 list for the past 3 years (6 lists)
Titan, a Cray XK7 system installed at the Department of Energy’s (DOE) Oak Ridge National Laboratory, is now the No.3 system. It achieved 17.59 Pflop/s on the Linpack benchmark using 261,632 of its NVIDIA K20x accelerator cores.
Sequoia, an IBM BlueGene/Q system installed at DOE’s Lawrence Livermore National Laboratory, is now the No. 4 system. It was first delivered in 2011 and has achieved 17.17 Pflop/s on the Linpack benchmark using 1,572,864 cores.
Cori, a Cray XC40 supercomputer comprised of 1,630 Intel Xeon "Haswell" processor nodes, 9,300 Intel Xeon Phi 7250 ("Knight's Landing") nodes is a new entry to the TOP10 at No.5 with 14.01 Pflops/s on the Linpack benchmark using 622,336 cores.
Oakforest-PACS, a Fujitsu PRIMERGY CX1640 M1 installed at Joint Center for Advanced High Performance Computing in Japan is powered by Intel Xeon Phi 7250 nodes and Intel Omni-Path interconnect technology is the second new system in the TOP10 at No. 6 with 13.55 PFlop/s on the Linpack benchmark using using 558,144 cores.
Fujitsu’s K computer installed at the RIKEN Advanced Institute for Computational Science (AICS) in Kobe, Japan, is the No. 7 system with 10.51 Pflop/s on the Linpack benchmark using 705,024 SPARC64 processing cores.
At No. 8 is Piz Daint, a Cray XC50 system installed at the Swiss National Supercomputing Centre (CSCS) in Lugano, Switzerland and the most powerful system in Europe. Piz Daint continues to maintain its position in the TOP 10 after an upgrade from Cray XC30. Piz Daint achieved 9.77 Pflop/s on the Linpack benchmark using NVIDIA Tesla P100. The system has a total of 206,720 cores.
Mira, a BlueGene/Q system installed at DOE’s Argonne National Laboratory, is No. 9 with 8.59 Pflop/s on the Linpack benchmark using 786,432 cores.
Trinity, a Cray X40 system installed at DOE/NNSA/LANL/SNL is now No. 10 with 8.1 Pflops/s and 301,056 cores.

Highlights from the Overall List

The number of systems installed in China increased to 171, compared to 168 on the last list. China now shares the No. 1 spot with the USA after one year at the top spot.
China and the USA are neck-and-neck in the performance category with the USA holding 33.9% of the overall installed performance while China is second with 33.3% of the overall installed performance.
The number of systems installed in the USA made a slight recovery and is now at 171 systems, up from from 165 in the previous list.
The overall list-by-list growth rates of performance continues to recover after historical low values in the past 4 years.
The growth of the average performance of all systems in the list has slowed since 2013 as well and has also dropped to about 55 percent per year.
There are 117 systems with performance greater than a Pflop/s on the list, up from 95 six months ago.
In the Top 10, the No. 2 system, Tianhe-2, the No. 5 Cori and the No. 6 Oakforest-PACS uses Intel Xeon Phi processors to speed up their computational rate. The No. 3 system Titan, the No. 8 system Piz Daint is using NVIDIA GPUs to accelerate computation.
A total of 86 systems on the list are using accelerator/co-processor technology, down from 93 on June 2016. Sixty (60) of these use NVIDIA chips, 21 systems with Intel Xeon Phi technology (As Co-Processors), one uses ATI Radeon, and one uses PEZY technology. Three systems use a combination of Nvidia and Intel Xeon Phi accelerators/co-processors. 10 Systems now use Xeon Phi as the main processing unit.
The average number of accelerator cores for these 86 systems is 80,277 cores/system.
Intel continues to provide the processors for the largest share (92.4 percent) of TOP500 systems.
Ninety-eight (98.2) percent of the systems use processors with six or more cores, eighty-eight (88) percent use eight or more cores, and seventy-two (72.2) percent ten or more cores.

General highlights from the TOP500 since the 47th edition

The entry level to the list moved up to the 349.3 Tflop/s mark on the Linpack benchmark, compared to 285.9 Tflop/s six months ago.
The last system on the newest list would have been listed at position 388 in the previous TOP500. This represents a slight recovery when compared to the last list.
Total combined performance of all 500 systems has grown to 672.2 Pflop/s, compared to 566.8 Pflop/s six months ago and 420 Pflop/s one year ago. This increase in installed performance also exhibits a noticeable slowdown in growth compared to the previous long-term trend.
The entry point for the TOP100 increased in six months to 1.07 Pflop/s, up from 958 Tflop/s.
The average concurrency level in the TOP500 is 87,990 cores per system, up from 81,995 six months ago and 58,596 one year ago.

Vendor Trends

A total of 462 systems (92.4 percent) are now using Intel processors, slightly up from 91 percent six months ago.
The share of IBM Power processors is now at 22 systems, down from 23 systems six months ago.
The AMD Opteron family is used in 7 systems, down from 13 systems on the previous list.
InfiniBand technology is now found on 187 systems, down from 205 systems, and is now the second most-used internal system interconnect technology. Gigabit Ethernet is now at 206 systems down from 218 systems, in large part thanks to 177 systems now using 10G interfaces.
Intel Omni-Path technology which made its first appearance 6 month ago with 8 systems is now at 28 systems and is used in the No. 6 system, Oakforest-PACS.
HPE has the lead in systems and now has 112 systems (22.4 percent). In addition HPE is now also picking up an additional 28 systems installed by former SGI. HPE is followed by Lenovo with 92 systems. Cray now has 56 systems, down from 69 systems six month ago . HPE had 127 systems six months ago. IBM is now 5th in the systems category with 33 systems. No new IBM system were introduced in this list.

Performance Trends

Cray continues to be the clear leader in the TOP500 list in performance and has a considerable lead with a 21.3 percent share of installed total performance (up from 19.9 percent).
Thanks to the Sunway TaihuLight system, NRCPC takes the second spot with 13.8 percent of the total performance.
HPE is third with 9.8 percent, down from 12.9 percent six months ago but will pick up another 6% from SGI based systems.
IBM and Lenovo are tied at the fourth spot with 8.8 percent share each.
Thanks to Tianhe-2 and Tianhe-1A, NUDT contributes 5.8 percent of the total performance of the list, down from 9.2 percent.

Geographical Observations

The U.S., the leading consumer of HPC systems since the inception of the TOP500 lists is now tied with China at 171 of the 500 systems. The European share (105 systems, up from 104 in the last list) is now lower than the dominant Asian share of 213 systems, down from 219 in June 2016.
Dominant countries in Asia are China with 171 systems and Japan with 27 systems.
In Europe, Germany is the clear leader with 31 systems followed by France with 20 and the UK with 13 systems.

Green500

The data collection and curation of the Green500 project has been integrated with the TOP500 project. This allows submissions of all data through a single webpage at http://top500.org/submit
The most energy-efficient system and #1 on the Green500 is DGX SATURNV, an NVIDIA DGX-1 System, with NVIDIA Tesla P100 and installed at NVIDIA with 9.46 GFlops/Watt.
#2 is the Piz Daint system at the Swiss National Supercomputing Centre (CSCS), a Cray XC50 with NVIDIA Tesla P100 with 7.45 GFlops/Watt.
#3 is Shoubu, a PEZY Computing / Exascaler ZettaScaler-1.6 System at the Advanced Center for Computing and Communication, RIKEN, Japan at 6.67 GFlops/Watt.

About the TOP500 List

The first version of what became today’s TOP500 list started as an exercise for a small conference in Germany in June 1993. Out of curiosity, the authors decided to revisit the list in November 1993 to see how things had changed. About that time they realized they might be onto something and decided to continue compiling the list, which is now a much-anticipated, much-watched and much-debated twice-yearly event.

The TOP500 list is compiled by Erich Strohmaier and Horst Simon of Lawrence Berkeley National Laboratory; Jack Dongarra of the University of Tennessee, Knoxville; and Martin Meuer of ISC Group, Germany.