Tianhe-2 (Milky Way-2), a system developed by China’s National University of Defense Technology (NUDT) and deployed at the National Supercomputer Center in Guangzho, China remains the No. 1 system with 33.86 petaflop/s (Pflop/s) on the Linpack benchmark. The system currently has 16,000 nodes, each with two Intel Xeon Ivy Bridge processors and three Xeon Phi processors for a combined total of 3,120,000 computing cores. It features a number of Chinese-developed components, including the TH Express-2 interconnect network, front-end processors, operating system and software tools. The Tianhe-2 uses the Kylin Linux operating system. The power consumption of Tianhe-2 while running Linpack was 17.8 MW.
Highlights from the Top 10
Titan, a Cray XK7 system installed at the Department of Energy’s (DOE) Oak Ridge National Laboratory remains the No. 2 system. It achieved 17.59 Pflop/s on the Linpack benchmark using 261,632 of its NVIDIA K20x accelerator cores. Titan is one of the most energy efficient systems on the list consuming a total of 8.21 MW and delivering 2.143 Gflops/W.
Sequoia, an IBM BlueGene/Q system installed at DOE’s Lawrence Livermore National Laboratory, is again the No. 3 system. It was first delivered in 2011 and has achieved 17.17 Pflop/s on the Linpack benchmark using 1,572,864 cores.
Fujitsu’s K computer installed at the RIKEN Advanced Institute for Computational Science (AICS) in Kobe, Japan, is the No. 4 system with 10.51 Pflop/s on the Linpack benchmark using 705,024 SPARC64 processing cores.
Mira, a BlueGene/Q system installed at DOE’s Argonne National Laboratory, is No. 5 with 8.59 Pflop/s on the Linpack benchmark using 786,432 cores.
At No. 6 is Piz Daint, a Cray XC30 system installed at the Swiss National Supercomputing Centre (CSCS) in Lugano, Switzerland and the most powerful system in Europe. Piz Daint achieved 6.27 Pflop/s on the Linpack benchmark using 73,808 NVIDIA K20x accelerator cores. Piz Daint is also the most energy efficient systems in the TOP10 consuming a total of 2.33 MW and delivering 2.7 Gflops/W.
Shaheen II, a Cray XC40 system installed at King Abdullah University of Science and Technology (KAUST) is the only new system in the TOP 10 at No. 7 with 5.536 PFlop/s on the Linpack benchmark using 196,608 Intel Xeon E5-2698v3 cores.
Stampede, a Dell PowerEdge C8220 system installed at the Texas Advanced Computing Center of the University of Texas, Austin, is now at No. 8. It also uses Intel Xeon Phi processors (previously known as MIC) to achieve its 5.17 Pflop/s.
The second system in Europe is at No. 9. It is also a BlueGene/Q system called JUQEEN installed at the Forschungszentrum Juelich in Germany and is listed with 5.01 Pflop/s.
No. 10 is taken by Vulcan, another IBM BlueGene/Q system at Lawrence Livermore National Laboratory. It was temporarily combined with the No. 3 system but is now operated independently. It achieved 4.29 Pflop/s.
The #7 system is the only Top 10 system installed in 2015. The #1 system the only system installed in 2013. The remaining 8 systems have been installed in 2012 or 2011. This age of the population in the Top 10 is unprecedented.
Highlights from the Overall List
The overall list-by-list growth rates of performance continues to be at historical low values for the last 2 years.
The performance of the last system on the list (#500) has systematically lagged behind historical trends for the last 6 years and now clearly is on a different growth trajectory than before. From 1994 to 2008 it grew by 90% per year. Since 2008 it only grows by 55% per year.
The growth of the average performance of all systems in the list has slowed as well but lagged only for the last two lists behind historical averages. This average is noticeably influenced by the very large systems on the top of the list. Recent installations of very large systems until June 2013 have counteracted the reduced growth rate at the bottom of the list. This offers an indication that the market for the very largest systems might behaved differently from the market of mid-sized and smaller supercomputers.
There are 68 systems with performance greater than a Pflop/s on the list, up from 50 six months ago.
In the Top 10, the No. 1 system, Tianhe-2, and the No. 8 system, Stampede, use Intel Xeon Phi processors to speed up their computational rate. The No. 2 system Titan, the No. 6 system Piz Daint is using NVIDIA GPUs to accelerate computation.
A total of 90 systems on the list are using accelerator/co-processor technology, up from 75 on November 2014. Fifty-two (52) of these use NVIDIA chips, four use ATI Radeon, and there are now 35 systems with Intel MIC technology (Xeon Phi). Four systems use a combination of Nvidia and Intel Xeon Phi accelerators/co-processors.
The average number of accelerator cores for these 90 systems is 76,372 cores/system.
Intel continues to provide the processors for the largest share (86.6 percent) of TOP500 systems.
Ninety-eight (98) percent of the systems use processors with six or more cores, eighty-eight (88.2) percent use eight or more cores, and forty-seven (47.4) percent ten or more cores.
IBM’s BlueGene/Q is still the most popular system in the TOP 10 with four entries including the No. 3, 5, 9 and 10 systems.
The number of systems installed in the USA stays close to previous lists at 233.
The number of systems installed in China has fallen to 37, compared to 61 on the last list. China is now at the No. 3 position as a user of HPC, after the U.S. and Japan. China however is is still holding the No. 2 position in the performance share, thanks to the Tianhe-2 System.
General highlights from the TOP500 since the June 2014 edition
The entry level to the list moved up to the 165.1 Tflop/s mark on the Linpack benchmark, compared to 153.3 Tflop/s six months ago.
The last system on the newest list would have been listed at position 419 in the previous TOP500. This represents the second lowest turnover rate in the list in two decades and a slight recovery when compared to the last list.
Total combined performance of all 500 systems has grown to 363 Pflop/s, compared to 309 Pflop/s six months ago and 274 Pflop/s one year ago. This increase in installed performance also exhibits a noticeable slowdown in growth compared to the previous long-term trend.
The entry point for the TOP100 increased in six months to 715 Tflop/s from 496 Tflop/s.
The average concurrency level in the TOP500 is 50,495 cores per system, up from 46,288 six months ago and 43,301 one year ago.
A total of 431 systems (86.2 percent) are now using Intel processors, slightly up from 85.8 percent six months ago.
The share of IBM Power processors is stable at 38 systems (8 percent).
The AMD Opteron family is used in 22 systems (4.4 percent), down from 5.2 percent on the previous list.
InfiniBand technology is now found on 257 systems, up from 225 systems, and is the most-used internal system interconnect technology. Gigabit Ethernet has fallen to 147 systems down from 187 systems, in large part thanks to 84 systems now using 10G interfaces.
IBM and Hewlett-Packard continue to sell the bulk of the systems at all performance levels of the TOP500.
Following its acquisition of IBM’s x86 business last year, Lenovo emerges as a new vendor with 3 new systems. Some systems that were previously listed as IBM are now labeled as both IBM/Lenovo (17 systems) and Lenovo/IBM (3 systems).
HP has the lead in systems and now has 178 systems (35.6 percent) compared to IBM with 111 systems (22.2 percent). HP had 179 systems six months ago, and IBM had 153 systems six months ago. In the system category, Cray remains third with 14.2 percent (71 systems).
Cray emerges in this list as the clear leader in the TOP500 list in performance and has a considerable lead with a 24% percent share of installed total performance (up from 18.2 percent).
IBM takes the second spot with 23 percent share, down from 26 percent six months ago.
Thanks to Tianhe-2 and Tianhe-1A, NUDT contributes 10.9 percent of the total performance of the list, down from 12.7 percent.
HP is third with 14.2 percent, down from 15.6 percent six months ago.
The U.S. is clearly the leading consumer of HPC systems with 233 of the 500 systems (231 in November 2014) although its share has been dropping close to its all time low in the last list. The European share (141 systems compared to 130 last time) has surpassed the Asian share (107 systems, down from 120 last time).
Dominant countries in Asia are Japan with 39 systems (up from 32) and China with 37 systems (down from 61).
In Europe, UK, France, and Germany, are in the same range with with 31, 27, and 37 respectively.
About the TOP500 List
The first version of what became today’s TOP500 list started as an exercise for a small conference in Germany in June 1993. Out of curiosity, the authors decided to revisit the list in November 1993 to see how things had changed. About that time they realized they might be on to something and decided to continue compiling the list, which is now a much-anticipated, much-watched and much-debated twice-yearly event.
The TOP500 list is compiled by Erich Strohmaier and Horst Simon of Lawrence Berkeley National Laboratory; Jack Dongarra of the University of Tennessee, Knoxville; and Martin Meuer of ISC Group, Germany.