Highlights - June 2009

TOP Highlights

  • The Roadrunner system, which broke the petaflop/s barrier one year ago, held on to its No. 1 spot. It still is one of the most energy efficient systems on the TOP500.

  • The most powerful system outside the U.S. is an IBM BlueGene/P system at the German Forschungszentrum Juelich (FZJ) at No. 3.  FZJ has a second system in the TOP10, a mixed Bull and Sun system at No. 10.

  • Intel dominates the high-end processor market with 79.8 percent of all systems and 87.7 percent of quad-core based systems.

  • Intel Core i7 (Nehalem) makes its first appearance in the list with 33 systems

  • Quad-core processors are used in 76.6 percent of the systems. Their use accelerates performance growth at all levels.

  • Other notable systems are:Hewlett-Packard kept a narrow lead in market share by total systems from IBM, but IBM still stays ahead by overall installed performance.

    •  An IBM BlueGene/P system at the King Abdullah University of Science and Technology (KAUST) in Saudi Arabia at No. 14
    • The Chinese-built Dawning 5000A at the Shanghai Supercomputer Center at No 15. It is the largest system which can be operated with Windows HPC 2008.
  • Cray’s XT system series is very popular for big customers 10 systems in the TOP50 (20 percent).

Power consumption of supercomputers

  • TOP500 now tracks actual power consumption of supercomputers in consistent fashion.

  • Most energy efficient supercomputers are based on

    • IBM QS22 Cell processor blades (up to 536 Mflop/Watt),
    • A GRAPE-DR custom accelerator system (429 Mflop/Watt),
    • IBM BlueGene/P systems (up to 372 Mflop/Watt)
  • Intel quad-core blades are catching up fast:Cray’s XT4/5 with AMD quad-cores are in the same region (up to 232 Mflop/Watt)

    • Nehalem based system (up to 273 Mflops/watt)
    • Harpertown based system (up to 265 Mflop/Watt).
  • These quad-core systems are already ahead of BlueGene/L (up to 210 Mflop/Watt).

  • Average Power consumption of a TOP10 system is 2.45 MWatt  (unchanged) and average power efficiency is 280 Mflops/Watt up from 228 Mflops/Watt six month ago.

  • Only 17 systems on the list are confirmed to use more than 1 MWatt of power.

  • Average Power consumption of a TOP500 system is 386 kWatt and average power efficiency is 150 Mflops/Watt.

Highlights from the Top 10:

  • The Roadrunner system at DOE’s Los Alamos National Laboratory (LANL) was built by IBM and in June 2008 was the first system ever to break the petaflop/s Linpack barrier.  Roadrunner is still holding on to its number 1 spot with 1.105 petaflop/s.  Roadrunner is based on the IBM QS22 blades which are built with advanced versions of the processor in the Sony PlayStation 3.  These nodes are connected with a commodity InfiniBand network.

  • Last November, Roadrunner was almost surpassed by the second petaflop/s system ever - the Jaguar system installed at the DOE’s Oak Ridge National Laboratory. Jaguar reached 1.059 petaflop/s shortly after its installation but due to its heavy workload no further measurements were possible and it remains at No. 2.  It is a XT5 system manufactured by Cray.

  • The TOP10 features four new systems.

  • In the TOP10 only the No. 3 and 10 systems are installed outside the U.S. – in this case in Germany.

  • The No. 3 system called JUGENE is a new IBM BlueGene/P system installed at the Forschungszentrum Juelich (FZJ) in Germany.  It achieved 825.5 Teraflop/s on the Linpack benchmarks and has a theoretical peak performance of just above 1 Petaflop/s.

  • The No. 6 system called Kraken is a new Cray XT5 system, installed at the National Institute for Computational Sciences at the University of Tennessee with a Linpack performance of 463.3 Tflop/s. It is the largest system at a University.

  • At No. 9 is also a new IBM BlueGene/P system called Dawn and it is installed at DOE’s Lawrence Livermore National Laboratory and it achieved 415.7 Tflop/s.

  • The No. 10 system is the second system at the Forschungszentrum Juelich (FZJ) in Germany. Called JUROPA, it is built from Bull Novascale and Sun SunBlade x6048 servers and achieved 274.8 TFlop/s.

General highlights from the Top 500 since the last edition:

  • Quad-core processor based systems have taken over the TOP500 quite rapidly. Already 383 systems are using them. 102 systems are using dual-core processors, and only four systems still use single core processors. Already four systems use IBMs advanced Sony PlayStation 3 processor with 9 cores and two systems at Cray are using the new six-core Shanghai AMD Opteron processors.  The Linpack benchmark can utilize multi-core processors very well, which led to performance levels increasing above average across the whole list.

  • The entry level to the list moved up to the 17.1 Tflop/s mark on the Linpack benchmark, compared to 12.64 Tflop/s six months ago.

  • The last system on the newest list would have been listed at position 274 in the previous TOP500 just six months ago. This turnover rate is gain just above average after the TOP500 recorded the highest turnover in its history one year ago.

  • Total combined performance of all 500 systems has grown to 22.6 Pflop/s, compared to 16.95 Pflop/s six months ago and 11.7 Pflop/s one year ago.

  • The entry point for the top 100 increased in six months from 27.37 Tflop/s to 39.58 Tflop/s.

  • The average concurrency level in the TOP500 is 8,210 cores per system up from 6,240 six month ago and 4,850 one year ago.

 

  • A total of 399 systems (79.8 percent) are now using Intel processors. This is slightly up from six months ago (379 systems, 75.8 percent). Intel continues to provide the processors for the largest share of TOP500 systems.

  • The IBM Power processors are the second most common used processor family with 55 systems (11 percent), down from 60.

  • They are followed by the AMD Opteron family with 43 systems (8.6 percent), down from 59.

  • Multi-core processors are the dominant chip architecture. The most impressive growth showed the number of systems using the Intel Harpertown, Clovertown, and Gainstown quad core chips, which grew from 253 last June to 287 systems in November and now 336 systems with the addition of the Gainstown processor to Intel’s quad core lineup.

  • The majority of remaining systems uses dual-core processors.

  • 410 systems are labeled as clusters, making this the most common architecture in the TOP500 with a stable share of 82 percent.

  • Gigabit Ethernet is still the most-used internal system interconnect technology (282 systems), due to its widespread use at industrial customers, followed by InfiniBand technology with 148 systems.

 

  • IBM and Hewlett-Packard continue to sell the bulk of systems at all performance levels of the TOP500.

  • HP kept a narrow lead in systems with 212 systems (42.4 percent) over IBM with 188 systems (37.6 percent).  HP had 209 systems (41.8 percent) six months ago, compared to IBM with 188 systems (37.6 percent).

  • IBM remains the clear leader in the TOP500 list in performance with 39.4 percent of installed total performance (up from 38 percent), compared to HP with 25.1 percent (up from 24.7 percent).

  • In the system category, Cray, SGI, and Dell follow with 4.0 percent, 4.0 percent and 3.4 percent respectively.

  • In the performance category, the manufacturers with more than 5 percent are: Cray (13.7 percent of performance) and SGI (6.7 percent), each of which benefits from large systems in the TOP10.

  • HP (191) and IBM (117) sold together 308 out of 313 systems at commercial and industrial customers and have had this important market segment clearly cornered for some time now.

 

  • The U.S. is clearly the leading consumer of HPC systems with 291 of the 500 systems (unchanged). The European share (145 systems – down from 151) is settling down after having risen for some time, but is still substantially larger then the Asian share (49 systems – up from 47).

  • Dominant countries in Asia are China with 21 systems (up from 16), Japan with 15 systems (down from 18), and India with 6 systems (down from 8).

  • In Europe, UK remains the No. 1 with 44 systems (45 six months ago). Germany is still in the No. 2 spot with 29 systems (24 six months ago).

Highlights from the Top 50:

  • The entry level into the TOP50 is at 75.7 Tflop/s

  • The U.S. has a lower percentage of systems (42 percent) in the TOP50 than in the TOP500 (58.4 percent).

  • The dominant architectures are custom-built massively parallel systems MPPs with 64 percent ahead of commodity clusters with 36 percent.

  • IBM leads the TOP50 with 34 percent of systems and 43 percent of performance.

  • No 2 is Cray with a stable share of 20 percent of systems and 25 percent of performance.

  • SGI is third with 12 percent of systems and 9.6 percent of performance.

  • HP has 4 percent of systems and 4.0 percent of performance.

  • 62 percent of systems are installed at research labs and 28 percent at universities.

  • There is only a single system using Gigabit Ethernet in the TOP50.

  • Cray’s XT is the most-used system family with 10 systems (20 percent), followed by IBM’s BlueGene with 9 systems (18 percent).

  • Intel processors are used in 28 percent of systems, behind of AMD processors in 36 percent and IBM’s Power processors in 32 percent.

  • The average concurrency level is 40,871 cores per system – up from 30,490 cores per system six month ago and 24,400 one year ago.

All changes are from November 2008 to June 2009.

The TOP500 list is compiled by Hans Meuer of the University of Mannheim, Germany; Erich Strohmaier and Horst Simon of NERSC/Lawrence Berkeley National Laboratory; and Jack Dongarra of the University of Tennessee, Knoxville.