Columbia
| System Name | Columbia |
| Site | NASA/Ames Research Center/NAS |
| System Family | SGI Altix |
| System Model | SGI Altix 3700 |
| Computer | SGI Altix 1.5 GHz, Voltaire Infiniband |
| Vendor | SGI |
| URL | http://www.nas.nasa.gov/About/... |
| Application area | Aerospace |
| Installation Year | 2004 |
| Operating System | SuSE Linux Enterprise Server 9 |
| Interconnect | Numalink/Infiniband | endif; ?>
| Processor | Intel IA-64 Itanium2 Montecito Dual Core 1600 MHz (6.4 GFlops) |
NASA's Columbia supercomputer is a 10,240 processor system composed of twenty 512-processor nodes, twelve of which are SGIRMk Altix™ 3700 nodes, and eight of which are SGIRMk Altix™ 3700 Bx2 nodes. Each node is a shared-memory, single-system-image (SSI) environment, running a LinuxRMk-based operating system. Four of the Bx2 nodes are linked to form a 2048-processor shared memory environment (2048-PE). In addition to supporting NASA's largest computations, the 2048-PE also supplies the computational resources for NASA's National Leadership Computing System (NLCS) initiative.
Each processor in the 2048-PE is an IntelRMk ItaniumRMk 2, running at 1.6 gigahertz (GHz), with 9 megabytes (MB) of level-3 cache (the "Madison 9M" processor), and a peak performance of 6.4 gigaflops (Gflop/s). There is a total of 4 terabytes (TB) of shared memory in the 2048-PE, or 2 gigabytes (GB) per processor. One other Bx2 node is equipped with these same processors. The remaining 15 nodes have ItaniumRMk 2 processors running at 1.5 GHz, with 6 MB of level-3 cache, and a peak performance of 6 Gflop/s. All nodes have 2 GB of shared memory per processor.
Within each node of Columbia, processors are connected by the SGIRMk NumaLink™ fabric. Nodes are connected to each other by VoltaireRMk InfiniBandRMk fabric, as well as 10- and 1-gigabit Ethernet. (Note that the four Bx2 nodes in the 2048-PE use NumaLink™ between these four nodes as well as the other fabrics.) Columbia is connected to 650 TB of on-line RAID storage through a Fibre Channel switch. Columbia has tertiary (tape) storage with a capacity of more than 10 petabytes.
On each 512-processor node, the primary features and benefits are:
- Low latency to global shared memory - supports 1-microsecond worst-case latency for message-passing interface (MPI), and approximately 0.5-microsecond latency for direct load/store operations, enabling strong application scalability
- High memory bandwidth - first system (in November 2003) to exceed 1 terabyte/second on the STREAM benchmark
- Global shared memory and cache-coherency - supports simpler and more efficient programming models than MPI, efficient use of NFS over 10,240 processors for home file systems, large data caches that improve input/output (I/O) performance, and a Fibre Channel storage area network implementation supporting global file systems between 512-processor nodes
- Large shared memory (1 TB) - allows large problems to remain resident
With these features and benefits, Columbia is particularly well suited for large-scale applications that involve substantial inter-processor communication and I/O. Typical applications are physics-based simulations involving a discretized grid of the simulation domain. Also, applications requiring dynamic load-balancing and/or adaptive gridding are more tractable on this system, leveraging shared memory programming models.
The development and operating environment on Columbia features 72 SGIRMk Altix™ processors for front-end (interactive) use, a Linux-based operating system, an AltairRMk PBS Professional™ job scheduler, an IntelRMk Fortran/C/C++ compiler, and SGIRMk ProPack™ software. Larger interactive jobs can be submitted through PBS to the compute node(s).
NASA's Altix system is named Columbia, in honor of the crew of Columbia flight STS-107, which was lost on February 1, 2003. The Columbia system represents a unique partnership between government and industry (SGIRMk, IntelRMk and VoltaireRMk) to build one of the world's largest supercomputers in record time. Columbia is installed at NASA Ames Research Center near San Jose, Calif., and became fully operational on October 26, 2004. Columbia features a sustained Linpack benchmark performance of 51.9 teraflop/s, as listed on the TOP500 Supercomputer Sites.
Click the images below for bigger versions:
Sites