Scaling to 8,192 Cores

Scaling on 2 HPC Platforms

Strong scaling of a constant problem size (8 million atoms) on 2 different HPC platforms. Solid / dashed lines correspond to a stored / recomputed Hamiltonian matrix. The largest number of cores available were 8,192 on Cray XT3/4 and IBM B/G.

Scaling on 6 HPC Platforms

Strong scaling of a constant problem size (8 million atoms) on 6 different HPC platforms. Solid / dashed lines correspond to a stored / recomputed Hamiltonian matrix. The largest number of cores available were 8,192 on Cray XT3/4 and IBM B/G.

Scaling on RPI BG/L

Strong (constant total problem size, diagonal dashed lines) and weak (constant number of atoms per core, horizontal solid lines) on RPI BG/L in recomputing mode. Excellent performance is evident for large enough problem sizes.


NEMO 3-D was benchmarked on a broad range of HPC platforms in June and July 2007. 5 of the 6 benchmark platforms were ranked on the TOP500 list of June 2007. The benchmark machines are:
  • #2, ORNL, Jaguar, Cray XT3/4, with 23016 cores, 2GB RAM/core
  • #7, RPI, eServer Blue Gene Solution, IBM B/G, with 32768 cores, 256MB RAM/core
  • #8, NCSA, Abe, XeonQ (Quad core, dual socket), with 9600 cores, 1GB RAM/core
  • #30, IUPU, Big Red, IBM JS21, with 3072 cores, 2GB RAM/core
  • #46, PSC, Big Ben, Cray XT3, with 4136 cores, 1GB RAM/core
  • unranked, Purdue, XeonD (Dual core, dual socket), with 672 cores, 2GB/4GB RAM/core

The benchmark constitutes 500 Lanczos iterations. NEMO 3-D can either store the Hamiltonian matrix or recompute it on the fly for use in the matrix multiply. Recomputing the matrix at every multiply step is computationally intensive, however it reduces the memory footprint and enables running on CPU-rich, memory poor machines. More details can be found at the following web page.

This page has been accessed at least several times since July 28, 2007.