NEMO 3-D was benchmarked on a broad range of HPC platforms in June and July 2007. 5 of the 6 benchmark platforms were ranked on the TOP500 list of June 2007. The benchmark machines are:
- #2, ORNL, Jaguar, Cray XT3/4, with 23016 cores, 2GB RAM/core
- #7, RPI, eServer Blue Gene Solution, IBM B/G, with 32768 cores, 256MB RAM/core
- #8, NCSA, Abe, XeonQ (Quad core, dual socket), with 9600 cores, 1GB RAM/core
- #30, IUPU, Big Red, IBM JS21, with 3072 cores, 2GB RAM/core
- #46, PSC, Big Ben, Cray XT3, with 4136 cores, 1GB RAM/core
- unranked, Purdue, XeonD (Dual core, dual socket), with 672 cores, 2GB/4GB RAM/core
The benchmark constitutes 500 Lanczos iterations. These iterations include the communication needed to construct the reduced order tridiagonal Lanczos matrix.
NEMO 3-D can either store the Hamiltonian matrix or recompute it on the fly for use in the matrix multiply. Recomputing the matrix at every multiply step is computationally intensive, however it reduces the memory footprint and enables running on CPU-rich, memory poor machines. This page shows the benchmark where the matrix elements are recomputed / reconstructed on the fly on every matrix-vecotr multiply step. The equivalent stored matrix benchmarks are shown below. All machines are compared on a single graph on a different page.
Each row contains the scaling data for a particular machine. The first column shows the stron scaling representation, where the total problem size is held constant and the number of cores is increased. Ideal behavior is represented by a diagonal line on a log-log scale. The second column represents the weak scaling data where the number of atoms per core is held constant and the the number of cores and the total prblem size is increased. Here a horizontal flat line would be the ideal scaling result. The third colum represents the combined data in column one and two. The fourth column shows the strong scaling linear speed-up curves for a constant problem size normalized to the data point at the lowest core count. The 8 million atom line is shown special as a dashed line, as that benchmark is the critical comparison for all platforms.
|