HPCG Software Releases

The development of the HPCG software is tracked on GitHub:

HPCG 3.0 Binary for NVIDIA GPUs Including Pascal

This new binary supports CUDA 8.0 and the new Tesla Pascal (sm 6.0). The binary also works for Kepler (sm 3.5) and Maxwell (sm 5.0). Additionally, there is a better "read me" and example files.

It requires Open MPI 1.6.5 and GCC 4.8.5

SHA256 89edb4215b0fb1d8818a6044be18c150301a75ba55b3384453a55896779ec7eb

HPCG_cuda8_ompi165_gcc485_sm_35_sm_50_sm60_v1c.tgz   Download

HPCG 3.0 Binary from NVIDIA Optimized for NVIDIA GPUs (bugfix)

This is a bugfix to the binary-only code optimized by NVIDIA. It requires CUDA 6.5 and OpenMPI 1.6.5 produced with GCC 4.8.2.

The previous code had a memory allocation problem on more than 27 MPI ranks.

SHA256 8017999eefb2854b50b3fc93aeaa5eea7aae1fa17974b554d707b0cfa8e0a39a

hpcg-3.0_CUDA-6.5_GCC-4.8.2_OMPI-1.6.5update1.tgz   Download

HPCG 3.0 Binary from NVIDIA Optimized for NVIDIA GPUs

This is a binary-only code optimized by NVIDIA employees. It requires CUDA 6.5 and OpenMPI 1.6.5 produced with GCC 4.8.2.

SHA256 320ddecc182c79cc980ed0be790005a4ec891f73f32883a74d0ff60959d0a095

hpcg-3.0_CUDA-6.5_GCC-4.8.2_OMPI-1.6.5.tgz   Download

HPCG 3.0 Reference Code
Features include:
  • Problem generation is a timed portion of the benchmark. This time is now added to any time spent optimizing data structures and counted as overhead when computing the official GFLOP/s rating. The total overhead time is divided by 500 to amortize its cost over 500 iterations.
  • Added memory usage counting and reporting.
  • Added memory bandwidth measurement and reporting
  • Added a "Quick Path" option to make obtaining results on production systems easier. With this option, obtaining a rating will take only a few minutes. This option also makes profiling and debugging easier. The Quick Path option is invoked by setting the run time to zero, either in hpcg.dat or by using the --rt=0 option.
  • Added a command line option (--rt=) to specify the run time.
  • Made a few small changes to support easier builds on MS Windows.
  • Changed the way the residual variance is computed to make sure it is zero if all residual values are identical.
  • Changed the order of array allocation in the reference code in order to improve performance.
  • Set the minimum iteration count for the optimized run to be the same as what is used in the reference run.

    SHA256 e2b9bb6e0e83c3a707c27e92a6b087082e6d7033f94c000a40aebf2c05881519

hpcg-3.0.tar.gz   Download

Intel® MKL Benchmarks - includes optimized HPCG 3.0 for Intel XEON and Phi

Intel versions of HPCG 3.0 for Linux based systems (including clusters) with Intel multicore and manycore (Phi KNL) processors.

Details are at

l_mklb_p_11.3.1.002.tgz   Download

Mar 25 2017 Contact: Admin Login