Decomposition

Without installing a different version of GROMACS, let us have a pass on the decomposition. Initiallly we used 18 ranks.

Let us fire up some Jobs to run different decompositions.

36 ranks

cat > gromacs-single-node-c5n-36x2.sbatch << \EOF
#!/bin/bash
#SBATCH --job-name=gromacs-single-node-c5n-36x2
#SBATCH --exclusive
#SBATCH --output=/fsx/logs/%x_%j.out
#SBATCH --partition=c5n
#SBATCH --ntasks-per-node=36
NTOMP=2

mkdir -p /fsx/jobs/${SLURM_JOBID}
cd /fsx/jobs/${SLURM_JOBID}

spack env activate aws
echo ">>> spack load /GROMACS_101_HASH"
spack load /GROMACS_101_HASH

set -x
time mpirun gmx_mpi mdrun -ntomp ${NTOMP} -s /fsx/input/gromacs/benchRIB.tpr -resethway
EOF
sed -i -e 's/GROMACS_101_HASH/'${GROMACS_101_HASH}'/' gromacs-single-node-c5n-36x2.sbatch

Finally, we’ll submit two job.

sbatch -N1 gromacs-single-node-c5n-36x2.sbatch 
sbatch -N1 gromacs-single-node-c5n-36x2.sbatch
squeue
sinfo

72 ranks

cat > gromacs-single-node-c5n-72x1.sbatch << \EOF
#!/bin/bash
#SBATCH --job-name=gromacs-single-node-c5n-72x1
#SBATCH --exclusive
#SBATCH --output=/fsx/logs/%x_%j.out
#SBATCH --partition=c5n
#SBATCH --ntasks-per-node=72
NTOMP=1

mkdir -p /fsx/jobs/${SLURM_JOBID}
cd /fsx/jobs/${SLURM_JOBID}

spack env activate aws
echo ">>> spack load /GROMACS_101_HASH"
spack load /GROMACS_101_HASH

set -x
time mpirun gmx_mpi mdrun -ntomp ${NTOMP} -s /fsx/input/gromacs/benchRIB.tpr -resethway
EOF
sed -i -e 's/GROMACS_101_HASH/'${GROMACS_101_HASH}'/' gromacs-single-node-c5n-72x1.sbatch

Finally, we’ll submit two job.

sbatch -N1 gromacs-single-node-c5n-72x1.sbatch 
sbatch -N1 gromacs-single-node-c5n-72x1.sbatch 
squeue -o "%.5i %.5P %.50j %.8u %.2t %.10M %.6D %R"
sinfo

Results

The second run might need to wait until a second instance is up.

watch -c 'squeue -o "%.5i %.5P %.50j %.8u %.2t %.10M %.6D %R"'

After both are running (R) and finish we check the performance in ns/day.

grep -B2 Performance /fsx/logs/gromacs-single-node-c5n*

Taking the earlier run with 18 ranks into account we arrive at the following measurement.

# execution spec instance Ranks x Threads ns/day
1 native gromacs@2021.1 c5n.18xl 18 x 4 4.7
2 native gromacs@2021.1 c5n.18xl 36 x 2 5.3
3 native gromacs@2021.1 c5n.18xl 72 x 1 5.5

Thus, in this case using 72 ranks is the most performant.