Thread-MPI

Thread MPI

Next, we are going to use the thread-MPI.

salloc -p c5n -N1 --exclusive

Once the job is Running, we are going to log in.

ssh $SLURM_NODELIST
spack env list
spack env activate aws
spack env list

We are using spack to install GROMACS without (~) using an MPI.

time spack install --no-check-signature --no-checksum gromacs@2021.1 ~mpi

It is able to reuse a lot of the packages we installed earlier.

The installation took about 5min. Again, we are capturing the hash.

read -p "Please paste the hash: " GROMACS_THREADMPI_HASH

We commit the variable to our environment in case we need to log in again.

echo "export GROMACS_THREADMPI_HASH=${GROMACS_THREADMPI_HASH}" |tee -a ~/.bashrc

Now we release the allocation:

exit

36 Ranks

source ~/.bashrc
cat > gromacs-single-node-c5n-threadmpi-36x2.sbatch << \EOF
#!/bin/bash
#SBATCH --job-name=gromacs-single-node-c5n-threadmpi-36x2
#SBATCH --exclusive
#SBATCH --output=/fsx/logs/%x_%j.out
#SBATCH --partition=c5n
NTOMP=2

mkdir -p /fsx/jobs/${SLURM_JOBID}
cd /fsx/jobs/${SLURM_JOBID}

spack env activate aws
echo ">>> spack load /GROMACS_THREADMPI_HASH"
spack load /GROMACS_THREADMPI_HASH

set -x
time gmx mdrun -ntomp ${NTOMP} -s /fsx/input/gromacs/benchRIB.tpr -resethway
EOF
sed -i -e 's/GROMACS_THREADMPI_HASH/'${GROMACS_THREADMPI_HASH}'/' gromacs-single-node-c5n-threadmpi-36x2.sbatch
sbatch gromacs-single-node-c5n-threadmpi-36x2.sbatch
sbatch gromacs-single-node-c5n-threadmpi-36x2.sbatch

72 Ranks

source ~/.bashrc
cat > gromacs-single-node-c5n-threadmpi-72x1.sbatch << \EOF
#!/bin/bash
#SBATCH --job-name=gromacs-single-node-c5n-threadmpi-72x1
#SBATCH --exclusive
#SBATCH --output=/fsx/logs/%x_%j.out
#SBATCH --partition=c5n
NTOMP=1

mkdir -p /fsx/jobs/${SLURM_JOBID}
cd /fsx/jobs/${SLURM_JOBID}

spack env activate aws
echo ">>> spack load /GROMACS_THREADMPI_HASH"
spack load /GROMACS_THREADMPI_HASH

set -x
time gmx mdrun -ntomp ${NTOMP} -s /fsx/input/gromacs/benchRIB.tpr -resethway
EOF
sed -i -e 's/GROMACS_THREADMPI_HASH/'${GROMACS_THREADMPI_HASH}'/' gromacs-single-node-c5n-threadmpi-72x1.sbatch
sbatch gromacs-single-node-c5n-threadmpi-72x1.sbatch
sbatch gromacs-single-node-c5n-threadmpi-72x1.sbatch

Results

After those runs are done, we grep the performance results.

grep -B2 Performance /fsx/logs/gromacs-single-node-c5n-threadmpi-*

This extends the table started with decomposition.

# execution spec instance Ranks x Threads ns/day
1 native gromacs@2021.1 c5n.18xl 18 x 4 4.7
2 native gromacs@2021.1 c5n.18xl 36 x 2 5.3
3 native gromacs@2021.1 c5n.18xl 72 x 1 5.5
4 native gromacs@2021.1 ^intel-mkl c5n.18xl 36 x 2 5.4
5 native gromacs@2021.1 ^intel-mkl c5n.18xl 72 x 1 5.5
6 native gromacs@2021.1 ~mpi c5n.18xl 36 x 2 5.5
7 native gromacs@2021.1 ~mpi c5n.18xl 72 x 1 5.7