GROMACS Simulation on Bare Metal GPUs

1. Overview

This document outlines the end-to-end procedure for deploying and benchmarking GROMACS 2023.2 on an NVIDIA GPU cluster using Docker and the NVIDIA Container Toolkit. It provides a reproducible environment setup for GPU performance benchmarking, ensuring consistent results across nodes and configurations.

Objectives

Deploy Docker and NVIDIA runtimes for GPU compute workloads
Pull and run the official NGC GROMACS container
Benchmark molecular dynamics simulations using benchPEP dataset
Measure throughput (ns/day) and optimize performance parameters

Deploy Docker and NVIDIA runtimes for GPU compute workloads Pull and run the official NGC GROMACS container Benchmark molecular dynamics simulations using benchPEP dataset Measure throughput (ns/day) and optimize performance parameters

System Architecture

The setup runs entirely in containers for reproducibility:

Host OS: Ubuntu 24.04 LTS
GPU Runtime: NVIDIA CUDA 12.x
Container Platform: Docker Engine + NVIDIA Container Toolkit
Benchmark Container: nvcr.io/hpc/gromacs:2023.2

Host OS: Ubuntu 24.04 LTS GPU Runtime: NVIDIA CUDA 12.x Container Platform: Docker Engine + NVIDIA Container Toolkit Benchmark Container: nvcr.io/hpc/gromacs:2023.2

2. Prerequisites

Component	Description
OS	Ubuntu 24.04 LTS
GPU	NVIDIA GPUs
Driver	550+ (CUDA 12.x compatible)
Docker	27.x or newer
NVIDIA Toolkit	libnvidia-container =E2=89=A5 1.15
Internet Access	Required to pull NGC image and benchmark dataset

Component Description OS Ubuntu 24.04 LTS GPU NVIDIA GPUs Driver 550+ (CUDA 12.x compatible) Docker 27.x or newer NVIDIA Toolkit libnvidia-container =E2=89=A5 1.15 Internet Access Required to pull NGC image and benchmark dataset

3. Install Docker Engine

# Remove old packages
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do=20
  sudo apt-get remove -y $pkg || true=20
done

# Base utilities
sudo apt-get update
sudo apt-get install -y ca-certificates curl gnupg

# Docker repo & key
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
  sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

# Engine + CLI + Compose
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# Add your user to the Docker group
sudo usermod -aG docker $USER
# (Log out and back in for the changes to apply)

4. Install NVIDIA Container Toolkit

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

# Add NVIDIA GPG key and repo
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
  sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

curl -fsSL https://nvidia.github.io/libnvidia-container/stable/deb/$distribution/ | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit

# Configure and restart Docker
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

5. Validate GPU Visibility

docker run --rm --gpus all nvidia/cuda:12.4.1-base-ubuntu22.04 nvidia-smi

Expected: Displays all 8 H100 GPUs.

docker login nvcr.io
# Username: $oauthtoken
# Password: 

docker pull nvcr.io/hpc/gromacs:2023.2

7. Prepare Benchmark Directory

mkdir -p ~/gmxbench && cd ~/gmxbench
wget -O benchPEP.zip https://www.mpinat.mpg.de/benchPEP.zip
unzip benchPEP.zip
ls -lh benchPEP.tpr

8. Launch Container with GPUs Mounted

docker run --rm -it --gpus all -v ~/gmxbench:/bench nvcr.io/hpc/gromacs:2023.2 bash

Inside the container:

cd /bench
export GMX_ENABLE_DIRECT_GPU_COMM=1
export OMP_NUM_THREADS=8  # Adjust per CPU
gmx mdrun -s benchPEP.tpr \
          -deffnm run1 \
          -nsteps 500000 \
          -nb gpu -bonded gpu -pme gpu \
          -update cpu -pin on -ntmpi 8 -npme 1 -nstlist 200

9. Key Parameters Explained

Flag	Purpose
-nb gpu	Short-range forces computed on GPU
-bonded gpu	Bonded terms computed on GPU
-pme gpu -npme 1	Long-range PME on one GPU
-update cpu	Required for legacy TPR updates
GMX_ENABLE_DIRECT_GPU_COMM=1	Enables peer-to-peer GPU communication
-ntmpi 8	One rank per GPU
-nstlist 200	Reduces neighbor-list rebuild overhead

Flag Purpose -nb gpu Short-range forces computed on GPU -bonded gpu Bonded terms computed on GPU -pme gpu -npme 1 Long-range PME on one GPU -update cpu Required for legacy TPR updates GMX_ENABLE_DIRECT_GPU_COMM=1 Enables peer-to-peer GPU communication -ntmpi 8 One rank per GPU -nstlist 200 Reduces neighbor-list rebuild overhead

10. Monitor Performance

GPU Utilization

watch -n1 nvidia-smi

Throughput (inside container)

watch -n2 "grep -E 'Performance|ns/day' run1.log | tail -n1"

Expected Performance (8=C3=97 H100):~8.0 ns/day

11. Scaling and Optimization Tips

Setting	Description
OMP Threads	Set OMP_NUM_THREADS total_CPU_threads / 8
DLB (Dynamic Load Balancing)	Let GROMACS manage automatically
CUDA Graphs	Enable via export GMX_CUDA_GRAPH=1
Run Length	Increase -nsteps for smoother performance trends
Resume Runs	gmx mdrun -s benchPEP.tpr -cpi run1.cpt -append

Setting Description OMP Threads Set OMP_NUM_THREADS total_CPU_threads / 8 DLB (Dynamic Load Balancing) Let GROMACS manage automatically CUDA Graphs Enable via export GMX_CUDA_GRAPH=1 Run Length Increase -nsteps for smoother performance trends Resume Runs gmx mdrun -s benchPEP.tpr -cpi run1.cpt -append

12. Results Location

All output files (e.g., md.log, run1.log, traj.xtc, ener.edr) are stored in:

~/gmxbench/

13. Troubleshooting

Issue	Possible Fix
nvidia-smi: command not found	Verify container uses --gpus all and toolkit is installed
CUDA driver not found	Ensure host driver =E2=89=A5 550 and Docker restarted
Benchmark fails early	Re-download benchPEP.zip or check disk space
Poor GPU scaling	Verify GMX_ENABLE_DIRECT_GPU_COMM=1 and NVLink topology

Issue Possible Fix nvidia-smi: command not found Verify container uses --gpus all and toolkit is installed CUDA driver not found Ensure host driver =E2=89=A5 550 and Docker restarted Benchmark fails early Re-download benchPEP.zip or check disk space Poor GPU scaling Verify GMX_ENABLE_DIRECT_GPU_COMM=1 and NVLink topology

1. Overview​

Objectives​

System Architecture​

2. Prerequisites​

3. Install Docker Engine​

4. Install NVIDIA Container Toolkit​

5. Validate GPU Visibility​

6. Login to NGC and Pull GROMACS Container​

7. Prepare Benchmark Directory​

8. Launch Container with GPUs Mounted​

Inside the container:​

9. Key Parameters Explained​

10. Monitor Performance​

GPU Utilization​

Throughput (inside container)​

11. Scaling and Optimization Tips​

12. Results Location​

13. Troubleshooting​