Hybrid Cloud HPC
Extend your on-premises infrastructure with high-performance Bare Metal GPU clusters.
Architecture Overview
A hybrid setup connecting local workstations to cloud-based H100 GPU clusters via a secure VPN tunnel.
Use Cases
- Large Language Model (LLM) Training: Burst to the cloud for training runs that exceed local capacity.
- Molecular Dynamics Simulations: Run long-running simulations on dedicated hardware.
- Rendering Farms: Offload heavy 3D rendering tasks.
Configuration
1. Bare Metal Provisioning
- Deploy a cluster of H100 8x NVLink servers.
- Ensure they are networked in the same private VLAN for low-latency node-to-node communication.
2. Secure Connectivity
- Establish a Site-to-Site VPN or Configure OpenVPN on a bastion host (Head Node).
- Ensure firewall rules allow traffic only from your office static IP.
3. Storage
- Deploy a high-throughput shared filesystem (like Ceph or NFS) accessible by all compute nodes.
- Mount the shared storage at
/dataon all nodes.