Skip to main content

Hybrid Cloud HPC

Extend your on-premises infrastructure with high-performance Bare Metal GPU clusters.

Architecture Overview

A hybrid setup connecting local workstations to cloud-based H100 GPU clusters via a secure VPN tunnel.

Use Cases

  • Large Language Model (LLM) Training: Burst to the cloud for training runs that exceed local capacity.
  • Molecular Dynamics Simulations: Run long-running simulations on dedicated hardware.
  • Rendering Farms: Offload heavy 3D rendering tasks.

Configuration

1. Bare Metal Provisioning

  • Deploy a cluster of H100 8x NVLink servers.
  • Ensure they are networked in the same private VLAN for low-latency node-to-node communication.

2. Secure Connectivity

  • Establish a Site-to-Site VPN or Configure OpenVPN on a bastion host (Head Node).
  • Ensure firewall rules allow traffic only from your office static IP.

3. Storage

  • Deploy a high-throughput shared filesystem (like Ceph or NFS) accessible by all compute nodes.
  • Mount the shared storage at /data on all nodes.