Skip to main content

Hosting OpenWebUI

1. Overview

This guide provides a complete, production-grade setup for deploying Ollama (GPU-enabled) and Open WebUI on an Ubuntu server with NVIDIA GPUs. The environment enables users to interact with large language models (LLMs) locally or remotely via a modern web interface with GPU acceleration.

Objectives

  • Deploy a GPU-enabled LLM backend using Ollama
  • Integrate Open WebUI as the user-friendly front-end
  • Optionally configure NGINX + SSL for secure public access
  • Ensure persistent storage and monitoring for long-term use

Deploy a GPU-enabled LLM backend using Ollama Integrate Open WebUI as the user-friendly front-end Optionally configure NGINX + SSL for secure public access Ensure persistent storage and monitoring for long-term use

Architecture Overview

LayerPurposePort
Ollama ContainerGPU-accelerated LLM runtime11434
Open WebUI ContainerWeb-based front-end interface3000
NGINX Reverse ProxyPublic endpoint + SSL termination80 / 443
Certbot (optional)Automated TLS certificates
Cloudflare DNS (optional)Domain → server IP mapping

Layer Purpose Port Ollama Container GPU-accelerated LLM runtime 11434 Open WebUI Container Web-based front-end interface 3000 NGINX Reverse Proxy Public endpoint + SSL termination 80 / 443 Certbot (optional) Automated TLS certificates — Cloudflare DNS (optional) Domain → server IP mapping —


2. Environment Prerequisites

RequirementRecommended
OSUbuntu 20.04 / 22.04 / 24.04
GPU DriverNVIDIA =E2=89=A5 535
DockerLatest version (with --gpus all support)
InternetRequired for image and model downloads
DNS (optional)e.g. ui.substrateai.net → your server IP

Requirement Recommended OS Ubuntu 20.04 / 22.04 / 24.04 GPU Driver NVIDIA =E2=89=A5 535 Docker Latest version (with --gpus all support) Internet Required for image and model downloads DNS (optional) e.g. ui.substrateai.net → your server IP


3. Install Docker and NVIDIA Container Toolkit

Step 1 — Install Docker

sudo apt update -y
sudo apt install -y ca-certificates curl gnupg lsb-release

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu $(. /etc/os-release; echo $VERSION_CODENAME) stable" | \
sudo tee /etc/apt/sources.list.d/docker.list

sudo apt update -y
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo systemctl enable --now docker

Step 2 — Install NVIDIA Container Toolkit

distribution=$(. /etc/os-release; echo ${ID}`${VERSION_ID}` | sed -e 's/\.//g')

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
sudo gpg --dearmor -o /etc/apt/keyrings/nvidia-container-toolkit-keyring.gpg

curl -fsSL https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/etc/apt/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt update -y
sudo apt install -y nvidia-container-toolkit

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

4. Deploy Ollama (GPU-Enabled LLM Backend)

docker network create ai-net || true
docker volume create ollama || true

docker run -d \
--name ollama \
--gpus all \
--restart unless-stopped \
--network ai-net \
-p 11434:11434 \
-v ollama:/root/.ollama \
-e OLLAMA_USE_CUDA=1 \
-e NVIDIA_VISIBLE_DEVICES=all \
-e NVIDIA_DRIVER_CAPABILITIES=compute,utility \
ollama/ollama:latest

5. Pull and Test a Model

docker exec ollama ollama pull mistral:7b-instruct-q4_K_M
docker exec ollama ollama run mistral:7b-instruct-q4_K_M

Monitor GPU activity:

watch -n 0.5 nvidia-smi

6. Deploy Open WebUI (Front-End Interface)

docker volume create openwebui || true

docker run -d \
--name open-webui \
--restart unless-stopped \
--network ai-net \
-p 3000:8080 \
-e OLLAMA_BASE_URL=http://ollama:11434 \
-v openwebui:/app/backend/data \
ghcr.io/open-webui/open-webui:latest

Access UI: http://:3000


7. Key Flags Explained

FlagPurpose
--gpus allExposes all GPUs to the container
OLLAMA_USE_CUDA=1Enables CUDA backend for Ollama
NVIDIA_VISIBLE_DEVICES=allMakes all GPUs visible in container
NVIDIA_DRIVER_CAPABILITIES=compute,utilityEnables compute and monitoring
OLLAMA_BASE_URLConnects WebUI to Ollama API
-vMounts persistent volumes for data/models

Flag Purpose --gpus all Exposes all GPUs to the container OLLAMA_USE_CUDA=1 Enables CUDA backend for Ollama NVIDIA_VISIBLE_DEVICES=all Makes all GPUs visible in container NVIDIA_DRIVER_CAPABILITIES=compute,utility Enables compute and monitoring OLLAMA_BASE_URL Connects WebUI to Ollama API -v Mounts persistent volumes for data/models


8. Verify Connectivity

Check running containers:

docker ps

Expected Endpoints:

Ollama API: http://localhost:11434 Open WebUI: http://localhost:3000 If using DNS + NGINX proxy:

curl http://

9. (Optional) Enable NGINX Reverse Proxy with SSL

sudo apt install -y nginx certbot python3-certbot-nginx

cat <<'EOF' | sudo tee /etc/nginx/sites-available/openwebui.conf
server {
listen 80;
server_name ;

location / {
proxy_pass http://127.0.0.1:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}`
}`
EOF

sudo ln -s /etc/nginx/sites-available/openwebui.conf /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
sudo certbot --nginx -d

10. Maintenance Commands

TaskCommand
View logsdocker logs -f ollama
Restart servicesdocker restart ollama open-webui
Update imagesdocker pull ollama/ollama:latest && docker pull ghcr.io/open-webui/open-webui:latest
Monitor GPUwatch -n 1 nvidia-smi
Test SSL renewsudo certbot renew --dry-run

Task Command View logs docker logs -f ollama Restart services docker restart ollama open-webui Update images docker pull ollama/ollama:latest && docker pull ghcr.io/open-webui/open-webui:latest Monitor GPU watch -n 1 nvidia-smi Test SSL renew sudo certbot renew --dry-run


11. Optional Enhancements

EnhancementDescription
Systemd serviceAuto-start containers on reboot
HTTPS-only redirectAdd return 301 https://$host$request_uri; in port 80 block
AuthenticationProtect WebUI with auth_basic or API keys
MonitoringIntegrate Prometheus + Grafana for GPU metrics
Load BalancingScale via NGINX or HAProxy across nodes
CachingUse Redis / LMCache for faster model response
Kubernetes scalingDeploy with MicroK8s / Argo for multi-GPU orchestration

Enhancement Description Systemd service Auto-start containers on reboot HTTPS-only redirect Add return 301 https://$host$request_uri; in port 80 block Authentication Protect WebUI with auth_basic or API keys Monitoring Integrate Prometheus + Grafana for GPU metrics Load Balancing Scale via NGINX or HAProxy across nodes Caching Use Redis / LMCache for faster model response Kubernetes scaling Deploy with MicroK8s / Argo for multi-GPU orchestration