Hosting OpenWebUI

1. Overview

This guide provides a complete, production-grade setup for deploying Ollama (GPU-enabled) and Open WebUI on an Ubuntu server with NVIDIA GPUs. The environment enables users to interact with large language models (LLMs) locally or remotely via a modern web interface with GPU acceleration.

Objectives

Deploy a GPU-enabled LLM backend using Ollama
Integrate Open WebUI as the user-friendly front-end
Optionally configure NGINX + SSL for secure public access
Ensure persistent storage and monitoring for long-term use

Deploy a GPU-enabled LLM backend using Ollama Integrate Open WebUI as the user-friendly front-end Optionally configure NGINX + SSL for secure public access Ensure persistent storage and monitoring for long-term use

Architecture Overview

Layer	Purpose	Port
Ollama Container	GPU-accelerated LLM runtime	11434
Open WebUI Container	Web-based front-end interface	3000
NGINX Reverse Proxy	Public endpoint + SSL termination	80 / 443
Certbot (optional)	Automated TLS certificates	—
Cloudflare DNS (optional)	Domain → server IP mapping	—

Layer Purpose Port Ollama Container GPU-accelerated LLM runtime 11434 Open WebUI Container Web-based front-end interface 3000 NGINX Reverse Proxy Public endpoint + SSL termination 80 / 443 Certbot (optional) Automated TLS certificates — Cloudflare DNS (optional) Domain → server IP mapping —

2. Environment Prerequisites

Requirement	Recommended
OS	Ubuntu 20.04 / 22.04 / 24.04
GPU Driver	NVIDIA =E2=89=A5 535
Docker	Latest version (with --gpus all support)
Internet	Required for image and model downloads
DNS (optional)	e.g. ui.substrateai.net → your server IP

Requirement Recommended OS Ubuntu 20.04 / 22.04 / 24.04 GPU Driver NVIDIA =E2=89=A5 535 Docker Latest version (with --gpus all support) Internet Required for image and model downloads DNS (optional) e.g. ui.substrateai.net → your server IP

3. Install Docker and NVIDIA Container Toolkit

Step 1 — Install Docker

sudo apt update -y
sudo apt install -y ca-certificates curl gnupg lsb-release

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu $(. /etc/os-release; echo $VERSION_CODENAME) stable" | \
sudo tee /etc/apt/sources.list.d/docker.list

sudo apt update -y
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo systemctl enable --now docker

Step 2 — Install NVIDIA Container Toolkit

distribution=$(. /etc/os-release; echo ${ID}`${VERSION_ID}` | sed -e 's/\.//g')

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
  sudo gpg --dearmor -o /etc/apt/keyrings/nvidia-container-toolkit-keyring.gpg

curl -fsSL https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
  sed 's#deb https://#deb [signed-by=/etc/apt/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt update -y
sudo apt install -y nvidia-container-toolkit

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

4. Deploy Ollama (GPU-Enabled LLM Backend)

docker network create ai-net || true
docker volume create ollama || true

docker run -d \
  --name ollama \
  --gpus all \
  --restart unless-stopped \
  --network ai-net \
  -p 11434:11434 \
  -v ollama:/root/.ollama \
  -e OLLAMA_USE_CUDA=1 \
  -e NVIDIA_VISIBLE_DEVICES=all \
  -e NVIDIA_DRIVER_CAPABILITIES=compute,utility \
  ollama/ollama:latest

5. Pull and Test a Model

docker exec ollama ollama pull mistral:7b-instruct-q4_K_M
docker exec ollama ollama run mistral:7b-instruct-q4_K_M

Monitor GPU activity:

watch -n 0.5 nvidia-smi

6. Deploy Open WebUI (Front-End Interface)

docker volume create openwebui || true

docker run -d \
  --name open-webui \
  --restart unless-stopped \
  --network ai-net \
  -p 3000:8080 \
  -e OLLAMA_BASE_URL=http://ollama:11434 \
  -v openwebui:/app/backend/data \
  ghcr.io/open-webui/open-webui:latest

Access UI: http://:3000

7. Key Flags Explained

Flag	Purpose
--gpus all	Exposes all GPUs to the container
OLLAMA_USE_CUDA=1	Enables CUDA backend for Ollama
NVIDIA_VISIBLE_DEVICES=all	Makes all GPUs visible in container
NVIDIA_DRIVER_CAPABILITIES=compute,utility	Enables compute and monitoring
OLLAMA_BASE_URL	Connects WebUI to Ollama API
-v	Mounts persistent volumes for data/models

Flag Purpose --gpus all Exposes all GPUs to the container OLLAMA_USE_CUDA=1 Enables CUDA backend for Ollama NVIDIA_VISIBLE_DEVICES=all Makes all GPUs visible in container NVIDIA_DRIVER_CAPABILITIES=compute,utility Enables compute and monitoring OLLAMA_BASE_URL Connects WebUI to Ollama API -v Mounts persistent volumes for data/models

8. Verify Connectivity

Check running containers:

docker ps

Expected Endpoints:

Ollama API: http://localhost:11434
Open WebUI: http://localhost:3000

Ollama API: http://localhost:11434 Open WebUI: http://localhost:3000 If using DNS + NGINX proxy:

curl http://

9. (Optional) Enable NGINX Reverse Proxy with SSL

sudo apt install -y nginx certbot python3-certbot-nginx

cat <<'EOF' | sudo tee /etc/nginx/sites-available/openwebui.conf
server {
  listen 80;
  server_name ;

  location / {
    proxy_pass http://127.0.0.1:3000;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
  }`
}`
EOF

sudo ln -s /etc/nginx/sites-available/openwebui.conf /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
sudo certbot --nginx -d

10. Maintenance Commands

Task	Command
View logs	docker logs -f ollama
Restart services	docker restart ollama open-webui
Update images	docker pull ollama/ollama:latest && docker pull ghcr.io/open-webui/open-webui:latest
Monitor GPU	watch -n 1 nvidia-smi
Test SSL renew	sudo certbot renew --dry-run

Task Command View logs docker logs -f ollama Restart services docker restart ollama open-webui Update images docker pull ollama/ollama:latest && docker pull ghcr.io/open-webui/open-webui:latest Monitor GPU watch -n 1 nvidia-smi Test SSL renew sudo certbot renew --dry-run

11. Optional Enhancements

Enhancement	Description
Systemd service	Auto-start containers on reboot
HTTPS-only redirect	Add return 301 https://$host$request_uri; in port 80 block
Authentication	Protect WebUI with auth_basic or API keys
Monitoring	Integrate Prometheus + Grafana for GPU metrics
Load Balancing	Scale via NGINX or HAProxy across nodes
Caching	Use Redis / LMCache for faster model response
Kubernetes scaling	Deploy with MicroK8s / Argo for multi-GPU orchestration

Enhancement Description Systemd service Auto-start containers on reboot HTTPS-only redirect Add return 301 https://$host$request_uri; in port 80 block Authentication Protect WebUI with auth_basic or API keys Monitoring Integrate Prometheus + Grafana for GPU metrics Load Balancing Scale via NGINX or HAProxy across nodes Caching Use Redis / LMCache for faster model response Kubernetes scaling Deploy with MicroK8s / Argo for multi-GPU orchestration

1. Overview​

Objectives​

Architecture Overview​

2. Environment Prerequisites​

3. Install Docker and NVIDIA Container Toolkit​

Step 1 — Install Docker​

Step 2 — Install NVIDIA Container Toolkit​

4. Deploy Ollama (GPU-Enabled LLM Backend)​

5. Pull and Test a Model​

6. Deploy Open WebUI (Front-End Interface)​

7. Key Flags Explained​

8. Verify Connectivity​

9. (Optional) Enable NGINX Reverse Proxy with SSL​

10. Maintenance Commands​

11. Optional Enhancements​