Hosting OpenWebUI
1. Overview
This guide provides a complete, production-grade setup for deploying Ollama (GPU-enabled) and Open WebUI on an Ubuntu server with NVIDIA GPUs. The environment enables users to interact with large language models (LLMs) locally or remotely via a modern web interface with GPU acceleration.
Objectives
- Deploy a GPU-enabled LLM backend using Ollama
- Integrate Open WebUI as the user-friendly front-end
- Optionally configure NGINX + SSL for secure public access
- Ensure persistent storage and monitoring for long-term use
Deploy a GPU-enabled LLM backend using Ollama Integrate Open WebUI as the user-friendly front-end Optionally configure NGINX + SSL for secure public access Ensure persistent storage and monitoring for long-term use
Architecture Overview
| Layer | Purpose | Port |
|---|---|---|
| Ollama Container | GPU-accelerated LLM runtime | 11434 |
| Open WebUI Container | Web-based front-end interface | 3000 |
| NGINX Reverse Proxy | Public endpoint + SSL termination | 80 / 443 |
| Certbot (optional) | Automated TLS certificates | — |
| Cloudflare DNS (optional) | Domain → server IP mapping | — |
Layer Purpose Port Ollama Container GPU-accelerated LLM runtime 11434 Open WebUI Container Web-based front-end interface 3000 NGINX Reverse Proxy Public endpoint + SSL termination 80 / 443 Certbot (optional) Automated TLS certificates — Cloudflare DNS (optional) Domain → server IP mapping —
2. Environment Prerequisites
| Requirement | Recommended |
|---|---|
| OS | Ubuntu 20.04 / 22.04 / 24.04 |
| GPU Driver | NVIDIA =E2=89=A5 535 |
| Docker | Latest version (with --gpus all support) |
| Internet | Required for image and model downloads |
| DNS (optional) | e.g. ui.substrateai.net → your server IP |
Requirement
Recommended
OS
Ubuntu 20.04 / 22.04 / 24.04
GPU Driver
NVIDIA =E2=89=A5 535
Docker
Latest version (with --gpus all support)
Internet
Required for image and model downloads
DNS (optional)
e.g. ui.substrateai.net → your server IP
3. Install Docker and NVIDIA Container Toolkit
Step 1 — Install Docker
sudo apt update -y
sudo apt install -y ca-certificates curl gnupg lsb-release
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu $(. /etc/os-release; echo $VERSION_CODENAME) stable" | \
sudo tee /etc/apt/sources.list.d/docker.list
sudo apt update -y
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo systemctl enable --now docker
Step 2 — Install NVIDIA Container Toolkit
distribution=$(. /etc/os-release; echo ${ID}`${VERSION_ID}` | sed -e 's/\.//g')
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
sudo gpg --dearmor -o /etc/apt/keyrings/nvidia-container-toolkit-keyring.gpg
curl -fsSL https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/etc/apt/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update -y
sudo apt install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
4. Deploy Ollama (GPU-Enabled LLM Backend)
docker network create ai-net || true
docker volume create ollama || true
docker run -d \
--name ollama \
--gpus all \
--restart unless-stopped \
--network ai-net \
-p 11434:11434 \
-v ollama:/root/.ollama \
-e OLLAMA_USE_CUDA=1 \
-e NVIDIA_VISIBLE_DEVICES=all \
-e NVIDIA_DRIVER_CAPABILITIES=compute,utility \
ollama/ollama:latest
5. Pull and Test a Model
docker exec ollama ollama pull mistral:7b-instruct-q4_K_M
docker exec ollama ollama run mistral:7b-instruct-q4_K_M
Monitor GPU activity:
watch -n 0.5 nvidia-smi
6. Deploy Open WebUI (Front-End Interface)
docker volume create openwebui || true
docker run -d \
--name open-webui \
--restart unless-stopped \
--network ai-net \
-p 3000:8080 \
-e OLLAMA_BASE_URL=http://ollama:11434 \
-v openwebui:/app/backend/data \
ghcr.io/open-webui/open-webui:latest
Access UI: http://:3000
7. Key Flags Explained
| Flag | Purpose |
|---|---|
| --gpus all | Exposes all GPUs to the container |
| OLLAMA_USE_CUDA=1 | Enables CUDA backend for Ollama |
| NVIDIA_VISIBLE_DEVICES=all | Makes all GPUs visible in container |
| NVIDIA_DRIVER_CAPABILITIES=compute,utility | Enables compute and monitoring |
| OLLAMA_BASE_URL | Connects WebUI to Ollama API |
| -v | Mounts persistent volumes for data/models |
Flag
Purpose
--gpus all
Exposes all GPUs to the container
OLLAMA_USE_CUDA=1
Enables CUDA backend for Ollama
NVIDIA_VISIBLE_DEVICES=all
Makes all GPUs visible in container
NVIDIA_DRIVER_CAPABILITIES=compute,utility
Enables compute and monitoring
OLLAMA_BASE_URL
Connects WebUI to Ollama API
-v
Mounts persistent volumes for data/models
8. Verify Connectivity
Check running containers:
docker ps
Expected Endpoints:
- Ollama API: http://localhost:11434
- Open WebUI: http://localhost:3000
Ollama API: http://localhost:11434 Open WebUI: http://localhost:3000 If using DNS + NGINX proxy:
curl http://
9. (Optional) Enable NGINX Reverse Proxy with SSL
sudo apt install -y nginx certbot python3-certbot-nginx
cat <<'EOF' | sudo tee /etc/nginx/sites-available/openwebui.conf
server {
listen 80;
server_name ;
location / {
proxy_pass http://127.0.0.1:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}`
}`
EOF
sudo ln -s /etc/nginx/sites-available/openwebui.conf /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx
sudo certbot --nginx -d
10. Maintenance Commands
| Task | Command |
|---|---|
| View logs | docker logs -f ollama |
| Restart services | docker restart ollama open-webui |
| Update images | docker pull ollama/ollama:latest && docker pull ghcr.io/open-webui/open-webui:latest |
| Monitor GPU | watch -n 1 nvidia-smi |
| Test SSL renew | sudo certbot renew --dry-run |
Task
Command
View logs
docker logs -f ollama
Restart services
docker restart ollama open-webui
Update images
docker pull ollama/ollama:latest && docker pull ghcr.io/open-webui/open-webui:latest
Monitor GPU
watch -n 1 nvidia-smi
Test SSL renew
sudo certbot renew --dry-run
11. Optional Enhancements
| Enhancement | Description |
|---|---|
| Systemd service | Auto-start containers on reboot |
| HTTPS-only redirect | Add return 301 https://$host$request_uri; in port 80 block |
| Authentication | Protect WebUI with auth_basic or API keys |
| Monitoring | Integrate Prometheus + Grafana for GPU metrics |
| Load Balancing | Scale via NGINX or HAProxy across nodes |
| Caching | Use Redis / LMCache for faster model response |
| Kubernetes scaling | Deploy with MicroK8s / Argo for multi-GPU orchestration |
Enhancement
Description
Systemd service
Auto-start containers on reboot
HTTPS-only redirect
Add return 301 https://$host$request_uri; in port 80 block
Authentication
Protect WebUI with auth_basic or API keys
Monitoring
Integrate Prometheus + Grafana for GPU metrics
Load Balancing
Scale via NGINX or HAProxy across nodes
Caching
Use Redis / LMCache for faster model response
Kubernetes scaling
Deploy with MicroK8s / Argo for multi-GPU orchestration