I made a mild how-to, for replicating my setup. Please note that the Nvidia driver setup is a SEPARATE "huge deal" that is out of scope for this how-to. You'll know the system is working if you see action in response to a prompt while running nvtop:
I run the open-webui stack found here:
https://github.com/open-webui/open-webuiIn order to have ollama (the LLM backend) available for my other fun projects and to ensure that the GPU is utilized by ollama, make sure you run the api and gpu compose files:
docker compose -f docker-compose.yaml -f docker-compose.api.yaml -f docker-compose.gpu.yaml up -d --build
I've catted my own versions of these files below:
(base) house@chonkers:~/open-webui$ cat docker-compose.gpu.yaml
services:
ollama:
# GPU support
deploy:
resources:
reservations:
devices:
- driver: ${OLLAMA_GPU_DRIVER-nvidia}
count: all
capabilities:
- gpu
(base) house@chonkers:~/open-webui$ cat docker-compose.api.yaml
version: '3.8'
services:
ollama:
# Expose Ollama API outside the container stack
ports:
- ${OLLAMA_WEBAPI_PORT-11434}:11434
(base) house@chonkers:~/open-webui$ cat docker-compose.yaml
version: '3.8'
services:
ollama:
volumes:
- ollama:/root/.ollama
container_name: ollama
pull_policy: always
tty: true
restart: unless-stopped
image: ollama/ollama:${OLLAMA_DOCKER_TAG-latest}
open-webui:
build:
context: .
args:
OLLAMA_BASE_URL: '/ollama'
dockerfile: Dockerfile
image:
ghcr.io/open-webui/open-webui:${WEBUI_DOCKER_TAG-main} container_name: open-webui
volumes:
- open-webui:/app/backend/data
depends_on:
- ollama
ports:
- 3131:8080
environment:
- 'OLLAMA_BASE_URL=
http://ollama:11434'
- 'WEBUI_SECRET_KEY='
extra_hosts:
- host.docker.internal:host-gateway
restart: unless-stopped
volumes:
ollama: {}
open-webui: {}
(base) house@chonkers:~/open-webui$