Reviving an old GPU: Setting up Ollama and Llama 3.1 in a homelab
I’ve been wanting to use an old GPU I have had sitting around for a while and the recent release of Llama 3.1… I think I have found a reason to blow off the dust.
Hardware
The hardware on this server is not ideal but this list will hopefully help you know what may be needed.
CPU: Intel i7-3770 @ 3.4GHz
RAM: 32GB (4x8GB) DDR3 1333Mhz
GPU: GTX 1080 with Driver 535.183.01
As you can see, it isn’t much but it also used to be a pretty good gaming PC!
Setup
To get started, I am using Portainer to help orchestrate my docker-compose.yml
. This allows me to easily manage multiple containers across a fleet of virtual machines. This particular server is only running Plex and now Ollama.
Update packages
sudo apt update
Install NVIDIA drivers and NVIDIA
sudo apt install nvidia-driver-535
Upgrade system and reboot
sudo apt upgrade sudo reboot
Install NVIDIA toolkit
sudo apt-get install -y nvidia-container-toolkit
Restart docker
sudo systemctl restart docker
Your mileage may vary system to system and you may need to install nvidia-docker2
and you may need to use the nvidia-container-toolkit-base
instead of what I have above. If you get the error below then you may need to reinstall the NVIDIA drivers.
Failed to initialize NVML: Driver/library version mismatch
Docker Compose
So I talked about how I orchestrate my compose files. I managed to get it to work in a single compose file. I have other things such as a reverse proxy configured that allow me to add SSL certificates on top of these containers. This is mostly pulled from Open WebUI’s github here.
---
version: "3.8"
services:
ollama:
volumes:
- /opt/ollama:/root/.ollama
container_name: ollama
pull_policy: always
tty: true
restart: unless-stopped
image: ollama/ollama:latest
environment:
- NVIDIA_VISIBLE_DEVICES=all
deploy:
resources:
reservations:
devices:
- capabilities: ["gpu"]
open-webui:
build:
context: .
args:
OLLAMA_BASE_URL: '/ollama'
dockerfile: Dockerfile
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
volumes:
- /opt/ollama/webui:/app/backend/data
depends_on:
- ollama
ports:
- 8080:8080
environment:
- 'OLLAMA_BASE_URL=http://ollama:11434'
- 'WEBUI_SECRET_KEY='
extra_hosts:
- host.docker.internal:host-gateway
restart: unless-stopped
Conclusion
Thank you for surviving this long. With the above, I was able to get Llama 3.1 running on my GTX 1080 and it is actually quite fast. I’ve been a big user of OpenAI’s ChatGPT 4o and speed wise, this is a bit faster in its responses.
I did notice some differences. I had to prompt Llama 3.1 to give me code as outputs, otherwise it leaned towards text output. I will post some more in this space as I test it more.
Thanks all!
Cory