Run LLMs on Nodes¶

To run LLMs on dalek nodes, you can use the Open WebUI frontend on your machine locally and connect directly to a node.

Install and Run the OpenWeb-UI Frontend¶

From the quick-start page, there are multiple installation methods. We follow here the one for docker. Pull the image from the docker repositories:

docker pull ghcr.io/open-webui/open-webui:main

Run it on your local machine, making it available on localhost:3000:

sudo docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway \
                -e WEBUI_AUTH=False \
                -v open-webui:/app/backend/data \
                --name open-webui ghcr.io/open-webui/open-webui:main

Info

The important options here are that we launch openwebui in single user mode (we don't want to handle multiple accounts on a local machine) and the --add-host=host.docker.internal:host-gateway option will redirect localhost of you main OS to host.docker.internal inside the docker container. This will simplify the connection to the node(s) later.

Run a Backend on a Node¶

Multiple backends are available but Open WebUI handles out-of-the-box Ollama and makes getting models running easy.

To start ollama, run:

module load ollama
ollama serve

Info

This makes Ollama listen to web requests on the node, on port 11434.

Meeting the Ends¶

You then need to perform port forwarding to make your local and node 11434 ports communicate. On an allocated node, run the following:

ssh -J front.dalek.lip6 -N -f [USER_DALEK]@[NODE_NAME] -L 0.0.0.0:11434:localhost:11434

Info

This commands connects to the node by first performing a jump on the front. It then performs port forwarding to bind your port 11434 with the node's localhost:11434. The binding is done on all your local interfaces. This command runs in the background without launching a shell so if no error are printed after issuing it, it is working properly.

[USER_DALEK] is your Dalek login and [NODE_NAME] is the name of the node you want to connect to (typically something like az4-n4090-2 or az4-a7900-0).

To check if it works, you can browse http://localhost:11434/ and it should display Ollama is running.

Last step is to navigate to Open WebUI Admin Settings (or localhost:3000 page, User > Settings > Administration Settings > Connections > Ollama API (or ) and make sure that the Ollama API connection is set to http://host.docker.internal:11434.

That's all folks!

Installation Notes¶

By default, the following models are installed and made available when locading the ollama module: deepseek-r1:14b, gpt-oss:latest, mixtral:8x22b, qwen3-coder:30b, qwen3:30b and mistral-small3.2.

By default, you cannot add or update these models. If you need specific models to specific versions, you can do unset OLLAMA_MODELS after the module load ollama in order for ollama to default to your home repository as the default model library. As the disk quota available on the NFS may be limited, you can also set it to the scratch and download your models there.

For example, you can do the following to store the models on the scratch.

module load ollama
mkdir -p /scratch/$USER/ollama/
OLLAMA_MODELS=/scratch/$USER/ollama/ ollama serve

Danger

Please keep in mind that models are heavy. When you store them on the NFS or the scratch, keep an eye on what you really use and clean occasionally.