OS·WholeTech
OS·WholeTech / Proxmox
🧱 Proxmox VE · home lab

The AI stack on Proxmox.

Proxmox VE is a free hypervisor — software that runs multiple virtual computers (VMs) and lightweight containers (LXC) on one physical machine. The goal here: create one always-on Linux container that hosts your AI tools and serves them to your whole network over Tailscale.

This is a more advanced home-lab tool than the other guides, but the plan is simple: make a container, and then it's just the Linux guide running inside it. Every command is here to copy.

🗺️New to the names? See the full map of AI tools and how they fit together at /landscape/.
Before you start

What you'll need

1

Create a Linux container

An LXC container is a lightweight virtual Linux machine — much lighter than a full VM. You'll make one container, called something like ai-box, and everything else in this guide runs inside it.

Why this first: one always-on container is the home for all your AI tools. Build it once and it quietly runs 24/7, ready for every device on your network.

Open the Proxmox web interface

In a browser on any device, go to your Proxmox box (replace with its IP address):

https://YOUR-PROXMOX-IP:8006
Get an OS template (one-time)

Open the host Shell (Datacenter → your node → Shell) and download an Ubuntu template:

pveam update
pveam available --section system
pveam download local ubuntu-24.04-standard_24.04-2_amd64.tar.zst

The exact filename from the available list may differ — use the ubuntu-24.04-standard one shown in your list.

Create the container in the UI

Click Create CT (top-right). Set a hostname (e.g. ai-box), pick the template you just downloaded, give it about 4 GB RAM and 2 cores to start, finish the wizard, then Start it.

Enter the container from the host Shell

The CTID is the number shown for your container (e.g. 100):

pct enter CTID
✓ Working when: inside the container, running cat /etc/os-release shows Ubuntu.
💡Shortcut: there are community "helper scripts" (community-scripts.github.io, formerly tteck's scripts) that create containers in one line — convenient, but they're community-maintained, so only run them if you trust them.
2

Node & Claude Code

The container is a small Linux machine, so from here the setup matches the Linux guide. First the basics and Node.js (the engine three of the AI tools run on), then Claude Code — Anthropic's AI coding agent that lives in your terminal and edits real files for you.

Why this first: every tool below installs through the terminal, and three of them need Node. Get this done and the rest is copy-paste.

Update the system & install basics
apt update && apt upgrade -y
apt install -y curl git build-essential
Install Node.js

This adds the official NodeSource repository, then installs the current LTS version:

curl -fsSL https://deb.nodesource.com/setup_lts.x | bash -
apt install -y nodejs
Install Claude Code
npm install -g @anthropic-ai/claude-code
Start it & log in

Go into any folder you want to work in, then launch:

claude

The container has no desktop browser, so the first run prints a URL. Copy that link, open it in the browser on your laptop or phone, sign in to your Anthropic account, and you're authorized. After that, just type what you want in plain English.

✓ Working when: running claude --version prints a version, and typing claude drops you into a prompt that greets you.
📘Go deeper: the full Claude playbook lives at claude.wholetech.com, and the inside-the-container steps are identical to the Linux guide.
3

Codex

OpenAI's command-line coding agent. Same idea as Claude Code, different brain — handy as a second opinion or when you've used up one tool's quota.

Why you want it: variety. Different models are stronger at different things; having both means you're never stuck.

Install
npm install -g @openai/codex
Start it & log in
codex

First run lets you sign in with your ChatGPT/OpenAI account (or an API key). With no desktop browser it prints a link to copy into your own browser, same as Claude.

✓ Working when: codex --version prints a version and codex opens its prompt.
4

Gemini CLI

Google's command-line AI agent. Notable for a very generous free tier and an enormous memory (context window) for long documents.

Why you want it: free headroom. Great for big, sprawling tasks before you spend on the others.

Install
npm install -g @google/gemini-cli
Start it & log in
gemini

First run signs you in with a Google account using the same URL-based login — it prints a link, you open it in your browser to authorize. The free tier is large, so you likely won't pay anything to start.

✓ Working when: gemini --version prints a version and gemini opens its prompt.

Hermes Agent bonus · 4th agent

A coding agent from Nous Research — "the agent that grows with you." Same idea as Claude Code, Codex, and Gemini, with two twists: it's self-improving (it learns your preferences over time), and it's model-agnostic — you point it at whichever brain you want (Nous, OpenAI, Anthropic, OpenRouter, and more).

Why you'd add it: it's a fourth tool in the rotation and the most model-flexible of the bunch — handy for staying un-locked-in (see the future-proofing principles).

You install it inside the Linux container you made in Step 1 — same as the other agents on this page. It installs natively in the container.

Inside your container
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
Alternative: with pip
pip install hermes-agent
hermes postinstall
Set it up & start it
hermes setup        # configure it the first time
hermes --tui        # start it (modern terminal UI, recommended)

During setup it asks how you want to sign in — an API key, or an OAuth login via hermes setup --portal. It works with the Nous Portal, OpenAI, Anthropic, OpenRouter, and others, so you can reuse an account you already have.

✓ Working when: hermes --version prints a version, and hermes --tui opens its interface.
📘Official docs: hermes-agent.nousresearch.com/docs. Since you run it inside a Linux container, the Linux guide applies here too.
5

Tailscale

A private network (a "mesh VPN") that connects all your devices to each other securely — your Proxmox box, your PCs, your Macs, your phone, your NAS — as if they were in the same room, from anywhere in the world.

Why you want it: this is the glue. Once your AI box is on Tailscale, every device on your network can reach it by name — to share one Ollama for the whole house, or to use Claude Code on it from your phone.

Easiest: install on the Proxmox host

You can run Tailscale on the Proxmox host itself:

curl -fsSL https://tailscale.com/install.sh | sh
tailscale up
To run it inside the container (note the quirk)

An LXC container needs access to the host's TUN device first. From the Proxmox host, edit /etc/pve/lxc/CTID.conf and add these two lines, then restart the container:

lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.mount.entry: /dev/net/tun dev/net/tun none bind,create=file

Then, inside the container, install and bring it up:

curl -fsSL https://tailscale.com/install.sh | sh
tailscale up

This prints a URL. Open it in your browser and sign in with the same Tailscale account you use on your other devices, so they all join one private network.

✓ Working when: run tailscale status — it lists this machine and any others already signed in, each with a 100.x.y.z address.
🌐Why it matters: now this AI box is reachable by name from every device on your Tailscale network.
6

Ollama

Runs AI models on your own box instead of the cloud — free, private, and works offline. Good for chat, summarizing, and coding help without a subscription.

Why you want it: no per-use cost, nothing leaves your machine, and combined with Tailscale (Step 5) this one always-on box can serve models to your phone and every other device.

Install & run your first model

Inside the container, install Ollama and pull a small, capable model:

curl -fsSL https://ollama.com/install.sh | sh
ollama run llama3.2

The first run downloads the model (a couple of GB), then you can chat right in the terminal. Type /bye to leave.

✓ Working when: after the download, it gives you a >>> prompt and answers a question you type.
About GPUs: by default the container uses CPU only (slow, small models). For real speed you need an NVIDIA GPU. The clean way is to pass the GPU through to a VM (Datacenter → node → the VM's HardwareAddPCI Device) and install NVIDIA drivers there; GPU passthrough into an LXC is possible but advanced. If you have no GPU, stick to small models or use a GPU cloud.
🌐Share it across the network: create a systemd override so Ollama listens on all interfaces — add Environment="OLLAMA_HOST=0.0.0.0:11434" — so every Tailscale device shares this one model server.
You're done

What you have now

You have one always-on Linux container that runs three cloud AI agents, sits on your private Tailscale network (reachable by name from every device), and serves local models with Ollama. The whole house now shares one AI box — built once on hardware you already own.