Jeeves — Building an AI Butler for My Homelab

How I wired Claude Sonnet, n8n, Discord, VAPI, ElevenLabs, Netdata, and a handful of WireGuard tunnels into a homelab AI that monitors, alerts, and literally calls my phone when things go wrong.


The Goal

Dashboards are passive. You have to go look at them. I wanted something that would come to me — answer questions in plain English, run commands on my servers, and call my phone when the firewall goes onto battery backup.

I named him Jeeves.

Phone ──────────────────► VAPI webhook ──► Jeeves (Claude Sonnet) │ Discord ────────────────► Discord bridge bot │ ▼ get_ups_status run_command get_netdata query_prometheus post_to_discord get_amp_status │ ──────── WireGuard ───────────┘ │ │ │ fort/halt containy unraid OPNsense + UPS Docker host 134 containers

The stack:

ComponentRole
n8n (self-hosted VPS)Workflow engine and AI agent host
Claude Sonnet 4.6The brain
DiscordPrimary text interface
VAPIInbound and outbound phone calls
ElevenLabsJeeves' voice (TTS)
NetdataMetrics on fort, unraid, hawk
apcupsdUPS monitoring
WireGuardConnects VPS to homelab subnets

Phase 1: Wiring Up the Homelab

WireGuard — the ephemeral port problem

Before Jeeves could do anything, I needed stable WireGuard tunnels from the n8n VPS to the rest of the lab. Simple enough — except the tunnels kept dying after restarts.

The culprit: wg-quick was picking a random UDP source port on every restart. The firewall on the homelab end had an allow rule for the previous port. New port, no handshake.

The fix is one line in the WireGuard config:

[Interface]
PrivateKey = ...
Address = 10.2.2.3/32
ListenPort = 51821   # ← this. prevents random port on restart.

And a follow-up lesson: wg-quick is registered as a oneshot systemd service. After it succeeds, systemctl start does nothing — the service considers itself complete. Always use systemctl restart wg-quick@wg0.

containy's invisible wall

While building Jeeves, I discovered that containy — my main Docker host — had a strange problem. SSH into it worked fine. But the host couldn't reach its own gateway, couldn't do DNS, couldn't ping 8.8.8.8.

The diagnosis process eventually led here:

# ARP resolved the gateway just fine
arp -n
# 10.0.11.100  ether  58:9c:fc:00:14:62  C  enp0s31f6

# But pinging it from containy? 100% packet loss.
# DNS stuck in a degraded loop:
journalctl -u systemd-resolved -n 20
# Mar 24 13:15:44: Using degraded feature set UDP instead of TCP
# Mar 24 13:15:49: Using degraded feature set TCP instead of UDP
# (forever)

Root cause: a missing firewall allow rule in OPNsense for 10.0.11.11 on the LAN interface. Inbound SSH worked because it was initiated from outside — the stateful firewall tracked the established connection. But new outbound flows from containy had no matching allow rule and were silently dropped.

Rule of thumb: If SSH into a host works but the host can't ping its own gateway, the answer is almost always a stateful firewall with a missing allow rule for that source IP.

Phase 2: Building the Core Agent

Jeeves lives in n8n as an AI Agent workflow. The trigger is a webhook at /webhook/jeeves that accepts {"message": "...", "source": "..."}.

I started with GPT-4o mini and quickly switched to Claude Sonnet 4.6. The difference in reasoning quality for homelab tasks — where context matters and the answers need to be accurate — was immediately obvious. System prompt:

You are Jeeves, a distinguished AI butler managing the Galaxy Lab homelab. You have access to tools to check system status, query metrics, run commands on hosts, and send alerts. Be concise, accurate, and appropriately formal.

The tool architecture

Each tool is a separate n8n sub-workflow called via Execute Workflow node. This keeps the main agent workflow clean and makes each tool independently testable.

ToolWhat it does
get_ups_statusTCP query to apcupsd:3551
get_netdata_metricsHTTP to Netdata REST API
query_prometheusPromQL via HTTP
run_commandSSH to whitelisted hosts
post_to_discordDiscord webhook
get_amp_statusAMP game server panel

UPS monitoring — talking to apcupsd

Fort runs apcupsd connected to an APC Back-UPS NS 600. The apcupsd daemon exposes a TCP network server on port 3551 using a simple binary protocol. You don't need SSH to query it — just a TCP socket:

// Send: 2-byte big-endian length prefix + "status"
// Receive: series of 2-byte-prefixed chunks, terminated by zero-length chunk

const net = require('net');
const raw = await new Promise((resolve, reject) => {
  const client = net.createConnection({ host: '10.0.16.100', port: 3551 }, () => {
    const cmd = 'status';
    const buf = Buffer.alloc(2 + cmd.length);
    buf.writeUInt16BE(cmd.length, 0);
    buf.write(cmd, 2);
    client.write(buf);
  });

  // Critical: buffer all data — TCP delivers length prefix and data in separate chunks
  let rxBuf = Buffer.alloc(0);
  let chunks = '';
  client.on('data', data => {
    rxBuf = Buffer.concat([rxBuf, data]);
    while (rxBuf.length >= 2) {
      const len = rxBuf.readUInt16BE(0);
      if (len === 0) { client.destroy(); resolve(chunks); return; }
      if (rxBuf.length < 2 + len) break;
      chunks += rxBuf.slice(2, 2 + len).toString('utf8');
      rxBuf = rxBuf.slice(2 + len);
    }
  });
  client.on('error', reject);
  client.setTimeout(5000, () => { client.destroy(); reject(new Error('timeout')); });
});

That streaming buffer was not in the original version. The original worked fine locally and then produced corrupt data in production. The reason: TCP is a stream protocol — the sender can split data arbitrarily across segments. apcupsd sends the 2-byte length prefix in one TCP segment and the actual data in the next. The fix is to accumulate everything and only parse complete chunks.

Discord integration — the inbound problem

Sending alerts to Discord is easy — just a webhook POST. Getting Discord messages into n8n is harder. Discord webhooks are one-way (outbound only). n8n's current version has no built-in Discord Trigger node. Solution: a small Python bot as a Docker container.

@client.event
async def on_message(message):
    if message.author.bot:
        return
    if message.channel.id != JEEVES_CHANNEL:
        return

    async with message.channel.typing():
        async with aiohttp.ClientSession() as session:
            payload = {"message": message.content, "source": "discord",
                       "author": message.author.display_name}
            async with session.post("http://n8n:5678/webhook/jeeves",
                                    json=payload, timeout=aiohttp.ClientTimeout(total=60)) as resp:
                data = await resp.json()
                response_text = data.get("output", str(data))

    await message.reply(response_text)

Deployed as a Docker container on the same network as n8n, so it calls n8n at http://n8n:5678 without going through the public internet. Don't forget to enable Message Content Intent in the Discord Developer Portal — the bot won't see message content without it.


Phase 3: Voice — VAPI + ElevenLabs

Text is great, but phone calls interrupt. When something critical happens, I want Jeeves to call me — not send a Discord ping I might not see.

The architecture:

  • VAPI handles call orchestration (inbound and outbound)
  • ElevenLabs provides the voice
  • n8n webhook handles tool calls from VAPI
  • Same underlying tools as Discord — Jeeves can do the same things either way

How VAPI tool calls work

When Jeeves (running inside a VAPI call) needs to use a tool, VAPI POSTs to your server URL:

POST https://n8n.yourdomain.com/webhook/vapi-tool
{
  "message": {
    "type": "tool-calls",
    "toolCallList": [{
      "id": "call_abc123",
      "type": "function",
      "function": {
        "name": "get_ups_status",
        "arguments": "{}"
      }
    }]
  }
}

Your server has roughly 20 seconds to respond with:

{
  "results": [{
    "toolCallId": "call_abc123",
    "result": "FortUPS: ONLINE | Battery 100% | Load 4% | Runtime 129 min"
  }]
}

The n8n VAPI tool handler workflow does exactly this: extract the tool call from the payload, route to the right sub-workflow, format the result, respond.

Outbound calls for critical alerts

The proactive monitor (runs every 5 minutes) now has a branch for critical alerts. If the UPS goes on battery, or a host reboots unexpectedly, it doesn't just ping Discord — it calls my phone:

// In the proactive monitor workflow
const isCritical = alerts.some(a =>
  a.includes('ON BATTERY') || a.includes('LOW BATTERY') || a.includes('rebooted recently'));

if (isCritical) {
  // Trigger outbound VAPI call
  await fetch('https://api.vapi.ai/call/phone', {
    method: 'POST',
    headers: { Authorization: `Bearer ${VAPI_KEY}`, 'Content-Type': 'application/json' },
    body: JSON.stringify({
      assistantId: ASSISTANT_ID,
      phoneNumberId: PHONE_NUMBER_ID,
      customer: { number: MY_PHONE },
      assistantOverrides: {
        firstMessage: `Sir, this is Jeeves. ${voiceSummary}. Say "status" for a full rundown.`
      }
    })
  });
}

When the call connects, Jeeves answers in character and has full access to all his tools. I can ask for the full status, tell him to bounce a WireGuard tunnel, or check which containers are down — all by talking.


n8n Quirks Worth Knowing

If you're building something similar in n8n, here are the non-obvious things that cost me the most time:

1. fetch only works in AI agent tool context. When a sub-workflow is called via executeWorkflow, fetch() and $helpers.httpRequest() are not defined. Use require('http') instead — but you also need NODE_FUNCTION_ALLOW_BUILTIN=net,http,https,url,crypto in your n8n container environment.

2. workflowId must be a plain string. The n8n Execute Workflow node stores IDs internally as {__rl: true, value: "id", mode: "id"}. If you create workflows via the API and use that object format for the workflowId parameter, you get "Workflow does not exist" at runtime. Use a plain string.

3. Multi-output Code nodes break in executeWorkflow. Returning null or [] for unused outputs causes "Code doesn't return items properly" when called via executeWorkflow. Use IF chain routing instead.

4. SSH node authentication defaults to password. If you attach an SSH private key credential but don't explicitly set authentication: "privateKey" in the node parameters, n8n will look for a password credential and fail with a confusing error.

5. active is read-only on workflow creation. POST /api/v1/workflows rejects payloads that include an active field. Strip it before sending.


What Jeeves Can Do Now

Via Discord or phone call, Jeeves can:

  • Report UPS battery %, load, and runtime estimate
  • Report CPU, RAM, disk, load average, and uptime for any host
  • List Docker containers and their states
  • Run PromQL queries against Prometheus
  • SSH into whitelisted hosts and run commands
  • Report game server status via AMP
  • Post alerts to Discord

Proactively, every 5 minutes:

  • Checks all thresholds (CPU, RAM, disk, uptime, containers, UPS)
  • Posts to Discord when anything is wrong
  • Calls my phone when something is critically wrong

Every Monday morning, a digest drops in Discord covering the week's health across all hosts.


What's Next

The current voice setup is inbound/outbound calling. Future ideas:

  • WireGuard tunnel health added to proactive monitor phone alerts
  • More hosts in run_command — currently limited to the servers I trust with SSH access
  • Jeeves initiating remediation — not just reporting problems but bouncing services, restarting tunnels
  • Custom ElevenLabs voice — the current voice is close but not perfect for the Jeeves character

Full setup guide and code at github.com/iamgadgetman/jeeves.

If you build something similar, I'd genuinely like to hear about it — this kind of homelab AI stack is still pretty niche and the community knowledge base is thin. The n8n quirks alone would have been a lot easier to navigate with prior art to reference.

Comments

Popular Posts