Skip to content
All articles
Security··8 min read

How we harden every AI agent deployment

An agent box is a privileged target: it talks to your messaging platforms, holds API keys, and can run commands. Here's what hardened means for us.


An agent box isn't a typical web app server. It holds API keys, OAuth tokens, conversation history, sometimes a memory store of your whole business context — and it can run commands. If someone gets a shell on it, the impact is larger than "the website goes down." Hardening it properly isn't optional.

This is the layered baseline we apply to every deployment. None of it is exotic. The point is that all of it is on by default — not waiting to be configured after a breach.

The threat model in one paragraph

The agent has internet access (to call APIs and the model provider), it accepts inbound messages from at least one platform (Telegram, Slack, Discord, etc.), it runs code locally, and it stores long-lived secrets. The realistic threats are: someone brute-forcing SSH, someone exploiting an unpatched package, the agent itself going off the rails on a destructive action, a leaked secret being abused, and accidental footguns from the operator. Each layer below addresses at least one of those.

Network: closed unless explicitly open

  • Hetzner cloud firewallat the perimeter, with a policy that defaults to deny. The only inbound ports are the ones we need; everything else doesn't even reach the kernel.
  • ufw on the host as a second layer, also default-deny. Belt and suspenders against config drift.
  • Tailscale mesh for admin access. The operator connects to the box over the tailnet, not over the public internet, and the SSH port is meaningfully reachable only from authorised devices.
  • Public SSHstays open as a break-glass path (Tailscale can fail; control planes have outages), but it's protected as described below.

SSH: keys only, banned on abuse, drift-resistant

  • Password authentication is off. Root login is off. Only authorised public keys can log in.
  • fail2ban watches sshd and bans IPs that probe repeatedly. Most opportunistic scans give up quickly.
  • Hardened sshd config is applied via reboot, not a plain reload. Ubuntu 24.04 socket-activates ssh, and `systemctl reload sshd` doesn't actually pick up a drop-in change until the box reboots. We detect when the drop-in is newer than the kernel boot time and reboot at the end of the provisioning run, so a fresh box converges fully hardened on its first cycle.

Users: privilege tiers, not one big root

Three users, each with the minimum access they need:

  • admin — unrestricted sudo, used for break-glass and provisioning.
  • claude— sudo with a destructive-command blacklist (rm -rf, dd to disk, mkfs, etc.). Used when an agent or operator needs to run system commands but shouldn't be able to wipe the box.
  • agentuser — no sudo at all. The Hermes agent process runs here. If the agent is exploited, the attacker is in a non-privileged account.

OS: patched automatically, with a guard

  • unattended-upgrades applies security patches on a regular cadence, with automatic reboot scheduled outside business hours. Falling behind on patches is one of the most common ways servers get popped; automating it is non-negotiable.
  • Agent upgrades go through a snapshot first. A full Hetzner snapshot is taken before each version bump; if the new version fails acceptance checks, we roll back to the snapshot. The agent runtime is a moving target; we treat upgrades as a controlled, reversible operation, not a hope.

Agent runtime: approval-gated, with checkpoints

Inside the agent itself we keep guardrails on by default:

  • Approval mode is manualfor dangerous commands. The bot asks the operator (typically via Telegram) before executing anything that could break things. Operator says approve; it runs. Operator says deny; it doesn't.
  • Destructive slash commands require confirmation.A second guard against rapid-fire mistakes.
  • File-edit checkpoints are on. The agent records a before-state when it edits files; a single /rollbackundoes its last set of edits. Cheap insurance against bad edits.

Secrets: encrypted at rest, excluded from backups

  • Secrets live in an encrypted vault in the control repo, not in plaintext anywhere on the box.
  • The OAuth token and `.env` files are explicitly excluded from backups. If a backup ever leaks, no working creds leak with it. Recovery means restoring data from backup andre-pairing OAuth — annoying once a year, much better than the alternative.

Health: not silent until it breaks

  • A watchdog runs every couple of minutes, checking the gateway service, disk pressure, last-backup age, and Tailscale state. After three consecutive fails it restarts the gateway.
  • The watchdog feeds an uptime monitor outside the box, so an unreachable box gets flagged in seconds — not when a user notices it's ignoring them.

What this isn't

It isn't a substitute for a real security program if you have regulatory obligations that require one. It isn't exotic — it's boring sysadmin hygiene applied uniformly and never forgotten. That's the value: every deployment gets the same posture, every time, with no "we'll harden it after launch" on the roadmap.


Want one of these for your business?

We run dedicated, hardened, monitored AI agents on your behalf — single-tenant, end-to-end.

Request access