Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.augustin.ai/llms.txt

Use this file to discover all available pages before exploring further.

Future State Architecture

A comprehensive description of the target architecture for the homelab, AI orchestration, state management, and service model. Produced from a deep planning session on 2026-03-08.

Guiding Principles

  • Declarative over imperative. The system should be defined in files that describe the desired state, not scripts that perform steps.
  • Machines are dumb compute. No machine is special. Any machine should be replaceable by installing the OS and converging it from an external definition.
  • The control path cannot depend on what it controls. The systems used to manage infrastructure must not run on that same infrastructure.
  • One brain, many hands. A single conversational AI orchestrator dispatches work to local executors on each machine. The orchestrator is the interface; the executors are the tools.
  • Markdown is the universal format. All knowledge, memory, decisions, and documentation are stored as plain text files — human-readable, machine-readable, version-controllable, vendor-agnostic.
  • Clear separation between services and tools. If something has its own web UI, its own users, or runs 24/7 independently, it’s a containerized service. If it’s something the AI uses, queries, or runs, it lives in the AI workspace.

Infrastructure

Machine Model

Machines (currently two Dell Optiplex 3020s, potentially more in the future) run a minimal OS with Docker and SSH. They receive their configuration from an external source and can be rebuilt from scratch without loss of state that matters. The target OS model leans toward either an immutable Linux distribution (e.g., Talos) where the OS itself is managed declaratively via API with no SSH/shell, or the current Debian setup with external provisioning (e.g., Ansible) that converges machines to a declared state. The immutable OS approach is the more architecturally pure option; Debian + Ansible is the pragmatic fallback. A self-hosted PaaS layer (e.g., Coolify) is also under consideration for the deployment and visibility layer — centralized dashboard showing all machines and services, git-push deployments for rapid prototyping, built-in Traefik management. This could coexist with or replace manual compose file management.

Provisioning

Machine provisioning definitions live outside the machines — in a git repository, a cloud service, or a CI system. Adding a new machine to the fleet is: install the OS, run one bootstrap command (or PXE boot), and the machine joins and receives its workload assignments. If a machine dies, the recovery process is: install the OS on new/repaired hardware, re-run provisioning, restore service data from backups. The provisioning layer is never lost because it doesn’t live on the machines.

Scaling

The architecture supports N machines without re-architecture. Services are assigned to machines based on hardware capabilities (storage, RAM) and role, but the assignment mechanism is centralized and declarative. Adding a third, fourth, or fifth machine should not require rethinking how services are distributed — only updating the assignment.

Networking

Traefik remains the reverse proxy. Cloudflare tunnel remains the ingress path from the internet. These are foundational infrastructure components that live on the control plane (see below), not on workload machines.

Control Plane Separation

The Problem

Currently, the entire control path — Synapse (Matrix messaging), OpenClaw (AI gateway), Traefik (routing), Cloudflare tunnel (ingress), and backup orchestration — runs on i3. If i3 goes down, the ability to communicate with and manage the infrastructure is lost. The control plane controls itself, creating a circular dependency.

The Solution

The control path lives on a separate VPS (or equivalent external compute) that is not part of the homelab hardware. This VPS runs:
  • Synapse (Matrix homeserver) — the messaging backbone
  • OpenClaw gateway — the AI orchestrator
  • Cloudflare tunnel or direct ingress — public routing into the system
  • Kiro CLI — for managing the VPS itself
Homelab machines are pure workload. They run application services (Jellyfin, *arr stack, Stalwart, etc.) and have Kiro CLI installed as the local executor. The VPS-hosted OpenClaw dispatches commands to each machine’s Kiro CLI via SSH or the Matrix bridge.

Break-Glass Path

The VPS is the one component that cannot be fully self-managed without circular dependency. OpenClaw manages the VPS for routine operations (updates, config changes, restarts). Direct SSH into the VPS is the escape hatch for when OpenClaw itself is broken. This is an accepted tradeoff — the same pattern cloud providers use for out-of-band management.

AI Orchestration

Architecture

User → Matrix client → Synapse (VPS) → OpenClaw (VPS) → Kiro CLI (on target machine) → Docker/SSH/APIs
  • OpenClaw is the brain — the always-on conversational AI agent. It receives messages, reasons about them, maintains memory and context, and dispatches work.
  • Kiro CLI is the hands — installed on each machine (including the VPS). It has shell access, Docker access, file access, and the AGENTS.md context for every service. It executes what OpenClaw tells it to.
  • Matrix is the nervous system — the messaging protocol that connects the user to OpenClaw and potentially OpenClaw to other services.
OpenClaw and Kiro are not competing. They operate at different layers. OpenClaw is the conversational interface, orchestrator, and memory system. Kiro is the local executor with deep tool access. OpenClaw may invoke Kiro to perform infrastructure operations, and Kiro’s results flow back through OpenClaw to the user.

Messaging Layer

Matrix is the messaging protocol. It provides:
  • Self-hosted (on the VPS, not dependent on third-party services)
  • Open protocol with multiple client options
  • Room-based context separation (equivalent to Discord channels)
  • Native OpenClaw integration via the Matrix plugin
  • Bridge support for other platforms (Discord already bridged via mautrix-discord, Telegram/WhatsApp possible via mautrix bridges)
The Matrix client is an open question. Element works but has UX friction (slash command prefixing). A custom lightweight Matrix client purpose-built for AI interaction is a viable project — it would treat messages to the bot natively, support context-separated rooms, and avoid the overhead of a general-purpose Matrix client.

Context Separation

Different concerns get different Matrix rooms (or equivalent channels). This prevents context pollution — research doesn’t bleed into infrastructure management, bookmarks don’t pollute daily tasks. Each room can potentially use different models or thinking levels for cost optimization (expensive models for deep reasoning, cheap models for routine checks).

State Management

Taxonomy of State

All state in the system falls into one of these categories, each with a defined home:
State TypeWhat It IsWhere It LivesFormat
Infrastructure definitionsCompose files, Traefik configs, provisioning playbooks, AGENTS.md per service~/apps git repo (on machines) + provisioning repo (external)YAML, Markdown
Service dataApplication databases, media libraries, email storesDocker volumes on each machineService-specific
SecretsAPI keys, passwords, tokens, certificatesEncrypted storage (SOPS in git, Vault, or Bitwarden Secrets) — not plaintext .env filesEncrypted
Agent identityPersonality, behavior rules, communication styleOpenClaw workspace: SOUL.md, IDENTITY.md, USER.md, AGENTS.mdMarkdown
Agent memoryConversation history, learned preferences, distilled knowledgeOpenClaw workspace: memory/YYYY-MM-DD.md (daily logs), MEMORY.md (long-term), main.sqlite (vector search)Markdown + SQLite
Personal knowledge baseResearch, articles, video notes, links, ideas, reference materialOpenClaw workspace: knowledge/ directory with subdirectories by type, plus SQLite with vector embeddings for semantic searchMarkdown + SQLite
Decision logArchitectural decisions with context and rationaleOpenClaw workspace: decisions/YYYY-MM-DD-title.md (ADR format)Markdown
Operational historyWhat ran, what failed, health over timeOpenClaw workspace: ops/snapshots/ (daily), ops/cron-log.sqlite (automation history)Markdown + SQLite
Task/project trackingWhat’s in progress, planned, blockedOpenClaw workspace: tasks/ (SQLite or markdown)SQLite or Markdown
Automation stateCron job definitions, heartbeat state, pipeline configsOpenClaw workspace: cron system + HEARTBEAT.mdJSON + Markdown

OpenClaw Workspace Structure

The OpenClaw workspace is the single source of truth for everything the AI knows, remembers, and manages (excluding infrastructure definitions and service data, which have their own homes).
workspace/
├── SOUL.md                        — personality and behavior
├── IDENTITY.md                    — self-conception
├── USER.md                        — user profile and preferences
├── AGENTS.md                      — operating instructions
├── TOOLS.md                       — environment-specific notes
├── MEMORY.md                      — distilled long-term memory
├── HEARTBEAT.md                   — periodic check instructions
├── memory/
│   └── YYYY-MM-DD.md              — daily conversation/activity logs
├── knowledge/
│   ├── articles/                  — ingested web content
│   ├── videos/                    — video notes and summaries
│   ├── ideas/                     — captured ideas
│   └── knowledge.sqlite           — vector embeddings for semantic search
├── decisions/
│   └── YYYY-MM-DD-title.md        — architecture decision records
├── ops/
│   ├── snapshots/                 — daily operational state snapshots
│   └── cron-log.sqlite            — automation run history
├── tasks/
│   └── tasks.sqlite               — project and task tracking
└── projects/
    ├── crm/                       — CRM database + skills (example)
    ├── briefing/                  — daily briefing automation (example)
    └── .../                       — other AI-built tools and automations

The Workspace/Container Line

The rule for what lives in the OpenClaw workspace vs. what gets its own container:
  • Container: has its own web UI, has its own users, needs to run 24/7 independently of the AI, is a third-party application. Examples: Jellyfin, Synapse, Sonarr, Stalwart, Miniflux, Uptime Kuma.
  • Workspace: is a tool the AI uses, a database the AI queries, automation the AI runs, or a quick app the AI built. Examples: CRM (SQLite + skill), knowledge base (markdown + embeddings), daily briefing (cron + prompt), bookmark manager (skill + SQLite), health tracker (markdown + analysis).
AI-built tools in the workspace are typically SQLite databases, Node.js scripts, bash scripts, markdown files, and single-file HTML apps — not Docker containers. OpenClaw executes them directly via its shell access and cron system.

Services

Current Services (Remain as Containers)

These are stable, long-running services that stay as Docker containers managed via compose files: Workload machines (i3 / Pentium / future machines):
  • Media: Jellyfin, Audiobookshelf, Calibre Web Automated
  • Media management: Sonarr, Radarr, Bazarr, Prowlarr, Chaptarr, Seerr
  • Downloads: qBittorrent, Gluetun (VPN), FlareSolverr, qSticky
  • Reading: Miniflux, Nextflux, RSSHub, Feed Scraper
  • Productivity: Excalidraw (Draw), Chromium
  • Tracking: Yamtrack
  • Auth: Pocket-ID
  • Email: Stalwart
  • Monitoring: Uptime Kuma, Homepage
Control plane VPS:
  • Synapse + Element (Matrix)
  • OpenClaw gateway
  • Cloudflare tunnel (or direct ingress)
  • Traefik

Future AI-Built Tools (Live in Workspace)

These are automations and personal tools that OpenClaw builds and runs inside its workspace, not as separate containers:
  • Personal CRM (SQLite + natural language queries)
  • Knowledge base with semantic search (markdown + vector embeddings)
  • Daily briefing system (cron + data aggregation + messaging delivery)
  • Bookmark/link manager (replaces paid services like Raindrop)
  • Operational monitoring summaries (daily snapshots of infrastructure health)
  • Task tracking (local task database managed conversationally)
  • Any rapid prototypes, quick web apps, or experimental tools

Backups

Current System (Remains)

Nightly restic backups — encrypted, deduplicated, incremental — with cross-machine replication and offsite to Backblaze B2. Database dumps before backup runs. 7 daily, 4 weekly, 3 monthly retention.

Additions for Future State

  • OpenClaw workspace must be included in backups. It contains the AI’s entire brain — memory, knowledge base, decisions, task state, automation history. This is the most important new backup target.
  • VPS control plane needs its own backup strategy. Synapse database, OpenClaw workspace, and configuration should back up to B2 (or equivalent offsite storage). The VPS itself is rebuildable from provisioning definitions, but the data on it is not.
  • Secrets migration — moving from plaintext .env files to encrypted storage means the backup strategy for secrets changes. Encrypted secrets in git are self-backing-up. Secrets in Vault or Bitwarden need their own backup consideration.
  • Automation run history (cron-log.sqlite) and knowledge base (knowledge.sqlite) are SQLite databases that should be included in the standard backup sweep. The existing restic backup of ~/apps would cover this if the workspace is mounted from within that tree.

Cost Model

  • VPS for control plane: ~$5-10/month for a small instance running Synapse, OpenClaw, Cloudflare tunnel, Traefik
  • LLM API tokens for OpenClaw: variable, depends on usage. Cost optimization via model routing — expensive models (Opus) for deep reasoning, cheap models for routine heartbeats, cron jobs, and simple queries
  • Kiro CLI: covered by work (Kiro Power plan), no additional cost
  • Backblaze B2: existing, minimal cost
  • Local model hardware (future): under consideration for reducing API token costs. Would run on homelab hardware or dedicated GPU machine.

Open Questions

These are acknowledged but not yet decided:
  • Exact provisioning tool: Ansible, Talos, NixOS, or something else for machine convergence
  • PaaS layer: whether Coolify (or similar) adds enough value for deployment visibility and rapid prototyping to justify the added platform complexity
  • Matrix client: Element, custom client, or alternative — depends on UX needs for AI interaction
  • Secrets management: SOPS, Vault, Bitwarden Secrets, or another solution
  • Local model hosting: hardware requirements, which models, how it integrates with OpenClaw’s model routing
  • OpenClaw↔Kiro interface: the exact mechanism by which OpenClaw dispatches work to Kiro CLI on each machine (SSH + subprocess, Matrix bridge, or another approach). The kiro-bridge design doc in kiro-bridge/AGENTS.md captures the exploration so far.
  • Knowledge base search tooling: QMD, SQLite with vector extensions, or another semantic search solution
  • Rapid prototyping workflow: whether AI-built quick apps are served directly from the workspace, deployed via a PaaS, or handled some other way