skills/host-backup-restore/skills/host-backup-restore/SKILL.md
Host-level backup and restore with profile system (presets + custom YAML profiles), model-aware agents (sonnet worker for mechanical tasks), post-discovery research, and skillwiki infrastructure capture. Uses rsync with partial-dir for resumable WAN transfers. Use when backing up or restoring Caddy reverse-proxy domains, databases (postgres, mysql, redis, mongodb, sqlite), systemd services, full SSH identity/config, Tailscale state/config, and Hermes agent state on remote Linux hosts.
npx skillsauth add karlorz/agent-skills host-backup-restoreInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Orchestrates host infrastructure backup and restore into a single flow: Caddy reverse-proxy domains, databases, systemd services, SSH configs, Hermes agent snapshots, and apt package lists.
Supports interactive (AskUserQuestion) mode (default), non-interactive CLI mode, backup profiles, post-discovery research, and skillwiki capture.
# Interactive backup — runs discover.sh, presents profile selection, then AskUserQuestion
/host-backup-restore sg01
# With a specific profile
/host-backup-restore sg01 --profile quick
# Non-interactive: use a preset profile
bash scripts/host-backup-cli.sh --host sg01 --profile full
# Non-interactive: quick backup (hermes + databases + base + caddy)
bash scripts/host-backup-cli.sh --host sg01 --profile quick
# Save a custom profile for reuse
bash scripts/host-backup-cli.sh --host sg01 --groups "hermes,databases,caddy_domains" --save-profile daily
# List all available profiles
bash scripts/host-backup-cli.sh --list-profiles
# Backup with post-discovery research
bash scripts/host-backup-cli.sh --host sg01 --profile full --research
# Restore to a fresh host
bash scripts/host-restore-cli.sh --archive ./sg01-backup.tar.gz --target newhost --all
# Restore SSH + Tailscale identity after OS reinstall (explicit opt-in)
bash scripts/host-restore-cli.sh --archive ./sg02-backup.tar.gz --target sg02 --groups "ssh,tailscale" --restore-identity
By default, backup and restore operations use the non-root agent user for SSH. Root access is not required if the agent user has passwordless sudo.
Run once per target host to create the agent user with passwordless sudo:
bash scripts/hermes/setup-remote-user.sh <host>
This connects as root (one-time bootstrap), creates the agent user, grants passwordless sudo, deploys your SSH key, and optionally writes an SSH config alias.
The CLI scripts default to agent@<host> for SSH connections:
| Scenario | Result | Example |
|----------|--------|---------|
| Default (no flags) | agent@<host> | ssh agent@sg01 "cmd" |
| --user root | root@<host> | ssh root@sg01 "cat /etc/caddy/Caddyfile" |
| --user deploy | deploy@<host> | ssh deploy@sg01 "cmd" |
| <host>-agent alias | Uses SSH config | ssh sg01-agent "cmd" |
The agent user requires passwordless sudo for operations that write to system paths (e.g., /etc/caddy/, /etc/hosts, systemd services). The setup-remote-user.sh script configures this automatically.
# Backup as root (if needed for system-level operations)
bash scripts/host-backup-cli.sh --host sg01 --user root --profile full
# Backup as a specific non-root user
bash scripts/host-backup-cli.sh --host sg01 --user deploy --profile quick
# Restore to a target as non-root agent (default)
bash scripts/host-restore-cli.sh --archive ./backup.tar.gz --target newhost --all
# Restore as root
bash scripts/host-restore-cli.sh --archive ./backup.tar.gz --target newhost --user root --all
# Using SSH config alias (handles user/key/port in ~/.ssh/config)
bash scripts/host-backup-cli.sh --host sg01-agent --profile full
agent user, consider disabling root SSH login on the target host~/.ssh/id_ed25519.pub or a specified keyhost-backup-restore/
├── SKILL.md # This file — interactive flow + orchestration
├── .claude-plugin/
│ ├── plugin.json # Plugin manifest (v{{VERSION}})
│ └── agents/
│ ├── backup-worker.md # Sonnet-pinned worker for general tasks
│ └── hermes-backup-worker.md # Sonnet-pinned worker for Hermes ops
├── scripts/
│ ├── hermes/
│ │ ├── discover-hermes.sh # Hermes-specific SSH discovery
│ │ ├── remote-backup.sh # Remote Hermes backup orchestrator
│ │ ├── remote-restore.sh # Remote Hermes restore via import
│ │ ├── pre-inspect.sh # Restore target readiness check
│ │ ├── restore-validate.sh # Post-restore Hermes validation
│ │ ├── prune-backups.sh # Retention pruning
│ │ ├── setup-remote-user.sh # Non-root user bootstrap
│ │ ├── setup-remote-cron.sh # Automated backup cron
│ │ └── setup-nonroot-hermes.sh # Non-root Hermes installation
│ ├── discover.sh # SSH discovery: parse Caddyfile, detect services
│ ├── backup-host.sh # Mechanical backup script (reads manifest)
│ ├── host-backup-cli.sh # Non-interactive CLI backup (profile-aware)
│ ├── host-restore-cli.sh # Non-interactive CLI restore
│ ├── profiles.sh # Profile management (presets + YAML)
│ └── research-host.sh # Post-discovery research query generator
└── tests/
└── test-restore.sh # Per-component restore verification (27 assertions)
/tmp/host-backup-{hostname}-manifest.json.AskUserQuestion, then runs backup/restore based on selection.--profile.Backup profiles define which groups to back up and what hermes-tier to use. Three built-in presets plus unlimited custom profiles.
| Profile | Groups | Hermes Tier | Use Case |
|---------|--------|-------------|----------|
| full | all 9 groups | full | Complete infrastructure backup including SSH identity and Tailscale state (default) |
| quick | base, caddy_domains, hermes, databases | standard | Essential state — skips systemd units + apt |
| minimal | hermes | minimal | Hermes agent state only — fastest snapshot |
Create ~/.config/host-backup-restore/profiles.yaml:
profiles:
daily:
groups: [hermes, databases, base, caddy_domains]
hermes_tier: full
description: "Daily backup of essential services"
weekly-full:
groups: [base, ssh, tailscale, caddy_domains, hermes, databases, other_services, apt, wiki]
hermes_tier: full
description: "Weekly full infrastructure backup"
hermes-only:
groups: [hermes]
hermes_tier: minimal
description: "Quick Hermes snapshot before upgrades"
| Flag | Description |
|------|-------------|
| --profile NAME | Use a named profile (preset or custom) |
| --save-profile NAME | Save current --groups + --hermes-tier as a named profile |
| --list-profiles | List all available profiles and exit |
In interactive mode, after discovery, present profile selection before group selection:
{
"question": "Which backup profile for <host>?",
"header": "Profile",
"options": [
{"label": "full (Recommended)", "description": "All 9 groups — complete infrastructure backup with SSH identity and Tailscale state"},
{"label": "quick", "description": "Essential state: Hermes, databases, Caddy, base (skips systemd + apt)"},
{"label": "minimal", "description": "Hermes agent state only — fastest snapshot"},
{"label": "Custom", "description": "Select individual groups manually"}
]
}
If "Custom" is selected, fall back to the per-group AskUserQuestion flow (Step 4b).
The skill uses a sonnet-pinned worker agent for mechanical tasks, keeping the orchestrator (main session) for user interaction and decision-making.
Defined in agents/backup-worker.md. Handles:
discover.sh)backup-host.sh)host-restore-cli.sh)test-restore.sh)profiles.sh)User Session (opus/inherit)
├── Interactive decisions (AskUserQuestion)
├── Profile selection
├── Post-discovery research (deep-research skill)
├── Skillwiki capture
├── Spawns backup-worker (sonnet)
│ ├── discover.sh
│ ├── backup-host.sh
│ ├── host-restore-cli.sh
│ └── test-restore.sh
└── Spawns hermes-backup-worker (sonnet)
├── discover-hermes.sh
├── remote-backup.sh
├── remote-restore.sh
├── pre-inspect.sh
├── restore-validate.sh
└── prune-backups.sh
When to spawn backup-worker:
When to stay in orchestrator:
Entry point: /host-backup-restore [host] [mode] [options]
Arguments:
host — SSH hostname (e.g. sg01, sg03, ptcloud). Required.mode — backup (default) or restore. Optional.--profile NAME — Use a named profile. Optional.--redetect — Re-run discovery instead of using cached manifest.--dest PATH — Backup destination directory.--dry-run — Preview what would be backed up without doing it.--research — Run post-discovery research. Optional.Run discover.sh to detect all services on the target host:
SCRIPT_DIR="$(dirname "$(realpath "$0")") 2>/dev/null || echo /path/to/skill"
bash "$SCRIPT_DIR/scripts/discover.sh" <host>
Discovery output is cached at /tmp/host-backup-{hostname}-manifest.json. Use --redetect to force re-run.
Model note: Spawn backup-worker agent for discovery to use sonnet for the SSH-heavy work.
Read the manifest and present the detected services to the user with a table. Example for sg01:
Detected services on sg01:
- Caddy domains: mon.karldigi.dev, status.karldigi.dev, term.karldigi.dev, bot.karldigi.dev, star.karldigi.dev (5 total)
- Hermes: v0.13.0 at /root/.hermes
- Databases: sqlite files (/root/.hermes/state.db, etc.), [redis/postgres/mysql as detected]
- Systemd services: hermes-gateway, hermes-dashboard, caddy, filebrowser, obsidian, xvfb, [others]
- Apt sources: deb https://... (N sources)
Use AskUserQuestion with profile options:
{
"question": "Which backup profile for <host>?",
"header": "Profile",
"options": [
{"label": "full (Recommended)", "description": "All 9 groups — complete infrastructure backup with SSH identity and Tailscale state"},
{"label": "quick", "description": "Essential state: Hermes, databases, Caddy, base (skips systemd + apt)"},
{"label": "minimal", "description": "Hermes agent state only — fastest snapshot"},
{"label": "Custom", "description": "Select individual groups manually"}
]
}
If a --profile flag was passed, skip this step and use the specified profile.
Resolve the profile via profiles.sh and run backup with the resolved groups:
source "$SCRIPT_DIR/scripts/profiles.sh"
resolve_profile "<profile_name>"
bash "$SCRIPT_DIR/scripts/backup-host.sh" "$MANIFEST_FILE" $PROFILE_GROUPS
For each detected group, use AskUserQuestion (yes/no). Iterate through groups one at a time. Only ask about groups that have detected services:
{
"question": "Back up <group_name>? (<brief_description_of_what_this_covers>)",
"header": "Service group",
"options": [
{"label": "Yes", "description": "Include this group in the backup"},
{"label": "No", "description": "Skip this group"}
]
}
Group order and descriptions:
| Group | Description | Detected on sg01 example |
|-------|------------|--------------------------|
| base | Hostname, /etc/hosts, /etc/os-release, sshd_config quick reference | hostname, /etc/hosts |
| ssh | Full SSH host identity and account access (/etc/ssh, /root/.ssh, /home/*/.ssh) | host keys, authorized_keys |
| tailscale | Tailscale machine state/config and restore-reference metadata (/var/lib/tailscale, package source, status JSON, IPs, version) | tailscaled state, tailscale status |
| caddy_domains | Caddy config (/etc/caddy/Caddyfile), SSL certs, caddy validate | 5 domains: mon, status, term, bot, star |
| hermes | hermes backup (built-in zip — handles SQLite WAL mode) | v0.13.0 at /root/.hermes |
| databases | sqlite files, postgres/mysql/redis dumps, mongodb | state.db, [any others detected] |
| other_services | systemd unit files, service states | hermes-gateway, hermes-dashboard, caddy, filebrowser, obsidian, xvfb, [others] |
| apt | Package list (apt list --installed), apt sources | N sources |
| wiki | rclone S3 mount for wiki vault (~/wiki backed by cloud:cloud/wiki) | rclone.conf, ~/wiki mount |
After collecting all answers, run backup-host.sh with the selected groups:
bash "$SCRIPT_DIR/scripts/backup-host.sh" "$MANIFEST" <selected_group1> <selected_group2> ...
Offer to save the selection as a custom profile:
{
"question": "Save this selection as a custom profile for future use?",
"header": "Save profile",
"options": [
{"label": "Yes", "description": "Save as a named profile in ~/.config/host-backup-restore/profiles.yaml"},
{"label": "No", "description": "Continue without saving"}
]
}
~/Desktop/backups/<host>/ for existing archives.{
"question": "Which backup archive do you want to restore from?",
"header": "Restore archive",
"options": [
{"label": "backup-20260510-143000.tar.gz (78M, 2026-05-10)", "description": "Full backup with 54 files"},
{"label": "backup-20260509-120000.tar.gz (45M, 2026-05-09)", "description": "Partial backup"},
{"label": "Custom path", "description": "Specify a different archive path"}
]
}
If ssh or tailscale identity groups were selected after an OS reinstall:
ssh and tailscale are reinstall-prep identity groups. They are restorable via CLI only with explicit --restore-identity because they replace host trust and tailnet identity:
bash scripts/host-restore-cli.sh \
--archive ~/Desktop/backups/sg02/sg02-backup-YYYYMMDD-HHMMSS.tar.gz \
--target sg02 \
--groups "ssh,tailscale" \
--restore-identity
Operational findings from the sg02 Debian 13 reinstall:
ssh-keygen -R <alias>; ssh-keygen -R <ip>, then reconnect with ssh -o StrictHostKeyChecking=accept-new <alias>.ca-certificates. Tailscale apt install can fail with certificate verify failed until apt-get install -y ca-certificates && update-ca-certificates runs.rsync and python3. Identity restore must not depend on them. Stream tarballs over SSH and use POSIX tools for validation./var/lib/tailscale. Restore the saved apt source/keyring first, install tailscale, stop tailscaled, extract the saved state, then systemctl daemon-reload && systemctl enable --now tailscaled./etc/ssh, run /usr/sbin/sshd -t; if it fails, roll back from the safety tarball before restarting SSH.known_hosts after restoring old SSH host keys. The server key reverts to the backup identity, so local clients that accepted the post-reinstall key will see another host-key change.The restore CLI creates remote safety tarballs under /root/host-restore-safety-*-ssh and /root/host-restore-safety-*-tailscale before overwriting identity material.
If caddy_domains was selected and Caddy is not on the target:
Check if Caddy exists on the target:
ssh <target> "which caddy 2>/dev/null || echo MISSING"
If MISSING, prompt the user:
{
"question": "Caddy is not installed on <target>. Should I install it before restoring Caddy config?",
"header": "Caddy install",
"options": [
{"label": "Yes (Recommended)", "description": "Install caddy via apt-get on the target, then restore config and restart the service"},
{"label": "No", "description": "Restore config files only — Caddy won't serve domains until manually installed"}
]
}
If yes, run: ssh <target> "sudo apt-get install -y caddy"
Check wiki S3 mount status on target:
ssh <target> "df -T ~/wiki 2>/dev/null | grep -q fuse.rclone && echo 'MOUNTED' || echo 'MISSING'"
If MISSING, check FUSE availability:
ssh <target> "test -c /dev/fuse && echo 'FUSE_OK' || echo 'NO_FUSE'"
If FUSE_OK, prompt:
{
"question": "Wiki S3 mount is not active on <target>. Should I set it up?",
"header": "Wiki mount",
"options": [
{"label": "Yes (Recommended)", "description": "Restore rclone.conf from backup and mount wiki at ~/wiki"},
{"label": "No", "description": "Skip — wiki will not be available on this host until manually mounted"}
]
}
If yes, run the wiki restore group via: bash scripts/host-restore-cli.sh --archive <path> --target <host> --groups wiki
If NO_FUSE, inform the user with fix guidance:
FUSE is not available on this host. The wiki S3 mount cannot be set up.
Fix options:
1. **LXC template (best):** Add `features: fuse=1` to the PVE base template
2. **LXC per-container:** Set `fuse=1` on the container features in PVE
3. **tmpfiles.d:** Create `/etc/tmpfiles.d/fuse.conf` with `c /dev/fuse 0666 root root - 10:229`
After FUSE is available, re-run the restore with `--groups wiki`.
bash "$SCRIPT_DIR/scripts/host-restore-cli.sh" --archive <archive_path> --target <host> --groups <selected_groups>
Restore best practices (from vault research):
ssh <host> "systemctl --user stop hermes-gateway.service" to avoid conflicts with running processes. ^[queries/hermes-backup-validation-restore-preinspection.md]-wal/-shm files. The hermes backup command handles this correctly. For manual sqlite files, use .backup command.bash "$SCRIPT_DIR/tests/test-restore.sh" --manifest /tmp/host-backup-<host>-manifest.json
Tarball the backup directory:
cd "$(dirname "$BACKUP_DIR")"
tar czf "<host>-backup-$(date +%Y%m%d-%H%M%S).tar.gz" "$(basename "$BACKUP_DIR")"
After backup completes, offer to capture the host infrastructure snapshot to skillwiki:
{
"question": "Capture host infrastructure snapshot to skillwiki?",
"header": "Wiki capture",
"options": [
{"label": "Yes", "description": "Write infrastructure snapshot to skillwiki vault as a typed-knowledge page"},
{"label": "No", "description": "Skip wiki capture"}
]
}
If yes, use the wiki-add-task or wiki-crystallize skill to capture:
# Capture as a typed-knowledge page
skillwiki wiki-crystallize --type entity --title "Host: <hostname>" --content "
## Infrastructure Snapshot (<date>)
**Hostname:** <hostname>
**OS:** <os_id> <os_version>
**Caddy domains:** <domain_list>
**Hermes:** <version> at <home>
**Databases:** <db_summary>
**Systemd services:** <service_list>
**Apt sources:** <source_count> sources
**Profile used:** <profile_name>
**Backup archive:** <archive_path>
"
This creates a point-in-time record of host infrastructure that can be queried later for drift detection or disaster recovery reference.
If --research flag is passed or user opts in, generate research queries and run deep-research:
bash "$SCRIPT_DIR/scripts/research-host.sh" "$MANIFEST_FILE" --output "/tmp/host-backup-${HOST}-research"
Then invoke the deep-research skill for high-priority queries:
{
"question": "Run post-discovery research on detected services?",
"header": "Research",
"options": [
{"label": "Yes", "description": "Research Hermes version, OS security advisories, database backup best practices"},
{"label": "No", "description": "Skip research and proceed with backup"}
]
}
Research topics generated from manifest:
host-backup-cli.shNon-interactive backup for automation/cron/scripting.
bash scripts/host-backup-cli.sh [options]
| Option | Description |
|--------|-------------|
| --host HOST | SSH target hostname (required) |
| --all | Back up all available groups |
| --groups "caddy_domains,hermes,databases" | Specific group selection |
| --profile NAME | Use a backup profile (full, quick, minimal, or custom) |
| --save-profile NAME | Save current selection as a named profile |
| --list-profiles | List all available profiles and exit |
| --hermes-tier minimal\|standard\|full | Hermes backup tier |
| --dest PATH | Backup destination directory |
| --dry-run | Preview what would be backed up without doing it |
| --redetect | Re-run discovery instead of using cached manifest |
| --research | Run post-discovery research on detected services |
Hermes tier mapping:
minimal → hermes backup --quick (config + state only)standard / full → hermes backup (no flags, full zip — handles SQLite WAL mode)Important:
hermes backupdoes NOT support--tier. Using--tiercauses a silent error that produces no backup zip. Use--quickinstead.
# Examples
bash scripts/host-backup-cli.sh --host sg01 --profile full
bash scripts/host-backup-cli.sh --host sg01 --profile quick --research
bash scripts/host-backup-cli.sh --host sg01 --groups "caddy_domains,hermes" --save-profile web-only
bash scripts/host-backup-cli.sh --host sg01 --all --hermes-tier minimal --dest ~/backups
bash scripts/host-backup-cli.sh --list-profiles
host-restore-cli.shNon-interactive restore from a backup archive.
bash scripts/host-restore-cli.sh [options]
| Option | Description |
|--------|-------------|
| --archive PATH | Backup archive path (.tar.gz) |
| --groups "caddy_domains,databases" | Groups to restore |
| --target HOST | Target host for restore |
| --all | Restore all groups |
| --dry-run | Preview restore actions without executing |
| --db-user USER | Database username for pg_restore/mysql (default: postgres/root) |
| --db-pass PASS | Database password for mysql (passed securely via temp file) |
| --allow-cross-distro | Allow apt restore across different OS (default: skip on mismatch) |
# Examples
bash scripts/host-restore-cli.sh --archive ./sg01-backup.tar.gz --target newhost --all
bash scripts/host-restore-cli.sh --archive ./sg01-backup.tar.gz --target newhost --groups "caddy_domains,hermes" --dry-run
Cached manifest at /tmp/host-backup-{hostname}-manifest.json. Use --redetect to re-run.
discover.sh connects via SSH and detects:
/etc/caddy/Caddyfile) — domain names and upstream targets (via caddy adapt + JSON extraction with legacy fallback)systemctl is-active, systemctl list-units).db files, installed packages, apt sources{
"hostname": "sg01",
"timestamp": "2026-05-10T14:30:00Z",
"caddy_domains": [
{"domain": "mon.karldigi.dev", "upstream": "localhost:3000"},
{"domain": "status.karldigi.dev", "upstream": "localhost:3001"},
{"domain": "term.karldigi.dev", "upstream": "localhost:8080"},
{"domain": "bot.karldigi.dev", "upstream": "localhost:7456"},
{"domain": "star.karldigi.dev", "upstream": ""}
],
"hermes": {"version": "0.13.0", "home": "/root/.hermes"},
"databases": {"sqlite": ["/root/.hermes/state.db"], "redis": ["6379"]},
"other_services": [
"hermes-gateway", "hermes-dashboard", "caddy",
"filebrowser", "obsidian", "xvfb",
"cmux-execd", "cmux-proxy", "cmux-ide", "cmux-worker-daemon"
],
"apt_sources": ["deb https://deb.debian.org/debian trixie main", "deb https://deb.debian.org/debian trixie-updates main"],
"os": "debian",
"os_version": "13"
}
The test harness validates functional correctness per component.
# Run all tests
bash tests/test-restore.sh --manifest /tmp/manifest.json
# Test a specific group only
bash tests/test-restore.sh --manifest /tmp/manifest.json --group caddy_domains
| Group | Assertions | What's verified |
|-------|-----------|-----------------|
| base | 3 | SSH config syntax, hostname match, hosts file integrity |
| caddy_domains | 4 | caddy validate, HTTP 200 on each domain, certs valid |
| per-domain | 3 | Each domain serves correctly |
| hermes | 4 | hermes --version, gateway active, dashboard loads, CLI works |
| databases | 4 | sqlite3 opens .db, row count > 0, postgres/mysql connection |
| other_services | 3 | systemd units active, ports listening |
| apt | 3 | apt list --installed includes expected packages |
| wiki | 3 | rclone.conf, wiki mount active, fstab entry |
Backup-only groups ssh and tailscale are reinstall-prep artifacts. Restore is deliberately manual because reusing SSH host keys or Tailscale machine identity affects host trust and tailnet identity.
CLI identity restore is available only with explicit opt-in:
bash scripts/host-restore-cli.sh --archive ./backup.tar.gz --target <host> --groups "ssh,tailscale" --restore-identity
The identity restore path is intentionally dependency-light for fresh reinstall targets:
rsyncca-certificates before using the Tailscale apt repotailscale before restoring /var/lib/tailscalesshd -t before restarting SSHknown_hosts refresh guidance after SSH host key reuseThe following test "failures" are source-side edge cases, not restore bugs:
The skill integrates with hermes backup for Hermes-specific snapshots:
# Full backup (SQLite-safe, handles WAL mode)
hermes backup -o hermes-backup.zip
# Quick snapshot (config + state only)
hermes backup --quick
# Restore
hermes import hermes-backup.zip
After restoring Hermes to a target host, run this post-restore validation sequence:
hermes --version && hermes doctorsystemctl --user status hermes-gateway.service and sudo systemctl status hermes-dashboard.servicecurl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:8642/health (expect 200)curl -s -H "Authorization: Bearer $KEY" http://127.0.0.1:8642/v1/modelsdu -sh ~/.hermes/state.db, ls ~/.hermes/skills/ | wc -l, cat ~/.hermes/cron/jobs.jsonImportant: Stop the gateway BEFORE importing:
systemctl --user stop hermes-gateway.service^[entities/hermes-backup-restore-guide.md] Also stop the dashboard if it's running as a system service:sudo systemctl stop hermes-dashboard.service
For full Hermes backup/restore reference, see the [[hermes-cli]] skill.
The scripts/hermes/ directory contains Hermes-specific backup/restore scripts absorbed from the standalone hermes-remote-backup skill. These scripts handle the Hermes agent layer of host backup via the official Hermes CLI.
| Script | Purpose |
|--------|---------|
| discover-hermes.sh | SSH discovery specific to Hermes (version, home, services) |
| remote-backup.sh | Remote Hermes backup orchestrator (hermes backup / --quick) |
| remote-restore.sh | Remote Hermes restore via hermes import with service stop/start |
| pre-inspect.sh | Restore target readiness check (arch, Python, disk, SSH, Hermes) |
| restore-validate.sh | Post-restore Hermes service validation (doctor, health, API, systemd, cron) |
| prune-backups.sh | Retention pruning for local Hermes backup archives |
| setup-remote-user.sh | Bootstrap non-root automation user on target host |
| setup-remote-cron.sh | Set up automated backup cron on target host |
| setup-nonroot-hermes.sh | Install Hermes for non-root user on target host |
The hermes-backup-worker agent (model: sonnet) orchestrates Hermes-specific mechanical tasks. It is spawned by the orchestrator (main session) for:
discover-hermes.sh → remote-backup.sh → prune-backups.shpre-inspect.sh → remote-restore.sh → restore-validate.shsetup-remote-user.sh → setup-remote-cron.sh → setup-nonroot-hermes.shSpawn pattern in interactive mode:
After user selects "hermes" group:
→ Spawn hermes-backup-worker (sonnet)
→ Agent runs: discover-hermes.sh → remote-backup.sh
→ Agent returns result summary
→ Orchestrator continues with next group or wiki capture
Model specification: Per [[concepts/claude-code-agent-model-specification]], the model: sonnet is set in the agent frontmatter (agents/hermes-backup-worker.md), not in plugin.json or SKILL.md. The Agent tool parameter can override at spawn time but defaults to the agent file setting.
Primary entry point: The
hermes-backup-workeragent is the recommended way to perform Hermes backup/restore operations. UseAgent(subagent_type="hermes-backup-worker", ...)instead of callingscripts/hermes/scripts directly. The agent handles script selection, error handling, and result reporting.
Performance note:
hermes backupon sg01 creates a ~2.2 GB zip via SSH. The transfer can take 10+ minutes over WAN. Consider:
- Spawning
hermes-backup-workerwithrun_in_background: truefor non-blocking backup- Using
--profile minimalor--hermes-tier minimalfor faster snapshots- Running hermes backup directly on the host (
ssh sg01 "hermes backup -o backup.zip") for large transfers
For automated restore testing on ephemeral VMs:
# Create a devsh VM (morph provider for sync support)
VM_ID=$(devsh start -p morph --json | python3 -c "import json,sys; print(json.load(sys.stdin)['id'])")
# Sync backup to VM
devsh sync "$VM_ID" ./backup-staging/
# Run test harness
devsh exec "$VM_ID" "bash /tmp/test-restore.sh --manifest /tmp/manifest.json"
# Clean up
devsh delete "$VM_ID"
Note: devsh
pve-lxcprovider does NOT supportdevsh syncor direct SSH file transfer. Usemorphprovider for restore testing. Forpve-lxc, use HTTP serve (python3 -m http.server+curl) as a workaround. ^[projects/agent-skills/compound/devsh-restore-testing.md]
Before restoring to any target host, run pre-inspection to verify readiness: ^[queries/hermes-backup-validation-restore-preinspection.md]
# Architecture
ssh <host> "uname -m" # Expect: aarch64 or x86_64
# OS compatibility (critical for apt restore)
ssh <host> "cat /etc/os-release"
# Python version (Hermes requires 3.10+)
ssh <host> "python3 --version"
# Disk space (2GB+ recommended)
ssh <host> "df -h ~"
# Hermes already installed?
ssh <host> "hermes --version 2>/dev/null || echo NOT_INSTALLED"
# SSH key auth confirmed
ssh -o BatchMode=yes <host> "hostname"
devsh sync NOT supported — use HTTP serve or morph providersystemctl --user FAILS (no user bus in LXC) — run gateway as system service or background processdevsh exec works for all commands| Method | Works for | Limit | Command |
|--------|-----------|-------|---------|
| base64 + devsh exec | Text files, small binaries | ~32 KB (shell arg limit) | B64=$(base64 < file); devsh exec "$LXC" "echo \$B64 \| base64 -d > /tmp/file" |
| Chunked base64 | Any file size | Slower than HTTP for large files | devsh_transfer "$VM_ID" backup.zip /tmp/backup.zip (built into host-restore-cli.sh) |
| HTTP serve | Any file size | Requires HTTP server on local machine | python3 -m http.server 8080 & curl -o /tmp/file http://10.10.x.1:8080/file |
| SCP via sg01 bridge | Any file size | Requires sg01 as jump host | rsync -avP file sg01:/tmp/; ssh sg01 "rsync -avP /tmp/file 10.10.1.123:/tmp/" |
For backup archives larger than 32 KB (Caddy config, SSL certs, Hermes zip), use chunked base64, HTTP serve, or rsync bridge. The devsh_transfer helper in host-restore-cli.sh splits files into 30KB base64 chunks and reassembles on the remote side.
systemctl is-active quirks — Prints to stdout even with stderr redirected; use &>/dev/nullhermes backup --tier is invalid — Hermes backup uses --quick for minimal, no flag for full-wal/-shm files; use hermes backup or .backup commandhermes-gateway.service before hermes import to avoid file lock conflicts ^[entities/hermes-backup-restore-guide.md]development
Review and simplify recently changed code for reuse, clarity, and efficiency while preserving behavior. Use when the user asks to simplify, refine, polish, clean up, or make code clearer, or after finishing a logical chunk of implementation that should be tightened before commit.
tools
Use this skill when the user asks to open a browser, browse a website, scrape a page, automate Chrome, take a screenshot, fill out a form, click a button, or otherwise interact with a website. Includes a browser-worker agent (model: sonnet) for mechanical Chrome lifecycle and interaction tasks.
tools
Hermes Agent CLI commands reference. Use when the user asks about hermes-agent CLI usage, commands, flags, or subcommands. Covers the full hermes terminal command surface.
development
Scaffold per-repo dev-loop config (PRD layer, knowledge layer, release config, vault path) and build the project glossary with grill-with-docs. Run once per repo before using dev-loop.