skills/claude-code-audit-workspace/iteration-1/eval-1-direct-audit-request/with_skill/outputs/audit/drafts/skills/autonomous-governor/SKILL.md
Install session-level guardrails when the user hands off to autonomous overnight execution. Fires when the first user prompt contains "going to bed", "headed to sleep", "full autonomy", "overnight", "continue as you were", or similar handoff-to-autonomy language. Enforces commit-per-slice, halt-on-3-consecutive-errors, max-wall-time cap, and a wake-time summary block. Exists because audited overnight sessions showed 348 Bash calls and 9 tool errors with no structural brake — the user explicitly acknowledged errors were happening and told Claude to keep going, which is exactly the moment guardrails must exist in config instead of in the user's head.
npx skillsauth add petekp/claude-skills autonomous-governorInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The audit found 4/25 sessions that began with "I'm going to bed, please continue autonomously." These sessions had the highest tool counts (348+ Bash calls), highest error counts (7-9 errors per session), and the most accumulated technical debt. The user is explicitly off-shift during these runs; nothing in the config sets a budget or a halt threshold, so the machine just keeps trying until morning.
This skill sets the budget.
Fire when the first user prompt contains any of these phrases (case-insensitive):
going to bedheaded to sleep / off to sleep / heading to bedfull autonomy / autonomous mode / autonomy modeovernight + continuation languagecontinue as you were + trust languagedrive this forward + break languageI'm going to sleepDo NOT fire on mid-session requests for autonomy. This is a session-opener-only skill because the governor rules need to apply to the whole session.
Announce these rules back to the user in the first response (one line each), then proceed:
Commit-per-slice. After each logically complete slice (feature, fix, or discrete refactor), commit. Never carry more than one slice of uncommitted work. This protects against a bad middle-of-night commit swallowing earlier good work.
Halt-on-3-errors. If you get three consecutive tool errors without a successful intervening action, STOP. Save a continuity record with a clear "halted by governor — N consecutive errors on X" debt marker. Do not keep retrying past 3.
Amend cap. If you must git commit --amend to fix errors, amend at most twice per slice. Third amendment means roll the slice back and reconsider — do not stack amendments all night.
Wall-time cap (default 6 hours). If the session has run longer than 6 hours of assistant wall-clock, save a continuity record and stop. The user will wake up; resume is cheap.
Wake-time summary. Before your final stop, emit a single plain-text summary block the user can read in 15 seconds:
Overnight summary:
- Shipped: <bullets of commits with shas>
- Halted: <none | what blocked and at which slice>
- Decisions needed: <bullets of things you punted on>
- Next resume: <one-line action>
User (first prompt): "claude, i'm headed to bed. please continue autonomously and drive this project forward as far as you can..."
Assistant (first response):
Governor active. Rules for this session:
- Commit after every slice; never >1 slice of uncommitted work.
- Halt on 3 consecutive tool errors; save continuity instead of retrying.
- Amend at most 2x per slice; third amendment triggers rollback.
- 6h wall-time cap.
- Wake-time summary before final stop.
Starting from current continuity record.
Then proceed with the autonomous work.
This skill is compatible with /circuit:handoff — when you halt, save a continuity record as usual. The governor adds budget; it does not replace the continuity system.
tools
Comprehensively manually test the Circuit plugin's user-facing surface in either Claude Code or Codex. Use this skill whenever the user asks to "manually test Circuit", "QA the Circuit plugin", "exercise the Circuit surface", "run the Circuit checklist", "smoke test Circuit", "find regressions in Circuit", "test the Claude Circuit plugin", "test the Codex Circuit plugin", or when preparing a Circuit release for marketplace publication. Argument is the host package to test — `claude` or `codex`. Produces a Markdown report with per-command pass/fail, exploratory findings ranked by severity, run-folder evidence links, and a concise terminal summary. Use even if the user does not say the word "test" — phrases like "go through every Circuit command" or "make sure Circuit still works end-to-end" should also trigger.
development
Turn the prompt supplied with this skill into a concise, auditable Codex Goal or explain why a Goal is not the right fit. Use when the user asks to draft, formulate, rewrite, tighten, or create a `/goal` from a plain-language task, especially for multi-step work that needs a durable objective, evidence-based completion, constraints, iteration policy, and a default adversarial review loop.
development
Give the human a fast, plain-English catch-up on what changed in the project: what the agents did, why, and what decisions need their input. Use this whenever the user asks to "catch me up", "what changed", "where are we", "recap", "brief me", "give me the rundown", "what did you do", "summarize the session", "fill me in", or otherwise signals they have been away and want to get back up to speed quickly. Built for someone steering several agent-driven projects at once who does not read the code closely but needs to grasp the core ideas, the choices made, and the open decisions well enough to steer. Trigger even if they do not use these exact words: any request to get oriented on recent progress should use this skill.
tools
Expert Unix and macOS systems engineer for shell scripting, system administration, command-line tools, launchd, Homebrew, networking, and low-level system tasks. Use when the user asks about Unix commands, shell scripts, macOS system configuration, process management, or troubleshooting system issues.