skills/claude-code-audit-workspace/iteration-1/eval-3-specific-leverage-framing/with_skill/outputs/drafts/skills/overnight-autonomy/SKILL.md
--- name: overnight-autonomy description: Codify the user's standard overnight-autonomy contract so it doesn't have to be retyped every session. Triggers on phrases that signal "I'm leaving Claude running while I sleep": "i'm going to sleep", "i'm headed to bed", "i have to go back to sleep", "drive this forward overnight", "go full autonomy", "keep going until morning", "autonomously follow through", or any combination including "claude" + "sleep/bed/morning". Establishes the overnight contract
npx skillsauth add petekp/claude-skills skills/claude-code-audit-workspace/iteration-1/eval-3-specific-leverage-framing/with_skill/outputs/drafts/skills/overnight-autonomyInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
User's prompt signals they are handing the session off to Claude while they sleep. Accepted forms (not exhaustive — match intent):
Do NOT fire on the word "autonomous" alone — that appears in non-sleep contexts (e.g., "run this autonomously" for a 20-min task).
When this skill fires, before doing any of the user's requested work, you adopt the following contract for the rest of the session. Announce it to the user in one paragraph (not a list) so they can scroll up and confirm. Then start work.
Create a full TaskList before touching any file. Every distinct unit of work the user mentioned, plus any sub-tasks you can foresee, gets a task. TaskCreate before TaskUpdate before Edit. The TaskList is the plan; the plan survives the session.
On any error — build break, test failure, type error, lint violation, runtime crash — do not patch the symptom. Enumerate 2+ hypotheses for root cause (the CLAUDE.md doctrine), pick the most likely, verify, then fix. Log the hypothesis chain to the morning log so the user can review.
If a commit fails post-hoc (tests go red, lint fails, etc.), amending the commit to fix is allowed and expected — but only for the work that was just introduced in that commit. Never amend past work that was already green.
Unless the user set an explicit Codex budget, you may dispatch at most 3 Codex calls across the overnight session (adversarial review counts as 1 call; re-dispatches for fold-ins count separately). Log every Codex dispatch to the morning log with input prompt and verdict.
Write a running log to <project-root>/.claude/overnight-logs/<YYYY-MM-DD>-session.md. One section per phase. Each section: goal, actions taken (bullet list with commit SHAs), errors encountered + root-cause analysis, decisions made + rationale, open questions for the user. Update it at every phase boundary and at session end. This is the artifact the user reviews in the morning.
Stop the session if:
Do NOT stop for: merge conflicts, flaky tests (retry 3x), transient network errors, or any ambiguity resolvable by reading more code.
The user is asleep. Asking them a question is equivalent to stopping. If you find yourself wanting to ask, instead: log the question to the morning log under "Open questions", pick the most defensible option, annotate the choice with "chose X because Y; revisit if Z", and continue.
If overnight-guard hook detects a resumed overnight run (on SessionStart:resume), the contract is re-applied automatically — no user action needed. The morning log is appended to; a new section is opened.
"Overnight contract in effect: full TaskList before any edit, root-cause hypotheses before any fix, amend-commit loop for in-flight work only, max 3 Codex calls, morning log at .claude/overnight-logs/2026-04-23-session.md updated at every phase boundary. If I hit a genuine blocker I'll stop and log it; otherwise I'll keep going until the TaskList is clean or dawn, whichever first. Starting now."
tools
Comprehensively manually test the Circuit plugin's user-facing surface in either Claude Code or Codex. Use this skill whenever the user asks to "manually test Circuit", "QA the Circuit plugin", "exercise the Circuit surface", "run the Circuit checklist", "smoke test Circuit", "find regressions in Circuit", "test the Claude Circuit plugin", "test the Codex Circuit plugin", or when preparing a Circuit release for marketplace publication. Argument is the host package to test — `claude` or `codex`. Produces a Markdown report with per-command pass/fail, exploratory findings ranked by severity, run-folder evidence links, and a concise terminal summary. Use even if the user does not say the word "test" — phrases like "go through every Circuit command" or "make sure Circuit still works end-to-end" should also trigger.
development
Turn the prompt supplied with this skill into a concise, auditable Codex Goal or explain why a Goal is not the right fit. Use when the user asks to draft, formulate, rewrite, tighten, or create a `/goal` from a plain-language task, especially for multi-step work that needs a durable objective, evidence-based completion, constraints, iteration policy, and a default adversarial review loop.
development
Give the human a fast, plain-English catch-up on what changed in the project: what the agents did, why, and what decisions need their input. Use this whenever the user asks to "catch me up", "what changed", "where are we", "recap", "brief me", "give me the rundown", "what did you do", "summarize the session", "fill me in", or otherwise signals they have been away and want to get back up to speed quickly. Built for someone steering several agent-driven projects at once who does not read the code closely but needs to grasp the core ideas, the choices made, and the open decisions well enough to steer. Trigger even if they do not use these exact words: any request to get oriented on recent progress should use this skill.
tools
Expert Unix and macOS systems engineer for shell scripting, system administration, command-line tools, launchd, Homebrew, networking, and low-level system tasks. Use when the user asks about Unix commands, shell scripts, macOS system configuration, process management, or troubleshooting system issues.