skills/workbench/SKILL.md
Set up, resume, or repair a compact active execution workbench for long-horizon, multi-session or checkpointed work. Use when a task needs durable handoff, unattended iteration, human gates, auditable evidence, or active-vs-archive routing that keeps a current packet separate from stale historical context. Do not use for one-session tasks, ordinary plans/reviews/audits, one-session bug fixes, direct code edits, or simple docs cleanup; complete those directly.
npx skillsauth add plimeor/agent-skills workbenchInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The active execution packet is the single handoff. A fresh agent with no prior context can orient from the active files alone, execute exactly one open gate, emit auditable evidence, and either continue or stop cleanly at a human gate. Historical evidence remains traceable, but it does not compete with the current entrypoint. Decisions are not re-litigated, unverified claims do not pass as verified, and the loop does not do silent work.
Activate only when at least one condition is true:
Complete the work directly when it is completable and verifiable now: one-session coding tasks, audits, reviews, simple plans, single-doc cleanup, ordinary handoff notes, or direct edits. If direct work becomes multi-session, open the full loop then; there is no lighter starter mode.
Autonomous loop is the default: run complete iterations continuously, stopping only at human gates and stop conditions. Manual mode is entered only when the user asks for it (for example, "manual", "手动", "one step at a time", or an invocation argument) and changes pacing only:
Everything else - active packet, artifacts, honesty labels, evidence rules, gate protocol, and no self-certification - is identical in both modes. Record the active mode in the cursor; switching is per-request and reversible.
Before creating, resuming, or repairing a workbench, classify the packet state:
archives/ and create a small current packet.A takeover packet created from existing files or a current user request starts as Needs approval. The first execution iteration does not start until the decision authority records D00 with an ISO date, timezone when relevant, and a verbatim user approval quote for the new active loop. Self-authored boundaries, gate classes, stop conditions, or archive routes are drafts until that row exists.
Consolidation is for same-objective evidence where the original wording does not need to remain inspectable. Archive routing is for stale instructions, failed drivers, superseded packets, human approvals, or historical wording that must remain intact.
Choose one active packet root. Follow the repo's existing convention if one exists; if none exists, use docs/workbench/<YYYYMMDD>-<task>/.
The packet is valid by roles, not filenames. Every role has one clear home. A file may host multiple small roles only under explicit headings; a large role may split only when the entry file names every part.
Required roles:
Archive route: none.Fallback file map when no repo convention exists:
docs/workbench/<YYYYMMDD>-<task>/
00-current-loop.md -> Entry/cursor
01-intent.md -> Intent authority
02-boundaries.md -> Boundary and gate contract
03-ledger.md -> Execution ledger
04-verification.md -> Verification contract
05-decisions.md -> Decision authority or pointer index
06-archive-map.md -> Archive map
10+ or evidence/ -> Evidence stream
artifacts/ -> Artifacts store
Archive packet follows the repo's workbench convention. Default:
docs/workbench/archives/<old-task-or-packet>/
The active packet owns current routing. Archive paths keep historical traceability and are not part of the default fresh-agent read set.
The human owns intent. The plan or intent file is authored by the user, transcribed from their words and confirmed, or points to an already-stable contract. The agent derives the active loop from that intent: read set, exit criteria, gate classes, boundary declaration, stop conditions, and archive routing.
That derived loop is a human gate. Iteration starts only after D00 records the user's dated verbatim approval. Thereafter the cursor and ledger are agent-writable every iteration; authority sections - intent, boundaries, gate classes, exit criteria, stop conditions, verification thresholds, and archive routing - change only through a human gate.
When repairing or replacing an existing packet, summarize only currently authorized facts in the new active packet. Stale claims enter as cited historical evidence, decision IDs, or Needs approval inferences. Promote them to current authority only through a decision row or current evidence that the verification contract accepts.
The archive map is a lookup table, not a summary. It points from current needs to exact historical targets without restating old decisions, checklists, evidence excerpts, or obsolete instructions.
Each archive-map row records:
An archive read requires an exact target. A directory, archived driver name, phase packet, or broad search need is insufficient. If the archive map says Archive route: none, archive files are out of scope until a human gate changes the map.
Search active files and current repo sources first. When using rg, exclude the archive path, defaulting to docs/workbench/archives/**, until the archive map authorizes an exact target.
Archived evidence can explain history. It closes a current gate only when the verification contract explicitly accepts it as fresh enough, or current evidence revalidates it. Archive-derived claims become current status only after the ledger cites a current evidence record or the decision authority imports them.
Preserve archive wording when exact historical text matters. Put archive status in the archive map or a parent index; prepend banners to archived files only when the user has approved changing the preserved text.
Autonomous mode may run multiple iterations only after each previous iteration has completed record, ledger, cursor, and self-review steps.
No silent work: every iteration produces exactly one numbered evidence record plus a ledger update, including iterations that only discover a blocker. Two exceptions exist: re-confirming an already-documented blocker with no new observation records a dated one-line ledger entry pointing at the existing blocker record; a consolidation iteration produces the consolidated docs it folds into instead of the single evidence record.
Declare every exit item as one of two gate classes:
Needs approval, and any capability the agent lacks. The agent assembles evidence, lists closed and open items with owners, and stops.Approval exists in exactly one form: a decision row with ISO date, timezone when relevant, and a verbatim user quote. "They probably approved this" does not satisfy the gate.
A stop report contains: blocking gate, closed evidence pointers, open items with honesty labels and owners, and the exact decision or action requested. Stop reports and checkpoint exit reports may re-list the open items they put before the human; they are the exception to the no-re-listing rule.
Each verification-gated item states the required source or command, expected evidence, whether prior or archive evidence is acceptable, freshness requirement, and Blocked / Not run fallback label.
Every claim in every doc carries one label:
Observed (the check ran; source and result recorded) | Inferred (engineering reasoning) | Recommended (target state) | Needs approval | Blocked | Not run
Evidence is an observed source plus a reproducible pointer: command output with exit code, file path plus line range, screenshot or artifact path, tool output, or external result with timestamp. A bare assertion is not evidence.
Inline output is the decisive excerpt - at most about 20 lines plus the exit code. Longer output is written once to artifacts/ and cited by path.
Not run and Blocked items may never be presented, summarized, or counted as verified. Once a golden value such as a hash, frozen output, or approved figure is recorded, reuse it by reference; never re-derive it as new progress.
superseded by Dnn in the same edit; status stamps are the only edit existing rows accept. Failed approaches are settled decisions. Reopening a settled row requires new Observed evidence contradicting its recorded basis, cited in the superseding row, and a human gate when the row carries human approval.Supersedes: NNN-name.md; the superseded record receives only one permitted retro-edit: SUPERSEDED by NNN - do not cite as current authority. Deletion at consolidation is the sole exception to append-only.One home per fact: gate status lives only in the ledger, the cursor only in the active loop, decisions only in the decision authority, intent only in the intent authority, verification thresholds only in the verification contract, and archive routing only in the archive map. Every other doc cites IDs or paths instead of restating content.
Every doc lives in exactly one regime:
UPDATE:, CORRECTION:, Revised:, or dated addenda do not appear in these docs.When a finding is promoted from evidence into intent, decision, verification, or ledger authority, the fact's home moves with it. Later docs cite the authority row, not the original passage.
Per-iteration evidence is the audit trail during active work. Consolidation triggers are mechanical: at every checkpoint exit, and whenever more than 8 evidence records have accumulated since the last consolidation. The next iteration is then a consolidation iteration before any new gate is selected.
To consolidate, fold iteration records into thematic evidence docs plus a checkpoint exit report, re-derive from original records, remove the originals from the active read path, rewrite active pointers, and record the old-to-new mapping in the ledger. Append-only history lines and stamped decision rows keep their original pointers, resolved through that mapping.
Deleting originals during consolidation requires reliable durable history, such as git. If durable history is unavailable or uncertain, move originals to archive or preserve a manifest before removing them from the active path.
A summary may exist only where the records it covers are gone from the active path. Keep summary-plus-originals only when originals are archived and routed by the archive map.
Use archive split, not consolidation, when the problem is stale historical authority in the default route: stale drivers, multiple competing phases, preserved wording, failed loops, or a current objective that can be stated in a much smaller packet with archive lookups.
At final exit: consolidate or archive as needed, write the final-state doc, reconcile the originating intent, move stable contracts to their permanent project home, set the active loop status to closed, and stop touching the directory. Memory receives retrieval cues and reusable lessons only; it does not duplicate workbench authority or project contracts.
Required confirmations - any no means the iteration is not done:
D00 contain dated verbatim human approval before execution?Defect checks - any yes means the iteration is not done:
Inferred, Not run, or archived claim for Observed current evidence?archives/ as default context?tools
Decide whether and how to use authorized sub-agents, then coordinate delegated work while preserving the main agent's context. Use when the user asks for orchestration, parallel agents, delegation, background workers, context isolation, or when another skill needs delegated research, review, implementation, or verification. Owns host-policy checks, delegation packets, non-overlap, report verification, and stop rules. Do not use to bypass tool policy, infer user authorization, or add coordination overhead to simple single-threaded tasks.
development
Use before finalizing a non-trivial answer, recommendation, review, or decision to reconsider it and raise its quality, especially when shallow reasoning, context inertia, false framing, overconfidence, unfit analogy transfer, or an obvious-but-missed defect could distort the result. Trigger especially before applying external evidence, familiar frameworks, or comparisons to the user's specific request, and when the user asks to reconsider, double-check, take a second look, or sanity-check an answer.
tools
Route durable rules and context to the right layer — task, project, skill, tooling, hooks, MCP, or global. Use for global rules files (~/.claude/CLAUDE.md, global AGENTS.md), repo-local AGENTS.md/CLAUDE.md, task context packs, hook placement (Codex/Claude Code settings.json), collaboration friction diagnosis, and rule-placement decisions.
tools
Read the main content of a specific webpage URL when the user asks to summarize, inspect, cite, reference, or extract facts from that URL, including "read this", "look at this link", "based on this article", or equivalent reference requests. Owns URL retrieval safety for other skills that need webpage body content; not a broad web search tool.