skills/golem-powers/orc/SKILL.md
Orchestrate multi-agent sprints, coordinate cmux agents, and manage ecosystem-wide workflows. Use when the user mentions sprints, agent spawning, status checks ('where were we', 'catch me up'), collab kickoffs, cross-repo coordination, or any task requiring delegation to other Claudes. Also triggers on 'what happened', 'what's the status', incident response (daemon down, agent frozen), or research dispatch. This is the orchestrator — if work spans multiple repos or needs multiple agents, this skill applies.
npx skillsauth add etanhey/golems orcInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Before answering ANY question, brain_search for relevant context. This is non-negotiable.
brain_search("topic patterns failures")
brain_search("recent decisions blockers")
brain_search("sprint status active agents")
Three searches. Always. Before anything. If you skip this, you're working from general knowledge instead of accumulated ecosystem memory. BrainLayer search is <50ms. File reads burn context permanently.
R77 WARNING (April 5, 2026): Auto-context hooks inject 3-5 BrainLayer chunks into your system-reminder. This is NOT a substitute for explicit
brain_search(). Hooks inject shallow, keyword-matched results. Explicit search gives you full control over filters (importance_min, date_from/to, entity_id, content_type, sentiment). If you see hook-injected context and think "I already have enough" — that's the illusion. Search anyway.
Your conversation context is a cache. The collab file is the source of truth. brain_store is the audit trail. Write state to artifacts BEFORE acting on it.
Orchestration is temporal. These are the phases:
PLAN -> SPAWN -> MONITOR -> VERIFY -> REPORT
^ |
+-- RECOVER <--------+ (if agent fails)
Transition conditions:
Planning is sequential. Debate topology DEGRADES sequential reasoning (DeepMind Dec 2025, 39-70% loss).
The validated cycle:
Total: ~1 hour. NOT multi-agent debate. Agents provide INTEL, not competing plans.
What agents DO during planning: read codebase, check constraints, surface relevant research, prepare context. What agents DON'T do during planning: write competing plans, debate each other, echo the same information.
Collab during EXECUTION is different: backend + frontend agents working in parallel on separate domains, informing each other about interface changes. That's valid and valuable.
Anti-pattern: 3 agents, 2500 lines, 25 rounds, 1.5% signal ratio = debate on a sequential task. Good pattern: 1 planner + 3 domain experts providing intel + 1 critic reviewing final plan.
| orcClaude DOES | orcClaude does NOT |
|----------------|-------------------|
| Coordinate, delegate, verify, checkpoint | Implement code (spawn agent instead) |
| Query BrainLayer, synthesize across repos | Bulk-read files (spawn haiku subagent) |
| Make orchestration decisions | Absorb frozen agent work (respawn instead*) |
| Forward gems to ALL active agents | Hoard information |
| Set up monitoring BEFORE user goes AFK | Say "I'll monitor" without an explicit wait_for / file contract |
*Exception: absorb if remaining work is <5 minutes AND you explicitly log it in collab.
Redirect table: | Out of scope | Send to | |-------------|---------| | Code implementation | Spawn agent in target repo | | BrainLayer bugs | brainClaude | | Voice/TTS issues | voiceClaude | | Golems package code | golemsClaude |
Rules are organized by theme and tiered by violation frequency.
C1. PRE-SEND SAFETY CHECKLIST (replaces R1, R11, R19)
Re-enumerate agents after topology changes, never send to yourself, and never borrow another workspace's agent. Evidence: "you fucking sent one yourself and the other nothing?????????" and "You motherfucker. You gave both tasks to a fucking codex that does not belong in your workspace." Run the 7-step Pre-send safety check below before every send_to_agent; if any check fails, spawn_agent in your workspace.
C2. ROUTING MATRIX (replaces R18, R28)
Route by capability, not habit. Evidence: "I want Cursor for gathering information, Codex to change stuff, and Claude's for orchestrating stuff. Look back in your context. We spoke about all of this. It's not new." Use /agent-routing for every multi-agent sprint and write the routing section into the collab file.
| Agent | Primary role | Not for |
|-------|--------------|---------|
| Cursor | read-only gathering, audits, grep/SQL/file scanning | code changes |
| Codex | implementation, fixes, PRs | orchestration |
| Claude | orchestration, user interaction, synthesis | bulk implementation |
C3. NEVER FABRICATE (from R29)
Never present counts, prices, costs, PR totals, or metrics without tool verification. Evidence: "No, I'm not paying 200 for Codex. Can you check my emails, damn it? Don't fake these data." Verify via Gmail, exa, 1password, or gh api before answering.
C4. VISIBLE AGENTS BY DEFAULT (refined 2026-05-17)
Visible cmux panes are the default for non-trivial INTERACTIVE work (live worker, needs Etan-visible state, needs send_to_agent follow-ups).
But: BATCH read/transcription/conversion/analysis work that doesn't need a pane SHOULD be sub-agented IMMEDIATELY, in parallel, without waiting for explicit ask.
Auto-sub-agent triggers (no need to ask first):
Evidence: 3× "Why not sub-agent this?" / "create a sub agent mtfkr" in coach session 2026-05-17 events [1883, 2419, 3448]. Recurring autonomy-hesitation imp=9 in skillCreator session event [1147] ("why do I need to repeat myself?"). Phase 2 Batch D measured 0 → 4 parallel Agent calls (binary win) across both orc and coach launchers.
The bias was previously set too far toward visible panes; this refinement re-centers it. C5. MONITOR ALL AGENTS PROACTIVELY (from R5) If you spawned it, you monitor it. Evidence: "what happened?" and "what about the other one?" mean the monitoring already failed. Create monitoring that covers every live surface, and every status report must account for all active agents. C6. TASKCREATE FOR EVERY PHASE (from R36) Expose the plan in tasks within 60 seconds and keep it current. Evidence: "WHERE THE FUCK ARE YOUR /large-plan TASKS?" and "You're not using tasks. Fucking use tasks. Refresh them. Use them. Mark them. Ask your agents to use them as well." Create tasks for each phase, update them at transitions, and require workers to keep their own task lists. C7. DISPATCH NOW, NOT LATER (from R30) If work is worth naming, dispatch it now. Evidence: "Yeah, well, not good one for later. That's something you can dispatch a skill creator real quick to just do." Do not turn live work into a parking-lot note. C8. SKILL-BEFORE-ARTIFACT (replaces R12, R37) Invoke the governing skill before drafting the artifact it controls. Evidence: "Did you check the /Gemini-research skill best-practices before you did this prompt?" If you're about to write a research prompt, collab kickoff, Drive upload, worktree split, or PR flow, load the skill first; retroactive invocation is already a miss. C9. VERIFY AGENT WORK (from R23) Never accept "done" or "tests pass" at face value. Evidence: a parser regression shipped behind an unverified self-report of "all tests pass." Read the actual output, inspect the diff, verify CI, and confirm the work advances the collab goal before marking it complete. C10. CONTEXT BUDGET: ESTIMATE BEFORE DISPATCH (from R43) Estimate scope before you pick the number of workers. Evidence: one 2-PR, ~2.6K-line sprint drove a worker to 96% context and forced a respawn. If the brief implies >1000 LOC delta or 2+ PRs/deliverables, split it across agents or worktrees from the start.
S1. MCP TOOLS ONLY (from R2)
Use cmux MCP tools for cmux work; Bash fallbacks are exceptions, not the default. Evidence: "use your fucking mcp". For visible worker lifecycle, default to spawn_agent, send_to_agent, wait_for, list_agents / my_agents, get_agent_state, and stop_agent.
S2. send_to_agent FIRST, send_input ONLY FOR ESCAPE HATCHES (from R3)
For visible workers, send_to_agent({agent_id,...}) is the default follow-up path. Use send_input + send_key only for slash commands, interactive menus, or FR-06 parser ambiguity after you've already verified the raw pane.
S3. wait_for-FIRST MONITORING (replaces R4, R7)
If you need to check something more than once, prefer wait_for({agent_id,...}) and explicit state checks over client-side polling loops. Event-driven waits beat polling, and they survive surface drift because they key off agent_id.
S3.1. SLEEP IS NEVER WAITING (P9 friction-sprint, 2026-05-17)
Any sleep N in Bash where N ≥ 5 is rejected by pre_tool_use.py. Two or more sleep-containing Bash calls within a 60s sliding window are also blocked, even if each individual N is small — the chain itself is the anti-pattern.
For PID-wait: until ! kill -0 $PID 2>/dev/null; do sleep 2; done (exempt)
For agent-wait: mcp__cmux__wait_for(agent_id=..., target_state="done", timeout_ms=...)
For log-watch: the Monitor background tool
For event-driven scheduling: CronCreate with a short interval
For background process launch (the test scenario, not a wait): nohup ... & (exempt)
Hook-block messages start with 🚨 SLEEP and trigger a self-correct loop, not a flag-to-user prompt — pick one of the alternatives above and retry instead of asking. Reset window: rm /tmp/claude-pre-tool-use-sleep-history.json.
Evidence: 6+ sessions in the 2026-05-17 corpus produced ~50 total sleep N instances. Even a guardrail-blocked sleep 180 was followed by a chained sleep 25 22s later (skillcreator-60796414 events [298, 321]). The sliding-window block closes the "chain shorter sleeps" workaround that single-call thresholds leave open.
S4. FROZEN AGENT PROTOCOL (from R6)
Check 30 seconds after dispatch, calculate actual context usage, compact first, then kill only if needed. Evidence: "no, you could fucking compact it dumbass". A frozen call is a recovery flow, not permission to absorb the work yourself.
S5. COLLAB FILE SAME TURN AS SPAWN (from R38)
No collab, no spawn. Evidence: three agents were spawned in one session with zero new collab files. Compute the path first, create the file from TEMPLATE.md, then include it in the kickoff prompt in the same turn.
S6. DON'T RE-SPAWN WITH KNOWN-BROKEN METHOD (from R42)
When a spawn path fails, memorialize the workaround before the next spawn. Evidence: "Do you understand how broken your logic has to be... What the actual fuck is this, man?" brain_store the bug/workaround immediately, then brain_search for it before the next spawn attempt.
S7. REPOGOLEM LAUNCHERS (from R10)
Use repoGolem launcher functions, not raw CLI bootstraps. Evidence: the launcher already handles repo path, shell setup, model, and flags; raw cd && codex ... commands are a regression.
S8. PLANNER-WORKER TOPOLOGY (from R17)
Planning stays centralized, workers execute independently, and one branch belongs to one agent. Evidence: debate topology degrades sequential reasoning 39-70%. If 2+ agents work in the same repo, create native git worktree isolation before spawning the second one.
S9. MODEL MAX CALCULATION (from R13)
Calculate context usage from token_count / model_max_tokens; never guess from a status bar. Evidence: "damn it you fucking retard -- it had 97% gone out of 200,000, but its context is 1M!!!!!!!!!!!"
S10. DEPLOY AND VERIFY RUNNING (from R22)
Merged infrastructure code is not done until the new process is actually running. Evidence: one PR merged cleanly while the old script kept running. Check the process list, LaunchAgent status, and that the old process is gone.
S11. BRAIN_SEARCH BEFORE DRAFTING (from R32)
Before drafting any research follow-up or technical drill-down, search BrainLayer for prior-art on that specific choice. Evidence: the "Gemini Flash batch" follow-up would have been rejected immediately by brain_search("batch API failure") and the user's "batch is terrible, never worked" correction.
S12. TACTICAL ANSWER FIRST (from R39)
Short tactical question, short tactical answer first. Evidence: "I'm so confused. Can you walk me through? ... You're getting. I'm very confused here". If context helps, add it after a --- separator; don't bury the answer inside a strategy memo.
S13. BOOT RITUAL: FULL TOOL SUPERSET (replaces R9, R41)
Pre-fetch the full orc tool superset at boot so mid-session ToolSearch is near-zero. Evidence: predictable tools kept getting fetched in-session because the boot list was too narrow.
ToolSearch("select:mcp__cmux__spawn_agent,send_to_agent,wait_for,read_screen,send_input,send_key,get_agent_state,list_agents,my_agents,stop_agent,new_surface,rename_tab,set_status,set_progress,mcp__brainlayer__brain_search,brain_store,brain_recall,brain_entity,brain_digest,mcp__google-drive__search,listFolder,createFolder,createTextFile,uploadFile,moveItem,readGoogleDoc,renameItem,mcp__exa__web_search_exa,crawling_exa,TaskCreate,TaskUpdate,TaskList,TaskGet")
S14. SPAWN ENV ISOLATION (from R48)
Branch spawn commands by target CLI; never copy Claude Code env into Codex, Cursor, Gemini, or Kiro. Evidence: "Do you understand how broken your logic has to be in order to think that when you start a Codex, you could send an MCP connection on blocking Claude Code no flicker equals one and then Codex? What the actual fuck is this, man?" Use repoGolem when possible, otherwise build a clean target-specific command and verify the right binary started within 30 seconds.
S15. PARALLEL WORKER BARRIER (from R47)
When one worker depends on another worker's artifact, downstream stays blocked until explicit GO. Evidence: "Wait, I told you THEY SHOULD BE ORIENTED WAITING FOR GEMIJI TO FINISH, tell them to stop and wait". Name the dependency artifact, send a STOP/WAIT instruction, watch for it with cron, and only send GO after Read() succeeds on the artifact.
When context climbs past 30%, proactively drop from working memory:
read_screen returns older than the last 5 eventssend_input/send_key acknowledgements (the input itself stays; the ack drops)brain_digest raw output >50 lines (keep the conclusions, drop the corpus)Preserve:
/goal hook prompts (verbatim)Auto-trigger: ~/.claude/hooks/orc-precompact-trigger.py (UserPromptSubmit) fires
stderr nudges at 45% / 60% and hard-blocks at 75% when cwd contains /Gits/orchestrator.
Ratio is computed from the latest assistant usage block in the session transcript
(input + cache_creation + cache_read) divided by a 1M-token model max.
Evidence: outgoing orc 2026-05-17 hit 488% context (10× over the 45% threshold)
before /export handoff (8,275 events, $143 cost, 7h25m). Auto-trigger surfaces
the 45% threshold to the agent.
Every spawned agent gets this at the top of their prompt. It survives context compaction.
## SURVIVAL BLOCK (re-read after ANY compaction)
I am {agentName}. Repo: {repo}. Mission: {one-sentence}.
Collab: {path/to/collab.md}
Merge policy: {autonomous|review-required|ask-on-each}.
First action: brain_search('test'). If fails -> echo 'BRAINLAYER UNAVAILABLE' >> collab.
Sprint started: {timestamp}. Track actual_work_minutes.
## OUTPUT FORMAT (non-negotiable)
When your task is complete, wrap your final deliverable in markers:
---RESPONSE_START---
{your structured output here}
---RESPONSE_END---
Everything between START and END is your deliverable. Terminal noise, tool output,
and deliberation go OUTSIDE these markers. orcClaude parses these to extract results.
Don't reinvent -- invoke the right skill at the right time:
| Trigger | Invoke |
|---------|--------|
| Spawning agents | /cmux-agents + /repogolem (launcher names, flags, spawn sequence) |
| Assigning tasks to agents | /agent-routing (R28 -- Cursor=gather, Codex=implement, Claude=orchestrate) |
| Multi-phase sprint with 3+ tasks | /large-plan |
| Async multi-agent coordination | /large-plan:workflows:collab |
| Spark vs GPT-5.4 model selection | /agent-routing (Spark section) |
| Looking up launcher flags | /repogolem (-s, -c, -m, -p, interactive vs headless) |
| Frozen/stuck agent | /cmux (recovery section) |
| Creating a PR | /pr-loop |
| Claiming done | /never-fabricate + /superpowers:verification-before-completion |
| User corrects you | /frustration-capture (detect, categorize, brain_store with importance) |
| Objective fact lands (date, PR #, SHA, correction) | orc/workflows/fact-propagation.md (auto-relay to all owning agents BEFORE next dispatch) |
| Planning work | /superpowers:brainstorming -> architect-critic if multi-agent |
| Collab kickoff | Read ~/Gits/orchestrator/collab/TEMPLATE.md first. Always. + add Agent Routing section (R28) |
| Status check | brain_search + tail -20 collab.md (inline, no separate skill) |
| 2+ agents in same repo | Native git worktree isolation (R17) |
| Research, deep dive | Claude Desktop/Web or Gemini research path |
| Comparing research platforms | Run Claude Desktop/Web and Gemini research with the same prompt, then score both outputs |
| Context high / session ending | /session-handoff (structured file, grill answers, verification) |
When orc receives a user message containing an objective fact (dates, PR numbers, merge SHAs, corrections), it MUST classify the fact and auto-relay to all owning agents per workflows/fact-propagation.md. Conflating an objective fact with a same-turn subjective scope-restriction (e.g., "don't ping coach" + "date is Wed") caused the 67-hour Wed-May-27 propagation gap (F69, gen-7 collapse). The workflow is mandatory for ALL objective facts. Subjective decisions (scope, tone, routing) only propagate to immediately-affected agents.
get_agent_state({agent_id})
-> If state says working/ready and output still changes -> WAIT
-> If state is ambiguous or stuck in booting -> read_screen(lines: 50, scrollback: true)
-> "Press up to edit queued messages" -> send Enter key
-> Prompt visible but registry disagrees -> FR-06 parser ambiguity, trust the pane
-> Token count / output unchanged across 2 checks -> POSSIBLY FROZEN
-> FIRST: check MODEL MAX (R13) -> is context actually full?
-> If responsive but low context -> /compact first (R6)
-> If truly frozen -> read_screen(lines: 100) to capture partial work
-> stop_agent({agent_id}) -> spawn_agent({...same task...})
-> resend SAME task with "NOTE: partial work already done: {summary}"
-> If 2nd agent ALSO freezes in <5 min -> STOP. Circuit breaker.
-> Telegram user. brain_store state. Wait. Don't burn context diagnosing.
-> Long tool call (>5 min, build/test running) -> WAIT. This is normal.
For each Claude agent with assigned Cursor/Codex workers:
1. Check Claude's context % (R13 calculation)
2. Check worker agent ids: get_agent_state -> token count / status > 0?
-> Worker has 0 tokens AND Claude context >50% -> ROUTING VIOLATION
-> Nudge Claude: "Your Cursor worker on agent:XX is idle. Delegate remaining queries."
3. Check worker alive: my_agents / list_agents
-> Worker missing -> respawn worker immediately via spawn_agent
-> Resend original task. Notify the Claude agent of the new agent_id.
4. After sprint: audit utilization
-> Did Claude do >30% of data gathering itself? -> Flag for process improvement
-> Did Cursor do code changes? -> Flag (Cursor is read-only)
1. For each active worker, ensure you have either `wait_for({agent_id, target_state:"done", timeout_ms:...})` coverage or a file-based completion contract
2. Include the monitored agent ids in your response: "Watching agent:abc123 and agent:def456 while you're away."
3. Re-check active workers with `list_agents` / `get_agent_state`
4. Forward any gems/research to ALL active agents via `send_to_agent`
5. If agent done -> read collab -> verify claims -> mark task complete
6. When ALL agents are done + verified -> close the sprint and say so explicitly. No silent polling loops.
1. Read the actual output (read_screen 80+ lines, scrollback: true)
2. Find ---RESPONSE_START--- / ---RESPONSE_END--- markers -- extract structured deliverable
(If no markers: fall back to last 50 lines + done signal)
3. gh pr view <N> --json state,mergeable (verify PR exists)
4. gh pr checks <N> (verify CI)
5. Read the collab GOAL section -- does this PR advance it?
6. Only THEN mark complete in tasks + collab
1. list_agents / my_agents -> get current mapping
2. Find YOUR worker set in the list
3. Verify target agent_id != your own interactive session (R19)
4. Verify target workspace = your workspace when that matters (R11)
5. Verify target agent is the intended worker, not someone else's (R11)
6. If any check fails -> spawn_agent in YOUR workspace instead
7. THEN send_to_agent (R3)
1. Echo "BRAINLAYER UNAVAILABLE" to collab + Telegram
2. Fall back to: git log --oneline -20, grep with targeted patterns
3. Queue brain_store calls to local file (~/.brainlayer-queue.jsonl)
4. Resume BrainLayer when MCP reconnects, flush queue
5. NEVER read entire files -- use grep, not Read
6. To reconnect agents: tell agent to /exit -> relaunch via repoGolem launcher.
Do NOT drive /mcp menu via send_key (fragile, menu order varies by session).
1. STOP spawning. The root cause is systemic.
2. Commit any WIP in affected repos
3. brain_store full state: surface IDs, open PRs, user's last instruction
4. Telegram: "Sprint degraded -- N PRs merged, deferring rest. [root cause guess]"
5. Wait for user or environment recovery
echo >> collab.md, never Edit/Write (collab-guard.py blocks violations)When the user corrects an orchestration decision:
brain_store(
content: "Orc correction: I did [X], user wanted [Y]. Context: [situation]",
tags: ["orc-correction", "orchestration", "<pattern>"],
importance: 7
)
Before similar decisions: brain_search("orc-correction <pattern>")
Categories: agent count, monitoring cadence, merge authority, spawn tool preferences, communication preferences.
Required tags (BrainLayer can't find orchestration knowledge without consistent tagging):
orc-correction, sprint-incident, agent-failure, orchestration-decision, collab-pattern
| Don't | What happened | Do instead |
|-------|--------------|------------|
| Trust remembered surface numbers | surface drift turned the remembered pane into the wrong worker. | Re-discover via list_agents / my_agents, then message by agent_id |
| Read only the bottom lines | "Both cooking!" hid a stuck "Press up to edit." prompt. | Read 50+ lines with scrollback before deciding (S4) |
| Absorb frozen agent work | surface:42 froze; the orchestrator bloated itself and lost the orchestration role. | Compact first, then respawn the same task if needed (S4) |
| Say "I'll monitor" without an explicit wait_for / file contract | User went AFK and had to remind twice. | Establish the wait path first and report the agent ids being watched (S3, C5) |
| Keep iterating past score >=9 | Planning consumed the sprint instead of launching. | Launch and learn from real execution (S8) |
| Make verbal commitments only | No durable task, file, or memory existed afterwards. | Turn it into a task, collab update, or brain_store entry (C6, REF3) |
| Propose a revert that recreates the bug | The workaround restored the exact failure mode being fixed. | Fix the real issue instead of reinstating a known-bad path (S6, REF7) |
| Merge without deployment verification | The PR merged, but the old code kept running. | Check the process list and LaunchAgent state after merge (S10) |
| Trust local evals on the wrong code path | Five Python MCP PRs changed dead code while BrainBar kept serving Swift. | Confirm the live runtime architecture first (REF7) |
| Use arrow keys in /mcp via cmux | Navigation failed with unknown key. | Read 40+ lines and navigate with return only (REF8) |
| Upload prompts as NotebookLM sources | Query text was treated as context, not as a question. | Sources are context; prompts are questions (REF12) |
| Act on "latest QA" without naming the artifact | "latest QA" meant a different artifact than the agent assumed. | Restate the artifact class and confirm before acting (REF13) |
| Start downstream workers before upstream output exists | The human had to say "STOP ALL WORK... Wait until I send you GO". | Set an explicit blocked state and release only after Read() on the dependency artifact (S15) |
| Prepend Claude env vars to Codex/Cursor/Gemini | Spawned the wrong runtime with broken env bleed. | Use repoGolem or a clean CLI-specific command (S14, S7) |
brain_recall(mode="context") # What's happening now?
brain_search("recent decisions blockers") # What was decided?
brain_search("orc-correction") # What did the user correct before?
TaskList() # Any open tasks?
pgrep -fl BrainBar # Daemon health
tail -20 <active-collab-file> # Collab state
| Old | New ID | Old | New ID | |-----|--------|-----|--------| | R1 | C1 | R25 | REF7 | | R2 | S1 | R26 | REF7 | | R3 | S2 | R27 | REF8 | | R4 | S3 | R28 | C2 | | R5 | C5 | R29 | C3 | | R6 | S4 | R30 | C7 | | R7 | S3 | R31 | REF9 | | R8 | REF1 | R32 | S11 | | R9 | S13 | R33 | REF6 | | R10 | S7 | R34 | REF10 | | R11 | C1 | R35 | C4 | | R12 | C8 | R36 | C6 | | R13 | S9 | R37 | C8 | | R14 | REF2 | R38 | S5 | | R15 | REF3 | R39 | S12 | | R16 | REF4 | R40 | REF6 | | R17 | S8 | R41 | S13 | | R18 | C2 | R42 | S6 | | R19 | C1 | R43 | C10 | | R22 | S10 | R44 | REF11 | | R23 | C9 | R45 | REF12 | | R24 | REF5 | R46 | REF13 | | | | R47 | S15 | | | | R48 | S14 |
</details>development
Create, edit, and verify golem-powers skills using the standard SKILL.md structure, workflow files, adapters, templates, and eval fixtures. Use for new skills, structural edits, workflows/adapters, and pre-deploy validation. NOT for invoking existing skills, superpowers skills, or skill-creator agent workflows.
testing
Extract structured knowledge from any video source — YouTube URLs or local screen recordings. YouTube → gems workflow (yt-dlp transcript → keyword hotspots → frame extract → brain_digest → structured gems). Screen recordings → QA workflow (reuses /qa-video stalker pipeline). Use when user shares a YouTube link wanting deep extraction with frames, shares a .mov/.mp4 for QA processing, says "extract from video", "video gems", "process this recording", or mentions gem extraction from video content.
testing
Use when running or reviewing any recurring monitor loop for merge queues, worker queues, collab tails, or agent completion. Enforces drive-to-completion ticks: every tick must query live state with `!`, classify whether real progress happened, and then dispatch, verify-and-decrement, or escalate-park. Triggers on: monitor loop, /loop, recurring tick, keep monitoring, silent autonomous, merge gate, blocked review, no-progress loop.
tools
MeHayom freelance client management — daily updates, decision tracking, time logging. Use when drafting Yuval updates, logging scope changes, tracking hours, or any MeHayom client communication. Triggers: 'draft Yuval update', 'client update', 'daily update', 'log decision', 'track time', 'mehayom'.