etanhey — AI Agent Skills on SkillsAuth

ecosystem-health

Run ecosystem health checks — MCP connections, BrainLayer stats, skill evals, friction scans. Use this skill when asked about ecosystem health, maintenance checks, skill monitoring, 'is everything working', 'run a health check', 'what's broken', or when proactively auditing the system. Also triggers for 'maintenance Claude', 'ecosystem audit', 'skill eval', 'MCP status', or 'BrainLayer health'. Run this before and after major changes to catch regressions.

tools3

catchup

Use when returning to work after any break — auto-detects depth. Short break (hours): reads only uncommitted changes. Long break (48h+) or context overflow: reads all branch changes vs main. Covers catchup, context recovery, refresh, rebuild understanding. NOT for: mid-task exploration (use Read/Grep directly).

development3

brave

Use as fallback browser automation when Claude-in-Chrome MCP is unavailable. Covers browser control, navigation, screenshots, clicking, typing. NOT for: headless testing (use Playwright). Claude Code users should prefer MCP first.

tools3

claude-web-research

DEPRECATED ALIAS — renamed to /claude-desktop-research on 2026-04-30. Use /claude-desktop-research instead. This alias remains active until 2026-05-30, then retires. Triggers: 'research prompt', 'Claude Web research', 'Claude Desktop research', 'deep research'.

development3

commit

Use when ready to commit changes. Runs CodeRabbit review first, then commits if review passes. Supports Ralph mode for atomic commit + criterion marking. Covers commit, ralph commit, atomic commit. NOT for: pushing or creating PRs (use pr-loop).

development3

github

Git and GitHub operations via gh CLI — branching, committing, creating PRs, managing issues, viewing CI status, and repository management. Provides ready-to-use gh commands for common workflows like creating feature branches, checking PR review status, listing open issues, viewing workflow run results, and managing labels. Use when doing git operations, creating or updating PRs, managing GitHub issues, checking CI/CD status, viewing PR comments, or working with GitHub releases. Triggers on 'git', 'github', 'PR', 'pull request', 'issue', 'branch', 'CI status', 'gh'. NOT for: Linear issue tracking (use linear), AI code reviews (use coderabbit), full PR lifecycle with review loops (use pr-loop).

tools3

convex

Manage Convex backend operations including dev server lifecycle, cloud deployments, function execution, schema management, and data import/export. Wraps the npx convex CLI with project-specific configuration. Includes workflows for starting local dev, deploying to production, running one-off functions, exporting/importing data snapshots, and managing environment variables. Use when starting a Convex dev server, deploying backend changes, running mutations or queries, managing Convex schema, or debugging Convex function errors. Triggers on 'convex', 'backend deploy', 'run function', 'schema change', 'convex dev'. NOT for: Supabase or Firebase operations (use respective tools), frontend-only React work, or general database queries.

tools3

interview-practice

Interactive mock interview simulator with 7 modes: leetcode, system-design, debugging, code-review, behavioral, optimization, and complexity drills. Conducts Socratic-style practice sessions calibrated by company and level. Use when: preparing for technical interviews, practicing coding questions, doing mock system design, or drilling Big O complexity. NOT for: actual job applications, resume writing, or outreach (use coach skill).

development3

maintenance

Two-phase agent for ecosystem maintenance (fact-gathering + verification) and publicity (content creation with collab partner). Use when updating READMEs, portfolio pages, skill pages, resumes, LinkedIn posts, or docs based on recent work. Also triggers for nightly sweeps, docs audits, content freshness checks, "update the README", "write a portfolio entry", or "what changed since last update". Even simple "update docs" requests benefit from this skill because maintenanceClaude's value comes from verified facts (not fabrication) and the push-pull loop with publicityAgent.

testing3

review-router

Dynamic code review routing with automatic fallback chain when primary reviewer is unavailable. Routes to CodeRabbit, Macroscope, requesting-code-review, or Cursor CLI. Triggers on: 'review-router', 'route review', 'reviewer unavailable'. NOT for: general code review workflow (use /code-review), receiving review feedback (use /superpowers:receiving-code-review).

tools3

test-plan

Generate structured manual testing checklists from git diffs for QA review before merging PRs. Analyzes changed files and produces step-by-step testing instructions covering happy paths, edge cases, and regression checks. Output is a markdown checklist suitable for QA handoff or self-review. Use when preparing a PR for manual QA, creating a testing checklist for a feature branch, or documenting what needs manual verification before merge. Triggers on 'test plan', 'QA checklist', 'testing checklist', 'manual testing', 'QA review prep'. NOT for: writing automated tests (write those in code), AI code reviews (use coderabbit), or CI pipeline configuration.

development3

video-showcase

Create product/project showcase videos using Remotion (React). Takes project description + screenshots → generates compositions → renders MP4. Use when asked to make demo videos, product showcases, or animated project walkthroughs.

development3

worktrees

Create and manage git worktrees for isolated feature development. Prevents branch cross-contamination by giving each feature its own working directory with shared git history. Includes workflows for creating worktrees from scratch or from Linear issues, listing active worktrees, switching between them, and cleaning up completed ones. Use when starting a new feature that needs isolation from current work, running parallel implementations, or preventing uncommitted changes from leaking between tasks. Triggers on 'worktree', 'isolated branch', 'parallel feature', 'branch isolation', 'new worktree', 'feature isolation'. NOT for: simple branch switching (use git checkout), Linear-only operations (use linear), or temporary experiments (just use a branch).

development3

cli-agents

Run external CLI agents (Gemini, Cursor, Codex, Kiro, Claude) as visible cmux workers through repoGolem launchers. Use for delegating implementation, spawning research/audit agents, or coordinating multi-agent builds. Workers split in the current workspace; audits/research use a separate named workspace. NOT for plain cmux pane management or Claude-only spawning.

tools3

1password

Manage secrets, credentials, API keys, vault items, and op:// references with the 1Password op CLI. Use for storing/rotating secrets, migrating plaintext .env files, wiring MCP configs to 1Password, and troubleshooting op auth. NOT for non-secret config or ordinary runtime shell exports.

tools3

agada-bench

Standing BrainLayer quality benchmark. Scores live brain_search against the frozen gold corpus for recall@K, MRR, precision@5, placebo rate, and regression vs baseline. Run after BrainLayer PRs, FM fixes, schema changes, or embedder swaps. Rare build mode adds a new user-domain corpus. NOT for non-BL retrieval systems or rubric edits.

development3

golem-install

Set up the golems ecosystem for the first time on a new machine. Checks CLI dependencies, wires MCP servers, creates skill symlinks. Use when: "set up golems", "install golems", "new machine setup", "wire skills". NOT for daily usage.

tools3

git-guardian

Safety gate for destructive git operations. Invoke before force-push, reset --hard, branch -D/delete, checkout ., restore ., clean -f, or any commit/push while on main/master. NOT for: normal git add, commit on feature branches, log, diff, status, stash.

testing3

writing-skills

Create, edit, and verify golem-powers skills using the standard SKILL.md structure, workflow files, adapters, templates, and eval fixtures. Use for new skills, structural edits, workflows/adapters, and pre-deploy validation. NOT for invoking existing skills, superpowers skills, or skill-creator agent workflows.

development3

brain-store-fallback

Fallback for failed brain_store calls: when BrainLayer returns null chunk_id, queues, DB-busy errors, or transport closures, write the exact content to docs.local/decisions for replay.

testing3

coach

Life admin assistant covering health/habits, recruiting/jobs, freelancing/contracts, Israeli law, and scheduling. Memory-first: always searches BrainLayer before responding. Use when: daily planning, schedule creation, WHOOP data review, habit tracking, job hunting, freelance contracts, Israeli business law, client management, outreach emails, or any request referencing past coaching sessions. NOT for: writing code, deployments, or infrastructure.

tools3

never-fabricate

MANDATORY before reporting on any file contents, test results, agent outputs, or audit findings. If you haven't Read() it, you don't know what's in it. Period. Use when summarizing results, reporting on agent work, or claiming anything is "green" or "complete."

testing3

cmux-agents

Spawn AI agents in cmux panes/workspaces through MCP tools and repoGolem launchers. Covers Claude, Cursor, Gemini, Codex, Kiro, external CLI agents, worker splits, audit/research workspaces, monitoring, prompt delivery, and collab patterns. Use when spawning visible AI workers, terminal agents, or multi-agent orchestration.

tools3

research-prompt-quality

Mandatory pre-flight gate before any deep-research prompt ships. Three gates: CHECK-FIRST (non-redundancy), GROUND (Drive refs + current-usage examples + prior-research stance), emit-only-if-pass. Use when writing deep research prompts, Claude Desktop research prompts, deciding should we research, or proposing research. Triggers: 'deep research', 'research prompt', 'should we research', 'propose research'. NOT for executing research — use /research, /claude-desktop-research, or /gemini-research.

testing3

cmux

Use when running inside cmux terminal to control panes, splits, browser, sidebar, and send agent-to-agent messages. Covers split panes, notifications, browser automation, terminal reads, delivery verification. NOT for: regular terminal operations (use Bash), non-cmux sessions, agent lifecycle management (use cmux-agents).

tools3

coderabbit

Use when reviewing uncommitted changes, preparing PRs, requesting or receiving code review, handling CodeRabbit/Greptile/Bugbot/GitHub PR comments, checking security/secrets/a11y/code quality, or deciding whether to accept or reject reviewer feedback. Runs AI review via CLI and covers review triage, false-positive pushback, red/blue team profiles, PR-ready gates. NOT for: runtime debugging or test execution.

tools3

cron-payload-discipline

Use when writing or reviewing any cron, /loop, recurring monitor payload, merge/worker queue tick, or freeze-prone cmux monitor. Enforces loop discipline: live-query-first frames, no hardcoded state strings, drive-to-completion outcomes, verified counter resets, one full-read freeze rotation, and timestamped stale-tick detection.

development3

cursor-multitask

Route fan-out / parallel work to the RIGHT engine: Cursor /multitask (in-editor GUI), headless cursor-agent shell fan-out, the Claude Workflow tool, or the cmux fleet. Use when a task decomposes into independent parallel units (classify N files, audit M things, write tests+docs+examples, parallel verification passes), or when deciding 'should this be parallelized and by whom'. Triggers: multitask, /multitask, parallel agents, fan out, in parallel, all of these, batch classify, parallel audit. NOT for: a single coherent edit, sequentially dependent steps, or spawning visible multi-vendor workers (use cmux-agents).

tools3

claude-desktop-research

Write self-contained deep-research prompts for Claude Desktop or Claude Web. Use for Drive-grounded deep research, MCP-aware Desktop prompts, Claude research prompt batches, and Claude-vs-Gemini comparisons. NOT for quick local web lookups; use web/exa directly. For NotebookLM/Gemini research, use gemini-research.

tools3

drive-usage

Brain Drive filing discipline — where every artifact goes + how to name it. Use WHENEVER touching Google Drive / Brain Drive: uploading, creating folders, saving research prompts/results, audits, plans, transcripts, dashboards, or when about to leave a durable artifact in docs.local/. Teaches the numbered folder model (01_STANDARDS / 02_GROUNDING / 03_RESEARCH / 04_INGEST / 06_ARCHIVE), date-prefixed naming, and the rule: FILE durable artifacts in the right Drive folder — docs.local/ is cache-only. NOT for querying Drive via Gemini (use /braindrive) or web research (use /gemini-research); for >100KB heavy archival defer to /google-drive-archive.

development3

frustration-capture

Capture user corrections as high-importance BrainLayer entries. Use when user says no/wrong/stop, repeats instructions, expresses frustration, or during session mining. Triggers on: user correction, 'I told you', 'not that', frustration signal.

development3

research

Deep web research orchestrator. Routes research tasks to the best backend — internal subagents, CLI agents (Gemini/Cursor), or the researcher subagent. Use when asked to research, investigate, compare, find alternatives, or deep-dive into any topic. Covers web research, company research, code pattern research, and pre-implementation research.

tools3

gemini-research

NotebookLM/Gemini research workflow: create notebooks, add sources, run Gemini Deep Research, query results, and generate reports, audio, slides, quizzes, or study materials through the NotebookLM MCP. Use for Gemini/NotebookLM research, synthesis, source-grounded artifacts, and Claude-vs-Gemini comparisons.

tools3

fleet-wrap

Quiet-down protocol for sprint close: when the fleet wraps, delete ALL polling crons and monitors, send ONE final dashboard + ONE message, then go SILENT. Use when: fleet wraps, all workers done, overnight queue exhausted, sprint close, Etan asleep/away with nothing approved left. Triggers: fleet wrap, wrap the fleet, stand down, going quiet, sprint close. NOT for: mid-sprint monitoring (keep your loops), spawning a successor (use /session-handoff first).

development3

google-drive-archive

Archive heavy research outputs (transcripts, audio, video, audits, plans) to Brain Drive as forever-storage. Use when about to write >100KB / media to docs.local/, when user says 'archive this' / 'save to Drive' / 'this should be in Brain Drive', or after digesting a large artifact into BrainLayer. Enforces the source-of-truth hierarchy: Brain Drive (forever) → docs.local/ (latest cache) → BrainLayer (searchable index).

development3

html-dashboard

Use when Etan asks for a dashboard, make me a dashboard, summarize work, compare these, status report, digest, what changed, or what moved. Produces gen-12-quality static HTML by cloning the proven dashboard template/reference; never build from scratch.

development3

large-plan

Scaffold and execute folder-based multi-phase implementation plans with async agent collaboration. Creates docs.local/plans phases with acceptance criteria, owners, and PR boundaries. Use for large features, multi-PR refactors, parallel cmux agent work, or complex specs. NOT for single-file changes, simple bugs, short tasks, or brainstorming.

development3

judge-fleet

Bulk LLM-judging protocol for fleet-dispatched verdict runs (KG cluster, eval harness). Use when: dispatching or running judge workers (J1/J2/RT), planning bulk-apply from verdict JSONL, or triaging evidence_degraded outputs. Triggers: judge fleet, bulk judge, R3 verdicts, kg-judge, RT gate, evidence_degraded. NOT for: single-item code review, Phoenix view UX (use phoenix-human-view), or non-judge eval pipelines.

development3

mac-systems

macOS systems specialist — AppKit NSPanel architecture, launchd services, socket activation, MCP bridge resilience, syspolicyd, and high-frequency SwiftUI dashboards. Use when building menu-bar apps, LaunchAgents, debugging syspolicyd/Gatekeeper/TCC, resilient UDS/MCP bridges, or SwiftUI dashboards at 10Hz+.

tools3

phoenix-human-view

The human-eval UX contract for Phoenix views: turn-by-turn scrollable replay (not a scorecard), hide-but-copyable IDs, collapsed thinking, identity chips, tool filters, tiny frozen starter datasets, mark-wrong-in-thread, mobile-first. Use when: building or reviewing ANY Phoenix/eval view, annotation UI, session replay, or human-grading surface. Triggers: phoenix view, eval UI, annotation view, session replay, human eval UX, grading interface. NOT for: Phoenix data pipelines/ingest (capture scripts have their own specs).

tools3

session-handoff

Structured session handoff — write handoff file, spawn new agent, answer grill from outgoing context, verify orientation. Use when context is high, session ending, or spawning continuation. Triggers on: hand off, wrap up, context high, new session.

development3

skills

Discover and list all installed golem-powers skills with their descriptions, grades, and status. Shows which skills are available in the current environment, their trigger patterns, and whether they are active, experimental, or archived. Use when wanting to see what skills exist, searching for a skill by keyword, checking if a skill is installed, or getting an overview of ecosystem capabilities. Triggers on 'list skills', 'what skills', 'search skills', 'discover skills', 'available skills', 'show skills', 'skill inventory'. NOT for: invoking a specific skill (call it directly by name), creating new skills (use skill-creator), or finding external skills to install (use find-skills).

development3

pr-loop

The complete PR loop — branch, implement, test, commit, push, PR, WAIT FOR REVIEW, fix, merge, cleanup. Includes PR creation and review comment fetching. Use whenever creating a PR or finishing work. This is NOT optional. Every change goes through this loop. No exceptions.

development3

prd

Use when planning a feature, starting a new project, or asked to create a PRD. Generates JSON-based PRD for Ralph. Adding stories uses update.json pattern. Covers PRD, create PRD, plan feature, Ralph stories. NOT for: running Ralph (user runs externally).

tools3

repogolem

Launch agents in any repo via repoGolem launchers ({name}Claude, {name}Codex, {name}Cursor). Unified flags: -s skip, -c continue, -m model override (rare), -p scripted one-shot only (not agent sessions), -w worktree cwd. 40 projects registered. Triggers on: spawn agent, launcher, repoGolem, brainlayerClaude, flags.

development3

qa-video

Video QA and knowledge extraction pipeline for screen recordings, local video, and YouTube URLs. Use for narrated QA sessions, bugs/UX issues, QA checklists, handoffs, iterative QA rounds, qa-record/click capture, stalker pipeline, video gems, YouTube insights, extract from video, process this recording, or frame-based analysis. Routes QA recordings to record/process/handoff/iterate workflows; routes YouTube/gems requests through transcript → keyword hotspots → frames → BrainLayer/Drive archival.

tools3

research-lifecycle

Manage research context freshness — add sprint results, condense stale files, refresh descriptions, archive implemented findings. Prevents next research round from starting with wrong assumptions. Triggers on: stale context, research prep, R-number, refresh context, before research.

research3

shell-hardening

Security checklist for bash scripts — injection prevention, set -euo pipefail, printf safety, path quoting. Apply before committing any shell script. Triggers on: 'shell-hardening', 'bash script security', 'harden shell'. NOT for: general shell scripting help (just write code), non-bash languages, runtime debugging.

development3

orc

Orchestrate multi-agent sprints, coordinate cmux agents, and manage ecosystem-wide workflows. Use when the user mentions sprints, agent spawning, status checks ('where were we', 'catch me up'), collab kickoffs, cross-repo coordination, or any task requiring delegation to other Claudes. Also triggers on 'what happened', 'what's the status', incident response (daemon down, agent frozen), or research dispatch. This is the orchestrator — if work spans multiple repos or needs multiple agents, this skill applies.

testing3

skill-creator

Create, edit, audit, and evaluate golem-powers skills. Use for new skills, structural skill edits, workflow/adapter changes, pre-deploy validation, skill evals, benchmarks, live A/B tests, session JSONL mining, batch miners, and handoff digests. Triggers: create skill, edit skill, audit skill, validate skill, skill eval, live eval, mine session. NOT for invoking existing skills or convergence weaving.

development3

voice-sessions

Structured voice-powered sessions using VoiceLayer MCP with six workflows: debrief conversations, practice presentations, QA test sites with voice, quick text capture, review past sessions, and KG flag-batch review by voice. Flow is Context then Walk-through then Drill then Capture then Output to Obsidian. Use when: debriefing meetings, voice-drilling content, coaching with spoken feedback, capturing insights hands-free, or reviewing entity clusters by voice. NOT for: simple TTS announcements (use voice_speak directly).

tools3

yash

Push through dependency bugs — PR upstream instead of working around them. When you hit a bug in a dependency, don't patch around it. Fork it, fix it, PR upstream, use patch-package as a bridge. Triggers on: dependency bug, workaround, monkey-patch, 'works but hacky', upstream fix, fork and fix, patch-package, 'known issue in library'. NOT for: application-level bugs (fix them directly), configuration issues (just configure correctly), feature requests to libraries (use the issue tracker).

development3

wizard

Fresh machine setup wizard for the golems ecosystem. Checks prerequisites (brew, node, bun, claude, gh, git), reads or creates ~/.golems/config.yaml, clones required repos, runs sync-config.sh to wire MCP servers, creates .claude.local.md in each repo with machine-specific paths, verifies BrainLayer MCP connection, and reports setup status. Triggers on 'setup', 'wizard', 'fresh machine', 'new machine setup', 'configure workspace', 'onboard', 'install golems'. NOT for: daily skill usage (invoke skills directly), individual MCP server debugging (check MCP docs), updating existing skills (use skill-creator).

tools3

whats-new

Cross-reference Claude Code, Codex CLI, Cursor CLI, and Wispr Flow changelogs against your setup (hooks, skills, MCP, repoGolem launchers, model configs). Reports what affects you, new capabilities, VoiceBar competitive intel. Triggers: 'whats new', 'changelog', 'release notes', 'any updates'.

tools3

weave

Orchestrator-only convergence workflow: mine recent Claude/Codex JSONLs, weave cited findings into an action ledger, route every finding to a disposition, then red-team facts against raw logs. Triggers: weave, weave now, run weave, session weave, convergence weave. Use only when fleet is quiet; NOT for single-session mining or web research.

development3

critique-waves

Use when needing multi-agent verification of complex work. Runs parallel critique agents until consensus. Covers verification, consensus, multi-agent review, validate work. NOT for: simple code reviews (use coderabbit), single-reviewer tasks.

development3

cli-agents

Run external CLI agents (Gemini, Cursor, Codex, Kiro, Claude) as visible cmux workers through repoGolem launchers. Use for delegating implementation, spawning research/audit agents, or coordinating multi-agent builds. Workers split in the current workspace; audits/research use a separate named workspace. NOT for plain cmux pane management or Claude-only spawning.

tools3

code-review

Full code review lifecycle: requesting reviews (CodeRabbit, Greptile, Bugbot, GitHub PR comments) and receiving feedback (classify issues, implement fixes, push back on wrong suggestions). Use when: creating a PR review, reading review comments, handling reviewer feedback, fixing review items, or deciding whether to accept or reject a suggestion. NOT for: running tests directly or CI/CD pipeline issues (use relevant repo tools).

tools3

freeze-detect

Use when supervising cmux or similar agent surfaces that look unchanged, quiet, or token-frozen. Distinguishes stale parsed telemetry from genuinely idle workers by rotating one full read onto the worst offender, requiring prompt proof before calling a surface idle, and parking monitor loops around known long-running operations. Triggers on: parsed_only, frozen screen, idle codex, no token movement, stuck worker, long-running build, long-running test.

development3

mehayom

MeHayom freelance client management — daily updates, decision tracking, time logging. Use when drafting Yuval updates, logging scope changes, tracking hours, or any MeHayom client communication. Triggers: 'draft Yuval update', 'client update', 'daily update', 'log decision', 'track time', 'mehayom'.

tools3

monitor-loop

Use when running or reviewing any recurring monitor loop for merge queues, worker queues, collab tails, or agent completion. Enforces drive-to-completion ticks: every tick must query live state with `!`, classify whether real progress happened, and then dispatch, verify-and-decrement, or escalate-park. Triggers on: monitor loop, /loop, recurring tick, keep monitoring, silent autonomous, merge gate, blocked review, no-progress loop.

testing3

video-extract

Extract structured knowledge from any video source — YouTube URLs or local screen recordings. YouTube → gems workflow (yt-dlp transcript → keyword hotspots → frame extract → brain_digest → structured gems). Screen recordings → QA workflow (reuses /qa-video stalker pipeline). Use when user shares a YouTube link wanting deep extraction with frames, shares a .mov/.mp4 for QA processing, says "extract from video", "video gems", "process this recording", or mentions gem extraction from video content.

testing3

agent-routing

Enforce Cursor=gather, Codex=implement, Claude=orchestrate routing. Use when assigning tasks, spawning agents, collab kickoffs, or checking worker utilization. Triggers on: delegate, cursor worker, codex worker, assign task, routing violation.

development3

context7

Use when needing current API references, function signatures, or library usage patterns. Looks up documentation via Context7 API. Covers docs lookup, library documentation, API reference, how to use library. NOT for: web search (use WebSearch), project-specific code (read the codebase).

development3

writing-skills

Create, edit, and verify golem-powers skills using the standard SKILL.md structure, workflow files, adapters, templates, and eval fixtures. Use for new skills, structural edits, workflows/adapters, and pre-deploy validation. NOT for invoking existing skills, superpowers skills, or skill-creator agent workflows.

development3

presentation-builder

Step-by-step presentation builder using Michal's Speakers Workshop method, Oren Efraim's rules, and Uri Alon's techniques. Three sessions: find the premise, build structure and slides, then practice delivery. Use when: preparing a talk, building slides, refining a pitch, practicing a presentation, or reviewing presentation structure. Supports VoiceLayer for spoken practice runs. NOT for: writing blog posts, LinkedIn content, or general writing tasks.

development3

research

Deep web research orchestrator. Routes research tasks to the best backend — internal subagents, CLI agents (Gemini/Cursor), or the researcher subagent. Use when asked to research, investigate, compare, find alternatives, or deep-dive into any topic. Covers web research, company research, code pattern research, and pre-implementation research.

tools3

content-demo-creation

Create polished product demo videos by recording a real running app with Computer Use or recreating a product UX as deterministic Remotion/Three output. Use for demo videos, walkthroughs, feature showcases, and product UX mimics. NOT for static screenshots, slide decks, or QA bug-hunting.

testing3

architectural-conformance-audit

Pre-R0 sprint gate that diffs implementation vs SOTA research output verbatim. Surfaces cited counter-examples and architectural mismatches before sprint hooks fire. Triggers: 'before R0', 'architectural audit', 'verify against research'. NOT for per-PR review or post-merge.

testing3

deploy-verify

Use after any infra PR merges or any claim that a daemon, app, CLI, or MCP server is now deployed. Enforces a 4-step post-merge verification sequence: process replacement, `launchctl print`, build-stamp match, and an end-to-end live probe before declaring the deploy done. Triggers on: PR merged, deployed, shipped, launch agent, daemon restart, MCP server deploy, post-merge check.

tools3

monitor-loop

Use when running or reviewing any recurring monitor loop for merge queues, worker queues, collab tails, or agent completion. Enforces drive-to-completion ticks: every tick must query live state with `!`, classify whether real progress happened, and then dispatch, verify-and-decrement, or escalate-park. Triggers on: monitor loop, /loop, recurring tick, keep monitoring, silent autonomous, merge gate, blocked review, no-progress loop.

testing3

freeze-detect

Use when supervising cmux or similar agent surfaces that look unchanged, quiet, or token-frozen. Distinguishes stale parsed telemetry from genuinely idle workers by rotating one full read onto the worst offender, requiring prompt proof before calling a surface idle, and parking monitor loops around known long-running operations. Triggers on: parsed_only, frozen screen, idle codex, no token movement, stuck worker, long-running build, long-running test.

development3

research-prompt-quality

Mandatory pre-flight gate before any deep-research prompt ships. Three gates: CHECK-FIRST (non-redundancy), GROUND (Drive refs + current-usage examples + prior-research stance), emit-only-if-pass. Use when writing deep research prompts, Claude Desktop research prompts, deciding should we research, or proposing research. Triggers: 'deep research', 'research prompt', 'should we research', 'propose research'. NOT for executing research — use /research, /claude-desktop-research, or /gemini-research.

testing3

github-research

Use when auditing an UNFAMILIAR codebase for the first time — architecture mapping, undocumented features, configuration gaps. NOT for catching up on your own branch (use catchup). Always searches BrainLayer first before touching files.

development2

linkedin-post

LinkedIn writing coach based on Aviv Levi's 2026 algorithm guidelines. Finds post topics from git history, drafts posts optimized for dwell time and saves rate, and reviews drafts against 11 data-backed rules. Use when: writing LinkedIn posts, finding content ideas, reviewing a draft for algorithm fit, or planning a weekly posting schedule. NOT for: auto-posting, other social platforms (use content skill), or resume writing.

development2

linkedin-post

LinkedIn writing coach based on Aviv Levi's 2026 algorithm guidelines. Finds post topics from git history, drafts posts optimized for dwell time and saves rate, and reviews drafts against 11 data-backed rules. Use when: writing LinkedIn posts, finding content ideas, reviewing a draft for algorithm fit, or planning a weekly posting schedule. NOT for: auto-posting, other social platforms (use content skill), or resume writing.

development2

maintenance

Two-phase agent for ecosystem maintenance (fact-gathering + verification) and publicity (content creation with collab partner). Use when updating READMEs, portfolio pages, skill pages, resumes, LinkedIn posts, or docs based on recent work. Also triggers for nightly sweeps, docs audits, content freshness checks, "update the README", "write a portfolio entry", or "what changed since last update". Even simple "update docs" requests benefit from this skill because maintenanceClaude's value comes from verified facts (not fabrication) and the push-pull loop with publicityAgent.

testing2

match

Run the job matching algorithm on recent scrapes — score new listings against your profile.

development2

mehayom

MeHayom freelance client management — daily updates, decision tracking, time logging. Use when drafting Yuval updates, logging scope changes, tracking hours, or any MeHayom client communication. Triggers: 'draft Yuval update', 'client update', 'daily update', 'log decision', 'track time', 'mehayom'.

tools2

nightly-docs-update

Automated sync between golems repo stats and etanheyman.com portfolio site. Collects package count, test count, skill count, BrainLayer chunk count, and PR count, then updates hardcoded numbers across portfolio files and detects dead references to removed packages. Use when: stats are stale, after merging significant PRs, nightly scheduled runs, or when someone says docs numbers look wrong. NOT for: editorial content rewrites or adding new portfolio sections.

development2

nightly-journal

Use when ending the day, checking what happened professionally, or capturing client and job pipeline activity. Triggers on: 'nightly journal', 'daily sweep', 'check my comms', 'what happened today', 'update my diary', end-of-day routines, or proactively during evening coach tasks.

tools2

nightly-journal

Use when ending the day, checking what happened professionally, or capturing client and job pipeline activity. Triggers on: 'nightly journal', 'daily sweep', 'check my comms', 'what happened today', 'update my diary', end-of-day routines, or proactively during evening coach tasks.

tools2

notify

Send Telegram notifications to a topic-routed group chat. Supports multiple sources (alerts, nightshift, email, jobs) each routing to a dedicated Telegram topic. Use when: a task completes, hitting a blocker, waiting for user input, reporting errors, or sending urgent alerts. Available via shell function and HTTP API. NOT for: asking the user questions (use AskUserQuestion), routine progress updates, or sending messages to external contacts.

development2

orchestrator-status

Ecosystem-wide status collection and orientation. Use when returning to work, starting a new session, when the user says "where were we", "what's the status", "catch me up", "what happened", or any time you need to understand the current state across projects. This is NOT the repo-scoped /catchup (git diffs) — this is the orchestrator-level "what's happening across the whole ecosystem." Also use when you need to prepare a research prompt — this skill gathers all the context needed to write a detailed, self-contained prompt for a research agent.

documentation2

nightshift

Check Night Shift status, view recent PRs, or manually trigger a Night Shift run.

testing2

outreach

Manage the job outreach pipeline — list, draft, track, and review outreach messages.

devops2

practice

Start an interview practice session with Elo-rated skill tracking. Supports 7 interview types.

tools2

restart

Restart the Telegram bot and notification server. Use when bot is unresponsive or after code changes.

development2

status

Show ClaudeGolem status — bot process, notification server, event log, active sessions, Night Shift target.

development2

stitch-design

Design-to-code using Google Stitch MCP. Read admin designs, craft prompts for Stitch to generate new screens within the existing design system, grill user about behavior, implement in React Native. Claude = prompt engineer for Stitch. Stitch = design generator. Use when: implementing from Stitch, generating screen prototypes, extracting design tokens.

tools2

subscriptions

View and manage tracked subscriptions — active services, costs, and payment history.

tools2

video-extract

Extract structured knowledge from any video source — YouTube URLs or local screen recordings. YouTube → gems workflow (yt-dlp transcript → keyword hotspots → frame extract → brain_digest → structured gems). Screen recordings → QA workflow (reuses /qa-video stalker pipeline). Use when user shares a YouTube link wanting deep extraction with frames, shares a .mov/.mp4 for QA processing, says "extract from video", "video gems", "process this recording", or mentions gem extraction from video content.

testing2

video-showcase

Create product/project showcase videos using Remotion (React). Takes project description + screenshots → generates compositions → renders MP4. Use when asked to make demo videos, product showcases, or animated project walkthroughs.

development2

wispr-mining

Mine Wispr Flow SQLite database for ASR vocabulary gaps and correction patterns. Generates clean, importable CSV files (vocabulary + replacements). Use when: updating Wispr dictionary, finding ASR misrecognitions, auditing voice transcription quality, 'wispr mining', 'update wispr dictionary', 'voice vocabulary gaps'. NOT for: general voice processing (use voicelayer), speech-to-text implementation.

testing2

figma-loop

Iterative Figma-to-implementation pixel-perfect verification loop. Use when implementing or refining UI from Figma designs. Drills on screenshots, comparing Figma vs implementation, fixing one thing at a time until 3 consecutive checks pass. Covers figma iteration, pixel perfect, design verification, ui drilling, figma comparison. NOT for: fetching Figma specs only (use figma-workflow docs), creating new components from scratch without a reference design.

testing2

youtube-pipeline

Extract knowledge from YouTube videos into BrainLayer. Use when user shares a YouTube link or asks to process/watch/extract from a video. Chains exa (transcript) -> brain_digest (entities/relations) -> brain_store (conclusions). Works with any YouTube URL.

devops2

orchestrator-status

Ecosystem-wide status collection and orientation. Use when returning to work, starting a new session, when the user says 'where were we', 'what's the status', 'catch me up', 'what happened', or any time you need to understand the current state across projects. This is NOT the repo-scoped /catchup (git diffs) — this is the orchestrator-level 'what's happening across the whole ecosystem.' Also use when you need to prepare a research prompt — this skill gathers all the context needed to write a detailed, self-contained prompt for a research agent.

documentation2

obsidian

Direct filesystem access to the iCloud-synced Obsidian vault for reading, searching, listing, and organizing notes. Use when: accessing diary entries, searching vault content, listing recent notes, reading Golems ideas, checking memos, or writing new notes. Covers obsidian, notes, vault, ideas, diary, memos, and Hebrew-language content. NOT for: general file operations outside the vault, or BrainLayer queries (use brain_search).

development2

railway

Deploy and manage the golems cloud-worker on Railway. Use when deploying backend changes, checking logs, managing env vars, or restarting services. Wraps `railway` CLI. Covers railway, deploy, cloud-worker, redeploy, logs, variables. NOT for: Vercel deployments, frontend, Supabase.

tools2

context7

Use when needing current API references, function signatures, or library usage patterns. Looks up documentation via Context7 API. Covers docs lookup, library documentation, API reference, how to use library. NOT for: web search (use WebSearch), project-specific code (read the codebase).

development2

convex

Manage Convex backend operations including dev server lifecycle, cloud deployments, function execution, schema management, and data import/export. Wraps the npx convex CLI with project-specific configuration. Includes workflows for starting local dev, deploying to production, running one-off functions, exporting/importing data snapshots, and managing environment variables. Use when starting a Convex dev server, deploying backend changes, running mutations or queries, managing Convex schema, or debugging Convex function errors. Triggers on 'convex', 'backend deploy', 'run function', 'schema change', 'convex dev'. NOT for: Supabase or Firebase operations (use respective tools), frontend-only React work, or general database queries.

tools2

brave

Use as fallback browser automation when Claude-in-Chrome MCP is unavailable. Covers browser control, navigation, screenshots, clicking, typing. NOT for: headless testing (use Playwright). Claude Code users should prefer MCP first.

tools2

worktrees

Create and manage git worktrees for isolated feature development. Prevents branch cross-contamination by giving each feature its own working directory with shared git history. Includes workflows for creating worktrees from scratch or from Linear issues, listing active worktrees, switching between them, and cleaning up completed ones. Use when starting a new feature that needs isolation from current work, running parallel implementations, or preventing uncommitted changes from leaking between tasks. Triggers on 'worktree', 'isolated branch', 'parallel feature', 'branch isolation', 'new worktree', 'feature isolation'. NOT for: simple branch switching (use git checkout), Linear-only operations (use linear), or temporary experiments (just use a branch).

development2

github

Git and GitHub operations via gh CLI — branching, committing, creating PRs, managing issues, viewing CI status, and repository management. Provides ready-to-use gh commands for common workflows like creating feature branches, checking PR review status, listing open issues, viewing workflow run results, and managing labels. Use when doing git operations, creating or updating PRs, managing GitHub issues, checking CI/CD status, viewing PR comments, or working with GitHub releases. Triggers on 'git', 'github', 'PR', 'pull request', 'issue', 'branch', 'CI status', 'gh'. NOT for: Linear issue tracking (use linear), AI code reviews (use coderabbit), full PR lifecycle with review loops (use pr-loop).

tools2

test-plan

Generate structured manual testing checklists from git diffs for QA review before merging PRs. Analyzes changed files and produces step-by-step testing instructions covering happy paths, edge cases, and regression checks. Output is a markdown checklist suitable for QA handoff or self-review. Use when preparing a PR for manual QA, creating a testing checklist for a feature branch, or documenting what needs manual verification before merge. Triggers on 'test plan', 'QA checklist', 'testing checklist', 'manual testing', 'QA review prep'. NOT for: writing automated tests (write those in code), AI code reviews (use coderabbit), or CI pipeline configuration.

development2

review-router

Dynamic code review routing with automatic fallback chain when primary reviewer is unavailable. Routes to CodeRabbit, Macroscope, requesting-code-review, or Cursor CLI. Triggers on: 'review-router', 'route review', 'reviewer unavailable'. NOT for: general code review workflow (use /code-review), receiving review feedback (use /superpowers:receiving-code-review).

tools2

search

Search job matches by keyword, company, or score threshold from the scraped job database.

data-ai2

critique-waves

Use when needing multi-agent verification of complex work. Runs parallel critique agents until consensus. Covers verification, consensus, multi-agent review, validate work. NOT for: simple code reviews (use coderabbit), single-reviewer tasks.

development2

interview-practice

Interactive mock interview simulator with 7 modes: leetcode, system-design, debugging, code-review, behavioral, optimization, and complexity drills. Conducts Socratic-style practice sessions calibrated by company and level. Use when: preparing for technical interviews, practicing coding questions, doing mock system design, or drilling Big O complexity. NOT for: actual job applications, resume writing, or outreach (use coach skill).

development2

catchup

Use when returning to work after any break — auto-detects depth. Short break (hours): reads only uncommitted changes. Long break (48h+) or context overflow: reads all branch changes vs main. Covers catchup, context recovery, refresh, rebuild understanding. NOT for: mid-task exploration (use Read/Grep directly).

development2

logs

View recent Railway deployment logs for the cloud worker.

devops2

followup

Check overdue follow-ups in the outreach pipeline and suggest next actions.

testing2

publish

Publish an approved draft to Soltome or prepare it for LinkedIn posting.

tools2

nudge

Send a gentle Telegram reminder about pending tasks or upcoming events. (Phase 6 — not yet implemented)

tools2

railway

Deploy and manage the golems cloud-worker on Railway. Use when deploying backend changes, checking logs, managing env vars, or restarting services. Wraps `railway` CLI. Covers railway, deploy, cloud-worker, redeploy, logs, variables. NOT for: Vercel deployments, frontend, Supabase.

tools2

claude-web-research

DEPRECATED ALIAS — renamed to /claude-desktop-research on 2026-04-30. Use /claude-desktop-research instead. This alias remains active until 2026-05-30, then retires. Triggers: 'research prompt', 'Claude Web research', 'Claude Desktop research', 'deep research'.

development2

ralph-commit

Use when reaching a "Commit:" criterion in Ralph stories. Atomically commits and marks criterion checked. Covers ralph commit, atomic commit, commit criterion. NOT for: regular git commits (use git directly), commits outside Ralph workflow.

testing2

lsp

Use when needing semantic code navigation - find definitions, references, or callers. Covers LSP, go to definition, find references, hover, code intelligence. NOT for: text pattern searches (use grep), file discovery (use glob).

development2

schedule

Use when planning the day, checking schedule, updating calendar events, or any scheduling task. Enforces time-awareness — always checks current time and existing calendar before changes.

testing2

plan

Generate a daily or weekly plan by reading all golem statuses and Google Calendar. (Phase 6 — not yet implemented)

development2

report

Generate financial spending reports — monthly, yearly, or tax-focused with category breakdowns.

development2

context-check

Audit and fix per-project AI context hygiene. Compares loaded skills, MCPs, hooks, and agents against the project's context profile in ~/.golems/config.yaml. Reports wasted tokens and generates .claude/settings.local.json + CLAUDE.md containerization. Use when setting up a new project, debugging context bloat, or running maintenance sweeps.

tools2

cyber

Security auditor for MCP servers, TypeScript services, Swift apps, and shell scripts. Detects silent error swallowing, unsanitized exec/spawn, path traversal, SSML injection, missing ToolAnnotations, prompt injection vectors, and data exfiltration patterns. Use when: reviewing PRs for security, auditing MCP servers, running repo-wide security scans, checking ToolAnnotations compliance, or any task mentioning 'security', 'vulnerability', 'audit', 'hardening'. NOT for: functional code review (use coderabbit), shell-only scripts (use shell-hardening), runtime debugging.

tools2

plan-validate

Extract and validate assumptions from multi-agent sprint plans. Generates research prompts, flags unverified claims, rewrites plan. Triggers on: 'validate plan', 'check assumptions', 'plan-validate'. NOT for: single-task plans (overkill), runtime debugging, or code review.

development2

branded-doc

Generate professional branded HTML documents from structured content. Use when the user asks to create a proposal, contract feedback, client response, pricing sheet, or any professional document that needs branding. Triggers on: "draft a response", "create proposal", "branded doc", "feedback document", "client document", "/branded-doc".

tools2

code-review

Full code review lifecycle: requesting reviews (CodeRabbit, Greptile, Bugbot, GitHub PR comments) and receiving feedback (classify issues, implement fixes, push back on wrong suggestions). Use when: creating a PR review, reading review comments, handling reviewer feedback, fixing review items, or deciding whether to accept or reject a suggestion. NOT for: running tests directly or CI/CD pipeline issues (use relevant repo tools).

tools2

commit

Use when ready to commit changes. Runs CodeRabbit review first, then commits if review passes. Supports Ralph mode for atomic commit + criterion marking. Covers commit, ralph commit, atomic commit. NOT for: pushing or creating PRs (use pr-loop).

development2

content

Content creation and publishing pipeline for ClaudeGolem across platforms (Soltome, blog, social). Handles drafting teasers, reveals, author posts, and quick updates in two voices (ClaudeGolem bot voice and Etan author voice). Use when: writing social posts, planning content calendar, publishing to Soltome, managing draft approval flow. NOT for: LinkedIn posts (use linkedin-post skill) or presentation slides.

development2

deploy

Deploy the cloud worker to Railway. Builds and pushes the latest code.

development2

draft

Draft content for Soltome, LinkedIn, or blog using the critique-waves pattern (generate → critique → refine → polish).

content-media2

eas-prebuild-check

Validate Expo iOS/Android sync BEFORE the first EAS build. Catches missing .easignore (2GB archive), wrong bundle ID, outdated eas-cli, missing device registration, unsynced Apple credentials, missing P8 push keys, versionCode drift, concurrency queue surprises. Run this before `eas build` on any new project or after a major branch change. Triggers on: 'eas build', 'first eas build', 'prebuild check', 'expo build validation', 'before eas', 'eas credentials', 'preview build'. NOT for: runtime debugging of a build that already started (use eas build:inspect).

tools2

email-golem

Email triage system using Gmail + MLX local inference for scoring and prioritization. Use when: checking emails, running email triage, viewing urgency scores, checking if email scheduler is running, debugging missing email notifications, or managing the launchd email-golem daemon. Scores emails 1-10 and sends Telegram alerts for urgent ones. NOT for: general Gmail search without triage context.

development2

explanatory-doc

Generate visual, branded HTML documents that explain concepts to non-technical audiences. Use when creating explanatory content for friends, partners, investors, or anyone who needs to understand a technical topic in simple language. NOT for contracts or proposals (use branded-doc). Triggers on: "explain to", "make a document for", "explanatory doc", "visual explanation", "send him/her an explanation", "/explanatory-doc".

development2

figma-swarm

Multi-agent Figma screen decomposition and component build pipeline. Use when implementing multiple screens from Figma designs, breaking down screens into components, mapping to existing component library, and coordinating parallel builds. Triggers on "decompose screens", "build from Figma", "screen components", "figma swarm", "component mapping", "design to components", or when the user shares multiple Figma screen node IDs for implementation. Also use when re-checking for design drift against Figma.

development2