
Comprehensively manually test the Circuit plugin's user-facing surface in either Claude Code or Codex. Use this skill whenever the user asks to "manually test Circuit", "QA the Circuit plugin", "exercise the Circuit surface", "run the Circuit checklist", "smoke test Circuit", "find regressions in Circuit", "test the Claude Circuit plugin", "test the Codex Circuit plugin", or when preparing a Circuit release for marketplace publication. Argument is the host package to test — `claude` or `codex`. Produces a Markdown report with per-command pass/fail, exploratory findings ranked by severity, run-folder evidence links, and a concise terminal summary. Use even if the user does not say the word "test" — phrases like "go through every Circuit command" or "make sure Circuit still works end-to-end" should also trigger.
Comprehensively manually test the Circuit plugin's user-facing surface in either Claude Code or Codex. Use this skill whenever the user asks to "manually test Circuit", "QA the Circuit plugin", "exercise the Circuit surface", "run the Circuit checklist", "smoke test Circuit", "find regressions in Circuit", "test the Claude Circuit plugin", "test the Codex Circuit plugin", or when preparing a Circuit release for marketplace publication. Argument is the host package to test — `claude` or `codex`. Produces a Markdown report with per-command pass/fail, exploratory findings ranked by severity, run-folder evidence links, and a concise terminal summary. Use even if the user does not say the word "test" — phrases like "go through every Circuit command" or "make sure Circuit still works end-to-end" should also trigger.
Turn the prompt supplied with this skill into a concise, auditable Codex Goal or explain why a Goal is not the right fit. Use when the user asks to draft, formulate, rewrite, tighten, or create a `/goal` from a plain-language task, especially for multi-step work that needs a durable objective, evidence-based completion, constraints, iteration policy, and a default adversarial review loop.
Give the human a fast, plain-English catch-up on what changed in the project: what the agents did, why, and what decisions need their input. Use this whenever the user asks to "catch me up", "what changed", "where are we", "recap", "brief me", "give me the rundown", "what did you do", "summarize the session", "fill me in", or otherwise signals they have been away and want to get back up to speed quickly. Built for someone steering several agent-driven projects at once who does not read the code closely but needs to grasp the core ideas, the choices made, and the open decisions well enough to steer. Trigger even if they do not use these exact words: any request to get oriented on recent progress should use this skill.
Turn the prompt supplied with this skill into a concise, auditable Codex Goal or explain why a Goal is not the right fit. Use when the user asks to draft, formulate, rewrite, tighten, or create a `/goal` from a plain-language task, especially for multi-step work that needs a durable objective, evidence-based completion, constraints, iteration policy, and a default adversarial review loop.
Give the human a fast, plain-English catch-up on what changed in the project: what the agents did, why, and what decisions need their input. Use this whenever the user asks to "catch me up", "what changed", "where are we", "recap", "brief me", "give me the rundown", "what did you do", "summarize the session", "fill me in", or otherwise signals they have been away and want to get back up to speed quickly. Built for someone steering several agent-driven projects at once who does not read the code closely but needs to grasp the core ideas, the choices made, and the open decisions well enough to steer. Trigger even if they do not use these exact words: any request to get oriented on recent progress should use this skill.
Install session-level guardrails when the user hands off to autonomous overnight execution. Fires when the first user prompt contains "going to bed", "headed to sleep", "full autonomy", "overnight", "continue as you were", or similar handoff-to-autonomy language. Enforces commit-per-slice, halt-on-3-consecutive-errors, max-wall-time cap, and a wake-time summary block. Exists because audited overnight sessions showed 348 Bash calls and 9 tool errors with no structural brake — the user explicitly acknowledged errors were happening and told Claude to keep going, which is exactly the moment guardrails must exist in config instead of in the user's head.
Clean Architecture principles and best practices from Robert C. Martin's book. This skill should be used when designing software systems, reviewing code structure, or refactoring applications to achieve better separation of concerns. Triggers on tasks involving layers, boundaries, dependency direction, entities, use cases, or system architecture.
Write clear, plain-spoken code comments and documentation that lives alongside the code. Use when writing or reviewing code that needs inline documentation—file headers, function docs, architectural decisions, or explanatory comments. Optimized for both human readers and AI coding assistants who benefit from co-located context.
Conduct exhaustive, citation-rich research on any topic using all available tools: web search, browser automation, documentation APIs, and codebase exploration. Use when asked to "research X", "find out about Y", "investigate Z", "deep dive into...", "what's the current state of...", "compare options for...", "fact-check this...", or any request requiring comprehensive, accurate information from multiple sources. Prioritizes accuracy over speed, cross-references claims across sources, identifies conflicts, and provides full citations. Outputs structured findings with confidence levels and source quality assessments.
Systematically explore and test a web application to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", "test this app/site/platform", or review the quality of a web application. Produces a structured report with full reproduction evidence -- step-by-step screenshots, repro videos, and detailed repro steps for every issue -- so findings can be handed directly to the responsible teams.
Perform evidence-driven, multi-subsystem audits of real codebases to find correctness bugs, race conditions, security gaps, stale documentation, dead code, and production-readiness risks. Use when asked to audit a system end-to-end, verify agent-written code before shipping, analyze a subsystem for correctness across multiple modules, or produce a structured risk report for a real implementation. Prefer other skills for a single isolated bug, a proposal or document review, or a dedicated dead-code cleanup.
Route `/gemini ...` requests to the Cursor headless CLI for one-shot autonomous execution. Use when the user explicitly invokes `/gemini` or asks to hand a task off to Cursor agent. Preserves the prompt verbatim, runs in headless print mode, and returns Cursor's output.
Interview the user relentlessly about a plan or design until reaching shared understanding, resolving each branch of the decision tree. Use when user wants to stress-test a plan, get grilled on their design, or mentions "grill me".
Create a narrative guide to a codebase or feature in the style of Knuth's Literate Programming — code and prose interwoven as a single essay, ordered for human understanding rather than compiler needs. Use when the user asks to 'explain this codebase as a story', 'write a literate guide', 'create a narrative walkthrough', 'tell the story of this code', 'Knuth-style documentation', 'weave a guide for this feature', or when they want deep, readable documentation that treats the program as literature. Also trigger when someone wants a document that a thoughtful reader could follow from start to finish and come away understanding both WHAT the code does and WHY every design choice was made.
Create a narrative guide to a codebase or feature in the style of Knuth's Literate Programming — code and prose interwoven as a single essay, ordered for human understanding rather than compiler needs. Use when the user asks to 'explain this codebase as a story', 'write a literate guide', 'create a narrative walkthrough', 'tell the story of this code', 'Knuth-style documentation', 'weave a guide for this feature', or when they want deep, readable documentation that treats the program as literature. Also trigger when someone wants a document that a thoughtful reader could follow from start to finish and come away understanding both WHAT the code does and WHY every design choice was made.
Guide users through targeted manual verification after code changes. Use when asked to "test this", "verify it works", "QA this", "walk me through testing", "smoke test", "sanity check", "regression test", "acceptance test", or after implementing a feature or bug fix that still needs human validation. Favor this skill for focused verification of the current change; use a broader exploratory-testing skill for open-ended bug hunting across an entire app.
Guide users through targeted manual verification after code changes. Use when asked to "test this", "verify it works", "QA this", "walk me through testing", "smoke test", "sanity check", "regression test", "acceptance test", or after implementing a feature or bug fix that still needs human validation. Favor this skill for focused verification of the current change; use a broader exploratory-testing skill for open-ended bug hunting across an entire app.
--- name: overnight-autonomy description: Codify the user's standard overnight-autonomy contract so it doesn't have to be retyped every session. Triggers on phrases that signal "I'm leaving Claude running while I sleep": "i'm going to sleep", "i'm headed to bed", "i have to go back to sleep", "drive this forward overnight", "go full autonomy", "keep going until morning", "autonomously follow through", or any combination including "claude" + "sleep/bed/morning". Establishes the overnight contract
--- name: overnight-autonomy description: Codify the user's standard overnight-autonomy contract so it doesn't have to be retyped every session. Triggers on phrases that signal "I'm leaving Claude running while I sleep": "i'm going to sleep", "i'm headed to bed", "i have to go back to sleep", "drive this forward overnight", "go full autonomy", "keep going until morning", "autonomously follow through", or any combination including "claude" + "sleep/bed/morning". Establishes the overnight contract
Review recent React, Next.js, or TypeScript UI code changes for hardening before merge or commit. Use when asked to review recent React code changes, audit a React diff, harden a feature, check a PR or branch for React issues, or produce a stack-ranked list of nonredundant findings and a recommended fix plan using react-doctor, Vercel React best practices, Vercel composition patterns, and React useEffect guidance.
Produce a 5-line situational summary on demand or whenever the user is resuming a session, catching up, or lost. Triggers on "resume", "where are we", "what were we doing", "status", "remind me", "catch me up", or immediately on SessionStart:resume. Replaces Claude's default behavior of dumping the entire saved continuity record back at the user — instead, always outputs exactly 5 lines in a fixed shape so the user can reorient in under a second. Pairs with circuit:handoff continuity records when present but does not require them.
Produce a 5-line situational summary on demand or whenever the user is resuming a session, catching up, or lost. Triggers on "resume", "where are we", "what were we doing", "status", "remind me", "catch me up", or immediately on SessionStart:resume. Replaces Claude's default behavior of dumping the entire saved continuity record back at the user — instead, always outputs exactly 5 lines in a fixed shape so the user can reorient in under a second. Pairs with circuit:handoff continuity records when present but does not require them.
Render a 4-line human orientation card (Goal / Last commit / Next / Blocker) as the FIRST visible assistant output on any resumed session. Fires when the first user prompt is "resume", "resume handoff", "continue from handoff", or any /circuit:handoff resume invocation. Exists because the continuity system stores state perfectly for machines but leaves the human disoriented after breaks — at least two 2026-04 sessions ended with the user explicitly saying "I've forgotten what we're doing". The card is mandatory, precedes any tool use, and never exceeds 4 lines.
Render a 4-line human orientation card (Goal / Last commit / Next / Blocker) as the FIRST visible assistant output on any resumed session. Fires when the first user prompt is "resume", "resume handoff", "continue from handoff", or any /circuit:handoff resume invocation. Exists because the continuity system stores state perfectly for machines but leaves the human disoriented after breaks — at least two 2026-04 sessions ended with the user explicitly saying "I've forgotten what we're doing". The card is mandatory, precedes any tool use, and never exceeds 4 lines.
Ruthlessly analyze architectural seams—the interfaces, boundaries, and contracts between system components—to expose coupling problems, abstraction leaks, and design failures. Use when asked to review architecture, analyze coupling, find interface problems, improve module boundaries, audit dependencies, or redesign system structure. Produces uncompromising redesign proposals that prioritize correctness over backwards compatibility.
Ruthlessly analyze architectural seams—the interfaces, boundaries, and contracts between system components—to expose coupling problems, abstraction leaks, and design failures. Use when asked to review architecture, analyze coupling, find interface problems, improve module boundaries, audit dependencies, or redesign system structure. Produces uncompromising redesign proposals that prioritize correctness over backwards compatibility.
Enforces one-subtask-per-session discipline for long Claude Code workflows. Use at the start of any prompt that begins a new subtask, or when the current session has already exceeded 60 tool uses or 45 minutes. Proposes a handoff + new session instead of continuing to pile work into the current thread. Especially valuable for users running Circuit or Codex workflows who tend to accumulate 100+ tool uses per session and then pay heavy re-entry cost on the next resume.
Reviews ~/.claude/skills/ against the last N days of actual session transcripts to identify skills that never triggered and are paying context-noise cost for no value. Produces a disable/archive recommendation per skill. Use when the user has 30+ skills installed and wants to reduce trigger-time noise, or when the user asks to "clean up my skills", "find unused skills", or "which skills never fire".
Manage, audit, and maintain your skill ecosystem. Use when the user asks to "check my skills", "audit skills", "find duplicate skills", "which skills am I using", "prune unused skills", "are my skills synced", "check for skill updates", "skill report", "skill health", or mentions skill maintenance, cleanup, or organization.
Build, refactor, review, and debug native Apple-platform software in Swift. Use when working on `.swift` files, SwiftUI views, Observation-based state, `@Bindable` and binding flow, SwiftData-backed UI, scenes and windows, search/navigation structures, UIKit/AppKit interop, Liquid Glass adoption, macOS-native UX, or SwiftUI performance/accessibility. Trigger on requests to create or polish iOS, iPadOS, macOS, or visionOS features; clean up SwiftUI view structure; diagnose jank or invalidation storms; review app quality; or make a feature feel like a good Apple-platform citizen.
Test-driven development for features, bug fixes, regressions, and safe refactors using a failing-test-first workflow. Use when Codex needs to add or change behavior with proof, reproduce a bug in a test, write regression or characterization tests, make a refactor safer, or respond to prompts like "use TDD", "red-green-refactor", "write the test first", "add a regression test", "reproduce this in a test", "prove the fix", "cover this change with tests", or "make this safe to refactor". Prefer this skill when confidence should come from executable evidence instead of reasoning alone.
Test-driven development for features, bug fixes, regressions, and safe refactors using a failing-test-first workflow. Use when Codex needs to add or change behavior with proof, reproduce a bug in a test, write regression or characterization tests, make a refactor safer, or respond to prompts like "use TDD", "red-green-refactor", "write the test first", "add a regression test", "reproduce this in a test", "prove the fix", "cover this change with tests", or "make this safe to refactor". Prefer this skill when confidence should come from executable evidence instead of reasoning alone.
Apply professional typography principles to create readable, hierarchical, and aesthetically refined interfaces. Use when setting type scales, choosing fonts, adjusting spacing, designing text-heavy layouts, implementing dark mode typography, or when asked about readability, font pairing, line height, measure, typographic hierarchy, variable fonts, font loading, or OpenType features.
Apply professional typography principles to create readable, hierarchical, and aesthetically refined interfaces. Use when setting type scales, choosing fonts, adjusting spacing, designing text-heavy layouts, implementing dark mode typography, or when asked about readability, font pairing, line height, measure, typographic hierarchy, variable fonts, font loading, or OpenType features.
Extract a DDD-style ubiquitous language glossary from the current conversation, flagging ambiguities and proposing canonical terms. Saves to UBIQUITOUS_LANGUAGE.md. Use when user wants to define domain terms, build a glossary, harden terminology, create a ubiquitous language, or mentions "domain model" or "DDD".
Expert Unix and macOS systems engineer for shell scripting, system administration, command-line tools, launchd, Homebrew, networking, and low-level system tasks. Use when the user asks about Unix commands, shell scripts, macOS system configuration, process management, or troubleshooting system issues.
Expert Unix and macOS systems engineer for shell scripting, system administration, command-line tools, launchd, Homebrew, networking, and low-level system tasks. Use when the user asks about Unix commands, shell scripts, macOS system configuration, process management, or troubleshooting system issues.
Continuous formal verification of architectural constraints and code quality. Use when asked to verify, audit, or validate codebase integrity. Runs automatically via hooks on every edit (structural) and pre-commit (full). Catches ownership violations, boundary crossings, state machine bugs, and code smells that grep ratchets miss. Triggers: "verify", "formal verify", "check architecture", "audit code quality", "run verification", "/verify", "/verify --bootstrap", "/verify --grade".
This skill should be used when cleaning up codebases that have accumulated dead code, redundant implementations, and orphaned artifacts — especially codebases maintained by coding agents. Triggers on "find dead code", "clean up unused code", "remove redundant code", "prune this codebase", "dead code sweep", "code cleanup", or when a codebase has gone through multiple agent-driven refactors and likely contains overlooked remnants. Systematically identifies cruft, categorizes findings, and removes confirmed dead code with user approval.
Enforces one-subtask-per-session discipline for long Claude Code workflows. Use at the start of any prompt that begins a new subtask, or when the current session has already exceeded 60 tool uses or 45 minutes. Proposes a handoff + new session instead of continuing to pile work into the current thread. Especially valuable for users running Circuit or Codex workflows who tend to accumulate 100+ tool uses per session and then pay heavy re-entry cost on the next resume.
Build, refactor, review, and debug native Apple-platform software in Swift. Use when working on `.swift` files, SwiftUI views, Observation-based state, `@Bindable` and binding flow, SwiftData-backed UI, scenes and windows, search/navigation structures, UIKit/AppKit interop, Liquid Glass adoption, macOS-native UX, or SwiftUI performance/accessibility. Trigger on requests to create or polish iOS, iPadOS, macOS, or visionOS features; clean up SwiftUI view structure; diagnose jank or invalidation storms; review app quality; or make a feature feel like a good Apple-platform citizen.
Query DeepWiki for repository documentation and structure. Use to understand open source projects, find API docs, and explore codebases.
Manage, audit, and maintain your skill ecosystem. Use when the user asks to "check my skills", "audit skills", "find duplicate skills", "which skills am I using", "prune unused skills", "are my skills synced", "check for skill updates", "skill report", "skill health", or mentions skill maintenance, cleanup, or organization.
Explore and compare architectural options before committing to a large technical direction. Use when the user wants to evaluate different architectures, compare approaches, choose between competing designs, rethink a subsystem, or understand tradeoffs before a major refactor or migration. Also use for prompts like "explore the architecture", "what are our options", "compare approaches", "what design should we choose", "audit and recommend an improved architecture", or "help me think through a large architectural change" even if the user does not mention a formal architecture review.
Add Agentation visual feedback toolbar to a Next.js project
Build a compilable type-level skeleton from a high-level architecture spec before writing any implementation logic. Use when you have an architectural assessment, design doc, or restructuring plan and need to prove the new architecture is sound before migrating code. Also use when asked to "scaffold the new architecture", "create type stubs", "build the shell", "flesh out this spec", "skeleton the modules", or any request to turn architectural intent into verified structure. This skill follows the "Human Builds the Shell" paradigm: types are hard constraints that the compiler enforces, so if the skeleton compiles, the architecture is structurally sound. Especially valuable for large refactors where you don't trust agents to maintain coherence.
Slice-based, evidence-driven framework for explicit codebase migrations and convergence programs. Use when the user is moving from a named source to a named target, running a multi-session standardization effort across a codebase, consolidating parallel implementations into one target architecture, or asking for a migration playbook with slices, ratchets, handoffs, and release closeout. Do not trigger for normal refactors, routine cleanup, or one-off architecture improvements that are not framed as a migration or convergence effort.
Install session-level guardrails when the user hands off to autonomous overnight execution. Fires when the first user prompt contains "going to bed", "headed to sleep", "full autonomy", "overnight", "continue as you were", or similar handoff-to-autonomy language. Enforces commit-per-slice, halt-on-3-consecutive-errors, max-wall-time cap, and a wake-time summary block. Exists because audited overnight sessions showed 348 Bash calls and 9 tool errors with no structural brake — the user explicitly acknowledged errors were happening and told Claude to keep going, which is exactly the moment guardrails must exist in config instead of in the user's head.
Reviews ~/.claude/skills/ against the last N days of actual session transcripts to identify skills that never triggered and are paying context-noise cost for no value. Produces a disable/archive recommendation per skill. Use when the user has 30+ skills installed and wants to reduce trigger-time noise, or when the user asks to "clean up my skills", "find unused skills", or "which skills never fire".
Forensic audit of the user's recent Claude Code sessions to surface step-change workflow improvements — not marginal ones. Use when the user asks to "audit my Claude Code sessions", "analyze how I use Claude Code", "find patterns in my usage", "improve my Claude Code workflow", "review my sessions", "find leverage in my setup", or wants to understand where their Claude Code setup is leaking time. Samples dozens of real transcripts, extracts quantitative signal via scripts, uses parallel subagents for deep reads, then synthesizes into a short prioritized report with drafted implementations (new skills, CLAUDE.md rules, hooks, settings diffs) that the user can install directly. Trigger even when the user doesn't say the word "audit" — if they're asking about improving or reviewing their Claude Code habits at scale, use this skill.
Interview the user relentlessly about a plan or design until reaching shared understanding, resolving each branch of the decision tree. Use when user wants to stress-test a plan, get grilled on their design, or mentions "grill me".
Explore a codebase to find opportunities for architectural improvement, focusing on making the codebase more testable by deepening shallow modules. Use when user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more AI-navigable.
Extract a DDD-style ubiquitous language glossary from the current conversation, flagging ambiguities and proposing canonical terms. Saves to UBIQUITOUS_LANGUAGE.md. Use when user wants to define domain terms, build a glossary, harden terminology, create a ubiquitous language, or mentions "domain model" or "DDD".
Write clear, plain-spoken code comments and documentation that lives alongside the code. Use when writing or reviewing code that needs inline documentation—file headers, function docs, architectural decisions, or explanatory comments. Optimized for both human readers and AI coding assistants who benefit from co-located context.
Forensic audit of the user's recent Claude Code sessions to surface step-change workflow improvements — not marginal ones. Use when the user asks to "audit my Claude Code sessions", "analyze how I use Claude Code", "find patterns in my usage", "improve my Claude Code workflow", "review my sessions", "find leverage in my setup", or wants to understand where their Claude Code setup is leaking time. Samples dozens of real transcripts, extracts quantitative signal via scripts, uses parallel subagents for deep reads, then synthesizes into a short prioritized report with drafted implementations (new skills, CLAUDE.md rules, hooks, settings diffs) that the user can install directly. Trigger even when the user doesn't say the word "audit" — if they're asking about improving or reviewing their Claude Code habits at scale, use this skill.
MANDATORY handoff to the local Codex CLI. Triggers when (a) the user's message begins with `/codex` as a command, or (b) the user issues an explicit handoff directive like "hand this to codex", "run this through codex", "ask codex", or "have codex do X". On trigger, pipe the user's prompt verbatim to `codex exec` and return Codex's final message verbatim. Treat this like a shell alias the user is executing through you. Do NOT interpret the task, inspect files, gather context, attempt the work yourself, judge whether Codex is the right tool, or rewrite the prompt. The only abort condition is an empty prompt (ask what Codex should do).
Route `/gemini ...` requests to the Cursor headless CLI for one-shot autonomous execution. Use when the user explicitly invokes `/gemini` or asks to hand a task off to Cursor agent. Preserves the prompt verbatim, runs in headless print mode, and returns Cursor's output.
Explore a codebase to find opportunities for architectural improvement, focusing on making the codebase more testable by deepening shallow modules. Use when user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more AI-navigable.
Review recent React, Next.js, or TypeScript UI code changes for hardening before merge or commit. Use when asked to review recent React code changes, audit a React diff, harden a feature, check a PR or branch for React issues, or produce a stack-ranked list of nonredundant findings and a recommended fix plan using react-doctor, Vercel React best practices, Vercel composition patterns, and React useEffect guidance.
Explore and compare architectural options before committing to a large technical direction. Use when the user wants to evaluate different architectures, compare approaches, choose between competing designs, rethink a subsystem, or understand tradeoffs before a major refactor or migration. Also use for prompts like "explore the architecture", "what are our options", "compare approaches", "what design should we choose", "audit and recommend an improved architecture", or "help me think through a large architectural change" even if the user does not mention a formal architecture review.
This skill should be used when cleaning up codebases that have accumulated dead code, redundant implementations, and orphaned artifacts — especially codebases maintained by coding agents. Triggers on "find dead code", "clean up unused code", "remove redundant code", "prune this codebase", "dead code sweep", "code cleanup", or when a codebase has gone through multiple agent-driven refactors and likely contains overlooked remnants. Systematically identifies cruft, categorizes findings, and removes confirmed dead code with user approval.
Conduct exhaustive, citation-rich research on any topic using all available tools: web search, browser automation, documentation APIs, and codebase exploration. Use when asked to "research X", "find out about Y", "investigate Z", "deep dive into...", "what's the current state of...", "compare options for...", "fact-check this...", or any request requiring comprehensive, accurate information from multiple sources. Prioritizes accuracy over speed, cross-references claims across sources, identifies conflicts, and provides full citations. Outputs structured findings with confidence levels and source quality assessments.
Systematically explore and test a web application to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", "test this app/site/platform", or review the quality of a web application. Produces a structured report with full reproduction evidence -- step-by-step screenshots, repro videos, and detailed repro steps for every issue -- so findings can be handed directly to the responsible teams.
Slice-based, evidence-driven framework for explicit codebase migrations and convergence programs. Use when the user is moving from a named source to a named target, running a multi-session standardization effort across a codebase, consolidating parallel implementations into one target architecture, or asking for a migration playbook with slices, ratchets, handoffs, and release closeout. Do not trigger for normal refactors, routine cleanup, or one-off architecture improvements that are not framed as a migration or convergence effort.
MANDATORY handoff to the local Codex CLI. Triggers when (a) the user's message begins with `/codex` as a command, or (b) the user issues an explicit handoff directive like "hand this to codex", "run this through codex", "ask codex", or "have codex do X". On trigger, pipe the user's prompt verbatim to `codex exec` and return Codex's final message verbatim. Treat this like a shell alias the user is executing through you. Do NOT interpret the task, inspect files, gather context, attempt the work yourself, judge whether Codex is the right tool, or rewrite the prompt. The only abort condition is an empty prompt (ask what Codex should do).
Clean Architecture principles and best practices from Robert C. Martin's book. This skill should be used when designing software systems, reviewing code structure, or refactoring applications to achieve better separation of concerns. Triggers on tasks involving layers, boundaries, dependency direction, entities, use cases, or system architecture.
Perform evidence-driven, multi-subsystem audits of real codebases to find correctness bugs, race conditions, security gaps, stale documentation, dead code, and production-readiness risks. Use when asked to audit a system end-to-end, verify agent-written code before shipping, analyze a subsystem for correctness across multiple modules, or produce a structured risk report for a real implementation. Prefer other skills for a single isolated bug, a proposal or document review, or a dedicated dead-code cleanup.
Continuous formal verification of architectural constraints and code quality. Use when asked to verify, audit, or validate codebase integrity. Runs automatically via hooks on every edit (structural) and pre-commit (full). Catches ownership violations, boundary crossings, state machine bugs, and code smells that grep ratchets miss. Triggers: "verify", "formal verify", "check architecture", "audit code quality", "run verification", "/verify", "/verify --bootstrap", "/verify --grade".
Add Agentation visual feedback toolbar to a Next.js project
Compile an agent-optimized changelog by cross-referencing git history with plans and documentation. Use when asked to "update changelog", "compile history", "document project evolution", or proactively after major milestones, architectural changes, or when stale/deprecated information is detected that could confuse coding agents.
Build a compilable type-level skeleton from a high-level architecture spec before writing any implementation logic. Use when you have an architectural assessment, design doc, or restructuring plan and need to prove the new architecture is sound before migrating code. Also use when asked to "scaffold the new architecture", "create type stubs", "build the shell", "flesh out this spec", "skeleton the modules", or any request to turn architectural intent into verified structure. This skill follows the "Human Builds the Shell" paradigm: types are hard constraints that the compiler enforces, so if the skeleton compiles, the architecture is structurally sound. Especially valuable for large refactors where you don't trust agents to maintain coherence.
Compile an agent-optimized changelog by cross-referencing git history with plans and documentation. Use when asked to "update changelog", "compile history", "document project evolution", or proactively after major milestones, architectural changes, or when stale/deprecated information is detected that could confuse coding agents.
Query DeepWiki for repository documentation and structure. Use to understand open source projects, find API docs, and explore codebases.
Interview-driven blog post drafting for technical product audiences. Use when user wants to write a blog post, article, or essay and needs help developing their thesis, structure, and initial draft. Triggers on "write a blog post", "draft an article", "help me write about X", "blog drafter", or when user has a topic they want to turn into written content. Conducts structured interviews using AskUserQuestion to extract the user's unique insights before generating drafts.
Generate a pedagogically-grounded study guide for learning an unfamiliar codebase. Use when the user wants to onboard onto a codebase, understand a project's architecture, create learning materials for a team, or asks things like "help me learn this codebase", "create an onboarding guide", "I'm new to this project", "how does this system work", "study guide for this repo", or "explain this codebase to me". Produces a structured document that builds understanding from purpose to systems to patterns, using evidence-based learning techniques (elaborative interrogation, concept mapping, threshold concepts, worked examples, progressive disclosure).
This skill should be used when cleaning up codebases that have accumulated dead code, redundant implementations, and orphaned artifacts — especially codebases maintained by coding agents. Triggers on "find dead code", "clean up unused code", "remove redundant code", "prune this codebase", "dead code sweep", "code cleanup", or when a codebase has gone through multiple agent-driven refactors and likely contains overlooked remnants. Systematically identifies cruft, categorizes findings, and removes confirmed dead code with user approval.
Compile a plain-language task into a concise, auditable Codex or Claude Code `/goal`, or explain why a normal prompt fits better. Use when the user asks to draft, formulate, rewrite, tighten, or create a goal for multi-step work that needs a durable objective, transcript-visible proof, constraints, bounded stop conditions, host-aware operation, and risk-based review depth.
Build interactive debugging interfaces that reveal internal system behavior. Use when asked to "help me understand how this works", "show me what's happening", "visualize the state", "build a debug view", "I can't see what's going on", or any request to make opaque system behavior visible. Applies to state machines, data flow, event systems, algorithms, render cycles, animations, CSS calculations, or any mechanism with hidden internals.
Perform comprehensive, deep analysis of a system and its subsystems to identify bugs, race conditions, stale documentation, dead code, and correctness issues. Use when asked to "audit this system", "exhaustive analysis of X", "analyze for correctness", "root out issues in...", "deep dive into...", "verify this code is correct", "find bugs in...", or when reviewing agent-written code for production readiness. Automatically decomposes systems into subsystems, applies appropriate analysis checklists, and produces structured findings with severity classification.
Write changelog entries for open source documentation sites using Keep a Changelog format. Use when asked to "write a changelog", "update the changelog", "add changelog entry", "document recent changes", or after a release/set of changes that should be recorded. Reviews git commits since the last changelog entry and produces a categorized, human-readable entry.
Design intuitive, meaningful interactions grounded in user goals and cognitive principles. Use when designing component behaviors, user flows, feedback systems, error handling, loading states, transitions, accessibility, keyboard navigation, touch/gesture interactions, or when evaluating interaction quality. Also use for modal vs modeless decisions, direct manipulation patterns, input device considerations, emotional/dramatic aspects of UX, or when asked about making interfaces feel responsive, humane, and goal-directed.
Conduct a thorough alignment interview to deeply understand a task before starting work. Use when starting any non-trivial task — take-home exercises, ambiguous problems, design challenges, complex implementations, research questions — anything where shared understanding matters more than speed. Triggers on phrases like "interview me", "let's align on this", "before we start", "kick off this task", "probe me on this", "I have a take-home", "help me think through", "I want to align before we begin", or whenever the user signals they want a deep upfront context-gathering session before diving in. Err strongly toward triggering for any substantive new task — measure twice, cut once. Produces a written kickoff brief that becomes the shared foundation for the work.
Use when designing or building native macOS applications with SwiftUI or AppKit. Triggers on menu bar structure, keyboard shortcuts, multi-window behavior, Liquid Glass design system, macOS Tahoe/Sequoia, sidebar navigation, toolbar design, app icons, SF Symbols, or making an app feel like a "good Mac citizen."
Apply Model-First Reasoning (MFR) to code generation tasks. Use when the user requests "model-first", "MFR", "formal modeling before coding", "model then implement", or when tasks involve complex logic, state machines, constraint systems, or any implementation requiring formal correctness guarantees. Enforces strict separation between modeling and implementation phases.
Synthesize outputs from multiple AI models into a comprehensive, verified assessment. Use when: (1) User pastes feedback/analysis from multiple LLMs (Claude, GPT, Gemini, etc.) about code or a project, (2) User wants to consolidate model outputs into a single reliable document, (3) User needs conflicting model claims resolved against actual source code. This skill verifies model claims against the codebase, resolves contradictions with evidence, and produces a more reliable assessment than any single model.
Expert guide for configuring, customizing, and creatively leveraging OpenClaw — the self-hosted AI gateway that connects LLMs to messaging channels (Telegram, WhatsApp, Discord, Slack, iMessage, etc.). Use when the user wants to: (1) Set up or modify their openclaw.json configuration, (2) Write or edit bootstrap files (SOUL.md, USER.md, AGENTS.md, IDENTITY.md, TOOLS.md), (3) Configure messaging channels, (4) Set up models and providers, (5) Create multi-agent routing, (6) Build skills, hooks, or cron jobs, (7) Troubleshoot OpenClaw issues, (8) Get creative ideas for leveraging OpenClaw in non-obvious ways. Triggers on: openclaw, gateway, SOUL.md, USER.md, AGENTS.md, IDENTITY.md, channels setup, agent routing, heartbeat, cron jobs, openclaw hooks, openclaw skills, openclaw config, openclaw.json, personal assistant setup.
Build a retrieval-optimized knowledge layer over agent documentation in dotfiles (.claude, .codex, .cursor, .aider). Use when asked to "optimize docs", "improve agent knowledge", "make docs more efficient", or when documentation has accumulated and retrieval feels inefficient. Generates a manifest mapping task-contexts to knowledge chunks, optimizes information density, and creates compiled artifacts for efficient agent consumption.
Guide users step-by-step through manually testing whatever is currently being worked on. Use when asked to "test this", "verify it works", "let's test", "manual testing", "QA this", "check if it works", or after implementing a feature that needs verification before proceeding.
Create high-quality animated explainer visuals for essays and blog posts. Use when the user wants to visualize concepts, processes, data, or ideas with interactive web animations. Triggers on requests like "create a visual for", "animate this concept", "make an explainer", "visualize this idea", "diagram this process", "show this data", or when essay content would benefit from visual explanation. Handles abstract concepts (mental models, frameworks), technical processes (algorithms, systems), and data visualization (trends, comparisons). Outputs self-contained HTML/CSS/JS that embeds directly in web content.
Product analytics expert using PostHog MCP. Triggers on requests to understand user behavior, surface insights, create dashboards, analyze funnels, track metrics, set up experiments, or answer questions about product performance. Use when working with PostHog data, discussing analytics strategy, investigating user journeys, retention, conversion, feature adoption, or when asked to help understand what's happening in the product.
Create a self-contained review package of current work for external review by another AI model or human reviewer. Bundles relevant files with a contextual README and instructional prompt. Triggers: "review package", "create review package", "hand off for review", "get a second opinion", "external code review", "cross-model review", "package for review", "prepare code review". Accepts an optional focus area argument to scope the analysis.
Facilitate methodical review of proposals (technical designs, product specs, feature requests). Use when asked to "review this proposal", "give feedback on this doc", "help me review this RFC", or when presented with a document that needs structured feedback. Handles markdown files, GitHub gists/issues/PRs, and other text formats. Chunks proposals intelligently, predicts reviewer reactions, and produces feedback adapted to the proposal's format.
Craft a high-quality prompt for a deep research agent (like ChatGPT Deep Research) through adaptive interviewing. Use when the user wants to research something but needs help formulating what to ask — when they say "I need to research X", "help me figure out what to ask about Y", "write a research prompt for Z", "I want to use deep research on...", or when they have a vague research need and want a precise, comprehensive prompt that will get excellent results from a research agent. Also use when the user mentions deep research, ChatGPT research, or preparing a query for an AI research tool.
First-principles simplification analysis for codebases. Methodically inventories what a codebase actually does, then asks whether each piece of complexity earns its keep. Use when asked to "simplify this codebase", "is this overengineered", "how could this be simpler", "reduce complexity", "first principles review", "essential complexity audit", "do we really need all this", or any request to rethink whether the current implementation is the simplest way to achieve its goals. Also useful when a codebase feels harder to work with than it should, when onboarding takes too long, or when changes that seem simple keep ballooning in scope.
Research a UI design aesthetic and produce exhaustive, implementation-ready design guidelines for coding agents. Use when the user names an aesthetic (brutalist, glassmorphism, retro-futuristic, Swiss modernist, Apple HIG, neumorphism, minimalism, cyberpunk, Material Design, art deco, vaporwave, etc.) and wants a complete style guide with exact CSS values, color palettes, component states, animations, and typography — detailed enough for a coding agent to faithfully implement the aesthetic with zero ambiguity.
CAVEMAN HUNT BAD PROCESS! Me find greedy creature eating fire and rocks. Me bonk them good. Use when tribe say "kill processes", "clean up servers", "save battery", "find resource hogs", "bonk next.js", or "hunt processes". Me bonk known bad creature automatic. Me ask before bonk mystery creature.
Enter todo recording mode to capture ideas without acting on them. Use when the user says "record todos", "let's capture some todos", "brainstorm mode", or wants to dump ideas without immediate execution. Captures thoughts to .claude/todos/, then organizes and prioritizes on exit.
Assess a codebase's readiness for autonomous agent development and provide tailored recommendations. Use when asked to evaluate how well a project supports unattended agent execution, assess development practices for agent autonomy, audit infrastructure for agent reliability, or improve a codebase for autonomous agent workflows. Triggers on requests like "assess this project for agent readiness", "how autonomous-ready is this codebase", "evaluate agent infrastructure", or "improve development practices for agents".
Write clear, plain-spoken code comments and documentation that lives alongside the code. Use when writing or reviewing code that needs inline documentation like file headers, function docs, architectural decisions, or explanatory comments. Works well for both human readers and AI coding assistants who see one file at a time.
Generate a smart bootstrap prompt to continue the current conversation in a fresh session. Use when (1) approaching context limits, (2) user says "handoff", "bootstrap", "continue later", "save session", or similar, (3) before closing a session with unfinished work, (4) user wants to resume in a different environment. Outputs a clipboard-ready prompt capturing essential context while minimizing tokens.
Ruthlessly analyze architectural seams—the interfaces, boundaries, and contracts between system components—to expose coupling problems, abstraction leaks, and design failures. Use when asked to review architecture, analyze coupling, find interface problems, improve module boundaries, audit dependencies, or redesign system structure. Produces uncompromising redesign proposals that prioritize correctness over backwards compatibility.
Continuous formal verification of architectural constraints and code quality. Use when asked to verify, audit, or validate codebase integrity. Runs automatically via hooks on every edit (structural) and pre-commit (full). Catches ownership violations, boundary crossings, state machine bugs, and code smells that grep ratchets miss. Triggers: "verify", "formal verify", "check architecture", "audit code quality", "run verification", "/verify", "/verify --bootstrap", "/verify --grade".
Expert Unix and macOS systems engineer for shell scripting, system administration, command-line tools, launchd, Homebrew, networking, and low-level system tasks. Use when the user asks about Unix commands, shell scripts, macOS system configuration, process management, or troubleshooting system issues.
Compile an agent-optimized changelog by cross-referencing git history with plans and documentation. Use when asked to "update changelog", "compile history", "document project evolution", or proactively after major milestones, architectural changes, or when stale/deprecated information is detected that could confuse coding agents.
Apply professional typography principles to create readable, hierarchical, and aesthetically refined interfaces. Use when setting type scales, choosing fonts, adjusting spacing, designing text-heavy layouts, implementing dark mode typography, or when asked about readability, font pairing, line height, measure, typographic hierarchy, variable fonts, font loading, or OpenType features.
Transform a codebase study guide into a polished interactive web experience. This skill should be used when the user has a completed study guide markdown file (from codebase-study-guide or similar) and wants to turn it into an interactive pedagogical app. Triggers on requests like "make this study guide interactive", "turn this into an interactive experience", "visualize this study guide", "create an interactive version", or when a user has a study guide .md file and wants a richer presentation. Produces a Vite-served single-page app with scroll-driven storytelling, interactive architecture diagrams, animated code walkthroughs, and progressive disclosure.
Robust Rust patterns for file-backed data, parsing, persistence, FFI boundaries, and system integration. Use when writing Rust that handles file formats, subprocess integration, PID/process management, Serde serialization, or UniFFI boundaries. Covers UTF-8 safety, atomic writes, state machines, and defensive error handling.
Execute architectural refactoring from an assessment document with deterministic, chunked operations and aggressive verification at every step. Use when you have an architectural assessment, clean architecture review, refactoring recommendations, or seam-ripper output and need to actually perform the refactoring safely. Also use when asked to "refactor based on this assessment", "execute these architectural recommendations", "fix architectural drift", "refactor in chunks", or any request to systematically restructure a codebase according to a plan. Designed specifically to prevent the kind of agent drift that causes architectural problems in the first place.
Build a compilable type-level skeleton from a high-level architecture spec before writing any implementation logic. Use when you have an architectural assessment, design doc, or restructuring plan and need to prove the new architecture is sound before migrating code. Also use when asked to "scaffold the new architecture", "create type stubs", "build the shell", "flesh out this spec", "skeleton the modules", or any request to turn architectural intent into verified structure. This skill follows the "Human Builds the Shell" paradigm: types are hard constraints that the compiler enforces, so if the skeleton compiles, the architecture is structurally sound. Especially valuable for large refactors where you don't trust agents to maintain coherence.
Remove LLM-isms and AI writing patterns from text. This skill should be used when editing prose to sound less like AI output — removing overused words, fixing structural tells, and restoring natural human voice. Triggers: "de-slop", "remove AI writing", "humanize this", "sounds too AI", "LLM-isms", "AI slop", or when reviewing text that reads like chatbot output.
Cold, methodical diagnostician for when you're stuck in agent-assisted app development. Call the fixer when: (1) You're in a loop with a coding agent and things keep getting worse, (2) Your project has accumulated so many agent-generated changes you've lost the thread, (3) Builds are broken and you can't figure out why, (4) You've tried multiple approaches and none are working, (5) You need someone to cut through confusion and give you a clear path forward. Triggers on: "I'm stuck", "nothing is working", "help me fix this", "I'm going in circles", "the agent keeps breaking things", "I've lost track of what's happening", "can you take a look at this mess", or any expression of frustration with agent-assisted development. The fixer does not commiserate — it diagnoses, intervenes, and unblocks.
Conduct exhaustive, citation-rich research on any topic using all available tools: web search, browser automation, documentation APIs, and codebase exploration. Use when asked to "research X", "find out about Y", "investigate Z", "deep dive into...", "what's the current state of...", "compare options for...", "fact-check this...", or any request requiring comprehensive, accurate information from multiple sources. Prioritizes accuracy over speed, cross-references claims across sources, identifies conflicts, and provides full citations. Outputs structured findings with confidence levels and source quality assessments.
Expertise in architecting, implementing, reviewing, and debugging hierarchical matching systems. Use when working with: (1) Two-sided matching (Gale-Shapley, hospital-resident, student-school), (2) Assignment/optimization problems (Hungarian algorithm, bipartite matching), (3) Multi-level hierarchy matching (org charts, taxonomies, nested categories), (4) Entity resolution and record linkage across hierarchies. Triggers: debugging match quality issues, reviewing matching algorithms, translating business requirements into constraints, validating match correctness, architecting new matching systems, fixing unstable matches, resolving constraint violations, diagnosing preference misalignment.
Create a narrative guide to a codebase or feature in the style of Knuth's Literate Programming — code and prose interwoven as a single essay, ordered for human understanding rather than compiler needs. Use when the user asks to 'explain this codebase as a story', 'write a literate guide', 'create a narrative walkthrough', 'tell the story of this code', 'Knuth-style documentation', 'weave a guide for this feature', or when they want deep, readable documentation that treats the program as literature. Also trigger when someone wants a document that a thoughtful reader could follow from start to finish and come away understanding both WHAT the code does and WHY every design choice was made.
Create visual parameter tuning panels for iterative adjustment of animations, layouts, colors, typography, physics, or any numeric/visual values. Use when the user asks to "create a tuning panel", "add parameter controls", "build a debug panel", "tweak parameters visually", "fine-tune values", "dial in the settings", or "adjust parameters interactively". Also triggers on mentions of "leva", "dat.GUI", or "tweakpane".
Meta-cognitive decision support that analyzes current context and surfaces intelligent next-step options to the user. Use this skill when: (1) User explicitly invokes /checkpoint, (2) Significant work has been completed and a checkpoint is valuable, (3) Uncertainty or ambiguity exists about requirements or approach, (4) Task complexity has expanded beyond initial scope, (5) Before finalizing or committing to ensure nothing is missed. This skill pauses execution, assesses the situation holistically, and presents 2-5 contextually-appropriate options via AskUserQuestion, with a recommended option and rationale.
Make application behavior visible to coding agents by exposing structured logs and telemetry. Use when asked to "add telemetry", "make logs accessible to agents", "add observability", "debug with logs", or when an agent needs to understand runtime behavior but has no way to query logs. Also use when debugging is difficult because there are no structured logs, when agent docs (CLAUDE.md, AGENTS.md) lack instructions for querying application logs, or when setting up logging infrastructure for a new or existing web application.
Analyze recent conversation context and capture learnings to project knowledge files (for project-specific insights) or skills/commands/subagents (for cross-project patterns). Use when the user asks to "capture this learning", "update the docs with this", "remember this for next time", "document this issue", "add this to CLAUDE.md", "save this knowledge", or "update project knowledge". Also triggers after resolving build/setup issues, discovering non-obvious patterns, or completing debugging sessions with valuable insights.
Identify non-obvious signals, hidden patterns, and clever correlations in datasets using investigative data analysis techniques. Use when analyzing social media exports, user data, behavioral datasets, or any structured data where deeper insights are desired. Pairs with personality-profiler for enhanced signal extraction. Triggers on requests like "what patterns do you see", "find hidden signals", "correlate these datasets", "what am I missing in this data", "analyze across datasets", "find non-obvious insights", or when users want to go beyond surface-level analysis. Also use proactively when you notice interesting anomalies or correlations during any data analysis task.
Structured development workflow that separates research, planning, and implementation into distinct phases with persistent markdown artifacts. Use when starting any non-trivial feature, refactor, bug investigation, or codebase change. Trigger on: "deep work", "research and plan", "plan before coding", "write a plan", "research this codebase", "don't code yet", "understand then implement", or when the user wants a disciplined approach to a complex task. Also use when the user says "research", "plan", "annotate", "implement the plan", or references research.md/plan.md artifacts.