
Conversational QA mode — user reports bugs in plain language, agent clarifies minimally, files GitHub issues that survive refactors. Trigger on "QA", "QA session", or ad-hoc bug reporting without a fixed deliverable shape. Distinct from branch-scoped and PR-scoped review.
Reduce concepts, duplication, and ceremony in internal code while touching nearby code. Use when working an existing path and you spot dead fields, redundant wrappers, or speculative abstractions; distinct from refactor-break-bw-compat (internal hygiene, not public API removal).
Auto-router for context gathering. Detects whether the task needs codebase exploration or external knowledge research and dispatches accordingly. Trigger on "get context", "background on X", "context on X", "how does X work", architectural orientation, or any setup-before-coding request.
Adversarial security audit — STRIDE, OWASP Top 10, supply-chain (CVE/SBOM), secrets scan, auth/authz analysis. Use on changes touching auth, input parsing, deserialization, network I/O, dependencies, or secrets; before any production release or external-surface PR.
Code search, analysis, and refactoring using ast-grep (sg). Use for AST-based code modifications, structural search, and linting.
Verbosity-reduction response register. Drops redundant clauses, narrative hedging, and ceremony while preserving articles, grammar, and decision-oriented register. Trigger on "caveman", "compact mode", "less tokens", "be brief", or context-window pressure in long sessions.
Verbalized Sampling (VS) protocol for intent exploration before planning, mode-aware. Default `exhaustive` runs full VS; `collaborative` runs tip-sharing dialogue; `adversarial` walks the design tree one fork at a time. Auto-detects from phrasing ("help me refine" → collaborative, "poke holes" → adversarial); override via `/askme adversarial|collaborative|exhaustive`. Use for ambiguous tasks or maximum clarifying questions before committing.
Surface deepening refactors that turn shallow modules into deep ones, informed by `CONTEXT.md` and `docs/adr/`. Use when the user asks to improve architecture, find refactor candidates, raise testability, or make a codebase more agent-navigable. Skip for single localized fixes.
Set visual and interaction direction for any UI surface (web, React, TUI, CLI, desktop, Qt, design-system tokens) before any UI code. Direction-first: generates 3-4 distinct directions via verbalized sampling, picks one via per-axis single-select, then derives palette, typography, spacing, motion budget. Loads when the user asks for UI work, palette/theme/tokens, mentions a design system, or when output looks AI-generic, vibe-coded, sloppy, or default Tailwind/shadcn/Bootstrap. Enforces two-sided anti-slop charter: forbids purple gradients, `transition: all`, system-ui, default Tailwind palette AND overkill compensation (sprites, gradients everywhere, animation on every element).
Polymorphic iterative repair loop — accept a verifier failure, structured findings (review/resolve/triage-issue), or a bug description; modify→verify→keep on green, auto-revert on guard regression, until clean or iteration cap. Use when the user says "fix", "make it pass", or "apply the findings", or hands an artifact + repo and expects patches; auto-routes to gh-fix-ci or gh-address-comments when an open PR + gh auth + GH-flavored input is detected.
Restructure Web-UI / human-triggered tasks into CLI + file-output loops the LLM can iterate alone, with structured logs and addressable scratchpads. Apply trap-or-abandon: if a step cannot be looped, improve the harness rather than babysit. Trigger on iterative grunt-work, "push a button in a web UI to trigger this", monitoring dashboards, or any workflow whose inner loop requires a human in the middle.
--- name: perf-profile description: Hotspot detection and performance root-cause analysis: flamegraph interpretation, allocation tracking, latency profiling, regression measurement. Use when a workload misses its latency, throughput, or memory budget; when a benchmark regresses; or before optimizing any hot path (no optimization without a profile). --- Performance is a contract with reality. Intuition about hot paths is wrong more often than right. Capture, locate, hypothesize, optimize, re-mea
Test-Driven Development (TDD) across any supported language. Use when implementing features or fixes with TDD methodology, writing tests before code, or following XP-style development.
Cross-domain taste skill — apply distinctive judgment to any artifact (prose, code, design, decisions) instead of converging to AI defaults. Two modes — `audit` (judge work against the two-sided charter and portable anchors) and `anchor` (load register before producing). Auto-detects by phrasing; override via `/taste audit | anchor`. Trigger on "is this slop?", "overkill?", "elegant?", "taste-test this".
ODIN's compress-operations dispatcher under the Compressor/Extender role. Invoke on "tidy", "clean up", "tidy this file/memory/workspace/git/docs", or when active context (current file, diff, stack, memory directory) has structural rot to resolve before touching behavior. Detects target domain from context and routes to the sibling skill. Requires explicit target or clear active-context signal — do not invoke speculatively.
Read-only codebase exploration: discovery, structural reading, and emission of architecture/pattern/tooling/dependency summaries. Use to understand existing code, map files, trace function flow, locate symbols, or build pre-implementation context. Defers to ODIN's Dispatch-First protocol (1/3/5 Explore-agent escalation). Trigger on "explore", "find where X is", "how does X work in the code", "map the codebase", "what files handle Y", or any architecture/pattern/tooling/dependency context request on a local repo — even without naming /explore.
Read-only external knowledge gathering via ODIN's 5-tier doc ladder (Official docs → API refs → Books/papers → Tutorials → Community). For library APIs, framework behavior, SDK migrations, version-specific docs, vendor announcements, RFCs. Verifies claims against primary sources. Invoke on "how does X library work", "migration guide", "docs for", or any named library/framework/SDK/API/CLI/service.
Plan a refactor as a sequence of tiny, working commits via adversarial interview. Default output is a markdown plan at `<project-root>/docs/refactor-plans/<name>.md`; pass `--emit-issue` to also file a GitHub issue. Trigger on structural refactors (rename, extract, move, split, dedupe) — NOT new features.
One-shot bootstrap of strict-mode tooling per ecosystem plus per-task GOALS.md scaffolding so an agentic loop can self-verify. Writes typechecker/linter/schema-validator config for TS (strict + noUncheckedIndexedAccess + exactOptionalPropertyTypes), Python (Pyright strict, Ruff strict), Rust (Clippy deny-correctness), Go (golangci-lint with staticcheck), OCaml (dune --release); establishes `.agent-tasks/<id>/GOALS.md` per-task convention distinct from project-stable AGENTS.md. C++/Java/Kotlin and framework specifics (Spring Boot, Nest, React-strict) are out of scope. Trigger on new project bootstrap, agentic-task setup, "make this self-verifying", "set the loop's goal", "scaffold goals for this issue". Pairs with `llm-self-loop` runtime.
Build safe, syntax-aware srgn CLI commands for source-code search and transformation. Use for srgn commands, scoped refactors (comments/docstrings/imports/functions), multi-file rewrites with --glob, tree-sitter queries, or CI checks with --fail-any/--fail-none.
Resolve code review comments by verifying validity, then proposing multiple architectural solutions (not naive fixes) for confirmed issues. Use when addressing review feedback or analyzing comment validity.
Install git pre-commit hooks via the project's hook tool — Husky+lint-staged (JS), pre-commit (Python/OCaml), lefthook (Go), cargo-husky (Rust). Use when the user wants commit-time formatting, linting, type-checking, or test gates. Detects ecosystem first.
Step up one layer of abstraction and surface a map of relevant modules and callers when the local view is too narrow. Trigger when the agent (or user) is unfamiliar with a code region and needs the surrounding architecture before committing to a change.
Design-by-Contract (DbC) development. Use when implementing with formal preconditions, postconditions, and invariants across any language.
Hypothesis-driven defect isolation — stack-trace forensics, breakpoint strategy, state inspection, and root-cause confirmation via minimal repro. Use when a defect surfaces (test failure, crash, exception, wrong output, intermittent flake) and the cause is not immediately obvious from the change set.
Review staged + unstaged changes and split them into one commit per logical change. Use whenever the user says "atomic commit", "commit my changes", "split this into commits", or has multiple unrelated edits sitting in the working tree — even if they don't say "atomic". Runs repo-native type-checker and linter before each commit and refuses to bundle unrelated changes.
Install a Claude-Code PreToolUse hook that blocks destructive git commands (push variants including force-push, hard reset, force clean, branch -D, checkout/restore overwrites) before Bash runs them. Use when the user wants git safety rails, force-push prevention, or repository-wipe protection.
Triage GitHub issues through a configurable label-based state machine. Use when user wants to triage incoming issues, prepare issues for an autonomous agent, or move an issue between workflow states. Repo inferred from `git remote`; all GitHub calls go through `gh`.
Hybrid interview that probes AI-engineering mastery by tip-vocabulary depth — entity referencing, loop closure, observability, harness improvement — not by token usage or LOC. Start collaborative (two-way tip exchange), escalate to adversarial probing when depth is lacking. Trigger when the user says "interview me on AI", "stress-test my Claude usage", "evaluate this candidate's AI engineering", or otherwise asks for an AI-collab skill assessment. User-only — never auto-invoke.
Software architect and planning specialist - conduct thorough read-only planning before any action. Use when exploring a codebase to design implementation plans, defining objectives, gathering relevant files, and summarizing available tools before coding begins.
Delete tests that don't catch real bugs — the inverse of TDD. Use when reviewing legacy test suites, slow CI investigations, refactor-driven test sweeps, or evaluating whether a test the type system already covers should stay. Thesis — a test exists only if removing it would let a real bug reach production.
Type-driven development. Use when developing with refined types, state machines encoded in types, or proof-carrying types; enforces totality and exhaustive pattern matching.
Grill against the existing domain model. Stress-test a plan's terminology against `CONTEXT.md` and ADRs; update both inline as decisions crystallise. Trigger when user proposes a feature/refactor that touches business concepts and the project has documented domain language to honor — or when domain language is missing and needs capture.
Produce share-safe copies of memory files under /tmp with PII redacted (paths, emails, session IDs, dates) and credentials scanned (tokens, keys); never mutates originals. Use when the user says "sanitize memory for sharing", "redact memory PII", or "scan memory for credentials".
Decompose a plan, PRD, or spec into independently-grabbable vertical-slice issues (markdown file by default, GitHub issues via flag). Trigger when the user wants implementation tickets, work decomposition, or to convert an implementation plan into parallelizable work. Takes a plan file and emits atomic vertical slices.
Mechanically tighten existing prose — restructure sections by dependency order, split or merge paragraphs, remove redundancy. Use to compress verbose plan files, READMEs, ADRs, and design docs. Does NOT change voice, register, tone, or any ODIN-mandated phrasing.
Author a single new skill — produce a SKILL.md plus optional bundled references and scripts following Anthropic's progressive-disclosure conventions. Trigger when the user asks to "write a skill", "create a skill", "draft a SKILL.md", or "add a skill" for a specific capability. Distinct from repo onboarding workflows that write AGENTS.md and project conventions.
Two-party posture — user as director, agent as executor; every fork, tradeoff, or choice surfaced via batched AskUserQuestion with a recommended default. Use when the user invokes /duet, says "ask before" / "pair with me" / "human-in-the-loop", or for aesthetic/architectural/irreversible decisions.
Merge one or more PRs into the base branch with queue-like sequencing and conflict resolution. Use when merging PRs that may conflict with each other or the base, requiring ordered application and intelligent conflict handling.
Execute an implementation plan with surgical precision. Use after a planning phase (plan-now or similar) has produced a step-by-step strategy and identified critical files. Focuses on precise code changes with verification at each step.
Write adversarial tests that intentionally stress failure paths. Use when hardening error handling, stress-testing assumptions, validating boundary behavior, or hunting silent failures.
Run the atomic-commit workflow on the current changes, then publish the resulting commits to the remote. Use whenever the user says "commit and push", "ship these changes", "atomic commit and push", "publish my work", or wants atomic commits delivered to origin in one step. Prefers `git submit` (git-branchless); falls back to a named branch + `git push origin HEAD:refs/heads/<branch>`. Refuses force-push and direct push to protected branches without explicit authorization.
Surface in-task-collaboration protocols when the user describes an AI workflow informally — URL-as-entity-reference, PR-comment threads as session memory. Trigger when the user names entities by colloquial label instead of stable URL, asks "how should I structure this for Claude", or describes a multi-step Claude workflow without a durable handle. Apply reactively, not as a checklist.
Analyze a codebase and create or improve an AGENTS.md file for future agent instances. Use when onboarding to a repository and capturing hard-to-rediscover conventions, constraints, and rationale.
Decompose a task into independent concerns and execute them through broadly parallel, specialized agent groups. Use when a request involves multiple independent sub-tasks, research across separate domains, or work that can be parallelized across files or modules.
Merge multiple PRs into a temporal integration branch before merging to base, with ordered conflict resolution. Use when you want to validate a set of PRs together on a staging branch before advancing the base branch.
Refactor by removing backward compatibility and legacy layers. Use when modernizing APIs, cleaning up migration debt, removing compat shims, or eliminating stale feature flags.
Review code changes on a given GitHub PR using gh CLI. Use when the user asks to review a pull request, analyze PR diffs, or provide feedback on open PRs with structured quality, security, and testing assessments.
Review the code changes on the current branch. Use when the user asks to review their current work, analyze recent commits, or get a code quality assessment of the active branch against the main branch.
Generate radically-different module or API contract designs in parallel, then compare on depth, simplicity, and ease-of-correct-use. Trigger when the user is shaping a new module surface, exploring contract options, comparing module shapes, or applying "design it twice" to a first-pass sketch.
Adversarial relentless interview against any plan or design until shared understanding is reached. Walk the decision tree, resolve dependencies one answer at a time, recommend a default per question. Trigger when the user says "grill me", "stress-test this", "interview me about this design", or otherwise asks for hostile-but-fair scrutiny of a proposal. General-purpose — no domain anchor required.
Initialize or idempotently revise the repo's .gitignore by composing gitignore.io templates, AI-tooling/IDE patterns, and confirmed noise from git status. Use when the user says "set up gitignore", "fix gitignore", or untracked files keep appearing in git status.
Validation-first development. Use when developing with formal state machine specifications, invariants, and temporal properties before writing implementation code.
Help address review/issue comments on the open GitHub PR for the current branch using gh CLI; verify gh auth first and prompt the user to authenticate if not logged in.
Inspect GitHub PR checks with gh, pull failing GitHub Actions logs, summarize the failure, then plan and implement the fix after user approval. Use when the user asks to debug or fix failing PR CI on GitHub Actions; external checks (Buildkite, etc.) are reported as URLs only.
Scan agent's session-history transcripts for save-worthy signals (corrections, preferences, decisions, references), propose and write auto-memory files with valid frontmatter and MEMORY.md entry. Use when the user says "save this to memory", "remember that", or "scan this session for memories".
Audit memory directory for structural issues (orphans, dangling refs, duplicates, missing sections, oversized entries) and staleness against session-history transcripts; report-first, fix-on-confirmation. Use when the user says "audit memory", "memory hygiene", or "find stale/duplicate memories".
Synthesize current conversation context and codebase understanding into a PRD artifact (markdown file by default, GitHub issue via flag). Trigger when the user asks for a PRD, requirements doc, feature spec, or wants to crystallize an in-flight discussion into a durable artifact before planning. PRD precedes implementation planning.
Dependency-upgrade campaign — outdated scan, batch-by-severity, breaking-change remediation, lockfile audit. Use when CVEs require remediation, when a major upstream version lands, when stack compatibility forces a sweep, or on a scheduled (quarterly) hygiene cadence. CVE-driven bumps consume security audit findings as input.
Investigate a reported bug to root cause, then emit a TDD-shaped fix plan as an issue artifact. Trigger when the user reports a bug, says "triage", asks for issue investigation, or wants a fix plan before code changes.
Personal taste skill — 5 evidence-derived anchors ({anchor_names}) for prose, code, design, and decisions. Two modes: audit judges an artifact against the two-sided charter; anchor loads the taste register before producing. Trigger with "{trigger_phrase}", "taste-test", "is this slop?", or "overkill?".
Evidence-first generator for a personal <name>-taste Claude Code skill. Mines local memories and agent histories for influences, slop bans, and overkill bans; asks compact confirmation forks; previews the synthesis; then writes a right-sized taste skill with exactly 5 anchors by default. Trigger with "generate my taste skill", "make my taste skill", or "derive my taste spine".
Extract a domain glossary from the current dialogue; flag ambiguities, propose canonical terms, persist to `UBIQUITOUS_LANGUAGE.md`. Trigger when the user is hardening domain terminology, building a glossary, or fresh domain concepts surface in conversation without documented language.
ODIN's compact-form conversation skill -- formal-logic English register with predicate claims, Hoare-triple framing, and ASCII shortened-English keywords. Trigger when user requests "axiom", "axiom-mode", "axiom-compact", or "compact form".
Proof-driven development. Use when implementing with formal verification using property-based testing, theorem proving, or proof tactics; zero unproven property policy enforced.
Enforce idiomatic git-branchless during planning and executing tasks — detached-HEAD-first work, in-memory rebase via `git move`, event-log recovery via `git undo`, deferred branch creation, speculative-merge `git sync` for base updates. Use when planning or executing multi-commit work, history rewrites, stack edits, rebase/reorder, fixup insertion mid-stack, stacked-PR publishing, or recovery from bad git ops; or when the user mentions branchless, smartlog, `git move`, or `git undo`. Silently inert if branchless is not initialized for the current repo.