Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

mblode/ax-audit

Name: ax-audit
Author: mblode

skills/ax-audit/SKILL.md

npx skillsauth add mblode/agent-skills ax-audit

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

AX Audit

Feature-level reviewer for apps where an agent acts for the user. One question: does it earn trust, and where does it break?

IS: rules-based audit of agentic surfaces (agent chat, tool execution panels, agent config, dashboards) across two layers (architecture in rules-arch/, trust/relationship design in rules-ax/), ending in a ship-readiness verdict plus an AX Relationship Summary.
IS NOT: traditional frontend UX (forms, states, focus, async, microcopy, accessibility, layout, typography, performance, use ui-audit); agent instruction-file quality (use agents-md).

No agentic features in scope (only forms, lists, modals)? Route to ui-audit; AX rules against traditional UI produce only noise.

Audit workflow
Two rule layers
Tiers and verdict
AX Relationship Summary
Reference files
Gotchas
Audit self-check
Related skills

Audit workflow

Track this checklist:

AX Audit progress:
- [ ] Step 1: Scope, via `git diff --name-only main` (PR mode) or explicit path (full sweep)
- [ ] Step 2: Detect agentic features per references/feature-playbooks.md
- [ ] Step 3: Run each detected feature's playbook in order, plus the diff-wide checks
- [ ] Step 4: For each check, load the rule file and follow its detection recipe
- [ ] Step 5: Tier each finding per references/ship-readiness.md (rule override table wins)
- [ ] Step 6: Render verdict + findings + AX Relationship Summary per references/output-format.md
- [ ] Step 7: Run the audit self-check and report its evidence counts

Step notes:

Scope. Default: PR diff plus the tool definitions and orchestrator code it touches. Findings in untouched files belong in a full sweep, not a PR verdict.
Detect. Heuristics (component names, hooks, routes) for the four feature types live in references/feature-playbooks.md.
Playbooks. Each feature has 5-9 ordered checks; run all, even expected passes (a pass with evidence belongs in the report). The diff-wide parity-orphan-ui-action runs on every PR-mode audit regardless of detected features.
Rules. Each rule file carries its own detection commands, false-positive guards, tier override table, and suppression syntax, and is authoritative; playbook annotations are a convenience copy.
Tier. Three tiers; precedence below.
Render. Group findings by surface; verdict block first, AX Relationship Summary last.
Self-check. Evidence or it didn't happen (see below).

Two rule layers

| Layer | Folder | Rules | Question it answers | Category index | |---|---|---|---|---| | 1: Agent-native architecture | rules-arch/ | 11 | Can the agent do what the user can do? Are tools atomic? Does the agent know what exists? Is completion explicit? | rules-arch/_sections.md | | 2: Agentic experience | rules-ax/ | 12 | Does the agent earn trust? Can the user interrupt, undo, push back? Is memory visible? | rules-ax/_sections.md |

Load rules-arch/<category>-<slug>.md or rules-ax/<category>-<slug>.md when a playbook check names it. Categories: arch = parity, granularity, context, comm; ax = trust, control, context, comm. Both layers share the comm and context prefixes, but the rules differ: rules-arch/comm-no-approval-gate.md (orchestrator code has no gate logic) is not rules-ax/control-no-approval-gate.md (approval UI doesn't match the stakes).

Tiers and verdict

Every finding gets exactly one tier (full trigger lists in references/ship-readiness.md):

release-blocker, fix before merge: no escape hatch, silent execution, heuristic completion, broken parity, ungated high-stakes actions
fix-this-sprint, merge with a tracked issue: no confidence cues, no intent handshake, opaque memory, bundled config tools
backlog, ship and track: static canvas, no generative momentum, static API mapping, no checkpoint/resume

Tier precedence: a rule's own surface-override table > the generic surface bump in references/ship-readiness.md > the rule's defaultTier. Apply at most one adjustment; never stack the generic bump on a rule's explicit override.

Verdict: ✅ READY (0 blockers, ≤3 sprint) · ⚠️ READY WITH FOLLOW-UP (0 blockers, ≥4 sprint) · ❌ NOT READY (≥1 blocker) · 🚫 INCOMPLETE (self-check failed).

AX Relationship Summary

Rendered after findings when any agentic feature was detected. Findings serve engineers; this serves designers and PMs, so never skip it. Four fields:

Evolution stage: behavior description, not a label (see references/ax-evolution-curve.md)
Trust signal: high/moderate/low, one-line reasoning from trust-critical rules
Key gap: the single most important gap, one actionable sentence
Trust question: one question only prototyping or research can answer

Reference files

| File | Read when | |---|---| | references/feature-playbooks.md | Steps 2-3: detection heuristics, per-feature ordered checks, diff-wide checks | | references/ship-readiness.md | Step 5: tier triggers, precedence, verdict logic | | references/output-format.md | Step 6: findings JSON schema, summary schema, terminal rendering | | references/agent-native-principles.md | A Layer 1 finding needs grounding: parity, granularity, CRUD completeness, context patterns, approval matrices, checkpoint/resume | | references/ax-evolution-curve.md | Writing the evolution-stage field of the AX summary | | rules-arch/_sections.md | Layer 1 categories and default tiers | | rules-ax/_sections.md | Layer 2 categories, default tiers, co-firing rule pairs |

Gotchas

Scope before rules. Running all 23 rules repo-wide on a 3-file PR buries a new release-blocker under pre-existing backlog noise; the verdict stops meaning "can this PR merge."
The rule's override table is authoritative. comm-no-intent-handshake defaults to fix-this-sprint but its table says release-blocker on tool execution. Stacking the generic "+1 tier on tool execution" bump on an explicit override double-upgrades backlog findings into blockers.
A stop button not wired to AbortController.abort() is a false affordance. control-no-escape-hatch still fails: verify the abort() call, not the button label, or the audit passes a UI that lies to users.
Absence checks need a recorded file list. "Find components lacking X" greps return nothing both when everything passes and when nothing was scanned. List candidate files first (rg -l <feature-pattern>), check each for the counter-pattern, and cite the file list as evidence.
detection: observational rules cannot fail on grep evidence alone. granularity-static-api-mapping, trust-no-uncertainty-markers, control-over-conversational, and comm-no-generative-momentum need interaction-flow judgment; on static evidence alone, return unknown with a reason, not fail.
ax-audit-ignore:<slug> comments count as suppressed, not pass. Report the count in the verdict block; a suppression with no reason is itself worth a warn.
Don't duplicate ui-audit findings. "Missing loading state" and "form clears on error" are ui-audit territory; duplicating them trains engineers to dismiss the whole AX report.
Don't inflate tiers. comm-no-generative-momentum and granularity-static-api-mapping default to backlog. Promoting cosmetic findings to blocker trains the team to ignore ❌ verdicts.

Audit self-check

Flag the audit INCOMPLETE if any of these hold, and include the counts as evidence (planned vs. run rules per playbook, unknown rate, suppressed count):

Fewer rules ran than the playbooks planned
More than 30% of rules returned unknown
Any fail/warn finding lacks file:line evidence or a fix snippet
Every finding landed in the same tier (suspect blanket assignment)
AX Relationship Summary is missing despite detected agentic features

Related skills

ui-audit: traditional frontend UX quality around agentic surfaces; run both on agentic feature PRs, with ax-audit covering the agent layer
agents-md: audit CLAUDE.md / AGENTS.md agent instruction files
define-architecture: repo structure and module boundaries

mblode/ax-audit

skills/ax-audit/SKILL.md

Audits agentic applications for architecture and trust: tool parity, tool granularity, context injection, completion signals, approval gates, confidence cues, escape hatches, intent handshakes, memory visibility, and adaptive canvases. Produces a ship-readiness verdict plus an AX Relationship Summary. Use when reviewing agentic feature PRs or asking "is this agent-native", "AX review", "critique this AI feature", "does this earn user trust", or "audit this for AX". For traditional frontend UX use ui-audit.

69 stars

tools

Updated Jul 26, 2026

$ install --global

skillsauth

npx skillsauth add mblode/agent-skills ax-audit

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Jul 26, 2026, 4:11 AM137.3s1 file scanned

SKILL.md

name:: ax-audit
description:: >-
Audits agentic applications for architecture and trust:: tool parity, tool

AX Audit

Feature-level reviewer for apps where an agent acts for the user. One question: does it earn trust, and where does it break?

IS: rules-based audit of agentic surfaces (agent chat, tool execution panels, agent config, dashboards) across two layers (architecture in rules-arch/, trust/relationship design in rules-ax/), ending in a ship-readiness verdict plus an AX Relationship Summary.
IS NOT: traditional frontend UX (forms, states, focus, async, microcopy, accessibility, layout, typography, performance, use ui-audit); agent instruction-file quality (use agents-md).

No agentic features in scope (only forms, lists, modals)? Route to ui-audit; AX rules against traditional UI produce only noise.

Audit workflow
Two rule layers
Tiers and verdict
AX Relationship Summary
Reference files
Gotchas
Audit self-check
Related skills

Audit workflow

Track this checklist:

AX Audit progress:
- [ ] Step 1: Scope, via `git diff --name-only main` (PR mode) or explicit path (full sweep)
- [ ] Step 2: Detect agentic features per references/feature-playbooks.md
- [ ] Step 3: Run each detected feature's playbook in order, plus the diff-wide checks
- [ ] Step 4: For each check, load the rule file and follow its detection recipe
- [ ] Step 5: Tier each finding per references/ship-readiness.md (rule override table wins)
- [ ] Step 6: Render verdict + findings + AX Relationship Summary per references/output-format.md
- [ ] Step 7: Run the audit self-check and report its evidence counts

Step notes:

Scope. Default: PR diff plus the tool definitions and orchestrator code it touches. Findings in untouched files belong in a full sweep, not a PR verdict.
Detect. Heuristics (component names, hooks, routes) for the four feature types live in references/feature-playbooks.md.
Playbooks. Each feature has 5-9 ordered checks; run all, even expected passes (a pass with evidence belongs in the report). The diff-wide parity-orphan-ui-action runs on every PR-mode audit regardless of detected features.
Rules. Each rule file carries its own detection commands, false-positive guards, tier override table, and suppression syntax, and is authoritative; playbook annotations are a convenience copy.
Tier. Three tiers; precedence below.
Render. Group findings by surface; verdict block first, AX Relationship Summary last.
Self-check. Evidence or it didn't happen (see below).

Two rule layers

Tiers and verdict

Every finding gets exactly one tier (full trigger lists in references/ship-readiness.md):

release-blocker, fix before merge: no escape hatch, silent execution, heuristic completion, broken parity, ungated high-stakes actions
fix-this-sprint, merge with a tracked issue: no confidence cues, no intent handshake, opaque memory, bundled config tools
backlog, ship and track: static canvas, no generative momentum, static API mapping, no checkpoint/resume

Verdict: ✅ READY (0 blockers, ≤3 sprint) · ⚠️ READY WITH FOLLOW-UP (0 blockers, ≥4 sprint) · ❌ NOT READY (≥1 blocker) · 🚫 INCOMPLETE (self-check failed).

AX Relationship Summary

Rendered after findings when any agentic feature was detected. Findings serve engineers; this serves designers and PMs, so never skip it. Four fields:

Evolution stage: behavior description, not a label (see references/ax-evolution-curve.md)
Trust signal: high/moderate/low, one-line reasoning from trust-critical rules
Key gap: the single most important gap, one actionable sentence
Trust question: one question only prototyping or research can answer

Reference files

Gotchas

Scope before rules. Running all 23 rules repo-wide on a 3-file PR buries a new release-blocker under pre-existing backlog noise; the verdict stops meaning "can this PR merge."
The rule's override table is authoritative. comm-no-intent-handshake defaults to fix-this-sprint but its table says release-blocker on tool execution. Stacking the generic "+1 tier on tool execution" bump on an explicit override double-upgrades backlog findings into blockers.
A stop button not wired to AbortController.abort() is a false affordance. control-no-escape-hatch still fails: verify the abort() call, not the button label, or the audit passes a UI that lies to users.
Absence checks need a recorded file list. "Find components lacking X" greps return nothing both when everything passes and when nothing was scanned. List candidate files first (rg -l <feature-pattern>), check each for the counter-pattern, and cite the file list as evidence.
detection: observational rules cannot fail on grep evidence alone. granularity-static-api-mapping, trust-no-uncertainty-markers, control-over-conversational, and comm-no-generative-momentum need interaction-flow judgment; on static evidence alone, return unknown with a reason, not fail.
ax-audit-ignore:<slug> comments count as suppressed, not pass. Report the count in the verdict block; a suppression with no reason is itself worth a warn.
Don't duplicate ui-audit findings. "Missing loading state" and "form clears on error" are ui-audit territory; duplicating them trains engineers to dismiss the whole AX report.
Don't inflate tiers. comm-no-generative-momentum and granularity-static-api-mapping default to backlog. Promoting cosmetic findings to blocker trains the team to ignore ❌ verdicts.

Audit self-check

Flag the audit INCOMPLETE if any of these hold, and include the counts as evidence (planned vs. run rules per playbook, unknown rate, suppressed count):

Fewer rules ran than the playbooks planned
More than 30% of rules returned unknown
Any fail/warn finding lacks file:line evidence or a fix snippet
Every finding landed in the same tier (suspect blanket assignment)
AX Relationship Summary is missing despite detected agentic features

Related skills

ui-audit: traditional frontend UX quality around agentic surfaces; run both on agentic feature PRs, with ax-audit covering the agent layer
agents-md: audit CLAUDE.md / AGENTS.md agent instruction files
define-architecture: repo structure and module boundaries

Related Skills

mblode/tidy

development

VerifiedTrustedCommunity

Fans out four concurrent review agents over the current diff, then APPLIES fixes directly to the working tree and verifies the build. Mutates code; it does not produce a report. Covers reuse (duplicate logic, hand-rolled stdlib, reinvented platform features), quality (hacky patterns, React/TypeScript hygiene, over-memoisation, exhaustive-deps, `any`, dead code, `CLAUDE.md`/`AGENTS.md` violations), efficiency (unnecessary work, missed concurrency, hot-path bloat), and test discipline (bug fixes without a repro test, useless tests to delete, missing tests only when they prevent a named failure). Use when the user says "tidy this up", "simplify", "clean up this diff", "polish my changes", "check for duplication", or "any reuse opportunities?", i.e. when the intent is to have the changes made automatically. For a read-only report that lists findings without touching files, use `pr-reviewer` instead. This skill edits code; for the PR's title, description, or commit history, use `pr-creator`.

69SKILL.mdUpdated Jul 2, 2026

mblode/product-design

development

VerifiedTrustedCommunity

Decides what an interface should do before UI is built or audited: interaction choice, action scope and consequence, reachable states, resilience, and accessibility as task completion. Works from a brief, spec, mockup, intent, or existing UI. Use when asked "is this the right interaction", "design the flow", "what control should this use", "what should this action affect", "which states should this have", "make this resilient", or "what breaks here". For building or styling use ui-design; for built-code audits use ui-audit; for copy wording use copywriting.

69SKILL.mdUpdated Jun 28, 2026

mblode/product-design

mblode/planning

development

VerifiedTrustedCommunity

Builds and stress-tests implementation plans in two modes. Create mode scans code and docs, asks one question at a time with a recommended answer, runs a blindspot pass when the user is new to the area, then writes a plan file. Review mode scores completeness, feasibility, scope, testability, risk, and assumptions, verifies checkable claims, and writes resolutions back until every dimension reaches 5/5. Use when asked to "create a plan", "plan this feature", "I want to build X", "grill me", "think this through", "blindspot pass", "unknown unknowns", "this is new to me", "review my plan", "rubber duck this", "stress test this plan", "is this plan ready", "get this plan to 5/5", "what am I missing", "verify this claim", "prove this plan", "fact-check this plan", or when the user explicitly wants a plan artifact before implementation. For code review use pr-reviewer; for architecture briefs use define-architecture.

69SKILL.mdUpdated Jun 28, 2026

mblode/dx-audit

tools

VerifiedTrustedCommunity

Audits the smallest relevant developer-facing surface of a library, CLI, SDK, or npm package across API contracts, errors, CLI behavior, public types, onboarding, and config. Uses candidate-first rule loading, bounded local evidence, and compact root-cause findings. Use when asked to "audit my CLI", "make this CLI agent-friendly", "is this API ergonomic", "review the developer experience", "improve these errors", "simplify first run", or "review my SDK". For end-user UI use ui-audit, for agentic-app trust use ax-audit, for docs prose use docs-writing, for README work use readme-creator, and for repo architecture use define-architecture. Inside a product that also ships a UI, this is the skill for the developer-facing half, so pick it when the complaint is about an import, command, error string, exported type, or config rather than a screen.

69SKILL.mdUpdated Jun 28, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/mblode/agent-skills.git

# Copy into Claude Code skills folder (global)
cp -r agent-skills/skills/ax-audit ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

mblode/agent-skills

69 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

Adoption

mblode/ax-audit

$ install --global

Security Scan Results

SKILL.md

AX Audit

Contents

Audit workflow

Two rule layers

Tiers and verdict

AX Relationship Summary

Reference files

Gotchas

Audit self-check

Related skills

Related Skills

mblode/tidy

mblode/product-design

mblode/planning

mblode/dx-audit

mblode/ax-audit

$ install --global

Security Scan Results

SKILL.md

AX Audit

Contents

Audit workflow

Two rule layers

Tiers and verdict

AX Relationship Summary

Reference files

Gotchas

Audit self-check

Related skills

Related Skills

mblode/tidy

mblode/product-design

mblode/planning

mblode/dx-audit