skills/api-rx/SKILL.md
Metric-driven API surface design quality evaluation from a consumer's perspective. Use when: evaluating REST API quality, scoring endpoint design, reviewing response contracts, comparing API versions, validating developer experience, or when the user says "grade this API", "evaluate API", "API design review", "score this API", "REST quality", "run api-rx", "API quality check", or "how good is this API". Measures 8 dimensions (32 sub-metrics) with exact thresholds from Richardson Maturity Model, JSON:API, Google AIP, Stripe API model, OAuth 2.1, OpenAPI 3.1, Standard Webhooks, and HTTP Caching RFCs. Produces scorecards with actionable prescriptions.
npx skillsauth add acardozzo/rx-suite api-rxInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
None (POSIX only). Optional: Firecrawl MCP
Check all dependencies: bash scripts/rx-deps.sh or bash scripts/rx-deps.sh --install
Evaluate API design quality using 8 weighted dimensions and 32 sub-metrics with exact, reproducible thresholds. No guessing — every score traces to a measured value.
Announce at start: "I'm using the api-rx skill to evaluate [target] against 8 dimensions and 32 sub-metrics."
Accepts one argument: a directory path containing API route/controller files, or all.
/api-rx src/api
/api-rx apps/backend
/api-rx all
When all: evaluate every API surface directory and produce both per-module and aggregate scorecards.
scripts/discover.sh to detect route files, middleware, controllers, OpenAPI specs| # | Dimension | Weight | What It Measures | Source | |---|-----------|--------|------------------|--------| | D1 | REST Maturity & Resource Design | 15% | Resource naming, HTTP methods, status codes, HATEOAS | Richardson Maturity Model, REST API Design Rulebook (Masse) | | D2 | Response Consistency & Contracts | 15% | Envelope uniformity, error format, pagination, filtering | JSON:API, Google AIP, Microsoft REST Guidelines | | D3 | Versioning & Evolution | 10% | Version strategy, deprecation, backward compat, changelog | Stripe API model, API Changelog patterns | | D4 | Authentication & Rate Limiting DX | 15% | Auth flow, rate limit headers, API key mgmt, auth errors | OAuth 2.1, IETF Rate Limiting headers | | D5 | OpenAPI & SDK Readiness | 10% | Spec completeness, codegen compat, examples, validation | OpenAPI 3.1, Smithy, gRPC service definitions | | D6 | Webhook & Event API | 10% | Delivery guarantees, signatures, retry, event schema | Standard Webhooks spec, Stripe Webhooks model | | D7 | Performance & Caching DX | 15% | Cache headers, conditional requests, bulk ops, payload opt | HTTP Caching (RFC 7234), ETags, Conditional Requests | | D8 | Documentation & Developer Experience | 10% | Interactive docs, code examples, getting started, error catalog | Stripe Docs, Twilio Docs as gold standards |
Full metric tables and thresholds: read references/grading-framework.md.
Run scripts/discover.sh [target_dir] to scan the codebase. The orchestrator dispatches 8 dimension scripts in parallel, each collecting raw measurements.
# M1.1: Resource naming
# Scan route definitions for plural nouns, hierarchical patterns, verb-in-URL violations
# M1.2: HTTP method semantics
# Check correct verb usage per route, idempotency of PUT/DELETE
# M1.3: Status code accuracy
# Count distinct status codes used, check for 200-for-everything anti-pattern
# M1.4: HATEOAS / Hypermedia
# Search for link generation in responses, self/next/prev links, rel attributes
# M2.1: Response envelope consistency
# Check if responses follow a uniform structure (data/meta/links or similar)
# M2.2: Error response format
# Scan error handlers for structured errors, machine-readable codes, i18n support
# M2.3: Pagination pattern
# Detect cursor vs offset pagination, total count, next/prev links
# M2.4: Sparse fields & filtering
# Check for field selection params, filter operators, sort params
# M3.1: Versioning strategy
# Detect version prefixes in routes (/v1/, /v2/) or version headers
# M3.2: Deprecation policy
# Search for Sunset headers, deprecation notices, @deprecated annotations
# M3.3: Backward compatibility
# Check for additive-only changes, no field removals without version bump
# M3.4: Changelog & migration
# Look for CHANGELOG files, migration guides, version diff docs
# M4.1: Auth flow clarity
# Detect auth middleware, token lifecycle, documented auth flows
# M4.2: Rate limit headers
# Search for X-RateLimit-Limit/Remaining/Reset, Retry-After in responses
# M4.3: API key management
# Check for key rotation logic, scoping, environment separation
# M4.4: Error UX for auth failures
# Verify 401 vs 403 distinction, token refresh guidance in responses
# M5.1: OpenAPI spec completeness
# Find openapi.yaml/json, check endpoint coverage, examples, schemas
# M5.2: Code generation compatibility
# Check for SDK-friendly naming, no ambiguous operationIds
# M5.3: Request/response examples
# Count endpoints with example request/response bodies
# M5.4: Schema validation
# Detect Zod/Joi/Yup at boundaries, typed responses
# M6.1: Delivery guarantees
# Search for at-least-once logic, idempotency keys in webhook handlers
# M6.2: Signature verification
# Detect HMAC signing, timestamp validation in webhook delivery
# M6.3: Retry policy
# Check for exponential backoff, dead letter, manual retry endpoints
# M6.4: Event schema & versioning
# Look for typed event definitions, schema evolution patterns
# M7.1: Cache headers
# Search for Cache-Control, ETag, Last-Modified on GET endpoints
# M7.2: Conditional requests
# Detect If-None-Match, If-Modified-Since handling, 304 responses
# M7.3: Bulk operations
# Find batch/bulk endpoints, reduced round-trip patterns
# M7.4: Response time & payload optimization
# Check for gzip/compression, field selection, lazy relation loading
# M8.1: Interactive documentation
# Detect Swagger UI, Redoc, try-it configs
# M8.2: Code examples
# Count multi-language examples, copy-pasteable snippets
# M8.3: Getting started guide
# Find quickstart/getting-started docs, time-to-first-call estimate
# M8.4: Error catalog
# Check for documented error codes, fix suggestions, searchable index
After collecting raw metrics, dispatch 4 parallel agents to score the 8 dimensions:
Agent 1 — D1 + D2 (REST Design + Response Contracts): Receives raw metric data for resource naming, HTTP methods, status codes, HATEOAS, envelope consistency, error format, pagination, filtering. Reads the grading framework reference file. Applies threshold tables. Returns scored sub-metrics and dimension scores.
Agent 2 — D3 + D4 (Versioning + Auth & Rate Limiting): Receives raw metric data for version strategy, deprecation, backward compat, changelog, auth flows, rate limit headers, API key management, auth error UX. Reads the grading framework reference file. Applies threshold tables. Returns scored sub-metrics and dimension scores.
Agent 3 — D5 + D6 (OpenAPI & SDK + Webhooks): Receives raw metric data for spec completeness, codegen compat, examples, schema validation, delivery guarantees, signatures, retry policy, event schemas. Reads the grading framework reference file. Applies threshold tables. Returns scored sub-metrics and dimension scores.
Agent 4 — D7 + D8 (Performance & Caching + Documentation): Receives raw metric data for cache headers, conditional requests, bulk ops, payload optimization, interactive docs, code examples, getting started, error catalog. Reads the grading framework reference file. Applies threshold tables. Returns scored sub-metrics and dimension scores.
After all agents return, compute the overall score:
Overall = (D1 * 0.15) + (D2 * 0.15) + (D3 * 0.10) + (D4 * 0.15)
+ (D5 * 0.10) + (D6 * 0.10) + (D7 * 0.15) + (D8 * 0.10)
Map to letter grade:
| Grade | Score Range | |-------|------------| | A+ | 97-100 | | A | 93-96 | | A- | 90-92 | | B+ | 87-89 | | B | 83-86 | | B- | 80-82 | | C+ | 77-79 | | C | 73-76 | | C- | 70-72 | | D+ | 67-69 | | D | 63-66 | | D- | 60-62 | | F | 0-59 |
Output format — ALWAYS use this exact structure:
# API Design Grade: [TARGET]
**Overall: [SCORE] ([GRADE])**
| # | Dimension | Weight | Score | Grade | Weakest Sub-Metric |
|----|-----------|--------|-------|-------|---------------------|
| D1 | REST Maturity & Resource Design | 15% | [X] | [G] | [metric: raw value] |
| D2 | Response Consistency & Contracts | 15% | [X] | [G] | [metric: raw value] |
| D3 | Versioning & Evolution | 10% | [X] | [G] | [metric: raw value] |
| D4 | Authentication & Rate Limiting DX | 15% | [X] | [G] | [metric: raw value] |
| D5 | OpenAPI & SDK Readiness | 10% | [X] | [G] | [metric: raw value] |
| D6 | Webhook & Event API | 10% | [X] | [G] | [metric: raw value] |
| D7 | Performance & Caching DX | 15% | [X] | [G] | [metric: raw value] |
| D8 | Documentation & Developer Experience | 10% | [X] | [G] | [metric: raw value] |
## Sub-Metric Detail
### D1: REST Maturity & Resource Design ([SCORE])
| Sub-Metric | Weight | Raw Value | Score |
|------------|--------|-----------|-------|
| M1.1 Resource Naming | 25% | [details] | [S] |
| M1.2 HTTP Method Semantics | 25% | [details] | [S] |
| M1.3 Status Code Accuracy | 25% | [details] | [S] |
| M1.4 HATEOAS / Hypermedia | 25% | [details] | [S] |
[... repeat for D2-D8 with same table format ...]
## Top 5 Issues (Highest Impact)
1. **[Issue]** — [dimension] — fixing raises score by ~[N] points
2. ...
## Recommendations
- To reach [NEXT_GRADE]: fix [specific issues]
- Estimated effort: [relative sizing]
When evaluating all, also produce an aggregate:
# Aggregate API Design Grade
**Overall: [SCORE] ([GRADE])**
| Module | Endpoints | Weight | D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 | Overall | Grade |
|--------|-----------|--------|----|----|----|----|----|----|----|----|---------|-------|
| users/ | [N] | [W]% | .. | .. | .. | .. | .. | .. | .. | .. | [S] | [G] |
| orders/ | [N] | [W]% | .. | .. | .. | .. | .. | .. | .. | .. | [S] | [G] |
[... all modules ...]
Aggregate = weighted average by endpoint count proportion
Save scorecard to: docs/audits/YYYY-MM-DD-api-rx-[target].md
When all: save individual module scorecards + aggregate to docs/audits/YYYY-MM-DD-api-rx-all.md
After generating the scorecard and saving the report to docs/audits/:
docs/rx-plans/{this-skill-name}/{date}-report.mdrx-plan skill to create or update the improvement plan at docs/rx-plans/{this-skill-name}/{dimension}/v{N}-{date}-plan.mddocs/rx-plans/{this-skill-name}/summary.md with current scoresdocs/rx-plans/dashboard.md with overall progressThis happens automatically — the user does not need to run /rx-plan separately.
development
Prescriptive UX/UI evaluation producing scored opportunity maps for Next.js + shadcn/ui projects. Evaluates user experience against Nielsen Heuristics, WCAG 2.2, Core Web Vitals, Laws of UX, and Atomic Design. Use when: auditing UX quality, evaluating accessibility, reviewing component usage, identifying missing shadcn components, improving form UX, or when the user says "ux audit", "run ux-rx", "evaluate UX", "accessibility check", "improve user experience", "shadcn review", "how to reach A+ UX", or "UX opportunities". Measures 11 dimensions (44 sub-metrics). Fixed stack: Next.js App Router + shadcn/ui + Tailwind CSS. Leverages shadcn registry to recommend ready-to-use components. Outputs per-page scorecards with before/after Mermaid diagrams.
development
Evaluates testing strategy and completeness across 8 dimensions (32 sub-metrics): test pyramid balance, test effectiveness, contract/API testing, UI/visual testing, performance/load testing, test data management, CI integration, and test organization. Produces a scored diagnostic with actionable improvement plans.
development
Code-level security posture evaluation. Scans for OWASP Top 10 vulnerabilities, authentication flaws, injection vectors, authorization gaps, and data protection issues. Complements arch-rx D9 (architectural security) by inspecting actual source code patterns, dependencies, and security configurations. Produces a scored report across 8 dimensions with 32 sub-metrics mapped to OWASP ASVS and CWE references.
testing
Generates versioned improvement plans from rx report results. Creates one plan per dimension that scores below A+ (97). Plans are saved to docs/rx-plans/{domain}/{dimension}/v{N}-{date}-plan.md. Use after running any rx skill, or when the user says "create plan from report", "rx plan", "plan improvements", "generate improvement plan", "what should I fix first", "create roadmap", "improvement plan", "plan from audit", or "next steps from rx".