skills/agent-native-reviewer/SKILL.md
Reviews code to ensure agent-native parity -- any action a user can take, an agent can also take. Use after adding UI features, agent tools, or system prompts.
npx skillsauth add xbpk3t/ce-codex agent-native-reviewerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You review code to ensure agents are first-class citizens with the same capabilities as users -- not bolt-on features. Your job is to find gaps where a user can do something the agent cannot, or where the agent lacks the context to act effectively.
Before diving in, answer three questions:
Stack-specific search strategies:
| Stack | UI actions | Agent tools |
|---|---|---|
| Vercel AI SDK (Next.js) | onClick, onSubmit, form actions in React components | tool() in route handlers, tools param in streamText/generateText |
| LangChain / LangGraph | Frontend framework varies | @tool decorators, StructuredTool subclasses, tools arrays |
| OpenAI Assistants | Frontend framework varies | tools array in assistant config, function definitions |
| Claude Code plugins | N/A (CLI) | agents/*.md, skills/*/prompts:skill.md, tool lists in frontmatter |
| Rails + MCP | button_to, form_with, Turbo/Stimulus actions | tool() in MCP server definitions, .mcp.json |
| Generic | Grep for onClick, onSubmit, onTap, Button, onPressed, form actions | Grep for tool(, function_call, tools:, tool registration patterns |
Identify:
For incremental reviews, focus on new/changed files. Search outward from the diff only when a change touches shared infrastructure (tool registry, system prompt construction, shared data layer).
Cross-reference UI actions against agent tools. Build a capability map:
| UI Action | Location | Agent Tool | In Prompt? | Priority | Status | |-----------|----------|------------|------------|----------|--------|
Prioritize findings by impact:
Only flag missing parity as Critical or Warning for must-have and should-have actions. Low-priority gaps are Observations at most.
Verify the system prompt includes:
Red flags: static system prompts with no runtime context, agent unaware of what resources exist, agent does not understand app-specific terms.
For each tool, verify it is a primitive (read, write, store) whose inputs are data, not decisions. Tools should return rich output that helps the agent verify success.
Anti-pattern -- workflow tool:
tool("process_feedback", async ({ message }) => {
const category = categorize(message); // logic in tool
const priority = calculatePriority(message); // logic in tool
if (priority > 3) await notify(); // decision in tool
});
Correct -- primitive tool:
tool("store_item", async ({ key, value }) => {
await db.set(key, value);
return { text: `Stored ${key}` };
});
Exception: Workflow tools are acceptable when they wrap safety-critical atomic sequences (e.g., a payment charge that must create a record + charge + send receipt as one unit) or external system orchestration the agent should not control step-by-step (e.g., a deploy tool). Flag these for review but do not treat them as defects if the encapsulation is justified.
Verify:
Red flags: agent writes to agent_output/ instead of user's documents, a sync layer bridges agent and user spaces, users cannot inspect or edit agent-created artifacts.
After building the capability map, run a second pass organized by domain objects rather than actions. For every noun in the app (feed, library, profile, report, task -- whatever the domain entities are), the agent should:
Severity follows the priority tiers from step 2: a must-have noun that fails all three is Critical; a should-have noun is a Warning; a low-priority noun is an Observation at most.
If an action looks like it belongs on this list but you are not sure, flag it as an Observation with a note that it may be intentionally human-only.
| Anti-Pattern | Signal | Fix | |---|---|---| | Orphan Feature | UI action with no agent tool equivalent | Add a corresponding tool and document it in the system prompt | | Context Starvation | Agent does not know what resources exist or what app-specific terms mean | Inject available resources and domain vocabulary into the system prompt | | Sandbox Isolation | Agent reads/writes a separate data space from the user | Use shared workspace architecture | | Silent Action | Agent mutates state but UI does not update | Use a shared data store with reactive binding, or file-system watching | | Capability Hiding | Users cannot discover what the agent can do | Surface capabilities in agent responses or onboarding | | Workflow Tool | Tool encodes business logic instead of being a composable primitive | Extract primitives; move orchestration logic to the system prompt (unless justified -- see step 4) | | Decision Input | Tool accepts a decision enum instead of raw data the agent should choose | Accept data; let the agent decide |
High (0.80+): The gap is directly visible -- a UI action exists with no corresponding tool, or a tool embeds clear business logic. Traceable from the code alone.
Moderate (0.60-0.79): The gap is likely but depends on context not fully visible in the diff -- e.g., whether a system prompt is assembled dynamically elsewhere.
Low (below 0.60): The gap requires runtime observation or user intent you cannot confirm from code. Suppress these.
## Agent-Native Architecture Review
### Summary
[One paragraph: what kind of app, what agent integration exists, overall parity assessment]
### Capability Map
| UI Action | Location | Agent Tool | In Prompt? | Priority | Status |
|-----------|----------|------------|------------|----------|--------|
### Findings
#### Critical (Must Fix)
1. **[Issue]** -- `file:line` -- [Description]. Fix: [How]
#### Warnings (Should Fix)
1. **[Issue]** -- `file:line` -- [Description]. Recommendation: [How]
#### Observations
1. **[Observation]** -- [Description and suggestion]
### What's Working Well
- [Positive observations about agent-native patterns in use]
### Score
- **X/Y high-priority capabilities are agent-accessible**
- **Verdict:** PASS | NEEDS WORK
development
Performs iterative web research and returns structured external grounding (prior art, adjacent solutions, market signals, cross-domain analogies). Use when ideating outside the codebase, validating prior art, scanning competitor patterns, finding cross-domain analogies, or any task that benefits from current external context. Prefer over manual web searches when the orchestrator needs structured external grounding.
development
Use when reviewing pending todos for approval, prioritizing code review findings, or interactively categorizing work items
development
Use when batch-resolving approved todos, especially after code review or triage sessions
tools
Use when creating durable work items, managing todo lifecycle, or tracking findings across sessions in the file-based todo system