skills/mcp-mastery/mcp-tool-design/SKILL.md
Design MCP tool schemas, names, and descriptions that AI agents actually pick correctly and use without hand-holding. Covers the anti-patterns that make agents loop, pick wrong tools, or hallucinate arguments. Use this skill when designing or reviewing MCP tools, debugging "the agent isn't using my tool", or pruning a bloated tool surface. Activate when: MCP tool design, tool description, agent picks wrong tool, too many tools, tool schema, tool naming.
npx skillsauth add latestaiagents/agent-skills mcp-tool-designInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
A well-designed tool is invoked correctly by the agent on the first try. A bad one causes loops, wrong-tool selection, or hallucinated arguments.
Agents pick tools based on name, description, and parameter schema — in that order of signal strength. Every design choice should strengthen at least one.
| Good | Bad | Why |
|---|---|---|
| search_issues | issueSearcher | snake_case, verb-led |
| get_user_by_email | user_lookup | specific over vague |
| create_pr_comment | comment | namespaced by object |
| list_repos | repos | action is explicit |
Rule: <verb>_<object>[_<qualifier>]. If two tools could answer the same query, one has the wrong name.
Descriptions are what the model reads most carefully. Budget ~1-3 sentences:
<Verb-led action>. <When to use it / when NOT to use it>. <Any gotchas>.
Weak:
Creates an issue.
Strong:
Create a GitHub issue in the specified repo. Use this for new bug reports or feature requests. Do NOT use to comment on an existing issue — use
create_issue_commentfor that. Title is required; body supports markdown.
The "when NOT to use" line is the highest-leverage sentence you can write — it routes disambiguation without the agent needing to enumerate all tools.
.describe() in Zod, Field(description=...) in Pydantic. Undescribed fields get hallucinated valuespriority: z.enum(["low", "med", "high"]) beats priority: z.string()limit=50, set the default; don't make the agent guessserver.tool(
"search_issues",
"Search GitHub issues across a repo. Returns up to 50 issues matching the query. " +
"Use for finding issues by keyword or label. For a specific issue by number, use `get_issue` instead.",
{
repo: z.string().describe("owner/name format, e.g. 'anthropic/claude-code'"),
query: z.string().describe("Full-text search query, GitHub search syntax supported"),
state: z.enum(["open", "closed", "all"]).default("open"),
labels: z.array(z.string()).optional().describe("Filter by label names (AND semantics)"),
limit: z.number().int().min(1).max(100).default(25),
},
async (args) => { /* ... */ },
);
The model sees the tool result as text. Structure matters:
"... 190 more, use offset=10" hintreturn {
content: [{
type: "text",
text: [
`Found ${results.length} issues matching "${query}":`,
...results.slice(0, 10).map(r => `- #${r.number} ${r.title} (${r.state})`),
results.length > 10 ? `... ${results.length - 10} more. Narrow query or paginate with offset.` : "",
].filter(Boolean).join("\n"),
}],
};
Tool selection accuracy degrades past ~15 tools per connected server set. If you have 40 CRUD operations, collapse them:
create_issue, update_issue, delete_issue, close_issue, reopen_issue, assign_issue, ... (12 tools)issue_action(action: "create"|"update"|"close"|"reopen", ...) (1 tool with discriminated schema)Only collapse if the sub-actions share 80%+ of their schema. Otherwise the union becomes a mess.
execute(query: string) that dispatches everything. Model can't pick wisely; hallucinates queriessearch, find, lookup, query all doing similar thingsoptions: object where some keys are required; agent skips themsearch tool, the model gets confused. Prefix: linear_search, github_searchBefore shipping a tool:
.describe()development
Test skills for correct activation, content quality, and regression — both automated checks (frontmatter validity, lint) and manual verification (query-suite activation testing). Covers CI integration and how to catch skill regressions before users do. Use this skill when adding skills to a repo, setting up CI for a skill library, or debugging "the skill exists but doesn't work". Activate when: test skills, validate skills, skill CI, skill linting, skill activation test, skill regression.
documentation
Write the YAML frontmatter for a SKILL.md file so it activates reliably — name, description, and activation keywords that the model matches against. Covers length, tone, and the most common frontmatter mistakes. Use this skill when authoring a new skill, fixing a skill that isn't auto-activating, or reviewing skills for publication. Activate when: SKILL.md frontmatter, skill description, skill activation, skill YAML, write a skill, author a skill.
development
Design skills that fire at the right moment — neither over-eager (noise) nor under-eager (silent). Covers activation specificity, trigger phrases, disambiguation between overlapping skills, and debugging activation. Use this skill when multiple skills could fire on the same query, a skill never fires, or a skill fires too often. Activate when: skill won't activate, skill over-activates, overlapping skills, skill triggers, skill selection, skill disambiguation.
development
Structure SKILL.md content so the model reads just enough — concise summary up front, progressively deeper detail, examples on demand. Covers section ordering, length budgets, when to split into multiple skills. Use this skill when writing or refactoring a skill body, one skill has grown too long, or a skill is wordy but not useful. Activate when: SKILL.md structure, skill content, skill too long, split skill, progressive disclosure, skill body.