creating-mcp-servers/SKILL.md
Creates production-ready MCP servers using FastMCP v2. Use when building MCP servers, optimizing tool descriptions for context efficiency, implementing progressive disclosure for multiple capabilities, or packaging servers for distribution.
npx skillsauth add oaustegard/claude-skills creating-mcp-serversInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Build production-ready MCP servers using FastMCP v2 with optimal context efficiency through progressive disclosure patterns.
Activate this skill when:
1-3 simple tools?
→ Standard FastMCP with optimized tools
Load: references/MANDATORY_PATTERNS.md
5+ related capabilities?
→ Gateway pattern (progressive disclosure)
Load: references/PROGRESSIVE_DISCLOSURE.md
Load: references/GATEWAY_PATTERNS.md
Optimize existing server?
→ Apply mandatory patterns
Load: references/MANDATORY_PATTERNS.md
Package for distribution?
→ MCPB bundler
Load: references/MCPB_BUNDLING.md
Execute: scripts/create_mcpb.py
Need FastMCP documentation?
→ Search references/LLMS_TXT.md for relevant URLs
→ Use web_fetch on gofastmcp.com URLs
Four critical requirements for ALL implementations:
uv pip install fastmcpDetails in references/MANDATORY_PATTERNS.md
To fetch FastMCP documentation:
1. Read references/LLMS_TXT.md - complete URL index
2. Search for relevant topic keywords
3. Use web_fetch on matched URLs (append .md for markdown)
4. Apply patterns from fetched documentation
Example: Authentication patterns → Search LLMS_TXT.md for "authentication" → web_fetch https://gofastmcp.com/servers/auth/authentication.md
For servers with 5+ capabilities:
Three-tier loading:
Achieves 85-93% baseline reduction. See references/PROGRESSIVE_DISCLOSURE.md
Read LLMS_TXT.md → Find relevant URLs → web_fetch documentation
Load appropriate reference based on architecture decision. Apply all four mandatory patterns.
cd /home/claude
zip -r server-name.mcpb manifest.json server.py README.md
cp server-name.mcpb /mnt/user-data/outputs/
See references/MCPB_BUNDLING.md for manifest format.
Documentation index (load first for FastMCP knowledge):
Core patterns:
Implementation:
Scripts:
scripts/create_mcpb.py - Bundle MCP servers into .mcpb filesBefore completing any FastMCP implementation:
✓ Uses uv (not pip)
✓ FastMCP docs fetched from LLMS_TXT.md URLs (not web_search)
✓ Tool annotations (readOnlyHint, title, openWorldHint)
✓ Annotated parameters with Field
✓ Single-sentence docstrings
✓ 65-70% token reduction vs verbose
✓ Server instructions concise (<100 chars)
For gateway implementations, additionally verify:
✓ 85%+ baseline context reduction
✓ Discover returns metadata only
✓ Load fetches content on demand
✓ Execute runs without context cost
Before (180 tokens):
@mcp.tool()
async def search_items(query: str):
"""Search for items in the database.
This tool allows comprehensive searching..."""
After (55 tokens):
@mcp.tool(
annotations={"title": "Search", "readOnlyHint": True, "openWorldHint": False}
)
async def search_items(
query: Annotated[str, Field(description="Search text")],
ctx: Context = None
):
"""Search items. Fast full-text search across all fields."""
❌ Using mcpb pack CLI (causes crashes, just use zip)
❌ Using pip instead of uv
❌ web_search for FastMCP docs (use web_fetch on LLMS_TXT.md URLs)
❌ Verbose tool descriptions
❌ Missing tool annotations
❌ Gateway for 1-3 tools (overhead exceeds benefit)
❌ Mixing unrelated capabilities in single gateway
testing
Disciplined, validation-gated revision of an EXISTING skill so each edit is a measured improvement rather than a guess. Use when editing, revising, or tuning a skill that already exists and there is evidence it underperforms (observed failures, drift, complaints) — invoke by name, or have versioning-skills / creating-skill defer to it before applying edits. Not for authoring a brand-new skill from scratch (use creating-skill) or one-off prose.
development
Skill-aware orchestration with context routing. Decomposes complex tasks into skill-typed subtasks, extracts targeted context subsets, executes subagents in parallel, and synthesizes results. Self-answers trivial lookups inline. No SDK dependency — uses raw HTTP via httpx. Use when tasks require multiple analytical perspectives, when context is large and subtasks only need portions, or when orchestrating-agents spawns too many redundant subagents.
tools
Orchestrates parallel API instances, delegated sub-tasks, and multi-agent workflows with streaming and tool-enabled delegation patterns. Use for parallel analysis, multi-perspective reviews, or complex task decomposition.
development
Invokes Google Gemini models for structured outputs, image generation, multi-modal tasks, and Google-specific features. Use when users request Gemini, image generation, structured JSON output, Google API integration, or cost-effective parallel processing.