agent-patterns-plugin/skills/mcp-code-execution/SKILL.md
Scaffold the code execution pattern for MCP-based agents. Use when agents call many MCP tools, intermediate data exceeds context, you need loops, or PII must stay out of context.
npx skillsauth add laurigates/claude-plugins mcp-code-executionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Expert knowledge for designing agent systems that generate and execute code to interact with MCP servers, instead of calling tools directly.
| Use this skill when... | Use mcp-management instead when... |
|---|---|
| Designing agents that fan out across 10+ MCP servers or 50+ tools | Installing or configuring a single MCP server in .mcp.json |
| Intermediate tool results are large (>10K tokens) and would blow context | Operating one or two servers where every result is small enough to inline |
| Workflows need loops, retries, or conditionals across tool calls | Doing a one-shot connection check or linear 2–3-call sequence |
| PII must not reach the model context | Tool responses contain no sensitive data |
| Use code execution when... | Use direct tool calls when... | |----------------------------|-------------------------------| | Connecting to 10+ MCP servers or 50+ tools | Few servers with handful of tools | | Intermediate results are large (>10K tokens) | Results are small and all needed by the model | | Workflows need loops, retries, or conditionals | Linear sequences of 2-3 tool calls | | PII must not reach the model context | No sensitive data in tool responses | | Tasks benefit from state persistence across runs | Stateless, one-shot operations | | You want agents to accumulate reusable skills | Fixed, predefined workflows |
Instead of loading all MCP tool definitions into the model context upfront, the agent:
This reduces token usage from O(all_tool_definitions) to O(only_relevant_imports).
project/
├── servers/
│ ├── google-drive/
│ │ ├── getDocument.ts
│ │ ├── getSheet.ts
│ │ ├── listFiles.ts
│ │ └── index.ts # Re-exports all tools
│ ├── salesforce/
│ │ ├── query.ts
│ │ ├── updateRecord.ts
│ │ └── index.ts
│ └── slack/
│ ├── sendMessage.ts
│ ├── getChannelHistory.ts
│ └── index.ts
├── skills/ # Agent-accumulated reusable functions
│ └── save-sheet-as-csv.ts
├── workspace/ # Persistent state between executions
├── client.ts # MCP client that routes calls to servers
└── sandbox.config.ts # Execution environment configuration
Each MCP tool gets a typed wrapper function that the agent imports:
// servers/google-drive/getDocument.ts
import { callMCPTool } from "../../client.js";
interface GetDocumentInput {
documentId: string;
}
interface GetDocumentResponse {
content: string;
}
/** Read a document from Google Drive */
export async function getDocument(
input: GetDocumentInput
): Promise<GetDocumentResponse> {
return callMCPTool<GetDocumentResponse>("google_drive__get_document", input);
}
The agent then writes code that uses these wrappers naturally:
import * as gdrive from "./servers/google-drive";
import * as salesforce from "./servers/salesforce";
const transcript = (
await gdrive.getDocument({ documentId: "abc123" })
).content;
await salesforce.updateRecord({
objectType: "SalesMeeting",
recordId: "00Q5f000001abcXYZ",
data: { Notes: transcript },
});
The agent navigates the filesystem to find relevant tools on demand, instead of loading all definitions upfront.
Agent: "I need to read from Google Drive"
→ ls servers/
→ ls servers/google-drive/
→ cat servers/google-drive/getDocument.ts (reads signature + JSDoc)
→ generates code importing only getDocument
Token impact: 150,000 tokens (all definitions) reduced to ~2,000 tokens (one definition). 98.7% reduction.
Filter large datasets in the execution environment before results reach the model:
// Filter in the sandbox — only summary reaches the model
const allRows = await gdrive.getSheet({ sheetId: "abc123" });
const pending = allRows.filter((row) => row["Status"] === "pending");
console.log(`Found ${pending.length} pending orders`);
console.log(pending.slice(0, 5)); // Only first 5 for model review
Replace chained tool calls with code-native loops and conditionals:
// Polling loop — runs entirely in sandbox
let found = false;
while (!found) {
const messages = await slack.getChannelHistory({ channel: "C123456" });
found = messages.some((m) => m.text.includes("deployment complete"));
if (!found) await new Promise((r) => setTimeout(r, 5000));
}
console.log("Deployment notification received");
The MCP client intercepts responses and tokenizes sensitive data before it reaches the model:
// Agent writes this code
for (const row of sheet.rows) {
await salesforce.updateRecord({
objectType: "Lead",
recordId: row.salesforceId,
data: { Email: row.email, Phone: row.phone, Name: row.name },
});
}
console.log(`Updated ${sheet.rows.length} leads`);
What the model sees in the execution output:
[
{ salesforceId: "00Q...", email: "[EMAIL_1]", phone: "[PHONE_1]", name: "[NAME_1]" },
{ salesforceId: "00Q...", email: "[EMAIL_2]", phone: "[PHONE_2]", name: "[NAME_2]" }
]
Updated 247 leads
The actual PII flows between external systems without entering model context.
Save intermediate results to the workspace for cross-execution continuity:
// Execution 1: fetch and save
const leads = await salesforce.query({
query: "SELECT Id, Email FROM Lead LIMIT 1000",
});
await fs.writeFile("./workspace/leads.csv", leads.map((l) => `${l.Id},${l.Email}`).join("\n"));
// Execution 2: resume from saved state
const saved = await fs.readFile("./workspace/leads.csv", "utf-8");
Agents persist reusable functions as skills for future executions:
// skills/save-sheet-as-csv.ts
import * as gdrive from "../servers/google-drive";
import * as fs from "fs/promises";
export async function saveSheetAsCsv(sheetId: string): Promise<string> {
const data = await gdrive.getSheet({ sheetId });
const csv = data.map((row) => row.join(",")).join("\n");
const path = `./workspace/sheet-${sheetId}.csv`;
await fs.writeFile(path, csv);
return path;
}
Later executions import the skill directly:
import { saveSheetAsCsv } from "./skills/save-sheet-as-csv";
const csvPath = await saveSheetAsCsv("abc123");
List the MCP servers the agent needs to interact with. Check .mcp.json or the project's MCP configuration:
cat .mcp.json 2>/dev/null || echo "No MCP config found"
For each MCP server, create a directory with typed wrappers. Each tool gets its own file with:
callMCPToolThe client routes callMCPTool calls to the appropriate MCP server:
// client.ts
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
const clients = new Map<string, Client>();
export async function callMCPTool<T>(
toolName: string,
input: Record<string, unknown>
): Promise<T> {
const serverName = toolName.split("__")[0];
const client = clients.get(serverName);
if (!client) throw new Error(`No MCP client for server: ${serverName}`);
const result = await client.callTool({ name: toolName, arguments: input });
return result.content as T;
}
The execution environment needs:
| Concern | Requirement |
|---------|-------------|
| Isolation | Process-level or container-level sandboxing |
| Resource limits | CPU time, memory caps, disk quotas |
| Network | Restrict to MCP server connections only |
| Timeout | Hard execution time limit per run |
| Filesystem | Scoped to workspace/ and servers/ directories |
| Monitoring | Log all executions and MCP calls |
The agent loop becomes:
1. Receive user request
2. Agent explores servers/ tree to find relevant tools
3. Agent generates TypeScript code using typed wrappers
4. Code executes in sandbox
5. Filtered output returns to agent
6. Agent decides: done, or generate more code?
| Item | Status |
|------|--------|
| Sandboxed execution environment | Required |
| Resource limits (CPU, memory, disk) | Required |
| Network isolation (MCP servers only) | Required |
| Execution timeout | Required |
| PII tokenization in MCP client | Recommended for sensitive data |
| Audit logging of all executions | Recommended |
| Read-only access to servers/ | Recommended |
| Scoped write access to workspace/ only | Recommended |
| Context | Approach | |---------|----------| | Many tools (50+) | Use progressive discovery via file tree | | Large intermediate data | Filter in sandbox, return summaries | | Multi-step workflows | Generate single code block with control flow | | Sensitive data pipelines | Enable PII tokenization in MCP client | | Long-running tasks | Use workspace/ for state persistence | | Repeated operations | Extract to skills/ for reuse |
| Approach | Tool definitions | Intermediate data | Total | |----------|-----------------|-------------------|-------| | Direct tool calls | All loaded upfront | Passes through context | High | | Code execution | On-demand discovery | Stays in sandbox | Low |
tools
Scaffold a new ComfyUI custom-node repo (pyproject, CI, release-please, vitest+pytest, JS extension skeleton) in the picker/gesture vein. Use when bootstrapping or init-ing a comfyui node pack.
tools
Orchestrate a ComfyUI node pack from idea to registry: scaffold, create + seed the repo, open the gitops adoption PR. Use when releasing or spinning up a new comfyui node pack.
testing
macOS EndpointSecurity/EDR high CPU & battery drain. Use when Kandji ESF / XProtect pegs a core; trace the exec storm via powermetrics + eslogger.
development
odiff pixel-by-pixel image diffing. Use when comparing screenshots, detecting visual regressions, diffing before/after PNGs, asserting golden images.