skills/claude-4-6-features/code-execution/SKILL.md
Use Claude's Code Execution tool to run Python in a sandboxed environment as part of a response — for calculation, data analysis, chart generation, and verification. Covers enabling, file upload, persistence across turns, and limitations. Use this skill when building features that need Claude to actually run code (not just write it), such as data analysis, math verification, or chart creation. Activate when: Claude code execution, Python sandbox, run code tool, data analysis agent, code interpreter, code_execution_20250522.
npx skillsauth add latestaiagents/agent-skills code-executionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
The Code Execution tool runs Python in an Anthropic-hosted sandbox as part of a model response. Use it when you need the model to actually compute, not just describe.
const response = await client.beta.messages.create(
{
model: "claude-sonnet-4-6",
max_tokens: 4096,
tools: [{ type: "code_execution_20250522", name: "code_execution" }],
messages: [{ role: "user", content: "Analyze the attached CSV and plot monthly revenue." }],
},
{ headers: { "anthropic-beta": "code-execution-2025-05-22" } },
);
No tool-use loop to manage — execution happens server-side and results flow back in the response.
Upload files with the Files API, then reference them in the message:
const file = await client.beta.files.upload({ file: fs.createReadStream("sales.csv") });
const response = await client.beta.messages.create(
{
model: "claude-sonnet-4-6",
max_tokens: 4096,
tools: [{ type: "code_execution_20250522", name: "code_execution" }],
messages: [
{
role: "user",
content: [
{ type: "container_upload", file_id: file.id },
{ type: "text", text: "Compute YoY growth from sales.csv." },
],
},
],
},
{ headers: { "anthropic-beta": "code-execution-2025-05-22,files-api-2025-04-14" } },
);
The file appears at /mnt/user-data/ inside the sandbox.
for (const block of response.content) {
if (block.type === "code_execution_tool_result") {
const result = block.content;
console.log("stdout:", result.stdout);
console.log("stderr:", result.stderr);
if (result.return_code !== 0) console.log("FAILED");
for (const file of result.files ?? []) {
// file is a generated image/data file with file_id; download via Files API
}
}
}
Re-use the sandbox across calls by passing the container ID:
// First call creates a container
const first = await client.beta.messages.create({ /* ... */ });
const containerId = first.container?.id;
// Second call re-uses it — pip installs, written files, variables all persist
const second = await client.beta.messages.create({
/* ... */
container: containerId,
messages: [
...firstMessages,
{ role: "user", content: "Now compute the 7-day rolling avg on that same DataFrame." },
],
});
Containers auto-expire after inactivity. Use persistence for multi-turn data analysis; skip it for one-shot.
matplotlib figures are returned as image files the Files API can serve:
import matplotlib.pyplot as plt
plt.plot([1,2,3], [4,5,6])
plt.savefig("/tmp/out.png")
The tool result includes a file_id. Download and render:
const bytes = await client.beta.files.content(fileId);
await fs.writeFile("out.png", bytes);
For analysis tasks, enable thinking so the model plans before coding:
const response = await client.beta.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 16_000,
thinking: { type: "enabled", budget_tokens: 8000 },
tools: [{ type: "code_execution_20250522", name: "code_execution" }],
messages: [...],
});
Enable interleaved thinking (see the extended-thinking skill) so it can reflect on code output before the next step.
pip install at startThe sandbox is isolated — user data doesn't leak between containers. Your risk is that the model might:
Scrub outputs before persisting (see mcp-security-sandboxing for redaction patterns).
return_code — model can confidently report results from failed codedevelopment
Test skills for correct activation, content quality, and regression — both automated checks (frontmatter validity, lint) and manual verification (query-suite activation testing). Covers CI integration and how to catch skill regressions before users do. Use this skill when adding skills to a repo, setting up CI for a skill library, or debugging "the skill exists but doesn't work". Activate when: test skills, validate skills, skill CI, skill linting, skill activation test, skill regression.
documentation
Write the YAML frontmatter for a SKILL.md file so it activates reliably — name, description, and activation keywords that the model matches against. Covers length, tone, and the most common frontmatter mistakes. Use this skill when authoring a new skill, fixing a skill that isn't auto-activating, or reviewing skills for publication. Activate when: SKILL.md frontmatter, skill description, skill activation, skill YAML, write a skill, author a skill.
development
Design skills that fire at the right moment — neither over-eager (noise) nor under-eager (silent). Covers activation specificity, trigger phrases, disambiguation between overlapping skills, and debugging activation. Use this skill when multiple skills could fire on the same query, a skill never fires, or a skill fires too often. Activate when: skill won't activate, skill over-activates, overlapping skills, skill triggers, skill selection, skill disambiguation.
development
Structure SKILL.md content so the model reads just enough — concise summary up front, progressively deeper detail, examples on demand. Covers section ordering, length budgets, when to split into multiple skills. Use this skill when writing or refactoring a skill body, one skill has grown too long, or a skill is wordy but not useful. Activate when: SKILL.md structure, skill content, skill too long, split skill, progressive disclosure, skill body.