apps/docs/skills/shell-execution-sandbox/SKILL.md
Enable and configure the sandboxed shell execution tool with command allowlists, Docker isolation, and audit logging for agents that run terminal commands.
npx skillsauth add tylerjrbuell/reactive-agents-ts shell-execution-sandboxInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Disclaimer — your machine, your risk.
shell-executeruns real processes on the host (or in an optional Docker sandbox you configure). Allowlists and blocklists reduce accidents but are not a guarantee. Only enable for trusted codebases and accounts; review allowed command names and working-directory rules before production use. Cortex exposes this as an explicit opt-in in the Lab builder with the same warning.
Produce a builder with shell execution enabled, the correct allowlist for the task, and appropriate safety config — without exposing destructive commands.
const agent = await ReactiveAgents.create()
.withProvider("anthropic")
.withReasoning({ defaultStrategy: "plan-execute-reflect", maxIterations: 15 })
.withTools({
allowedTools: ["shell-execute", "file-read", "checkpoint"],
terminal: true, // registers shell-execute handler (or use .withTerminalTools())
})
.withSystemPrompt(`
You have access to a shell. Use it to explore the codebase and run commands.
Always checkpoint important findings before continuing.
`)
.build();
The shell-execute tool blocks any command not on the allowlist. Default allowed commands:
git, ls, cat, grep, find, echo, printf
mkdir, cp, mv, touch
wc, head, tail, sort, uniq, cut, tr, tee, diff, sed, awk, jq
pwd, date, which, basename, dirname, test, true, false
seq, gzip, gunzip, zip, unzip
Explicitly excluded: rm, chmod, chown — too destructive for agent sandboxes.
Build tools (Node, Bun, npm, Python, curl) are available but not on by default:
// Available opt-in commands: node, bun, npm, npx, python, python3, curl, env, xargs, tar
// Add via ShellExecuteConfig.additionalCommands when registering the tool:
import { shellExecuteTool, shellExecuteHandler } from "@reactive-agents/tools";
const shellTool = {
definition: shellExecuteTool,
handler: shellExecuteHandler({
additionalCommands: ["bun", "node", "npm"],
timeoutMs: 60_000, // default 30s — increase for build commands
maxOutputChars: 8_000, // default 4000
cwd: "/workspace", // default to project root
}),
};
const agent = await ReactiveAgents.create()
.withTools({ tools: [shellTool], allowedTools: ["shell-execute"] })
.build();
When dockerEscalation is enabled, inline code (Node --eval, Bun -e, Python -c) automatically routes through a Docker sandbox:
shellExecuteHandler({
additionalCommands: ["node", "python3"],
dockerEscalation: {
enabled: true,
// Inline code execution is fully isolated in a fresh container
},
})
shellExecuteHandler({
allowedCommands: ["ls", "cat", "grep", "find", "head", "tail", "wc"],
// Only listing and reading — no writes, no execution
})
shellExecuteHandler({
onAudit: (entry: ShellAuditEntry) => {
logger.info("shell-execute", {
command: entry.command,
exitCode: entry.exitCode,
durationMs: entry.durationMs,
});
},
})
The shell-execute built-in tool has these characteristics:
| Property | Value |
|----------|-------|
| riskLevel | "high" |
| requiresApproval | true |
| category | "system" |
| timeoutMs (default) | 30,000ms |
| maxOutputChars (default) | 4,000 chars |
| MAX_COMMAND_LENGTH | 4,096 chars |
| Method | Key params | Notes |
|--------|-----------|-------|
| .withTools({ tools, allowedTools }) | include "shell-execute" | Register custom handler for config |
| .withTools() | no args | Enables shell-execute but with requiresApproval: true |
shell-execute has requiresApproval: true by default — in automated pipelines, register a custom handler with requiresApproval: false if human approval flow is not wiredgit is allowed regardless of sub-command args; curl is opt-inMAX_COMMAND_LENGTH is 4,096 — very long piped commands will be rejecteddockerEscalation — check before enabling in CIrm, chmod, chown are hard-excluded and cannot be added via additionalCommandsmaxOutputChars: 4000 truncates long output — increase for commands that produce large output (e.g., git log, find on large trees)development
Orient to the Reactive Agents framework, understand the builder API shape, and select the right capability skills for your task.
testing
Enable output verification (hallucination detection, semantic entropy, self-consistency), add post-run verification steps, and run LLM-scored evals across 5 quality dimensions.
data-ai
Configure per-provider behavior, understand streaming quirks, and use the 7-hook adapter system for optimal performance across LLM providers.
data-ai
Configure the 4-layer memory system with SQLite/FTS5/vec storage for persistent agent knowledge that survives sessions.