claude/skills/test-sandbox/SKILL.md
Run Python test code in an isolated sandbox without polluting the main context. Writes test files to .claude/testing/ (gitignored), runs via sub-agent, and reports only pass/fail counts and assertion details. Use when you want to quickly verify code without writing inline python3 -c scripts. Also supports --sweep to clean stale test files. Use when the user says "run a quick test", "verify this works", "sanity check", "test this snippet", or invokes /test-sandbox.
npx skillsauth add paulnsorensen/dotfiles test-sandboxInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Run Python test code in an isolated, sandboxed environment without polluting the main context. Ideal for quick assertions and verification during development.
/test-sandbox "assert 1 + 1 == 2"
/test-sandbox "from src.orders import process_order; assert process_order({}) == expected"
/test-sandbox --file tests/test_edge_cases.py
/test-sandbox --sweep
.claude/testing/test_<hash>.py (isolated from repo)uv run pytest .claude/testing/test_<hash>.py --tb=shortThe skill delegates to sub-agents to keep your main context clean — you only see the results, not the verbose test output or implementation details.
> /test-sandbox "assert 'hello'.upper() == 'HELLO'"
✓ Test passed: 1 assertion ran, all passed
Wrote: .claude/testing/test_abc123.py
> /test-sandbox "from src.auth import verify_token; assert verify_token('valid') is True; assert verify_token('bad') is False"
✓ Test passed: 2 assertions ran, all passed
> /test-sandbox "assert 1 + 1 == 3"
✗ Test failed: AssertionError
Expected: 3
Actual: 2
File: .claude/testing/test_xyz789.py (not cleaned up for inspection)
> /test-sandbox --sweep
Cleaned 7 stale test files (> 24 hours old)
`.claude/testing/` now contains 2 recent tests
| Flag | Behavior |
|------|----------|
| --sweep | Delete test files older than 24 hours. Does not run tests. |
| --file <path> | Run tests from an existing file instead of inline code. |
| --keep | Don't clean up test file after run (for inspection). |
On first use, /test-sandbox automatically adds .claude/testing/ to .gitignore if not already present. No manual action needed.
uv run pytest, not a mock runner. Respects venv, fixtures, conftest.$TMPDIR if needed.Good for:
Not ideal for:
tests/ directory instead).claude/testing/)conftest.py)/test-sandbox "from src.module import fn; result = fn(); assert result > 0; print(f'Result: {result}')"
Separate with semicolons:
/test-sandbox "from mymodule import Cls; c = Cls(); assert c.x == 1; assert c.y == 2; assert c.z == 3"
/test-sandbox --file src/old_module.py # Run old module's internal test suite
/test-sandbox --keep "assert my_fn() == expected" # Don't delete file after failure
cat .claude/testing/test_*.py # Inspect the generated test
--keep flag used)See claude/CLAUDE.md for sub-agent delegation patterns and context discipline rules.
PYTHONPATH=.tests/ are not available in .claude/testing/ — copy needed fixturesuv must be installed — fall back to python -m pytest if unavailable.claude/testing/ are gitignored but accumulate — use --sweep periodicallytools
Reconstruct what a past coding-agent session was doing so you can resume it — goal, files touched, last verified state, and the next step — by querying the session logs. Use when the user says "what was I working on", "recover that session", "reconstruct where I left off", "resume my last session", "what did that session change", "rebuild context from logs", or invokes /work-recovery. Report-only — it never scores or judges. Do NOT use for usage scoring (that is /skill-improver, /tool-efficiency, /prompt-analytics) or one-off interactive log queries (that is /session-analytics).
development
Curate this repo's hallouminate wiki (.hallouminate/wiki/, the repo:dotfiles:wiki corpus) — add or update architecture pages, per-harness docs, and gotchas. Use when the user says "update the wiki", "document this in the wiki", "refresh the harness docs", "add a wiki page", "curate the wiki", "the wiki is stale", or invokes /wiki-curator. Also use at session end to write back a non-obvious decision or gotcha worth preserving. Grounds the existing wiki first, follows one-topic-per-file conventions, verifies every external doc URL before writing, and reindexes. Do NOT use for general code search (that is cheez-search) or for editing AGENTS.md command reference.
tools
Audit how a tool, command, or MCP server is actually used across coding-agent sessions and produce calibrated recommendations — tool-vs-task fit, error forensics, fix recommendations, permission friction, MCP health, and token economics. Use when the user says "tool efficiency", "am I using X efficiently", "audit tool usage", "why does X keep failing", "how do I fix this error", "what should I change", "permission friction", "is this MCP worth it", "tool error rate", "fix recommendations", or invokes /tool-efficiency. Do NOT use for auditing a skill or agent definition (that is /skill-improver) or for one-off interactive log queries (that is /session-analytics).
tools
Analyze how prompts and skill routing behave across coding-agent sessions and produce calibrated recommendations — prompt-pattern analysis, routing accuracy, and knowledge gaps. Use when the user says "analyze my prompts", "prompt patterns", "is routing working", "which skill should have fired", "knowledge gaps", "what do I keep asking", or invokes /prompt-analytics. Do NOT use for auditing a single skill/agent definition (that is /skill-improver), tool/MCP efficiency (that is /tool-efficiency), or one-off interactive log queries (that is /session-analytics).