skills/dev-test/SKILL.md
This skill should be used when the user needs to 'debug web applications', 'test UI interactions', 'capture screenshots or network requests', 'test desktop automation', or needs to select between testing tools. Routes to platform-specific E2E testing skills: Chrome MCP for debugging, Playwright for CI/CD, Hammerspoon for macOS, Linux for X11/Wayland.
npx skillsauth add edwinhu/workflows dev-testInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Main Chat Task Agent
─────────────────────────────────────────────────────
/goal <condition> (set at phase entry; refires turns)
dev-implement (loads dev-tdd)
→ dev-delegate
→ Task agent ──────────────────→ uses dev-test (this skill)
↓ loads dev-tdd again
has TDD protocol + gates
→ routes to specific tool
<EXTREMELY-IMPORTANT>
## Load TDD Enforcement (REQUIRED)
Before choosing testing tools, you MUST load the TDD skill to ensure gate compliance:
Read ${CLAUDE_SKILL_DIR}/../../skills/dev-tdd/SKILL.md and follow its instructions.
This loads:
Read dev-tdd skill content now before selecting testing tools. </EXTREMELY-IMPORTANT>
This skill routes to the right testing tool. The loaded dev-tdd skill provides TDD protocol details.
YOU MUST WRITE E2E TESTS FOR USER-FACING FEATURES. This is not negotiable.
When your changes affect what users see or interact with, you MUST:
Unit tests prove components work. E2E tests prove YOUR feature works for users.
If your "E2E test" does any of these, it's NOT E2E:
| Pattern | Why It's Fake | Real E2E Alternative |
|---------|---------------|----------------------|
| grep "success" logs.txt | Only proves code ran | Verify actual output file/UI/API response |
| assert mock.called | Tests mock, not real system | Use real integration, verify real data |
| cat output.txt \| wc -l | File exists ≠ correct content | Read file, assert exact expected content |
| "I ran it manually" | No automation = no evidence | Capture manual test as automated test |
| Check log for icon name | Observability, not verification | Screenshot + visual diff of rendered icon |
| Exit code 0 | Process succeeded ≠ output correct | Verify the actual output data |
The test: If removing the actual implementation still passes your "E2E test", it's fake.
Example of fake E2E that caught nothing:
# FAKE E2E - only checks logs
def test_icon_theme_change():
run_command("set-theme papirus")
logs = read_logs()
assert "papirus" in logs # ❌ FAKE - only proves code ran
# BUG: 89% of icons weren't changed, test still passed!
Real E2E that would have caught the bug:
# REAL E2E - verifies actual output
def test_icon_theme_change():
run_command("set-theme papirus")
screenshot = capture_desktop()
assert visual_diff(screenshot, "expected_papirus.png") < threshold # ✅ REAL
# This would have shown 89% of icons were wrong
</EXTREMELY-IMPORTANT>
┌─────────────────────────────────────────────────────────────────┐
│ BROWSER TESTING REQUIRED? │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Need to debug JS errors or API calls? │
│ (console.log, network requests, XHR) │
└─────────────────────────────────────────────┘
│ │
YES NO
│ │
▼ ▼
┌───────────────────┐ ┌──────────────────────────┐
│ CHROME MCP │ │ Running in CI/CD? │
│ (debugging) │ │ (headless, automated) │
└───────────────────┘ └──────────────────────────┘
│ │
YES NO
│ │
▼ ▼
┌──────────────┐ ┌───────────────────┐
│ PLAYWRIGHT │ │ Cross-browser │
│ MCP │ │ needed? │
└──────────────┘ └───────────────────┘
│ │
YES NO
│ │
▼ ▼
┌──────────────┐ ┌────────────┐
│ PLAYWRIGHT │ │ Either OK │
│ MCP │ │ (prefer │
└──────────────┘ │ Playwright)│
└────────────┘
<EXTREMELY-IMPORTANT>
### Iron Laws: Browser MCP Selection
YOU MUST USE CHROME MCP FOR API/CONSOLE DEBUGGING. NO EXCEPTIONS. YOU MUST USE PLAYWRIGHT MCP FOR CI/CD TESTING. NO EXCEPTIONS.
| Need | Tool | Why |
|------|------|-----|
| Debug console errors | Chrome MCP | read_console_messages |
| Inspect API calls/responses | Chrome MCP | read_network_requests |
| Execute custom JS in page | Chrome MCP | javascript_tool |
| Record interaction as GIF | Chrome MCP | gif_creator |
| Headless/CI automation | Playwright MCP | Headless mode |
| Cross-browser testing | Playwright MCP | Firefox/WebKit support |
| Standard E2E suite | Playwright MCP | Test isolation, maturity |
| Interactive debugging | Chrome MCP | Real browser, console access |
| Capability | Playwright MCP | Chrome MCP |
|------------|---------------|------------|
| Navigate/click/type | ✅ | ✅ |
| Accessibility tree | ✅ browser_snapshot | ✅ read_page |
| Screenshots | ✅ | ✅ |
| Console messages | ❌ | ✅ read_console_messages |
| Network requests | ❌ | ✅ read_network_requests |
| JavaScript execution | ❌ | ✅ javascript_tool |
| GIF recording | ❌ | ✅ gif_creator |
| Headless mode | ✅ | ❌ (requires visible browser) |
| Cross-browser | ✅ (Chromium/Firefox/WebKit) | ❌ (Chrome only) |
| Natural language find | ❌ | ✅ find |
read_console_messages, read_network_requests). Chrome MCP cannot run headless — CI/CD requires Playwright MCP. Choosing by familiarity instead of by these constraints produces a test that cannot observe what it claims to verify.read_network_requests is an unverified claim presented as fact.
</EXTREMELY-IMPORTANT>
Detect the operating system and display server to select the appropriate testing tool:
# Detect platform for desktop automation
case "$(uname -s)" in
Darwin) echo "macOS - use dev-test-hammerspoon" ;;
Linux)
if [ "$XDG_SESSION_TYPE" = "wayland" ]; then
echo "Linux/Wayland - use dev-test-linux (ydotool)"
else
echo "Linux/X11 - use dev-test-linux (xdotool)"
fi
;;
esac
┌─────────────────────────────────────────────────────────────────┐
│ DESKTOP AUTOMATION REQUIRED? │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────┐
│ Platform? │
└─────────────────┘
/ | \
macOS Linux Windows
│ │ │
▼ ▼ ▼
┌──────────────┐ ┌─────────┐ ┌─────────┐
│ HAMMERSPOON │ │ LINUX │ │ NOT │
│ (dev-test- │ │ (dev- │ │ SUPPORTED│
│ hammerspoon) │ │ test- │ └─────────┘
└──────────────┘ │ linux) │
└─────────┘
│
┌─────────┴─────────┐
│ Display Server? │
└───────────────────┘
/ \
Wayland X11
│ │
▼ ▼
┌──────────┐ ┌──────────┐
│ ydotool │ │ xdotool │
└──────────┘ └──────────┘
Verify tools are available BEFORE proceeding. Missing tools = FULL STOP.
Each sub-skill has its own availability gate. Load the appropriate skill and follow its gate. </EXTREMELY-IMPORTANT>
| Skill | Use Case | Key Capabilities |
|-------|----------|------------------|
| skills/dev-test-chrome/SKILL.md (via cache lookup) | Debugging, console/network inspection | read_console_messages, read_network_requests, javascript_tool |
| skills/dev-test-playwright/SKILL.md (via cache lookup) | CI/CD, headless, cross-browser E2E | Headless mode, Firefox/WebKit, test isolation |
| Skill | Platform | Primary Tool |
|-------|----------|--------------|
| skills/dev-test-hammerspoon/SKILL.md (via cache lookup) | macOS | Hammerspoon (hs) |
| skills/dev-test-linux/SKILL.md (via cache lookup) | Linux | ydotool (Wayland) / xdotool (X11) |
Locate test directories and identify the test framework used in the project:
# Find test directory
ls -d tests/ test/ spec/ __tests__/ 2>/dev/null
# Find test framework
cat package.json 2>/dev/null | grep -E "(test|jest)"
cat pyproject.toml 2>/dev/null | grep -i pytest
cat Cargo.toml 2>/dev/null | grep -i "\[dev-dependencies\]"
cat meson.build 2>/dev/null | grep -i test
| Language | Framework | Command |
|----------|-----------|---------|
| Python | pytest | pytest tests/ -v |
| JavaScript | jest | npm test |
| TypeScript | vitest | npx vitest |
| Rust | cargo | cargo test |
| C/C++ | meson | meson test -C build -v |
| Go | go test | go test ./... |
Execute CLI applications with test inputs and verify outputs against expected results:
# Run with test inputs
./app --test-mode input.txt > output.txt
# Compare to expected
diff expected.txt output.txt
# Check exit code
./app --validate file && echo "PASS" || echo "FAIL"
Every test run MUST be documented in LEARNINGS.md:
## Test Run: [Description]
**Tool:** [Chrome MCP / Playwright / Hammerspoon / ydotool / pytest / etc.]
**Command:**
```bash
pytest tests/ -v
Output:
tests/test_feature.py::test_basic PASSED
tests/test_feature.py::test_edge_case PASSED
tests/test_feature.py::test_error FAILED
1 failed, 2 passed
Result: 2/3 PASS, 1 FAIL
Next: Fix test_error failure
## Integration
For TDD protocol (RED-GREEN-REFACTOR), see:
Read `${CLAUDE_SKILL_DIR}/../../skills/dev-tdd/SKILL.md` and follow its instructions.
This skill is invoked by Task agents during `dev-implement` phase.
tools
Use when "query Dewey Data", "deweydata.io", "SafeGraph places/patterns/spend", "Advan foot traffic", "POI / points of interest", "mobility data", "dataplor", "Veraset", "PassBy", "crypto/Bitcoin ATM locations", or any pull from the Dewey Data academic marketplace (UVA/NYU Platform Subscription) via the deweypy/deweydatapy client, DuckDB, or the Dewey MCP server.
development
Use when submitting jobs to UVA HPC (Rivanna/Afton), writing Slurm scripts (sbatch/srun/squeue), converting SGE to Slurm, running compute on any Slurm-managed cluster, or building WRDS data pipelines with polars on HPC. Triggers: 'submit to HPC', 'sbatch', 'squeue', 'slurm job', 'run on Rivanna', 'run on Afton', 'HPC array job', 'convert SGE to Slurm', 'polars on HPC', 'WRDS from HPC'.
testing
Internal skill for literature review and source materialization. Called after brainstorm, before setup. NOT user-facing.
development
This skill should be used when the user asks to "add paper", "paperpile add", "fetch PDF for", "find and add", "search paperpile", "find in paperpile", "paperpile search", "label paper", "trash paper", "download paper", "paperpile index", "edit paper metadata", "update paper title", "fix paper author", "paperpile edit", "find PDF online", "search google for PDF", "resolve PDF", "fetch PDF for citation", "get full-text for DOI", "resolve cite to PDF", or any request to manage their Paperpile library or resolve a citation to a local PDF.