skills/e2e-testing/SKILL.md
E2E and visual regression testing with Playwright. Use when writing tests, running E2E tests, debugging test failures, or working with visual baselines. Contains test commands, patterns, and debugging tips.
npx skillsauth add avantmedialtd/skills e2e-testingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Playwright-based E2E and visual regression tests, run inside Docker for environment isolation. The Docker setup uses a separate test database so dev work continues uninterrupted in parallel.
Two reasons that compound:
The procedure runs as direct docker compose calls so it is fully visible and the runner can recover from infrastructure issues without elevated permissions or project-specific wrappers.
Tear down stale state, build, run, fetch report, tear down. One sequence (paste as a single Bash invocation):
dc() { docker compose -f docker-compose.yml -f docker-compose.test.yml --profile testing "$@"; }
dc down -v --remove-orphans \
&& dc up -d --build --wait \
&& dc exec -T e2e sh -c "npm run e2e -- --reporter=list,html"
e=$?
dc cp e2e:/workspace/playwright-report ./playwright-report 2>/dev/null || true
[ "$e" -ne 0 ] && dc cp e2e:/workspace/test-results ./test-results 2>/dev/null
dc down -v --remove-orphans
exit $e
--remove-orphans on every down is what makes the runner self-healing — a previous crashed run never blocks the next attempt.
Replace the npm run e2e -- portion in the dc exec line:
| Goal | Replace with |
| --------------------------- | -------------------------------------------------------------------------- |
| Full suite, parallel | npm run e2e -- --workers 2 --max-failures=1 --reporter=list,html |
| Single test by name | npm run e2e -- --grep "test name" --reporter=list,html |
| Single file | npm run e2e -- tests/Feature.spec.ts --reporter=list,html |
| Multiple files | npm run e2e -- tests/A.spec.ts tests/B.spec.ts --reporter=list,html |
| Update all visual baselines | npm run e2e -- --update-snapshots --reporter=list,html |
| Update one baseline | npm run e2e -- --grep "homepage" --update-snapshots --reporter=list,html |
Pipes inside --grep are eaten by the shell because the dc exec -T e2e sh -c "..." wraps the command in sh -c. --grep "MCP|Expense" is parsed as a pipeline. Use multiple file arguments instead, or run two separate dc exec calls.
When only test specs under e2e/ changed (no app source, no Dockerfile), skip the rebuild:
dc() { docker compose -f docker-compose.yml -f docker-compose.test.yml --profile testing "$@"; }
dc down -v --remove-orphans
dc up -d --no-build --wait
dc exec -T e2e sh -c "npm run e2e -- --grep 'pattern' --reporter=list,html"
e=$?
dc cp e2e:/workspace/playwright-report ./playwright-report 2>/dev/null || true
[ "$e" -ne 0 ] && dc cp e2e:/workspace/test-results ./test-results 2>/dev/null
dc down -v --remove-orphans
exit $e
--no-build reuses existing images. If they don't exist (first run on a fresh checkout), Docker errors clearly and you can rerun the standard path.
E2E full runs take minutes. Redirect to a log and use Bash run_in_background:
# inside Bash tool with run_in_background=true
{ dc() { docker compose -f docker-compose.yml -f docker-compose.test.yml --profile testing "$@"; }
dc down -v --remove-orphans \
&& dc up -d --build --wait \
&& dc exec -T e2e sh -c "npm run e2e -- --workers 2 --reporter=list,html"
e=$?
dc cp e2e:/workspace/playwright-report ./playwright-report 2>/dev/null || true
[ "$e" -ne 0 ] && dc cp e2e:/workspace/test-results ./test-results 2>/dev/null
dc down -v --remove-orphans
exit $e; } > /tmp/e2e.log 2>&1
Direct foreground runs exceed harness output limits and SIGPIPE (exit 141).
--remove-orphans in down — without it, stale containers from a previous failed run will block the next attempt.--grep) before finishing, to confirm nothing else broke.waitForLoadState('networkidle') — it's unreliable and causes timeouts. Wait for a specific visible element instead.If dc up fails with Conflict. The container name "/<project>-db-1" is already in use, a previous crash left orphans. Try the standard dc down -v --remove-orphans first. If that also fails, list and remove the test containers explicitly. Test profile only — never touch the dev-DB container (e.g. one named <project>-local-db-1 or similar; check docker ps first):
# Replace <project> with the compose project name (usually the repo dir name).
# The grep -v guard excludes the dev DB — adjust the pattern to match your dev container's name.
docker rm -f $(docker ps -aq --filter "name=<project>-" --filter "name=-1" | grep -v 'local-db')
docker network rm <project>_app-network 2>/dev/null || true
Then retry the standard run.
The list reporter prints each test's outcome on a single line. Failures look like [FAIL] tests/Feature.spec.ts > test name.
./test-results/<test-folder>/error-context.md for the full DOM snapshot at failure time. This file is auto-generated by Playwright when a test fails../playwright-report/index.html../test-results/<test-folder>/trace.zip — open via npx playwright show-trace.Common failure causes:
level to headings, scope to a section, or use .first().--update-snapshots.playwright-report/ — HTML report with traces. Always populated after a run.test-results/ — Failure artifacts including error-context.md and trace.zip. Populated only when at least one test failed (the conditional copy in the run script).Visual snapshot files live under e2e/tests/visual-baselines/ (or wherever the project keeps them) and are bind-mounted into the test container, so --update-snapshots writes back to the repo directly. Do not docker compose cp the baselines back — that nests visual-baselines/visual-baselines/.
Time-sensitive baselines (anything rendering the current date, calendars, or live counters) need regeneration after seed-data shifts. Mask dynamic regions where possible — see PATTERNS.md.
waitForLoadState('networkidle') — unreliable, use visible-element waits.if statements in assertions — use proper matchers.For test code patterns, selectors, and visual testing guidance, see PATTERNS.md.
development
Manage Bitbucket Cloud pull requests, comments, tasks, and pipelines from the command line. Use when working with PRs, reviewing code, leaving inline comments, creating PR tasks, triggering or inspecting Bitbucket Pipelines, or looking up reviewer account IDs.
development
Opinionated TypeScript and React development standards from Avant Media. Use when scaffolding new components, reviewing code, writing TypeScript interfaces or types, setting up project structure, creating React hooks, or working on any TypeScript/React codebase. Also use when the user asks about best practices, patterns, or conventions for TypeScript or React projects, even if they don't explicitly mention "standards."
tools
Assign a Jira issue to yourself and convert it into an OpenSpec proposal. Use when the user says "start work", "pick up an issue", "take the next ticket", or provides a Jira key and wants to begin work on it. Handles issue selection from the backlog, assignment, transition to In Progress, and scaffolding an OpenSpec change with Jira context embedded in the proposal.
development
Bun/Elysia/React/MUI monorepo stack blueprint. Use when scaffolding a new project, adding a new app or package to the monorepo, choosing dependencies, or making architectural decisions. Contains the full tech stack, conventions, and wiring patterns.