plugins/gh-workflow/skills/runtime-verification/SKILL.md
Verifies implementation works at runtime by discovering and executing dev server startup, API smoke tests, E2E tests, and browser checks. Use after quality checks pass (lint, test, typecheck) to confirm the code actually runs. Use when validating acceptance criteria, running Playwright or Cypress suites, or smoke-testing endpoints before PR creation.
npx skillsauth add synaptiai/synapti-marketplace runtime-verificationInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
This skill verifies that an implementation actually works at runtime, not just that it compiles and passes lint/tests.
Quality checks (lint, test, typecheck) answer "does it compile?" This skill answers "does it work?" by:
Before running verifications, read timeout configuration (local > project > user > defaults):
DEV_STARTUP_TIMEOUT=$(jq -r '.timeouts.devServerStartup // empty' .claude/settings.gh-workflow.local.json 2>/dev/null)
[ -z "$DEV_STARTUP_TIMEOUT" ] && DEV_STARTUP_TIMEOUT=$(jq -r '.timeouts.devServerStartup // empty' .claude/settings.gh-workflow.json 2>/dev/null)
[ -z "$DEV_STARTUP_TIMEOUT" ] && DEV_STARTUP_TIMEOUT=$(jq -r '.timeouts.devServerStartup // empty' "$HOME/.claude/settings.gh-workflow.json" 2>/dev/null)
[ -z "$DEV_STARTUP_TIMEOUT" ] && DEV_STARTUP_TIMEOUT="30"
E2E_TIMEOUT=$(jq -r '.timeouts.e2eTest // empty' .claude/settings.gh-workflow.local.json 2>/dev/null)
[ -z "$E2E_TIMEOUT" ] && E2E_TIMEOUT=$(jq -r '.timeouts.e2eTest // empty' .claude/settings.gh-workflow.json 2>/dev/null)
[ -z "$E2E_TIMEOUT" ] && E2E_TIMEOUT=$(jq -r '.timeouts.e2eTest // empty' "$HOME/.claude/settings.gh-workflow.json" 2>/dev/null)
[ -z "$E2E_TIMEOUT" ] && E2E_TIMEOUT="120"
VERIFICATION_TIMEOUT=$(jq -r '.timeouts.verificationScript // empty' .claude/settings.gh-workflow.local.json 2>/dev/null)
[ -z "$VERIFICATION_TIMEOUT" ] && VERIFICATION_TIMEOUT=$(jq -r '.timeouts.verificationScript // empty' .claude/settings.gh-workflow.json 2>/dev/null)
[ -z "$VERIFICATION_TIMEOUT" ] && VERIFICATION_TIMEOUT=$(jq -r '.timeouts.verificationScript // empty' "$HOME/.claude/settings.gh-workflow.json" 2>/dev/null)
[ -z "$VERIFICATION_TIMEOUT" ] && VERIFICATION_TIMEOUT="180"
Before running the full discovery process, check for an existing verification script. Many mature projects already wire everything up in one command — if one exists, run it and skip the rest.
# Check for dedicated verification scripts
ls verify.sh test-e2e.sh smoke-test.sh scripts/verify* 2>/dev/null
# Check Makefile for verify/e2e/smoke targets
grep -E "^(verify|e2e|smoke|integration-test):" Makefile 2>/dev/null
If found, run it with a timeout:
timeout $VERIFICATION_TIMEOUT ./verify.sh 2>&1 # or make verify, etc.
If it passes, skip to Output Format. If it fails or no script exists, continue with full discovery.
Search project instructions for any development or verification commands:
grep -iE "(dev server|npm run|yarn |pnpm |python.*run|go run|cargo run|docker.compose|make\s+\w+|uvicorn|gunicorn|flask run)" .claude/CLAUDE.md CLAUDE.md 2>/dev/null
grep -iE "(verify|e2e|smoke|integration|acceptance)" .claude/CLAUDE.md CLAUDE.md 2>/dev/null
# Playwright
ls playwright.config.* 2>/dev/null
grep -l "playwright" package.json 2>/dev/null
# Cypress
ls cypress.config.* cypress/ 2>/dev/null
# Selenium
grep -l "selenium" requirements.txt pyproject.toml 2>/dev/null
# Node.js
grep -E '"(dev|start|serve)"' package.json 2>/dev/null
# Makefile
grep -E "^(dev|serve|run|start|up):" Makefile 2>/dev/null
# Docker
ls docker-compose.yml docker-compose.yaml compose.yml compose.yaml 2>/dev/null
# Python
ls manage.py 2>/dev/null && echo "django: python manage.py runserver"
grep -E "(uvicorn|gunicorn|flask)" pyproject.toml requirements.txt 2>/dev/null
# Go
ls main.go cmd/*/main.go 2>/dev/null && echo "go: go run ."
# Monorepo
ls turbo.json 2>/dev/null && echo "turbo: check turbo.json for dev pipeline"
# Check for port configuration
grep -rE "PORT|:3000|:8080|:5173|:4000|:8000|:3001" package.json .env .env.local .env.development 2>/dev/null | head -10
# Common framework defaults:
# Vite/SvelteKit=5173, Next.js/Rails=3000, CRA=3000
# Django=8000, Flask=5000, Go=8080, Spring=8080
grep -rn "health\|healthz\|ready\|alive\|ping" --include="*.ts" --include="*.py" --include="*.go" --include="*.java" . 2>/dev/null | head -10
If a dev server command is discovered:
Start server in background with PID tracking:
{dev_cmd} &
DEV_PID=$!
Wait for ready signal — try common health paths, fall back to port check:
PORT={detected_port:-3000}
for i in $(seq 1 $DEV_STARTUP_TIMEOUT); do
curl -sf http://localhost:$PORT/ > /dev/null 2>&1 && break
curl -sf http://localhost:$PORT/health > /dev/null 2>&1 && break
curl -sf http://localhost:$PORT/healthz > /dev/null 2>&1 && break
curl -sf http://localhost:$PORT/api/health > /dev/null 2>&1 && break
nc -z localhost $PORT 2>/dev/null && break
sleep 1
done
If server doesn't start within ${DEV_STARTUP_TIMEOUT}s (default: 30s, configurable via .timeouts.devServerStartup), report as verification failure with the last few lines of output
For each new/modified API endpoint in the diff:
Detecting endpoints from the diff:
# Find route definitions in changed files
git diff origin/$DEFAULT_BRANCH..HEAD --name-only | xargs grep -nE \
"@(app|router)\.(get|post|put|patch|delete)|app\.(get|post|put|use)|router\.(get|post)|@(Get|Post|Put|Delete|Patch)Mapping|@api_view|path\(" \
2>/dev/null
For each discovered endpoint:
Run discovered E2E test command with a timeout to prevent hanging:
timeout $E2E_TIMEOUT npx playwright test 2>&1 # or
timeout $E2E_TIMEOUT npx cypress run 2>&1 # or
timeout $E2E_TIMEOUT pytest tests/e2e/ 2>&1 # etc.
If the full suite is too large, run only tests related to changed files:
# Playwright: run specific test file
timeout $E2E_TIMEOUT npx playwright test tests/e2e/changed-feature.spec.ts 2>&1
# Pytest: run tests matching changed module names
timeout $E2E_TIMEOUT pytest tests/e2e/ -k "changed_module" 2>&1
When the diff includes frontend changes (templates, components, styles, layouts), visually verify the running application in a browser. This catches rendering issues, broken layouts, and visual regressions that automated tests often miss.
Detecting UI changes in the diff:
git diff origin/$DEFAULT_BRANCH..HEAD --name-only | grep -iE "\.(tsx|jsx|vue|svelte|html|css|scss|sass|less|ejs|hbs|pug)$"
If UI files changed and the dev server is running:
If a browser MCP tool is available (e.g., Puppeteer, Playwright MCP), use it to:
Navigate to http://localhost:{PORT}{route}
Take a screenshot
Check the browser console for errors
If no browser MCP but Playwright is installed, take automated screenshots:
# Quick screenshot of affected pages
npx playwright test --project=chromium -g "screenshot" 2>&1 || \
npx playwright screenshot http://localhost:$PORT{route} screenshot-{route-name}.png 2>&1
If no browser automation is available, fetch the page HTML and verify structure:
# Fetch rendered HTML and check for expected elements
curl -s http://localhost:$PORT{route} | grep -E "<(main|section|div|h1)" | head -20
Use the WebFetch tool for richer inspection — it renders JavaScript and returns the page content, which is useful for SPAs where curl only sees a shell <div id="root">.
| Check | How | Evidence | |-------|-----|----------| | Page loads without errors | Browser console or curl status | Screenshot or 200 OK | | Layout isn't broken | Screenshot or HTML structure check | Screenshot file path | | New UI elements are present | Look for expected elements in DOM | Element found / not found | | No console errors or warnings | Browser console output | Clean console or error list | | Interactive elements work | Click/interact via browser tool | Before/after screenshots |
Record results in the Visual Verification section of the output.
For each acceptance criterion from the linked issue:
Kill any background services started in Step 1:
kill $DEV_PID 2>/dev/null
# For Docker Compose
docker compose down 2>/dev/null
## Runtime Verification Results
### Service Status
| Service | Command | Status | Notes |
|---------|---------|--------|-------|
| Dev server | npm run dev | Started on :3000 | Healthy after 3s |
### Smoke Tests
| Endpoint/Feature | Test | Status | Evidence |
|-----------------|------|--------|----------|
| POST /api/users | Create user with valid data | Pass | 201 Created |
| POST /api/users | Create user with invalid email | Pass | 400 Bad Request |
### E2E Tests
| Suite | Status | Passed | Failed |
|-------|--------|--------|--------|
| Playwright | Pass | 12 | 0 |
### Visual Verification
| Page/Route | Check | Status | Evidence |
|------------|-------|--------|----------|
| /dashboard | Page loads | Pass | Screenshot: screenshot-dashboard.png |
| /dashboard | No console errors | Pass | Clean console |
| /users/new | Form renders correctly | Pass | All fields present |
### Acceptance Criteria
| Criterion | Verification Method | Status | Evidence |
|-----------|-------------------|--------|----------|
| Users can filter by date | GET /api/users?date=2024-01-01 | Pass | Returns filtered results |
### Not Verified (Requires Manual Check)
| Item | Reason |
|------|--------|
| Visual styling matches mockup | No browser automation available |
| Missing Capability | Fallback |
|-------------------|----------|
| No dev server command | Skip service startup, run only static checks |
| No E2E framework | Skip E2E, note as unverified |
| No health endpoint | Poll port availability with nc -z instead |
| No verification commands in CLAUDE.md | Infer from tech stack, ask user if ambiguous |
| Server won't start | Report failure with logs, don't block workflow |
| E2E tests timeout | Report timeout, suggest running a subset |
| No browser tool | Use Playwright screenshots, then WebFetch, then curl HTML check |
| No UI changes in diff | Skip visual verification entirely |
This skill is invoked by:
gh-start — Phase 7 (after quality checks, before code review)gh-pr — Step 3.6 (pre-PR runtime verification)IMPORTANT: Runtime verification is additive, not blocking. If a project has no dev server or E2E framework, this skill completes with "skipped" status and the workflow continues.
tools
Validate a FlowWorkflow YAML at `plugins/flow/workflows/<id>.workflow.yaml` against `schemas/v1/workflow.schema.json` AND cross-reference the referenced skills/agents exist + every Tier 3 action is confirm-gated + no native /goal or /loop dependency is declared. Use when /flow:workflow validate is invoked, when CI runs the workflow schema gates, or when a new workflow is being authored. This skill MUST be consulted because schema validation alone catches shape errors; cross-reference validation catches the silent-correctness failures (typo'd skill name, Tier 3 escape, /goal dependency) that would otherwise ship to users.
tools
Verify UI-facing changes by running a screenshot-analyze-verify loop across configured viewports, with a browser-tool priority cascade (Playwright MCP → Chrome DevTools MCP → CLI fallback → external skill fallback) and bounded iteration. Use after build/runtime verification passes and the diff includes `.tsx`/`.jsx`/`.vue`/`.html`/`.css`/`.scss`/`.svelte` files OR the acceptance criteria mention UI/page/render/display/visual. This skill MUST be consulted because UI changes that pass build and unit tests can still ship blank pages, render-blocking console errors, or broken responsive layouts that no other verification phase catches.
data-ai
Coordinate agent teams for adversarial review (paired skeptic/verifier per facet, challenge round with disposition vocabulary, consolidated findings with confidence) or parallel implementation (task sizing 5-6 per teammate, non-overlapping files). Enforces independent analysis before shared conclusions. Reference only (`disable-model-invocation: true`); loaded only when `agentTeams: true` in settings.
development
Conduct two-stage code review: Stage 1 verifies spec compliance (criterion-to-code mapping), Stage 2 evaluates security, correctness, performance, and maintainability across 6 parallel facets with P1/P2/P3 synthesis and deduplication by file:line. Use when reviewing code changes or pull requests. This skill MUST be consulted because reviewing quality on broken logic is wasted effort, and unmet acceptance criteria must block merge.