QA Run Skill

Execute the test plan and report failures as tickets in the project's tracker (YouTrack or GitHub, based on .claude/ticket-config.json).

This skill runs scenarios. It does not write or update them — that's /qa-sync.

Arguments

| Flag | Default | Effect | |---|---|---| | --flow <name> | all flows | Run only docs/qa/<name>.md | | --scenario <id> | all in scope | Run a single scenario by id | | --no-tickets | off | Run scenarios but do not create tickets on failure (dry mode) | | --max-failures <N> | unlimited | Stop after N failed scenarios (avoids ticket flooding on broken envs) | | --skip-preflight | off | Skip the application reachability check (use when targeting an external URL the sandbox can't ping) | | --no-grouping | off | Disable root-cause bucketing — create one ticket per FAILED scenario (legacy behavior, see Step 5c) |

Workflow

Step 1 — Pre-flight: configuration

Verify the prerequisites. Abort early with a clear message if anything is missing.

Playwright MCP available. Look up mcp__playwright__* tools in the current session. If none, tell the user:

Playwright MCP n'est pas configuré. Installez-le avec :

  claude mcp add playwright npx -- @playwright/mcp@latest

Puis relancez /qa-run.

Stop the skill.

Project config. Read .claude/ticket-config.json. If absent → Lancez /init-project puis /qa-sync avant /qa-run. and stop. If present but no qa section → Lancez /qa-sync pour bootstrapper la QA. and stop.
.env.qa. If at least one scenario uses auth: user|admin, check for .env.qa at the project root. If missing → ask the user whether to abort or run with anonymous-only scenarios.
Test plan present. docs/qa/ must contain at least one *.md file other than README.md.

Step 2 — Pre-flight: application reachability

Skip this step when --skip-preflight is set.

BASE_URL=$(jq -r '.qa.base_url' .claude/ticket-config.json)
HTTP_CODE=$(curl -sS -o /dev/null -w "%{http_code}" --max-time 5 "$BASE_URL" || echo "000")

Treat as down when HTTP_CODE is 000 (timeout/connection refused), or starts with 5. Treat 200-499 as up (a 4xx on the home page is rare but means the server is responding).

If down:

Detect docker-compose.yml, compose.yml, docker-compose.yaml, or compose.yaml at the project root.

If a compose file exists, ask:

AskUserQuestion:
  question: "L'app sur ${BASE_URL} ne répond pas. Démarrer la stack Docker ?"
  header: "App down"
  options:
    - label: "Oui, lancer docker compose up -d (Recommended)"
      description: "Démarre les services et attend que l'app réponde (max 30s)"
    - label: "Non, je gère manuellement"
      description: "Abandonne, à toi de démarrer puis de relancer /qa-run"
    - label: "Annuler"
      description: "Abandonne sans rien tenter"

Oui: run docker compose up -d. Then poll:

for i in $(seq 1 15); do
  sleep 2
  CODE=$(curl -sS -o /dev/null -w "%{http_code}" --max-time 3 "$BASE_URL" || echo "000")
  case "$CODE" in
    000|5*) continue ;;
    *) echo "App ready (HTTP $CODE)"; exit 0 ;;
  esac
done
echo "App still not reachable after 30s — check docker logs"; exit 1

If still KO after polling → abort with a message pointing to docker compose logs and stop the skill.

Non: print OK, lance ton app puis relance /qa-run. and stop the skill.
Annuler: stop the skill.

No compose file, suggest the most likely command based on what's at the root (package.json → yarn dev / npm run dev; composer.json with symfony/framework-bundle → symfony serve or php -S; etc.) and stop the skill. Do not spawn a long-running process from the skill — orphan processes are worse than a clear error.

Step 2b — Pre-run hook

Read .claude/ticket-config.json key qa.pre_run (string, optional). When set, this command resets fixtures / seeds the QA environment between runs (typical: make qa-cycle, npm run qa:reset, bin/console app:qa:reset && bin/console app:qa:seed).

PRE_RUN=$(jq -r '.qa.pre_run // empty' .claude/ticket-config.json)

If PRE_RUN is non-empty:

Print Running pre-run hook: $PRE_RUN.
Execute via Bash. Capture stdout/stderr.
If the exit code is non-zero, print the captured stderr verbatim and abort the run with:
```
Pre-run hook failed (exit <code>). Aborting /qa-run.
See output above. Fix the seed/reset failure or temporarily clear `.qa.pre_run` to skip.
```
Do not proceed to scenarios — running with stale or corrupted fixtures is worse than failing fast.
Auto-extract QA_* variables from stdout. Parse the captured stdout for lines matching ^QA_[A-Z][A-Z0-9_]*=.*$ (full-line match — values embedded in log noise are ignored). Trim trailing whitespace including \r (CRLF-safe — Docker / Windows-spawned hooks often emit \r\n); keep the value as printed (the hook owns its quoting). Empty values (KEY=) are kept as-is — that is the hook's way of clearing a variable. If a key appears multiple times in stdout, the last occurrence wins (mirrors how source would resolve duplicates).
- No matches: skip silently (most hooks emit no QA vars — that's fine).
- At least one match, but .env.qa.local does not exist at the project root: print the refusal followed by the extracted pairs so the user can copy them without re-scrolling through hook output:
```
Pre-run produced QA_* vars but .env.qa.local is missing. Refusing to create it
(would silently materialize a secrets file). Add the following lines manually,
then re-run /qa-run:

  QA_SESSION_ID=<uuid>
  QA_USER_ID=<uuid>
  ...
```
  Do not abort the run — fixtures are already reset; aborting would force another full cycle. Continue with scenarios; auth-protected ones may fail until the user creates the file.
- At least one match and .env.qa.local exists: for each parsed KEY=VALUE (after duplicate resolution), replace the existing line ^KEY=... in place if present, otherwise append KEY=VALUE at the end. Preserve every other line (comments, unrelated keys, trailing newline) untouched. Use a temp-file rewrite — never an in-place sed regex on user data, since values may contain regex metachars. Print:
```
Updated .env.qa.local: <N> QA_* variable(s) refreshed (<comma-separated keys>).
```

If qa.pre_run is unset, skip silently.

Skip this whole sub-step when --skip-preflight is set (the flag now also skips this hook — same intent: "I know what I'm doing, just run scenarios").

Pre-run hook output convention

For auto-extraction to work, the command pointed to by qa.pre_run should print one KEY=VALUE line per variable to refresh on stdout (typical: QA_SESSION_ID=<uuid>, QA_USER_ID=<uuid>, ...). Rules:

Only keys matching ^QA_[A-Z][A-Z0-9_]*$ are extracted. Anything else is left alone.
The QA_ prefix is intentional: it scopes auto-extraction to variables the hook explicitly designates as QA fixtures, avoiding accidental capture of unrelated env vars the command may echo (PATH, CI tokens, ...).
The line must be a full match — info: QA_SESSION_ID=abc is ignored.
.env.qa.local must already exist; the skill never creates it (avoids silently materializing a secrets file).
Existing keys are replaced in place; unrelated lines are preserved verbatim. If the same key is printed twice in stdout, the last occurrence wins.

Step 3 — Parse the test plan

Iterate over docs/qa/*.md (excluding README.md). For each file, extract every fenced ```yaml ... ``` block whose top-level keys include id: and start: — that's a scenario.

Build a flat list of scenarios with their source file and the section heading they appear under.

Apply filters:

If --scenario <id>: keep only that ID. Abort if not found.
If --flow <name>: keep only scenarios whose source file is docs/qa/<name>.md. Abort if file missing or has no scenarios.

Pre-filter: fully manual scenarios. After applying --scenario / --flow filters, inspect each remaining scenario's steps: list. A scenario is fully manual when its steps: list is non-empty AND every entry in steps: uses the check_manual: action key. For each such scenario:

Set status to SKIPPED.
Set reason to scenario fully manual (all steps are check_manual) (fixed string — used by Step 7 to group these under a single "Skipped scenarios" heading).
Remove it from the active execution list.
Do not open the browser for it — it is never passed to Step 5 (execution loop).

Hybrid scenarios (at least one non-check_manual: action in steps:) and scenarios with an empty or absent steps: are not affected. The pre-filtered SKIPPED scenarios are kept in memory and surfaced in Step 7's "Skipped scenarios" subsection exactly like precondition-based and auth-based SKIPPED outcomes.

The check_manual: action is a step action only — never an assertion — per ~/.claude/skills/qa-sync/references/scenario-format.md §Step actions. That is why this pre-filter inspects steps: only and not expect:.

If the resulting list is empty, print No scenarios match the filters. and stop.

Step 4 — Authentication setup

If any scenario uses auth: user, auth: admin, auth: user_fresh, or auth: user_unverified, the runner handles authentication per references/playwright-runner.md §Auth handling. This step pre-warms the cache for user and admin only:

For each user / admin profile referenced in the filtered plan, look for a scenario with id: AUTH-PROFILE-USER (or AUTH-PROFILE-ADMIN).
If present, run it; on success, capture the resulting storage state (cookies + localStorage) into an in-memory cache keyed by profile, per references/playwright-runner.md §Auth handling step 2.
If absent, fall back to a simple form login: navigate to /login, fill ${QA_USER_EMAIL} and ${QA_USER_PWD} from .env.qa, submit, and apply the same capture as step 2. If this fails, log a warning and SKIP auth-protected scenarios with reason auth setup failed.

user_fresh and user_unverified are not pre-warmed here: they re-login (with their fixture-specific credentials, see §Auth handling step 5) just before each scenario that uses them, and their state is never cached.

The captured storage is reused for every same-profile scenario via injection (no re-login). Re-login is triggered only on profile change, on qa.session_revalidation_endpoint returning 401, or on capture failure. The full procedure - including the injection, revalidation, and fallback rules - lives in references/playwright-runner.md §Auth handling.

Step 5 — Execute scenarios

For each scenario, execute the scenario blocks in this order: precondition → setup → start → steps → expect → teardown. Between steps and expect the runner performs an internal drain of the network and console captures (sub-step 5 below); this drain is a runtime step of the runner, not a scenario block — authors of docs/qa/*.md cannot write a drain: key. The auth profile setup (Step 4) has already run before this step — preconditions need an authenticated session for ${QA_USER_ID}-style references.

Open a browser context. If auth: matches the current profile and a captured storage state exists for it, inject the cached cookies + localStorage (per references/playwright-runner.md §Auth handling step 3b) and run the optional session revalidation if qa.session_revalidation_endpoint is set. On profile change, on revalidation 401, or when no cache exists, clear the context and re-login (steps 3a / 4 of §Auth handling) before recapturing.
Start a console + network capture for the duration of the scenario.
Preconditions (if precondition: block present): evaluate each guard via the browser_evaluate templates documented in references/playwright-runner.md §Precondition mapping. The schema and SKIPPED reason format are canonical in ~/.claude/skills/qa-sync/references/scenario-format.md §Preconditions.
- On the first failing guard: mark the scenario SKIPPED with reason = <formatted guard string> (e.g., precondition GET /api/game-sessions/abc failed: status 404). Do NOT execute setup, start, steps, expect, or teardown, since setup never ran (the internal drain is also skipped — nothing to drain). Skip to the next scenario.
- On all guards passing: continue to step 4.
Execute setup, then navigate to start, then each step.
Drain network and console (internal runtime step), before evaluating any expect: assertion. Call mcp__playwright__browser_network_requests and mcp__playwright__browser_console_messages exactly once and store the raw lists on the scenario record. They are reused by the assertion sub-step below, by Step 5c (root-cause bucketing), and by Step 6 — Issue creation (<console_block> / <network_block> in the ticket body). Filtering rules — same-host, drop static assets, drop OPTIONS for the network drain; type: "error" (or level: "error" on MCP builds that expose level instead) for the console — are documented in references/playwright-runner.md §status_max and §console_clean.
Evaluate each expect: assertion. Map actions to Playwright MCP calls via references/playwright-runner.md. The new status_max: and console_clean: assertions consume the drains from sub-step 5 and FAIL with normalized <observed> lines (Réponse <METHOD> <path> → <status> dépasse status_max=<n>, Erreur console: <first_error_message>).
Track outcome: PASSED, FAILED (which step/assertion), or SKIPPED (precondition failure, check_manual:, or auth setup failure — see Edge cases).
Run teardown regardless of pass/fail (best-effort, swallow errors with a warning). Exception: if the scenario was SKIPPED due to a precondition failure, teardown is also skipped (per step 3 above).

Stop the loop early if --max-failures is reached. Note: SKIPPED scenarios do not count toward --max-failures (only FAILED scenarios do).

Step 5c — Bucket FAILED scenarios by root cause

Skip this step entirely if --no-grouping is set, or if --no-tickets is set (no tickets means no buckets to merge), or if there are fewer than 2 FAILED scenarios.

When N scenarios share the same root cause (stale fixture, missing endpoint, missing selector), opening N near-identical tickets pollutes the tracker. Group failures before Step 6 so the tracker sees one parent ticket per root cause instead of N siblings.

Signature derivation

For each FAILED scenario, walk the signals in priority order and stop at the first hit. The first matching signal becomes the bucket key.

| Priority | Signal | Bucket key | Confidence | |---|---|---|---| | 1 | <network_block> contains a 4xx/5xx response | (<flow_file>, "http", <method>, <url_path_no_query>, <status>) | HIGH if every member of the final bucket shares the same <step_number>; otherwise MEDIUM. (Evaluate after all members have been assigned. A bucket of size 1 is a singleton — confidence does not apply.) | | 2 | <observed> matches a missing-selector pattern (Élément introuvable: <selector>, Element not found: <selector>, or Playwright "no element matches selector ...") | (<flow_file>, "selector", <normalized_selector>) | HIGH | | 3 | <console_block> contains a [error] ... line | (<flow_file>, "console", <normalized_first_error>) | MEDIUM | | 4 | None of the above | (<flow_file>, "scenario", <id>) — singleton | n/a (no merge) |

Normalization:

URL path: drop query string and fragment; keep host + path.
Selector: trim outer whitespace; collapse internal whitespace to single space.
Console error: strip the [error] / [warn ] prefix; truncate at first newline; cap at 120 chars.

The bucket key always includes <flow_file> as a prefix — never group across docs/qa/<flow>.md files. A 401 in auth.md and a 401 in cards.md get separate buckets even when the URL is identical: a fix in one flow does not necessarily fix the other.

Don't combine signals into a composite key. An HTTP 500 that also produces a console error has signal 1 win — both scenarios sharing that 500 land in the same HTTP bucket whether or not they emit identical console lines.

Confidence and merge policy

The skill is invocation-mode aware. Detect the mode by checking whether AskUserQuestion calls earlier in this run were honored (typical sign of an interactive Claude session) — in practice, qa-run runs in interactive Claude unless the user has explicitly opted out (e.g., a wrapper that sets a non-interactive context).

| Confidence | Interactive | Non-interactive (auto) | |---|---|---| | HIGH | Ask once via AskUserQuestion — default option is "merge". | Merge silently into a parent bucket. No prompt. | | MEDIUM | Ask once via AskUserQuestion — default option is "merge". On no/timeout: fall back to per-scenario tickets. | Do not merge — fall back to per-scenario tickets (current behavior). The ticket spec calls for merging only when confidence is high in auto mode. | | n/a (singleton) | No merge — proceed as today (one ticket per scenario). | Same. |

This split honors the ticket: "proposer via AskUserQuestion 'fusionner ces N echecs en 1 ticket parent ?'" (interactive default) and "en mode auto, fusionner si la confiance est elevee. Sinon, comportement actuel" (auto fallback).

AskUserQuestion shape for HIGH or MEDIUM in interactive mode:

AskUserQuestion:
  question: "<N> scénarios échouent avec la même cause probable: <signature_summary>. Créer 1 ticket parent au lieu de <N> ?"
  header: "Group failures"
  options:
    - label: "Oui, créer 1 ticket parent (Recommended)"
      description: "Liste les <N> scénarios concernés dans le corps"
    - label: "Non, créer <N> tickets séparés"
      description: "Comportement legacy — utile si les scénarios doivent être suivis individuellement"

Output of this step

Produce a list of buckets, where each bucket carries:

bucket_key (the tuple above)
signature_summary (human-readable, see references/issue-templates.md §Signature summary)
flow_file
members (list of {id, step_number, observed, console_block, network_block})
confidence (HIGH | MEDIUM | n/a)
merge_decision (parent | split) — set to parent for HIGH, to user's answer for MEDIUM, to split for singletons

Pass this list to Step 6 instead of the flat FAILED list. Singleton buckets (merge_decision = split) flow through Step 6 unchanged; parent buckets use the parent templates.

Step 6 — Issue creation on failure

Skip this step entirely if --no-tickets is set.

Iterate over the buckets produced by Step 5c (or the flat FAILED list when Step 5c was skipped — --no-grouping or fewer than 2 failures; --no-tickets already short-circuited above). For each bucket:

Singleton bucket (size 1, or any bucket with merge_decision = split): use the per-scenario flow below — title [QA] <id> failed at step <N>, dedup key [QA] <id>, body from references/issue-templates.md §Body — GitHub or §Body — YouTrack.
Parent bucket (size >= 2 with merge_decision = parent): use the parent flow below — title [QA] <flow_basename> — <signature_summary> (<N> scenarios), dedup key [QA] <flow_basename> — <signature_summary>, body from references/issue-templates.md §Parent bucket. <flow_basename> is the source filename without extension (e.g., checkout for docs/qa/checkout.md).

6a. Duplicate detection (default behavior)

Before creating any new ticket, search the tracker for an open issue whose title starts with the bucket's dedup key.

| Bucket type | Dedup key | Rationale | |---|---|---| | Singleton | [QA] <id> | Same scenario failing across runs is the same bug. The trailing step number can differ between runs — match by ID prefix. | | Parent | [QA] <flow_basename> — <signature_summary> | Same root cause across runs is the same bug. The scenario count and member list change between runs — match by flow + signature. |

GitHub:

gh issue list --state open --search "<dedup_key> in:title" --json number,title,url --repo "<github.repo>"

Match any result whose title starts with <dedup_key>.

YouTrack: mcp__youtrack__search_issues with query project: <prefix> summary: "<dedup_key>" State: Open (or equivalent unresolved state).

If a matching open issue exists:

Do not create a new ticket.
Post a comment using the appropriate template in references/issue-templates.md:
- Singleton bucket → §Duplicate detection — comment template
- Parent bucket → §Parent bucket — comment template (includes the full member list of the current run)
Comment via:
- GitHub: gh issue comment <number> --body "<comment>"
- YouTrack: the appropriate mcp__youtrack__*_comment tool (or update_issue adding to the comments collection if no dedicated tool is available).
Track the comment in the run report under the existing ticket's URL — do not list it as a "new ticket".

If no match is found, proceed to 6b.

6b. Create the ticket

Submission via the project's tracker (read .claude/ticket-config.json):

YouTrack: mcp__youtrack__create_issue with project = qa.youtrack_project_prefix or youtrack.project_prefix, summary = title, description = body, type = "Bug". Add a label/tag from qa.label (default qa).
GitHub: gh issue create --title "<title>" --body "<body>" --label "<qa.label>,bug" --repo "<github.repo>".

Body placeholders <console_block> and <network_block> are filled from the per-scenario drains captured during Step 5's internal drain sub-step (between steps and expect). Formatting and caps follow references/issue-templates.md §Console block formatting (warn+error, 30-line cap) and §Network block formatting (>= 400 only, 20-row cap, sorted by status desc then URL). Empty sections are omitted per the template's "omit if empty" rule, but the drain itself always runs so the diagnostic context is captured even when an unrelated assertion is the headline failure.

Capture the returned issue URL/number for the final report. Track per-bucket whether the result is a singleton ticket, a parent ticket, or a duplicate-comment hit (Step 7 splits these).

Step 7 — Final report

Print a summary:

## QA run summary

| Flow | Scenarios | Passed | Failed | Skipped |
|------|-----------|--------|--------|---------|
| auth | 3 | 3 | 0 | 0 |
| checkout | 5 | 1 | 4 | 0 |

**Total**: 8 / 8 ran (4 passed, 4 failed across 2 root causes).

### Failure buckets

- **checkout — `POST /api/coupons → 500`** (3 scenarios)
  - CHECKOUT-DISCOUNT-01 (step 3)
  - CHECKOUT-DISCOUNT-02 (step 3)
  - CHECKOUT-PAYMENT-01 (step 5)
- **checkout — selector `button[data-test=apply-coupon]` introuvable** (1 scenario)
  - CHECKOUT-COUPON-01 (step 2)

### Skipped scenarios

- **precondition GET /api/game-sessions/abc failed: status 404** (2 scenarios)
  - CHECKOUT-DISCOUNT-01
  - CHECKOUT-DISCOUNT-02
- **auth setup failed** (1 scenario)
  - LOGIN-SUCCESS-01

### Tickets

**Parent tickets opened** (root cause shared by N scenarios):
- [PROJ-142](https://...) — checkout `POST /api/coupons → 500` (3 scenarios: CHECKOUT-DISCOUNT-01, CHECKOUT-DISCOUNT-02, CHECKOUT-PAYMENT-01)

**Tickets opened**:
- [PROJ-143](https://...) — CHECKOUT-COUPON-01 failed at step 2

**Comments added** (existing tickets still failing):
- [PROJ-118](https://...) — checkout `GET /api/billing/profile → 401` still failing (2 scenarios)

Run `/qa-run --scenario <id>` to re-test a specific scenario after a fix.

Failure buckets subsection

The "Failure buckets" subsection lists every bucket from Step 5c that has at least 2 members (singletons add noise — they're already named in the per-scenario "Tickets opened" list below). Build it as follows:

Take the buckets produced by Step 5c. Drop singleton buckets (size 1).
If no buckets remain (Step 5c was skipped, every bucket was a singleton, or merge_decision was split for all MEDIUM buckets), omit the entire ### Failure buckets subsection — do not print an empty heading.
For each remaining bucket, print:
- Header line: **<flow_basename> — <signature_summary>** (<N> scenarios)
- Sub-list: each member as <id> (step <step_number>), sorted alphabetically by <id>
Sort buckets by descending member count, then alphabetically by <signature_summary> for ties.

This subsection makes root causes visible even when --no-tickets is set (no parent ticket gets opened, but the user still sees how failures cluster).

Skipped scenarios grouping

The "Skipped scenarios" subsection groups every SKIPPED scenario by its reason field. Build it as follows:

Collect all scenarios whose final status is SKIPPED. Each carries a reason string (set in Step 5 — precondition failure reasons match the format documented in ~/.claude/skills/qa-sync/references/scenario-format.md §Preconditions; auth setup failures use auth setup failed; check_manual: uses the manual description).
Group by exact reason string equality.
Sort groups by descending count, then alphabetically by reason for ties.
Within each group, list scenario IDs alphabetically.
If no scenario is SKIPPED, omit the entire ### Skipped scenarios subsection (do not print an empty heading).

Skipped scenarios do NOT open or comment on tickets — see references/issue-templates.md §When NOT to open a ticket. This subsection is the only visibility surface for skipped causes.

If no failures: end with Tout est vert. Rien à signaler..

Patterns reused (no duplication)

Ticket source detection: same logic as ~/.claude/skills/fetch-ticket/SKILL.md. Read .claude/ticket-config.json, fall back to git remote.
Issue creation: same primitives as ~/.claude/skills/create-ticket/SKILL.md (gh CLI for GitHub, mcp__youtrack__create_issue for YouTrack).
AskUserQuestion patterns: same shape as ~/.claude/skills/init-project/SKILL.md.
Bucketing pattern: Step 5c reuses the group-by-key + sort-by-descending-count idea from the existing "Skipped scenarios grouping" (Step 7); the only difference is the key derivation (root cause signature vs. exact reason string).

Edge cases

Auth setup fails: continue with anonymous scenarios; mark all auth: user|admin scenarios as SKIPPED with reason auth setup failed.
Single scenario explodes (browser crash, MCP timeout): log the trace, mark the scenario FAILED, continue.
--max-failures reached: stop, print partial summary, mention --skip-preflight/--scenario for re-runs.

What this skill does NOT do

Does not modify scenarios. Use /qa-sync for that.
Does not commit anything.
Does not deploy or restart the application beyond docker compose up -d during preflight.
Does not perform visual regression. Use visual-verify agent for design-vs-render checks.
Does not group failures across different docs/qa/<flow>.md files. A 401 in auth.md and a 401 in cards.md produce separate buckets — a fix in one flow does not auto-fix the other.

QA Run Skill

Execute the test plan and report failures as tickets in the project's tracker (YouTrack or GitHub, based on .claude/ticket-config.json).

This skill runs scenarios. It does not write or update them — that's /qa-sync.

Arguments

Workflow

Step 1 — Pre-flight: configuration

Verify the prerequisites. Abort early with a clear message if anything is missing.

Playwright MCP available. Look up mcp__playwright__* tools in the current session. If none, tell the user:

Playwright MCP n'est pas configuré. Installez-le avec :

  claude mcp add playwright npx -- @playwright/mcp@latest

Puis relancez /qa-run.

Stop the skill.

Project config. Read .claude/ticket-config.json. If absent → Lancez /init-project puis /qa-sync avant /qa-run. and stop. If present but no qa section → Lancez /qa-sync pour bootstrapper la QA. and stop.
.env.qa. If at least one scenario uses auth: user|admin, check for .env.qa at the project root. If missing → ask the user whether to abort or run with anonymous-only scenarios.
Test plan present. docs/qa/ must contain at least one *.md file other than README.md.

Step 2 — Pre-flight: application reachability

Skip this step when --skip-preflight is set.

BASE_URL=$(jq -r '.qa.base_url' .claude/ticket-config.json)
HTTP_CODE=$(curl -sS -o /dev/null -w "%{http_code}" --max-time 5 "$BASE_URL" || echo "000")

Treat as down when HTTP_CODE is 000 (timeout/connection refused), or starts with 5. Treat 200-499 as up (a 4xx on the home page is rare but means the server is responding).

If down:

Detect docker-compose.yml, compose.yml, docker-compose.yaml, or compose.yaml at the project root.

If a compose file exists, ask:

AskUserQuestion:
  question: "L'app sur ${BASE_URL} ne répond pas. Démarrer la stack Docker ?"
  header: "App down"
  options:
    - label: "Oui, lancer docker compose up -d (Recommended)"
      description: "Démarre les services et attend que l'app réponde (max 30s)"
    - label: "Non, je gère manuellement"
      description: "Abandonne, à toi de démarrer puis de relancer /qa-run"
    - label: "Annuler"
      description: "Abandonne sans rien tenter"

Oui: run docker compose up -d. Then poll:

for i in $(seq 1 15); do
  sleep 2
  CODE=$(curl -sS -o /dev/null -w "%{http_code}" --max-time 3 "$BASE_URL" || echo "000")
  case "$CODE" in
    000|5*) continue ;;
    *) echo "App ready (HTTP $CODE)"; exit 0 ;;
  esac
done
echo "App still not reachable after 30s — check docker logs"; exit 1

If still KO after polling → abort with a message pointing to docker compose logs and stop the skill.

Non: print OK, lance ton app puis relance /qa-run. and stop the skill.
Annuler: stop the skill.

No compose file, suggest the most likely command based on what's at the root (package.json → yarn dev / npm run dev; composer.json with symfony/framework-bundle → symfony serve or php -S; etc.) and stop the skill. Do not spawn a long-running process from the skill — orphan processes are worse than a clear error.

Step 2b — Pre-run hook

PRE_RUN=$(jq -r '.qa.pre_run // empty' .claude/ticket-config.json)

If PRE_RUN is non-empty:

Print Running pre-run hook: $PRE_RUN.
Execute via Bash. Capture stdout/stderr.
If the exit code is non-zero, print the captured stderr verbatim and abort the run with:
```
Pre-run hook failed (exit <code>). Aborting /qa-run.
See output above. Fix the seed/reset failure or temporarily clear `.qa.pre_run` to skip.
```
Do not proceed to scenarios — running with stale or corrupted fixtures is worse than failing fast.
Auto-extract QA_* variables from stdout. Parse the captured stdout for lines matching ^QA_[A-Z][A-Z0-9_]*=.*$ (full-line match — values embedded in log noise are ignored). Trim trailing whitespace including \r (CRLF-safe — Docker / Windows-spawned hooks often emit \r\n); keep the value as printed (the hook owns its quoting). Empty values (KEY=) are kept as-is — that is the hook's way of clearing a variable. If a key appears multiple times in stdout, the last occurrence wins (mirrors how source would resolve duplicates).
- No matches: skip silently (most hooks emit no QA vars — that's fine).
- At least one match, but .env.qa.local does not exist at the project root: print the refusal followed by the extracted pairs so the user can copy them without re-scrolling through hook output:
```
Pre-run produced QA_* vars but .env.qa.local is missing. Refusing to create it
(would silently materialize a secrets file). Add the following lines manually,
then re-run /qa-run:

  QA_SESSION_ID=<uuid>
  QA_USER_ID=<uuid>
  ...
```
  Do not abort the run — fixtures are already reset; aborting would force another full cycle. Continue with scenarios; auth-protected ones may fail until the user creates the file.
- At least one match and .env.qa.local exists: for each parsed KEY=VALUE (after duplicate resolution), replace the existing line ^KEY=... in place if present, otherwise append KEY=VALUE at the end. Preserve every other line (comments, unrelated keys, trailing newline) untouched. Use a temp-file rewrite — never an in-place sed regex on user data, since values may contain regex metachars. Print:
```
Updated .env.qa.local: <N> QA_* variable(s) refreshed (<comma-separated keys>).
```

If qa.pre_run is unset, skip silently.

Skip this whole sub-step when --skip-preflight is set (the flag now also skips this hook — same intent: "I know what I'm doing, just run scenarios").

Pre-run hook output convention

Only keys matching ^QA_[A-Z][A-Z0-9_]*$ are extracted. Anything else is left alone.
The QA_ prefix is intentional: it scopes auto-extraction to variables the hook explicitly designates as QA fixtures, avoiding accidental capture of unrelated env vars the command may echo (PATH, CI tokens, ...).
The line must be a full match — info: QA_SESSION_ID=abc is ignored.
.env.qa.local must already exist; the skill never creates it (avoids silently materializing a secrets file).
Existing keys are replaced in place; unrelated lines are preserved verbatim. If the same key is printed twice in stdout, the last occurrence wins.

Step 3 — Parse the test plan

Iterate over docs/qa/*.md (excluding README.md). For each file, extract every fenced ```yaml ... ``` block whose top-level keys include id: and start: — that's a scenario.

Build a flat list of scenarios with their source file and the section heading they appear under.

Apply filters:

If --scenario <id>: keep only that ID. Abort if not found.
If --flow <name>: keep only scenarios whose source file is docs/qa/<name>.md. Abort if file missing or has no scenarios.

Set status to SKIPPED.
Set reason to scenario fully manual (all steps are check_manual) (fixed string — used by Step 7 to group these under a single "Skipped scenarios" heading).
Remove it from the active execution list.
Do not open the browser for it — it is never passed to Step 5 (execution loop).

If the resulting list is empty, print No scenarios match the filters. and stop.

Step 4 — Authentication setup

For each user / admin profile referenced in the filtered plan, look for a scenario with id: AUTH-PROFILE-USER (or AUTH-PROFILE-ADMIN).
If present, run it; on success, capture the resulting storage state (cookies + localStorage) into an in-memory cache keyed by profile, per references/playwright-runner.md §Auth handling step 2.
If absent, fall back to a simple form login: navigate to /login, fill ${QA_USER_EMAIL} and ${QA_USER_PWD} from .env.qa, submit, and apply the same capture as step 2. If this fails, log a warning and SKIP auth-protected scenarios with reason auth setup failed.

Step 5 — Execute scenarios

Open a browser context. If auth: matches the current profile and a captured storage state exists for it, inject the cached cookies + localStorage (per references/playwright-runner.md §Auth handling step 3b) and run the optional session revalidation if qa.session_revalidation_endpoint is set. On profile change, on revalidation 401, or when no cache exists, clear the context and re-login (steps 3a / 4 of §Auth handling) before recapturing.
Start a console + network capture for the duration of the scenario.
Preconditions (if precondition: block present): evaluate each guard via the browser_evaluate templates documented in references/playwright-runner.md §Precondition mapping. The schema and SKIPPED reason format are canonical in ~/.claude/skills/qa-sync/references/scenario-format.md §Preconditions.
- On the first failing guard: mark the scenario SKIPPED with reason = <formatted guard string> (e.g., precondition GET /api/game-sessions/abc failed: status 404). Do NOT execute setup, start, steps, expect, or teardown, since setup never ran (the internal drain is also skipped — nothing to drain). Skip to the next scenario.
- On all guards passing: continue to step 4.
Execute setup, then navigate to start, then each step.
Drain network and console (internal runtime step), before evaluating any expect: assertion. Call mcp__playwright__browser_network_requests and mcp__playwright__browser_console_messages exactly once and store the raw lists on the scenario record. They are reused by the assertion sub-step below, by Step 5c (root-cause bucketing), and by Step 6 — Issue creation (<console_block> / <network_block> in the ticket body). Filtering rules — same-host, drop static assets, drop OPTIONS for the network drain; type: "error" (or level: "error" on MCP builds that expose level instead) for the console — are documented in references/playwright-runner.md §status_max and §console_clean.
Evaluate each expect: assertion. Map actions to Playwright MCP calls via references/playwright-runner.md. The new status_max: and console_clean: assertions consume the drains from sub-step 5 and FAIL with normalized <observed> lines (Réponse <METHOD> <path> → <status> dépasse status_max=<n>, Erreur console: <first_error_message>).
Track outcome: PASSED, FAILED (which step/assertion), or SKIPPED (precondition failure, check_manual:, or auth setup failure — see Edge cases).
Run teardown regardless of pass/fail (best-effort, swallow errors with a warning). Exception: if the scenario was SKIPPED due to a precondition failure, teardown is also skipped (per step 3 above).

Stop the loop early if --max-failures is reached. Note: SKIPPED scenarios do not count toward --max-failures (only FAILED scenarios do).

Step 5c — Bucket FAILED scenarios by root cause

Skip this step entirely if --no-grouping is set, or if --no-tickets is set (no tickets means no buckets to merge), or if there are fewer than 2 FAILED scenarios.

Signature derivation

For each FAILED scenario, walk the signals in priority order and stop at the first hit. The first matching signal becomes the bucket key.

Normalization:

URL path: drop query string and fragment; keep host + path.
Selector: trim outer whitespace; collapse internal whitespace to single space.
Console error: strip the [error] / [warn ] prefix; truncate at first newline; cap at 120 chars.

Confidence and merge policy

AskUserQuestion shape for HIGH or MEDIUM in interactive mode:

AskUserQuestion:
  question: "<N> scénarios échouent avec la même cause probable: <signature_summary>. Créer 1 ticket parent au lieu de <N> ?"
  header: "Group failures"
  options:
    - label: "Oui, créer 1 ticket parent (Recommended)"
      description: "Liste les <N> scénarios concernés dans le corps"
    - label: "Non, créer <N> tickets séparés"
      description: "Comportement legacy — utile si les scénarios doivent être suivis individuellement"

Output of this step

Produce a list of buckets, where each bucket carries:

bucket_key (the tuple above)
signature_summary (human-readable, see references/issue-templates.md §Signature summary)
flow_file
members (list of {id, step_number, observed, console_block, network_block})
confidence (HIGH | MEDIUM | n/a)
merge_decision (parent | split) — set to parent for HIGH, to user's answer for MEDIUM, to split for singletons

Pass this list to Step 6 instead of the flat FAILED list. Singleton buckets (merge_decision = split) flow through Step 6 unchanged; parent buckets use the parent templates.

Step 6 — Issue creation on failure

Skip this step entirely if --no-tickets is set.

Singleton bucket (size 1, or any bucket with merge_decision = split): use the per-scenario flow below — title [QA] <id> failed at step <N>, dedup key [QA] <id>, body from references/issue-templates.md §Body — GitHub or §Body — YouTrack.
Parent bucket (size >= 2 with merge_decision = parent): use the parent flow below — title [QA] <flow_basename> — <signature_summary> (<N> scenarios), dedup key [QA] <flow_basename> — <signature_summary>, body from references/issue-templates.md §Parent bucket. <flow_basename> is the source filename without extension (e.g., checkout for docs/qa/checkout.md).

6a. Duplicate detection (default behavior)

Before creating any new ticket, search the tracker for an open issue whose title starts with the bucket's dedup key.

GitHub:

gh issue list --state open --search "<dedup_key> in:title" --json number,title,url --repo "<github.repo>"

Match any result whose title starts with <dedup_key>.

YouTrack: mcp__youtrack__search_issues with query project: <prefix> summary: "<dedup_key>" State: Open (or equivalent unresolved state).

If a matching open issue exists:

Do not create a new ticket.
Post a comment using the appropriate template in references/issue-templates.md:
- Singleton bucket → §Duplicate detection — comment template
- Parent bucket → §Parent bucket — comment template (includes the full member list of the current run)
Comment via:
- GitHub: gh issue comment <number> --body "<comment>"
- YouTrack: the appropriate mcp__youtrack__*_comment tool (or update_issue adding to the comments collection if no dedicated tool is available).
Track the comment in the run report under the existing ticket's URL — do not list it as a "new ticket".

If no match is found, proceed to 6b.

6b. Create the ticket

Submission via the project's tracker (read .claude/ticket-config.json):

YouTrack: mcp__youtrack__create_issue with project = qa.youtrack_project_prefix or youtrack.project_prefix, summary = title, description = body, type = "Bug". Add a label/tag from qa.label (default qa).
GitHub: gh issue create --title "<title>" --body "<body>" --label "<qa.label>,bug" --repo "<github.repo>".

Capture the returned issue URL/number for the final report. Track per-bucket whether the result is a singleton ticket, a parent ticket, or a duplicate-comment hit (Step 7 splits these).

Step 7 — Final report

Print a summary:

## QA run summary

| Flow | Scenarios | Passed | Failed | Skipped |
|------|-----------|--------|--------|---------|
| auth | 3 | 3 | 0 | 0 |
| checkout | 5 | 1 | 4 | 0 |

**Total**: 8 / 8 ran (4 passed, 4 failed across 2 root causes).

### Failure buckets

- **checkout — `POST /api/coupons → 500`** (3 scenarios)
  - CHECKOUT-DISCOUNT-01 (step 3)
  - CHECKOUT-DISCOUNT-02 (step 3)
  - CHECKOUT-PAYMENT-01 (step 5)
- **checkout — selector `button[data-test=apply-coupon]` introuvable** (1 scenario)
  - CHECKOUT-COUPON-01 (step 2)

### Skipped scenarios

- **precondition GET /api/game-sessions/abc failed: status 404** (2 scenarios)
  - CHECKOUT-DISCOUNT-01
  - CHECKOUT-DISCOUNT-02
- **auth setup failed** (1 scenario)
  - LOGIN-SUCCESS-01

### Tickets

**Parent tickets opened** (root cause shared by N scenarios):
- [PROJ-142](https://...) — checkout `POST /api/coupons → 500` (3 scenarios: CHECKOUT-DISCOUNT-01, CHECKOUT-DISCOUNT-02, CHECKOUT-PAYMENT-01)

**Tickets opened**:
- [PROJ-143](https://...) — CHECKOUT-COUPON-01 failed at step 2

**Comments added** (existing tickets still failing):
- [PROJ-118](https://...) — checkout `GET /api/billing/profile → 401` still failing (2 scenarios)

Run `/qa-run --scenario <id>` to re-test a specific scenario after a fix.

Failure buckets subsection

Take the buckets produced by Step 5c. Drop singleton buckets (size 1).
If no buckets remain (Step 5c was skipped, every bucket was a singleton, or merge_decision was split for all MEDIUM buckets), omit the entire ### Failure buckets subsection — do not print an empty heading.
For each remaining bucket, print:
- Header line: **<flow_basename> — <signature_summary>** (<N> scenarios)
- Sub-list: each member as <id> (step <step_number>), sorted alphabetically by <id>
Sort buckets by descending member count, then alphabetically by <signature_summary> for ties.

This subsection makes root causes visible even when --no-tickets is set (no parent ticket gets opened, but the user still sees how failures cluster).

Skipped scenarios grouping

The "Skipped scenarios" subsection groups every SKIPPED scenario by its reason field. Build it as follows:

Collect all scenarios whose final status is SKIPPED. Each carries a reason string (set in Step 5 — precondition failure reasons match the format documented in ~/.claude/skills/qa-sync/references/scenario-format.md §Preconditions; auth setup failures use auth setup failed; check_manual: uses the manual description).
Group by exact reason string equality.
Sort groups by descending count, then alphabetically by reason for ties.
Within each group, list scenario IDs alphabetically.
If no scenario is SKIPPED, omit the entire ### Skipped scenarios subsection (do not print an empty heading).

Skipped scenarios do NOT open or comment on tickets — see references/issue-templates.md §When NOT to open a ticket. This subsection is the only visibility surface for skipped causes.

If no failures: end with Tout est vert. Rien à signaler..

Patterns reused (no duplication)

Ticket source detection: same logic as ~/.claude/skills/fetch-ticket/SKILL.md. Read .claude/ticket-config.json, fall back to git remote.
Issue creation: same primitives as ~/.claude/skills/create-ticket/SKILL.md (gh CLI for GitHub, mcp__youtrack__create_issue for YouTrack).
AskUserQuestion patterns: same shape as ~/.claude/skills/init-project/SKILL.md.
Bucketing pattern: Step 5c reuses the group-by-key + sort-by-descending-count idea from the existing "Skipped scenarios grouping" (Step 7); the only difference is the key derivation (root cause signature vs. exact reason string).

Edge cases

Auth setup fails: continue with anonymous scenarios; mark all auth: user|admin scenarios as SKIPPED with reason auth setup failed.
Single scenario explodes (browser crash, MCP timeout): log the trace, mark the scenario FAILED, continue.
--max-failures reached: stop, print partial summary, mention --skip-preflight/--scenario for re-runs.

What this skill does NOT do

Does not modify scenarios. Use /qa-sync for that.
Does not commit anything.
Does not deploy or restart the application beyond docker compose up -d during preflight.
Does not perform visual regression. Use visual-verify agent for design-vs-render checks.
Does not group failures across different docs/qa/<flow>.md files. A 401 in auth.md and a 401 in cards.md produce separate buckets — a fix in one flow does not auto-fix the other.

Adoption

nicolas-codemate/qa-run

$ install --global

Security Scan Results

SKILL.md

QA Run Skill

Arguments

Workflow

Step 1 — Pre-flight: configuration

Step 2 — Pre-flight: application reachability

Step 2b — Pre-run hook

Pre-run hook output convention

Step 3 — Parse the test plan

Step 4 — Authentication setup

Step 5 — Execute scenarios

Step 5c — Bucket FAILED scenarios by root cause

Signature derivation

Confidence and merge policy

Output of this step

Step 6 — Issue creation on failure

6a. Duplicate detection (default behavior)

6b. Create the ticket

Step 7 — Final report

Failure buckets subsection

Skipped scenarios grouping

Patterns reused (no duplication)

Edge cases

What this skill does NOT do

Related Skills

nicolas-codemate/perf-landing

nicolas-codemate/audit-project

nicolas-codemate/mentor

nicolas-codemate/architect-refactor

nicolas-codemate/qa-run

$ install --global

Security Scan Results

SKILL.md

QA Run Skill

Arguments

Workflow

Step 1 — Pre-flight: configuration

Step 2 — Pre-flight: application reachability

Step 2b — Pre-run hook

Pre-run hook output convention

Step 3 — Parse the test plan

Step 4 — Authentication setup

Step 5 — Execute scenarios

Step 5c — Bucket FAILED scenarios by root cause

Signature derivation

Confidence and merge policy

Output of this step

Step 6 — Issue creation on failure

6a. Duplicate detection (default behavior)

6b. Create the ticket

Step 7 — Final report

Failure buckets subsection

Skipped scenarios grouping

Patterns reused (no duplication)

Edge cases

What this skill does NOT do

Related Skills

nicolas-codemate/perf-landing

nicolas-codemate/audit-project

nicolas-codemate/mentor

nicolas-codemate/architect-refactor