Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

cedricziel/e2e-testing

Name: e2e-testing
Author: cedricziel

.claude/skills/e2e-testing/SKILL.md

npx skillsauth add cedricziel/assistant e2e-testing

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

E2E Testing (Playwright Visual Regression)

The assistant frontend is a Flutter web app embedded in the Rust binary. Playwright captures screenshots of the compiled Flutter SPA at three viewport sizes (desktop, tablet, mobile) and compares them against committed baselines.

The binary serves the Flutter web build at /. The webServer config in playwright.config.ts builds the binary (which embeds flutter build web) and starts it with --auth-token test-token --listen 127.0.0.1:8787.

Directory Layout

crates/web-ui/e2e/
  playwright.config.ts     # Config: viewports, server, reporters
  package.json             # Dependencies (playwright, @playwright/test)
  tests/
    visual-regression.spec.ts   # All visual tests
  screenshots/                  # Committed baselines (PNG)
    tests/visual-regression.spec.ts/
      login-desktop-chrome.png
      login-tablet-chrome.png
      login-mobile-chrome.png
      ...
  test-results/            # Generated on failure (diff, actual, expected PNGs)
  playwright-report/       # HTML report (gitignored)

Running Tests

# From the e2e directory
cd crates/web-ui/e2e

# Run all visual tests (starts server automatically via webServer config)
npx playwright test

# Update baselines after intentional changes
npx playwright test --update-snapshots
# or use the npm script:
npm run test:update

# Run a single test
npx playwright test -g "traces page"

# Run only desktop viewport
npx playwright test --project=desktop-chrome

# Show HTML report after failure
npx playwright show-report

The webServer config in playwright.config.ts automatically builds and starts the web-ui binary with --auth-token test-token --listen 127.0.0.1:8787. Set E2E_BASE_URL to skip the auto-start and use a running server instead.

Writing Tests

Test Structure

import { test, expect, Page } from "@playwright/test";

const AUTH_TOKEN = "test-token";
const MAX_DIFF_RATIO = 0.03; // 3% tolerance for cross-platform fonts
const CSS_SETTLE_MS = 300; // Wait for CSS transitions

// Authenticate via the login form
async function login(page: Page) {
  await page.goto("/login");
  await page.fill('input[name="token"]', AUTH_TOKEN);
  await page.click('button[type="submit"]');
  await page.waitForURL((url) => !url.pathname.includes("/login"));
}

// Navigate and wait for network idle + CSS settle
async function navigateAndSettle(page: Page, path: string) {
  await page.goto(path, { waitUntil: "networkidle" });
  await page.waitForTimeout(CSS_SETTLE_MS);
}

Adding a New Page Test

test("my new page (empty state)", async ({ page }) => {
  await navigateAndSettle(page, "/my-page");
  await expect(page).toHaveScreenshot("my-page-empty.png", {
    fullPage: true,
    maxDiffPixelRatio: MAX_DIFF_RATIO,
  });
});

Then generate baselines:

npx playwright test --update-snapshots -g "my new page"

This creates three files in screenshots/ (one per project/viewport).

Unauthenticated vs Authenticated Pages

Login page: Test without calling login() first
All other pages: Use test.beforeEach with login(page):

test.describe("Authenticated pages", () => {
  test.beforeEach(async ({ page }) => {
    await login(page);
  });

  test("page name", async ({ page }) => { ... });
});

Cross-Platform Tolerance

Font rendering differs between macOS (local dev) and Linux (CI). The maxDiffPixelRatio: 0.03 setting allows up to 3% pixel differences, which absorbs font hinting/anti-aliasing variance while still catching layout regressions (moved elements, missing sections, broken styles).

When to adjust:

If CI fails on font-only diffs, the 3% tolerance should already absorb them
If a real layout regression is masked, lower the tolerance for that specific test
Never set tolerance above 5% — at that point you're not testing anything

Viewport Projects

Three projects in playwright.config.ts match the app's responsive breakpoints:

| Project | Viewport | App Layout | | ---------------- | -------- | ----------------------------- | | desktop-chrome | 1280x900 | Icon rail + top bar | | tablet-chrome | 768x1024 | Hamburger + drawer | | mobile-chrome | Pixel 7 | Bottom tabs + stacked content |

CI Workflow

The visual-regression job in .github/workflows/ci.yml:

Installs Flutter SDK (subosito/flutter-action@v2, flutter 3.x stable) and runs flutter pub get in app/
Builds the unified assistant binary (cargo build -p assistant-cli), which embeds the Flutter web app via build.rs
Installs Playwright + Chromium
Runs npx playwright test
Uploads the HTML report as an artifact (always)
On failure + PR: uploads diff images as artifact
On failure + PR: pushes diff PNGs to an orphan visual-diffs/pr-N branch and posts an inline comment with embedded image comparisons
On PR close: a cleanup job deletes the visual-diffs/pr-N branch

Reading Diff Comments

When visual tests fail on a PR, the bot posts a comment with:

Expected: The committed baseline
Actual: What the test rendered
Diff: Pink/red highlights showing changed pixels

Review the diff images to decide whether to:

Fix a regression: The change was unintentional — fix the code
Update baselines: The change was intentional — run npm run test:update and commit

Regenerating Baselines

After intentional visual changes:

cd crates/web-ui/e2e
npx playwright test --update-snapshots

Commit the updated PNGs in the same commit as the code change.

Troubleshooting

Tests pass locally but fail in CI

Font rendering differences. The 3% tolerance should absorb this. If not:

Check if the diff is font-only (fuzzy text edges) vs structural (moved elements)
For font-only diffs: consider bumping tolerance for that specific test
For structural diffs: there's a real bug — investigate

Server doesn't start in time

The webServer.timeout is 120 seconds. The first build is slow because cargo build triggers flutter build web --release (via build.rs). If it still times out:

Pre-build manually: cargo build -p assistant-cli from the repo root
Ensure Flutter SDK is on $PATH (run flutter doctor)
Check if port 8787 is already in use
Set E2E_BASE_URL to point to a manually started server

Screenshot dimensions changed

Viewport size is fixed per project — if dimensions change, it's likely the page content height changed. fullPage: true captures the full scrollable height, so adding content to a page will change the baseline.

cedricziel/e2e-testing

.claude/skills/e2e-testing/SKILL.md

Playwright visual regression testing for the assistant Flutter SPA. Covers test structure, screenshot baselines, cross-platform diff tolerance, CI workflow with inline diff comments, and baseline management. Use when adding screens, changing layouts, or debugging visual test failures.

4 stars

development

Updated Apr 12, 2026

$ install --global

skillsauth

npx skillsauth add cedricziel/assistant e2e-testing

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 12, 2026, 3:32 AM4.6s1 file scanned

SKILL.md

name:: e2e-testing
description:: >
license:: MIT

E2E Testing (Playwright Visual Regression)

Directory Layout

crates/web-ui/e2e/
  playwright.config.ts     # Config: viewports, server, reporters
  package.json             # Dependencies (playwright, @playwright/test)
  tests/
    visual-regression.spec.ts   # All visual tests
  screenshots/                  # Committed baselines (PNG)
    tests/visual-regression.spec.ts/
      login-desktop-chrome.png
      login-tablet-chrome.png
      login-mobile-chrome.png
      ...
  test-results/            # Generated on failure (diff, actual, expected PNGs)
  playwright-report/       # HTML report (gitignored)

Running Tests

# From the e2e directory
cd crates/web-ui/e2e

# Run all visual tests (starts server automatically via webServer config)
npx playwright test

# Update baselines after intentional changes
npx playwright test --update-snapshots
# or use the npm script:
npm run test:update

# Run a single test
npx playwright test -g "traces page"

# Run only desktop viewport
npx playwright test --project=desktop-chrome

# Show HTML report after failure
npx playwright show-report

Writing Tests

Test Structure

import { test, expect, Page } from "@playwright/test";

const AUTH_TOKEN = "test-token";
const MAX_DIFF_RATIO = 0.03; // 3% tolerance for cross-platform fonts
const CSS_SETTLE_MS = 300; // Wait for CSS transitions

// Authenticate via the login form
async function login(page: Page) {
  await page.goto("/login");
  await page.fill('input[name="token"]', AUTH_TOKEN);
  await page.click('button[type="submit"]');
  await page.waitForURL((url) => !url.pathname.includes("/login"));
}

// Navigate and wait for network idle + CSS settle
async function navigateAndSettle(page: Page, path: string) {
  await page.goto(path, { waitUntil: "networkidle" });
  await page.waitForTimeout(CSS_SETTLE_MS);
}

Adding a New Page Test

test("my new page (empty state)", async ({ page }) => {
  await navigateAndSettle(page, "/my-page");
  await expect(page).toHaveScreenshot("my-page-empty.png", {
    fullPage: true,
    maxDiffPixelRatio: MAX_DIFF_RATIO,
  });
});

Then generate baselines:

npx playwright test --update-snapshots -g "my new page"

This creates three files in screenshots/ (one per project/viewport).

Unauthenticated vs Authenticated Pages

Login page: Test without calling login() first
All other pages: Use test.beforeEach with login(page):

test.describe("Authenticated pages", () => {
  test.beforeEach(async ({ page }) => {
    await login(page);
  });

  test("page name", async ({ page }) => { ... });
});

Cross-Platform Tolerance

When to adjust:

If CI fails on font-only diffs, the 3% tolerance should already absorb them
If a real layout regression is masked, lower the tolerance for that specific test
Never set tolerance above 5% — at that point you're not testing anything

Viewport Projects

Three projects in playwright.config.ts match the app's responsive breakpoints:

CI Workflow

The visual-regression job in .github/workflows/ci.yml:

Installs Flutter SDK (subosito/flutter-action@v2, flutter 3.x stable) and runs flutter pub get in app/
Builds the unified assistant binary (cargo build -p assistant-cli), which embeds the Flutter web app via build.rs
Installs Playwright + Chromium
Runs npx playwright test
Uploads the HTML report as an artifact (always)
On failure + PR: uploads diff images as artifact
On failure + PR: pushes diff PNGs to an orphan visual-diffs/pr-N branch and posts an inline comment with embedded image comparisons
On PR close: a cleanup job deletes the visual-diffs/pr-N branch

Reading Diff Comments

When visual tests fail on a PR, the bot posts a comment with:

Expected: The committed baseline
Actual: What the test rendered
Diff: Pink/red highlights showing changed pixels

Review the diff images to decide whether to:

Fix a regression: The change was unintentional — fix the code
Update baselines: The change was intentional — run npm run test:update and commit

Regenerating Baselines

After intentional visual changes:

cd crates/web-ui/e2e
npx playwright test --update-snapshots

Commit the updated PNGs in the same commit as the code change.

Troubleshooting

Tests pass locally but fail in CI

Font rendering differences. The 3% tolerance should absorb this. If not:

Check if the diff is font-only (fuzzy text edges) vs structural (moved elements)
For font-only diffs: consider bumping tolerance for that specific test
For structural diffs: there's a real bug — investigate

Server doesn't start in time

The webServer.timeout is 120 seconds. The first build is slow because cargo build triggers flutter build web --release (via build.rs). If it still times out:

Pre-build manually: cargo build -p assistant-cli from the repo root
Ensure Flutter SDK is on $PATH (run flutter doctor)
Check if port 8787 is already in use
Set E2E_BASE_URL to point to a manually started server

Screenshot dimensions changed

Related Skills

cedricziel/openapi-sync

tools

VerifiedTrustedCommunity

Enforces OpenAPI spec discipline when working on REST API endpoints in this project. Triggers whenever adding, modifying, or removing HTTP routes, request/response types, or API handlers in the Rust web-ui crate (`crates/web-ui`). Reminds the agent to (1) update the committed `openapi.json` spec, (2) run `make dump-openapi` to re-export the spec from the running server, and (3) run `make generate-flutter-client` to regenerate the Dart/dio client in `app/packages/assistant_api/`. Also applies when changing route parameters, status codes, or authentication on existing endpoints.

4SKILL.mdUpdated Apr 15, 2026

cedricziel/openapi-sync

cedricziel/playwright-cli

tools

VerifiedTrustedCommunity

Browser automation via @playwright/mcp (Microsoft). Use this when the user wants to navigate websites, fill forms, take screenshots, scrape web content, test web apps, or run any multi-step browser workflow. Requires no display (headless mode supported).

4SKILL.mdUpdated Apr 12, 2026

cedricziel/playwright-cli

cedricziel/hello-wasm

testing

VerifiedTrustedCommunity

A minimal example WASM skill that returns a greeting. Use to verify that the WASM execution tier is working correctly.

4SKILL.mdUpdated Apr 12, 2026

cedricziel/hello-wasm

cedricziel/coding-agent

development

VerifiedTrustedCommunity

Run coding agents (Claude Code, Codex, OpenCode, or others) as background processes for programmatic control. Use when you need non-blocking execution, parallel agents, PR reviews, or long-running coding tasks. Prefer this over direct bash for any task that takes more than ~20 seconds.

4SKILL.mdUpdated Apr 12, 2026

cedricziel/coding-agent

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/cedricziel/assistant.git

# Copy into Claude Code skills folder (global)
cp -r assistant/.claude/skills/e2e-testing ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

cedricziel/assistant

4 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT