Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

jaykim88/test-strategy

Name: test-strategy
Author: jaykim88

plugins/frontend-toolkit/skills/test-strategy/SKILL.md

npx skillsauth add jaykim88/claude-ai-engineering test-strategy

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Test Strategy

Purpose

Follow the Testing Trophy: write integration tests mostly, supported by unit tests for pure logic and minimal E2E for critical journeys. Establish CI thresholds so coverage doesn't silently degrade.

Universal — the Testing Trophy ROI model and 3-handler-per-endpoint convention apply to any framework; only the test runner and component-rendering library differ.

Procedure

Run coverage analysis (validation loop)
- Run the test runner with coverage flag
- If branch coverage < target, identify 0%-coverage business logic and add tests; re-run until target met
- Prioritize untested business logic over framework code
- Coverage is a floor, not a goal — 90% coverage with implementation-coupled assertions gives false confidence. Chase untested behavior, not the number
Unit tests — pure functions & custom hooks
- AAA pattern (Arrange / Act / Assert), one assertion focus per test
- Mock only at system boundaries — never mock business logic (runner — see Implementation; Vitest)
Integration tests — the bulk
- Render the component with its real children; mock at the network boundary (not component-level mocks)
- Project convention: 3 handlers per endpoint — happy / error / empty. (A project rule, not from Kent's article — it operationalizes the integration-heavy stance.)
- Assert behavior, not implementation (applies to all layers): query user-visible output by role/label; never assert internal state, props, or that a function was called. A test that breaks on a refactor — when behavior didn't change — is testing the wrong thing
- Query priority getByRole > getByLabelText > text > getByTestId doubles as an a11y signal: if you can't query by role, the component is probably inaccessible
- Watch for mock drift: a hand-written MSW response can diverge from the real API shape (mock returns X, prod returns Y) → type the handlers against the real schema, or the test passes while prod breaks
- User-centric assertions (query by role, simulate real user events) (RTL + MSW — see Implementation)
E2E — critical user journeys only
- Login → core action → result; limit to journeys where regression would block users
- Reserve E2E for what integration can't reach: real navigation/redirects, auth flows, third-party iframes, multi-tab
- Don't replicate integration tests at the E2E layer (Playwright — see Implementation)
- Visual regression is a separate category role-query tests can't catch (CSS/layout breaks) — Playwright screenshots or Chromatic on critical pages
Component-level a11y checks
- Every shared component passes an a11y check in the component workshop
- Block PR on a11y violations (Storybook a11y addon — see Implementation)
Keep tests deterministic (flake kills suites)
- No real time: fake timers (vi.useFakeTimers) for debounce/throttle/intervals; never an arbitrary setTimeout/sleep to "wait for" something
- Await async properly — findBy* / waitFor (not getBy* before data resolves); userEvent is async, await it
- Control randomness, dates, and network: stub Math.random, freeze Date.now, mock every request (a test that hits the real network is flaky by definition)
- Tests pass in isolation and in any order — no shared mutable state between tests
CI configuration
- Set branch coverage threshold (not line coverage — line coverage rewards trivial getter tests)
- Run unit + integration on every PR; E2E on main only (or with [e2e] label)
Explicit non-targets
- Document what you intentionally don't test (framework internals, third-party library behavior)
- Prevents test-coverage gaming

Severity tiers

| Tier | Examples | Action SLA | |---|---|---| | Critical | 0% coverage on auth / payment / data-integrity paths; no E2E for the primary user journey; tests fail intermittently | Block release; fix immediately | | Major | Business-logic coverage < 50%; missing error/empty MSW handlers; tests assert implementation details (break on refactor); no Storybook a11y on shared components | Fix this sprint | | Minor | Utility coverage < 80%; missing CI flake detection (--repeat-each); no visual-regression on critical pages; unused test helpers | Schedule within 2 sprints |

Default coverage target if no project-specific target is set: branch coverage ≥ 70% overall; ≥ 90% for security-critical paths.

Completion Criteria

[ ] Branch coverage ≥ target (default ≥ 70% if unset; ≥ 90% on critical paths)
[ ] Tests assert user-visible behavior, not implementation details (survive a refactor)
[ ] Test suite is deterministic — no real timers/network/random; passes under --repeat-each
[ ] Critical user journeys covered by E2E
[ ] MSW handlers cover happy / error / empty per endpoint
[ ] CI passes; no skipped tests left in main
[ ] Storybook a11y addon = 0 violations on shared components
[ ] All Critical findings fixed; all Major findings scheduled

Output

Test files: organized as *.test.ts (unit) / *.test.tsx (integration with RTL) / e2e/*.spec.ts (Playwright)
MSW handlers: src/mocks/handlers.ts exporting one handler per endpoint (happy / error / empty variants)
Coverage report: generated by test runner; CI fails if below threshold (default 70% branch)
CI config: .github/workflows/ci.yml includes vitest run --coverage with threshold + playwright test for critical journeys
Commit format: test(<scope>): <description> for test additions; fix(<scope>): <description> + test for regression-driven additions

Implementation

React + Next.js (default)

Runner: Vitest (vitest run --coverage)
Component rendering: React Testing Library (RTL) — screen.getByRole, userEvent
Network mocking: MSW (Mock Service Worker)
E2E: Playwright (npx playwright test --repeat-each=10 for flake detection)
Determinism: vi.useFakeTimers() for timers; await userEvent.*; findBy*/waitFor for async; stub Math.random / vi.setSystemTime
Visual regression: Playwright expect(page).toHaveScreenshot() or Chromatic (Storybook)
Storybook a11y addon for component-level a11y checks

Other stacks

Vue / Nuxt: Vitest + Vue Testing Library (@testing-library/vue); MSW works identically; Playwright for E2E
SvelteKit: Vitest + @testing-library/svelte; MSW or Vitest's built-in vi.fn() for mocks; Playwright for E2E (built into SvelteKit's default template)
Angular: Jest or Vitest + Angular Testing Library (@testing-library/angular); MSW for HTTP mocks; Playwright or Cypress for E2E
Universal: Testing Trophy / Testing Pyramid logic is framework-agnostic; MSW intercepts at the network layer regardless of client; Playwright drives the browser regardless of framework

Related skills

component-quality — extracted hooks/components need accompanying tests
cicd-pipeline — wire coverage threshold + Playwright into the GitHub Actions matrix
accessibility-audit — Storybook a11y addon runs as part of the test pipeline

Reference

Key insight encoded: "Write tests. Not too many. Mostly integration." Center the stack on RTL + MSW integration tests. Reserve Playwright for critical user journeys — Vitest Browser Mode now closes the gap for component-level needs in 2025+, so E2E weight should stay light. Quality beats quantity: assert user-visible behavior (so tests survive refactors), not implementation; coverage is a floor, not a goal; and a flaky suite (real timers/network/random) gets ignored — keep tests deterministic.
Tool substitution note: Kent's original article cites Jest + Cypress. The substitution (Vitest for Jest, Playwright for Cypress, MSW for fetch mocks) is the modern equivalent at the time of writing, not Kent's literal recommendation. The 3-handler-per-endpoint convention is also a project rule, not from the article.

jaykim88/test-strategy

plugins/frontend-toolkit/skills/test-strategy/SKILL.md

Apply the Testing Trophy (mostly integration tests with RTL + MSW, sparing E2E with Playwright) and set coverage thresholds. Use before new feature work, after bug fixes, when CI coverage falls below target, or when tests are flaky or break on every refactor. Not for wiring coverage gates + Playwright into the GitHub Actions matrix (use cicd-pipeline) or auditing WCAG a11y compliance (use accessibility-audit).

development

Updated May 30, 2026

$ install --global

skillsauth

npx skillsauth add jaykim88/claude-ai-engineering test-strategy

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 30, 2026, 7:51 AM23.3s1 file scanned

SKILL.md

name:: test-strategy
description:: Apply the Testing Trophy (mostly integration tests with RTL + MSW, sparing E2E with Playwright) and set coverage thresholds. Use before new feature work, after bug fixes, when CI coverage falls below target, or when tests are flaky or break on every refactor. Not for wiring coverage gates + Playwright into the GitHub Actions matrix (use cicd-pipeline) or auditing WCAG a11y compliance (use accessibility-audit).
license:: MIT

Test Strategy

Purpose

Universal — the Testing Trophy ROI model and 3-handler-per-endpoint convention apply to any framework; only the test runner and component-rendering library differ.

Procedure

Run coverage analysis (validation loop)
- Run the test runner with coverage flag
- If branch coverage < target, identify 0%-coverage business logic and add tests; re-run until target met
- Prioritize untested business logic over framework code
- Coverage is a floor, not a goal — 90% coverage with implementation-coupled assertions gives false confidence. Chase untested behavior, not the number
Unit tests — pure functions & custom hooks
- AAA pattern (Arrange / Act / Assert), one assertion focus per test
- Mock only at system boundaries — never mock business logic (runner — see Implementation; Vitest)
Integration tests — the bulk
- Render the component with its real children; mock at the network boundary (not component-level mocks)
- Project convention: 3 handlers per endpoint — happy / error / empty. (A project rule, not from Kent's article — it operationalizes the integration-heavy stance.)
- Assert behavior, not implementation (applies to all layers): query user-visible output by role/label; never assert internal state, props, or that a function was called. A test that breaks on a refactor — when behavior didn't change — is testing the wrong thing
- Query priority getByRole > getByLabelText > text > getByTestId doubles as an a11y signal: if you can't query by role, the component is probably inaccessible
- Watch for mock drift: a hand-written MSW response can diverge from the real API shape (mock returns X, prod returns Y) → type the handlers against the real schema, or the test passes while prod breaks
- User-centric assertions (query by role, simulate real user events) (RTL + MSW — see Implementation)
E2E — critical user journeys only
- Login → core action → result; limit to journeys where regression would block users
- Reserve E2E for what integration can't reach: real navigation/redirects, auth flows, third-party iframes, multi-tab
- Don't replicate integration tests at the E2E layer (Playwright — see Implementation)
- Visual regression is a separate category role-query tests can't catch (CSS/layout breaks) — Playwright screenshots or Chromatic on critical pages
Component-level a11y checks
- Every shared component passes an a11y check in the component workshop
- Block PR on a11y violations (Storybook a11y addon — see Implementation)
Keep tests deterministic (flake kills suites)
- No real time: fake timers (vi.useFakeTimers) for debounce/throttle/intervals; never an arbitrary setTimeout/sleep to "wait for" something
- Await async properly — findBy* / waitFor (not getBy* before data resolves); userEvent is async, await it
- Control randomness, dates, and network: stub Math.random, freeze Date.now, mock every request (a test that hits the real network is flaky by definition)
- Tests pass in isolation and in any order — no shared mutable state between tests
CI configuration
- Set branch coverage threshold (not line coverage — line coverage rewards trivial getter tests)
- Run unit + integration on every PR; E2E on main only (or with [e2e] label)
Explicit non-targets
- Document what you intentionally don't test (framework internals, third-party library behavior)
- Prevents test-coverage gaming

Severity tiers

Default coverage target if no project-specific target is set: branch coverage ≥ 70% overall; ≥ 90% for security-critical paths.

Completion Criteria

[ ] Branch coverage ≥ target (default ≥ 70% if unset; ≥ 90% on critical paths)
[ ] Tests assert user-visible behavior, not implementation details (survive a refactor)
[ ] Test suite is deterministic — no real timers/network/random; passes under --repeat-each
[ ] Critical user journeys covered by E2E
[ ] MSW handlers cover happy / error / empty per endpoint
[ ] CI passes; no skipped tests left in main
[ ] Storybook a11y addon = 0 violations on shared components
[ ] All Critical findings fixed; all Major findings scheduled

Output

Test files: organized as *.test.ts (unit) / *.test.tsx (integration with RTL) / e2e/*.spec.ts (Playwright)
MSW handlers: src/mocks/handlers.ts exporting one handler per endpoint (happy / error / empty variants)
Coverage report: generated by test runner; CI fails if below threshold (default 70% branch)
CI config: .github/workflows/ci.yml includes vitest run --coverage with threshold + playwright test for critical journeys
Commit format: test(<scope>): <description> for test additions; fix(<scope>): <description> + test for regression-driven additions

Implementation

React + Next.js (default)

Runner: Vitest (vitest run --coverage)
Component rendering: React Testing Library (RTL) — screen.getByRole, userEvent
Network mocking: MSW (Mock Service Worker)
E2E: Playwright (npx playwright test --repeat-each=10 for flake detection)
Determinism: vi.useFakeTimers() for timers; await userEvent.*; findBy*/waitFor for async; stub Math.random / vi.setSystemTime
Visual regression: Playwright expect(page).toHaveScreenshot() or Chromatic (Storybook)
Storybook a11y addon for component-level a11y checks

Other stacks

Vue / Nuxt: Vitest + Vue Testing Library (@testing-library/vue); MSW works identically; Playwright for E2E
SvelteKit: Vitest + @testing-library/svelte; MSW or Vitest's built-in vi.fn() for mocks; Playwright for E2E (built into SvelteKit's default template)
Angular: Jest or Vitest + Angular Testing Library (@testing-library/angular); MSW for HTTP mocks; Playwright or Cypress for E2E
Universal: Testing Trophy / Testing Pyramid logic is framework-agnostic; MSW intercepts at the network layer regardless of client; Playwright drives the browser regardless of framework

Related skills

component-quality — extracted hooks/components need accompanying tests
cicd-pipeline — wire coverage threshold + Playwright into the GitHub Actions matrix
accessibility-audit — Storybook a11y addon runs as part of the test pipeline

Reference

Key insight encoded: "Write tests. Not too many. Mostly integration." Center the stack on RTL + MSW integration tests. Reserve Playwright for critical user journeys — Vitest Browser Mode now closes the gap for component-level needs in 2025+, so E2E weight should stay light. Quality beats quantity: assert user-visible behavior (so tests survive refactors), not implementation; coverage is a floor, not a goal; and a flaky suite (real timers/network/random) gets ignored — keep tests deterministic.
Tool substitution note: Kent's original article cites Jest + Cypress. The substitution (Vitest for Jest, Playwright for Cypress, MSW for fetch mocks) is the modern equivalent at the time of writing, not Kent's literal recommendation. The 3-handler-per-endpoint convention is also a project rule, not from the article.

Related Skills

jaykim88/third-party-scripts

development

VerifiedTrustedCommunity

Audit and optimize third-party scripts — analytics, tag managers, chat widgets, embeds — with the right loading strategy, performance budget, facades, and CSP/consent controls. Use when adding a script, when TBT/INP regress, when a GDPR/CCPA consent requirement arises, or before shipping. Not for first-party bundle size (use bundle-optimization) or broad Core Web Vitals diagnosis (use rendering-performance).

SKILL.mdUpdated May 30, 2026

jaykim88/third-party-scripts

jaykim88/tech-debt-management

development

VerifiedTrustedCommunity

Inventory and prioritize technical debt — TODO/FIXME/HACK, any usage, deprecated APIs, untested logic — with impact × effort matrix. Use at quarter start, before a refactoring sprint, when a new teammate joins, or when feature velocity slows. Not for actually paying down debt (use code-refactoring) or recording a migration approach (use decision-records) — this only inventories and prioritizes.

SKILL.mdUpdated May 30, 2026

jaykim88/tech-debt-management

jaykim88/state-management-decisions

development

VerifiedTrustedCommunity

Decision framework for choosing the right state location — URL, server cache, local component, or shared/global store. Use when state-sync bugs appear, prop drilling gets deep (3+ levels), filters/tabs lose state on reload, or quarterly review. Not for form state specifically (use form-ux) or when the state is actually server data (use api-caching-optimization).

SKILL.mdUpdated May 30, 2026

jaykim88/state-management-decisions

jaykim88/seo-metadata

development

VerifiedTrustedCommunity

Apply Next.js Metadata API per route — title template, og:image, generateMetadata for dynamic pages, JSON-LD structured data, robots, sitemap, canonical, hreflang. Use when adding a new route, when search visibility drops, when rich results are needed, or before shipping. Not for choosing a route's render mode (use render-strategy-decision); align generateMetadata with that route caching choice.

SKILL.mdUpdated May 30, 2026

jaykim88/seo-metadata

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/jaykim88/claude-ai-engineering.git

# Copy into Claude Code skills folder (global)
cp -r claude-ai-engineering/plugins/frontend-toolkit/skills/test-strategy ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

jaykim88/claude-ai-engineering

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT