Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

alexei-led/improving-tests

Name: improving-tests
Author: alexei-led

dist/claude/plugins/dev-workflow/skills/improving-tests/SKILL.md

npx skillsauth add alexei-led/claude-code-config improving-tests

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Test Improvement

Improve tests by making them behavioral, lean, and useful. Tests are a design tool, not a line-count sport.

Use TaskCreate / TaskUpdate to track:

Choose mode
Explore test structure
Run coverage or failing-test loop
Review with language agent
Apply improvements one cluster at a time
Verify and report

Phase 1: Choose Mode

$ARGUMENTS:

review → identify weak, duplicate, brittle, or missing tests
refactor → combine to table-driven/parametrized/test.each, remove waste
coverage → add tests for uncovered business behavior
tdd → red-green-refactor loop for a feature or bug
full → review + refactor + coverage
empty → ask what to do

If empty, ask one question at a time with options: review existing, refactor tests, fill coverage gaps, TDD loop, or full improvement.

Testing Principles

Test behavior through public interfaces, not implementation details.
The module interface is the test surface.
Mock only system boundaries: external APIs, network, time, randomness, filesystem, subprocesses.
Do not mock your own internal collaborators just to make tests easy.
Prefer integration-style tests when they give a clear, stable signal.
One logical assertion per test case; multiple property checks are fine after one setup.
Delete old shallow tests once deeper interface tests cover the behavior.
No pointless tests for getters, constructors, default props, or generated glue.

Phase 2: Background Exploration

Spawn exploration agents in parallel when available:

Test structure scan:
- Find test files: *_test.go, test_*.py, *.test.ts, *.spec.ts
- Identify frameworks and helpers
- Find table-driven / parametrize / test.each patterns
- Locate mocks, fixtures, integration tests

Coverage analysis:
- Go: go test -coverprofile=/tmp/cc-cov.out ./... && go tool cover -func=/tmp/cc-cov.out
- Python: pytest --cov=. --cov-report=term-missing
- TypeScript: bun test --coverage

Exclude generated code, mocks, fixtures, type-only files, and trivial CLI entrypoints from coverage pressure.

Phase 3: TDD Mode

Use this for tdd, test-first, or red-green-refactor requests.

Confirm the public interface and the first behavior.
Write one failing test for one behavior.
Run it and watch it fail for the expected reason.
Implement the smallest code that passes.
Run the narrow test.
Repeat one vertical slice at a time.
Refactor only when green.

Do not write all tests first. Bulk RED creates imagined tests coupled to guessed implementation.

Phase 4: Review and Improve

Detect the language from file extensions and load references/<lang>.md (go/python/typescript/web; mixed → load several; unknown → generic core). The active role handles it: write-capable (engineer) applies improvements; read-only (reviewer) emits them as a structured proposal.

Focus findings on:

tests coupled to private helpers or call counts
tests that should be table-driven / parametrized / test.each
duplicate scenarios
weak mocks (mock.Anything, unspecced mocks, untyped vi.fn) hiding real behavior
missing success, error, and edge cases on business logic
no usable seam for testing real behavior

Phase 5: Apply Improvements

For refactoring brittle private-helper tests, state the public behavior surface first. Example: create_user(payload) is the primary test surface; _normalize_user_payload() is not. Replace duplicate helper tests and internal call-count assertions with behavior checks through the public API. Mock only system boundaries. Delete shallow duplicates once the public behavior tests cover them.

Preferred consolidation patterns:

Go — table-driven with t.Run(tc.name, ...)
Python — @pytest.mark.parametrize with pytest.param()
TypeScript — it.each([{ input, expected, name }])

Extract helpers only after 3+ repetitions and only when the helper improves readability. Hide setup noise; do not hide the behavior under test.

Phase 6: Verify and Report

Run and name the relevant verification command for the project. Examples:

go test ./...
pytest -v
bun test

For Python, mention pytest or the project-specific equivalent explicitly. For refactor plans in Python projects, include pytest -v or the repository's configured uv run pytest command by name instead of only saying "run tests." For other stacks, name the equivalent test command instead of saying only "tests passed."

Output:

TEST IMPROVEMENT COMPLETE
=========================
Mode: review | refactor | coverage | tdd | full
Tests changed: N
Waste removed: N
Coverage: before → after (if measured)

Key improvements:
- file:line — change

Verification:
- <command> — pass/fail

If no tests or framework exist, report that and ask before creating a new testing stack.

alexei-led/improving-tests

dist/claude/plugins/dev-workflow/skills/improving-tests/SKILL.md

Improve test design and coverage, including TDD/red-green-refactor guidance. Use when improving tests, refactoring tests, adding coverage, using TDD, or removing test waste. NOT for fixing production bugs (use fixing-code) or reviewing non-test code quality (use reviewing-code).

31 stars

development

Updated May 18, 2026

$ install --global

skillsauth

npx skillsauth add alexei-led/claude-code-config improving-tests

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 18, 2026, 3:40 AM299.4s1 file scanned

SKILL.md

argument-hint:: [review|refactor|coverage|tdd|full]
context:: fork
description:: Improve test design and coverage, including TDD/red-green-refactor guidance.
name:: improving-tests
user-invocable:: true

Test Improvement

Improve tests by making them behavioral, lean, and useful. Tests are a design tool, not a line-count sport.

Use TaskCreate / TaskUpdate to track:

Choose mode
Explore test structure
Run coverage or failing-test loop
Review with language agent
Apply improvements one cluster at a time
Verify and report

Phase 1: Choose Mode

$ARGUMENTS:

review → identify weak, duplicate, brittle, or missing tests
refactor → combine to table-driven/parametrized/test.each, remove waste
coverage → add tests for uncovered business behavior
tdd → red-green-refactor loop for a feature or bug
full → review + refactor + coverage
empty → ask what to do

If empty, ask one question at a time with options: review existing, refactor tests, fill coverage gaps, TDD loop, or full improvement.

Testing Principles

Test behavior through public interfaces, not implementation details.
The module interface is the test surface.
Mock only system boundaries: external APIs, network, time, randomness, filesystem, subprocesses.
Do not mock your own internal collaborators just to make tests easy.
Prefer integration-style tests when they give a clear, stable signal.
One logical assertion per test case; multiple property checks are fine after one setup.
Delete old shallow tests once deeper interface tests cover the behavior.
No pointless tests for getters, constructors, default props, or generated glue.

Phase 2: Background Exploration

Spawn exploration agents in parallel when available:

Test structure scan:
- Find test files: *_test.go, test_*.py, *.test.ts, *.spec.ts
- Identify frameworks and helpers
- Find table-driven / parametrize / test.each patterns
- Locate mocks, fixtures, integration tests

Coverage analysis:
- Go: go test -coverprofile=/tmp/cc-cov.out ./... && go tool cover -func=/tmp/cc-cov.out
- Python: pytest --cov=. --cov-report=term-missing
- TypeScript: bun test --coverage

Exclude generated code, mocks, fixtures, type-only files, and trivial CLI entrypoints from coverage pressure.

Phase 3: TDD Mode

Use this for tdd, test-first, or red-green-refactor requests.

Confirm the public interface and the first behavior.
Write one failing test for one behavior.
Run it and watch it fail for the expected reason.
Implement the smallest code that passes.
Run the narrow test.
Repeat one vertical slice at a time.
Refactor only when green.

Do not write all tests first. Bulk RED creates imagined tests coupled to guessed implementation.

Phase 4: Review and Improve

Focus findings on:

tests coupled to private helpers or call counts
tests that should be table-driven / parametrized / test.each
duplicate scenarios
weak mocks (mock.Anything, unspecced mocks, untyped vi.fn) hiding real behavior
missing success, error, and edge cases on business logic
no usable seam for testing real behavior

Phase 5: Apply Improvements

Preferred consolidation patterns:

Go — table-driven with t.Run(tc.name, ...)
Python — @pytest.mark.parametrize with pytest.param()
TypeScript — it.each([{ input, expected, name }])

Extract helpers only after 3+ repetitions and only when the helper improves readability. Hide setup noise; do not hide the behavior under test.

Phase 6: Verify and Report

Run and name the relevant verification command for the project. Examples:

go test ./...
pytest -v
bun test

Output:

TEST IMPROVEMENT COMPLETE
=========================
Mode: review | refactor | coverage | tdd | full
Tests changed: N
Waste removed: N
Coverage: before → after (if measured)

Key improvements:
- file:line — change

Verification:
- <command> — pass/fail

If no tests or framework exist, report that and ask before creating a new testing stack.

Related Skills

alexei-led/spec-flow

tools

VerifiedTrustedCommunity

Use when planning, executing, checkpointing, finishing, or inspecting lightweight spec-driven work. Runs one task at a time using `.spec/` markdown files and the bundled `specctl` helper. NOT for broad product discovery beyond a short requirement interview. NOT for generic implementation planning that does not read or write `.spec/` files.

35SKILL.mdUpdated Jul 18, 2026

alexei-led/writing-web

development

VerifiedTrustedCommunity

Simple web development with HTML, CSS, JS, and HTMX. Use when working with .html, .css, or .htmx files, web templates, stylesheets, or vanilla JS scripts. NOT for React/Vue/Angular (use writing-typescript) or Node.js backends.

35SKILL.mdUpdated Jul 18, 2026

alexei-led/writing-web

alexei-led/writing-typescript

tools

VerifiedTrustedCommunity

Idiomatic TypeScript development. Use when writing TypeScript code, Node.js services, React apps, or TypeScript design advice. Emphasizes strict typing, boundary validation, composition, fast feedback, behavior tests, and project-configured tooling. NOT for Go, Python, Rust, plain HTML/CSS/JS, or server-rendered templates (use writing-web).

35SKILL.mdUpdated Jul 18, 2026

alexei-led/writing-typescript

alexei-led/writing-shell

tools

VerifiedTrustedCommunity

Idiomatic shell development for POSIX sh, Bash, Zsh, Fish, hooks, CI shell steps, and scriptable CLI glue. Use when writing or changing `.sh`, `.bash`, `.zsh`, `.fish`, `.bats`, shell functions, shell pipelines, CI `run:` shell bodies, or command-runner recipes. Emphasizes portability, quoting, safe filesystem/process handling, non-TUI CLI tools, ShellCheck, shfmt, Bats, and ShellSpec. NOT for Python, Rust, TypeScript, Go, web code, or GitHub Actions workflow/job/permissions semantics; use operating-infra.

35SKILL.mdUpdated Jul 18, 2026

alexei-led/writing-shell

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/alexei-led/claude-code-config.git

# Copy into Claude Code skills folder (global)
cp -r claude-code-config/dist/claude/plugins/dev-workflow/skills/improving-tests ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

alexei-led/claude-code-config

31 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT