Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

alexei-led/improving-tests

Name: improving-tests
Author: alexei-led

dist/codex/plugins/dev-workflow/skills/improving-tests/SKILL.md

npx skillsauth add alexei-led/claude-code-config improving-tests

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Test Improvement

Improve tests by making them behavioral, lean, and useful.

Role-gated action

Detect your capability from your tools, not from prose:

Write-capable role (engineer): apply the test changes and run the verification command.
Read-only role (reviewer): identify the weak/missing/brittle tests and emit the changes in the Proposed Changes contract under Verify and report. Apply nothing; run nothing — a reviewer has no edit or Bash tools, so the review mode is its natural fit.

Language detection and references

Detect the language from the file extensions in scope and load the matching reference for language-specific test patterns and tooling:

Go → references/go.md
Python → references/python.md
TypeScript → references/typescript.md
Web → references/web.md

Mixed languages: load each matching reference. Unknown language: use the generic principles below only.

Modes

review → identify weak, duplicate, brittle, or missing tests
refactor → combine to table-driven/parametrized/test.each, remove waste
coverage → add tests for uncovered business behavior
tdd → red-green-refactor loop for a feature or bug
full → review + refactor + coverage

Testing principles

Test behavior through public interfaces, not implementation details.
The module interface is the test surface.
Mock only system boundaries: external APIs, network, time, randomness, filesystem, subprocesses.
Do not mock your own internal collaborators just to make tests easy.
Prefer integration-style tests when they give a clear, stable signal.
Delete old shallow tests once deeper interface tests cover the behavior.
No pointless tests for getters, constructors, default props, or generated glue.

TDD workflow

Use this for test-first or red-green-refactor work.

Confirm the public interface and first behavior.
Write one failing test for one behavior.
Run it and watch it fail for the expected reason.
Implement the smallest code that passes.
Run the narrow test.
Repeat one vertical slice at a time.
Refactor only when green.

Do not write all tests first. Bulk RED creates imagined tests coupled to guessed implementation.

Review workflow

Explore the existing test suite. Engineer runs these commands to gather coverage; reviewer (no Bash) works from the test files in scope plus any coverage output the caller supplies — ask for that context if missing, do not run the commands:

# Go
go test -coverprofile=/tmp/cc-cov.out ./... && go tool cover -func=/tmp/cc-cov.out

# Python
pytest --cov=. --cov-report=term-missing

# TypeScript
bun test --coverage

Look for:

tests coupled to private helpers or call counts
tests that should be table-driven / parametrized / test.each
duplicate scenarios
weak mocks hiding real behavior
missing success, error, and edge cases on business logic
no usable seam for testing real behavior

Preferred consolidation patterns

For refactoring brittle private-helper tests, state the public behavior surface first. Example: create_user(payload) is the primary test surface; _normalize_user_payload() is not. Replace duplicate helper tests and internal call-count assertions with behavior checks through the public API. Mock only system boundaries. Delete shallow duplicates once the public behavior tests cover them.

Go — table-driven with t.Run(tc.name, ...)
Python — @pytest.mark.parametrize with pytest.param()
TypeScript — it.each([{ input, expected, name }])

Extract helpers only after 3+ repetitions and only when the helper improves readability.

Verify and report

Engineer runs and names the relevant verification command for the project after applying. Reviewer names the command in the Proposed Changes rationale and does not run it (no Bash). Examples:

go test ./...
pytest -v
bun test

For Python, mention pytest or the project-specific equivalent explicitly. For refactor plans in Python projects, include pytest -v or the repository's configured uv run pytest command by name instead of only saying "run tests." For other stacks, name the equivalent test command instead of saying only "tests passed."

Engineer (applied the changes):

TEST IMPROVEMENT COMPLETE
=========================
Mode: review | refactor | coverage | tdd | full
Tests changed: N
Waste removed: N
Coverage: before → after (if measured)

Key improvements:
- file:line — change

Verification:
- <command> — pass/fail

Reviewer (identified only — emit the changes as a proposal, apply nothing):

## Proposed Changes

### Change 1: <brief description>

File: `path/to/test_file`
Action: CREATE | MODIFY | DELETE

Code:
<complete test code, in the file's language>

Rationale: <weak/missing/brittle test this addresses>

If no tests or framework exist, report that and ask before creating a new testing stack.

alexei-led/improving-tests

dist/codex/plugins/dev-workflow/skills/improving-tests/SKILL.md

Improve test design and coverage, including TDD/red-green-refactor guidance. Use when improving tests, refactoring tests, adding coverage, using TDD, or removing test waste. NOT for fixing production bugs (use fixing-code) or reviewing non-test code quality (use reviewing-code).

31 stars

development

Updated May 18, 2026

$ install --global

skillsauth

npx skillsauth add alexei-led/claude-code-config improving-tests

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 18, 2026, 3:46 AM387.5s1 file scanned

SKILL.md

description:: Improve test design and coverage, including TDD/red-green-refactor guidance.
name:: improving-tests

Test Improvement

Improve tests by making them behavioral, lean, and useful.

Role-gated action

Detect your capability from your tools, not from prose:

Write-capable role (engineer): apply the test changes and run the verification command.
Read-only role (reviewer): identify the weak/missing/brittle tests and emit the changes in the Proposed Changes contract under Verify and report. Apply nothing; run nothing — a reviewer has no edit or Bash tools, so the review mode is its natural fit.

Language detection and references

Detect the language from the file extensions in scope and load the matching reference for language-specific test patterns and tooling:

Go → references/go.md
Python → references/python.md
TypeScript → references/typescript.md
Web → references/web.md

Mixed languages: load each matching reference. Unknown language: use the generic principles below only.

Modes

review → identify weak, duplicate, brittle, or missing tests
refactor → combine to table-driven/parametrized/test.each, remove waste
coverage → add tests for uncovered business behavior
tdd → red-green-refactor loop for a feature or bug
full → review + refactor + coverage

Testing principles

Test behavior through public interfaces, not implementation details.
The module interface is the test surface.
Mock only system boundaries: external APIs, network, time, randomness, filesystem, subprocesses.
Do not mock your own internal collaborators just to make tests easy.
Prefer integration-style tests when they give a clear, stable signal.
Delete old shallow tests once deeper interface tests cover the behavior.
No pointless tests for getters, constructors, default props, or generated glue.

TDD workflow

Use this for test-first or red-green-refactor work.

Confirm the public interface and first behavior.
Write one failing test for one behavior.
Run it and watch it fail for the expected reason.
Implement the smallest code that passes.
Run the narrow test.
Repeat one vertical slice at a time.
Refactor only when green.

Do not write all tests first. Bulk RED creates imagined tests coupled to guessed implementation.

Review workflow

# Go
go test -coverprofile=/tmp/cc-cov.out ./... && go tool cover -func=/tmp/cc-cov.out

# Python
pytest --cov=. --cov-report=term-missing

# TypeScript
bun test --coverage

Look for:

tests coupled to private helpers or call counts
tests that should be table-driven / parametrized / test.each
duplicate scenarios
weak mocks hiding real behavior
missing success, error, and edge cases on business logic
no usable seam for testing real behavior

Preferred consolidation patterns

Go — table-driven with t.Run(tc.name, ...)
Python — @pytest.mark.parametrize with pytest.param()
TypeScript — it.each([{ input, expected, name }])

Extract helpers only after 3+ repetitions and only when the helper improves readability.

Verify and report

Engineer runs and names the relevant verification command for the project after applying. Reviewer names the command in the Proposed Changes rationale and does not run it (no Bash). Examples:

go test ./...
pytest -v
bun test

Engineer (applied the changes):

TEST IMPROVEMENT COMPLETE
=========================
Mode: review | refactor | coverage | tdd | full
Tests changed: N
Waste removed: N
Coverage: before → after (if measured)

Key improvements:
- file:line — change

Verification:
- <command> — pass/fail

Reviewer (identified only — emit the changes as a proposal, apply nothing):

## Proposed Changes

### Change 1: <brief description>

File: `path/to/test_file`
Action: CREATE | MODIFY | DELETE

Code:
<complete test code, in the file's language>

Rationale: <weak/missing/brittle test this addresses>

If no tests or framework exist, report that and ask before creating a new testing stack.

Related Skills

alexei-led/spec-flow

tools

VerifiedTrustedCommunity

Use when planning, executing, checkpointing, finishing, or inspecting lightweight spec-driven work. Runs one task at a time using `.spec/` markdown files and the bundled `specctl` helper. NOT for broad product discovery beyond a short requirement interview. NOT for generic implementation planning that does not read or write `.spec/` files.

35SKILL.mdUpdated Jul 18, 2026

alexei-led/writing-web

development

VerifiedTrustedCommunity

Simple web development with HTML, CSS, JS, and HTMX. Use when working with .html, .css, or .htmx files, web templates, stylesheets, or vanilla JS scripts. NOT for React/Vue/Angular (use writing-typescript) or Node.js backends.

35SKILL.mdUpdated Jul 18, 2026

alexei-led/writing-web

alexei-led/writing-typescript

tools

VerifiedTrustedCommunity

Idiomatic TypeScript development. Use when writing TypeScript code, Node.js services, React apps, or TypeScript design advice. Emphasizes strict typing, boundary validation, composition, fast feedback, behavior tests, and project-configured tooling. NOT for Go, Python, Rust, plain HTML/CSS/JS, or server-rendered templates (use writing-web).

35SKILL.mdUpdated Jul 18, 2026

alexei-led/writing-typescript

alexei-led/writing-shell

tools

VerifiedTrustedCommunity

Idiomatic shell development for POSIX sh, Bash, Zsh, Fish, hooks, CI shell steps, and scriptable CLI glue. Use when writing or changing `.sh`, `.bash`, `.zsh`, `.fish`, `.bats`, shell functions, shell pipelines, CI `run:` shell bodies, or command-runner recipes. Emphasizes portability, quoting, safe filesystem/process handling, non-TUI CLI tools, ShellCheck, shfmt, Bats, and ShellSpec. NOT for Python, Rust, TypeScript, Go, web code, or GitHub Actions workflow/job/permissions semantics; use operating-infra.

35SKILL.mdUpdated Jul 18, 2026

alexei-led/writing-shell

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/alexei-led/claude-code-config.git

# Copy into Claude Code skills folder (global)
cp -r claude-code-config/dist/codex/plugins/dev-workflow/skills/improving-tests ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

alexei-led/claude-code-config

31 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT