Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

rileyhilliard/fixing-flaky-tests

Name: fixing-flaky-tests
Author: rileyhilliard

plugins/ce/skills/fixing-flaky-tests/SKILL.md

npx skillsauth add rileyhilliard/claude-essentials fixing-flaky-tests

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Fixing Flaky Tests

Target symptom: Tests pass when run alone, fail when run with other tests.

Diagnose first

Test passes alone, fails with others?
    │
    ├─ Same error every time → Shared state
    │   └─ Database, globals, files, singletons
    │
    ├─ Random/timing failures → Race condition
    │   └─ Use `condition-based-waiting` skill
    │
    └─ Resource errors (port, file lock) → Resource conflict
        └─ Need unique resources per test/worker

Quick diagnosis:

Run failing test 10x alone - does it always pass?
Run failing test 10x with the suite - same error or different?
Check error message - mentions port/file/connection?

Shared state (deterministic failures)

Tests pollute state that other tests depend on. Fix by isolating state per test.

| State Type | Isolation Pattern | |------------|-------------------| | Database | Transaction rollback, savepoints, worker-specific DBs | | Global variables | Reset in beforeEach/afterEach | | Singletons | Provide fresh instance per test | | Module state | jest.resetModules() or equivalent | | Files | Unique paths per test, temp directories | | Environment vars | Save/restore in setup/teardown |

Database isolation (most common):

# Python: Savepoint rollback - each test gets rolled back
@pytest.fixture
async def db_session(db_engine):
    async with db_engine.connect() as conn:
        await conn.begin()
        await conn.begin_nested()  # Savepoint
        # ... yield session ...
        await conn.rollback()  # All changes vanish

// Jest: Reset mocks between tests
beforeEach(() => {
  jest.clearAllMocks()
  jest.resetModules()  // Clear module cache before test
})

afterEach(() => {
  jest.restoreAllMocks()  // Restore spied functions
})

See language-specific references for complete patterns.

Race conditions (random failures)

Tests don't wait for async operations to complete.

Use the condition-based-waiting skill for detailed patterns on:

Framework-specific waiting (Testing Library findBy, Playwright auto-wait)
Custom polling helpers
When arbitrary timeouts are acceptable

Quick summary: Wait for conditions, not time:

// Bad
await sleep(500)

// Good
await waitFor(() => expect(result).toBe('done'))

Resource conflicts (port/file errors)

Multiple tests or workers compete for same resource.

Worker-specific resources:

# Python pytest-xdist: unique DB per worker
@pytest.fixture(scope="session")
def database_url(worker_id):
    if worker_id == "master":
        return "postgresql://localhost/test"
    return f"postgresql://localhost/test_{worker_id}"

// Jest/Node: dynamic port allocation
const server = app.listen(0)  // OS assigns available port
const port = server.address().port

File conflicts:

import tempfile

@pytest.fixture
def temp_dir():
    with tempfile.TemporaryDirectory() as d:
        yield d

Language-specific isolation patterns

| Stack | Reference | |-------|-----------| | Python (pytest, SQLAlchemy) | references/python.md | | Jest / Testing Library | references/jest.md | | Playwright E2E | references/playwright.md |

Verification

After fixing, verify the fix worked:

# Run the specific test many times
pytest tests/test_flaky.py -x --count=20

# Run with parallelism
pytest -n auto

# Jest equivalent
jest --runInBand  # First verify serial works
jest              # Then verify parallel works

rileyhilliard/fixing-flaky-tests

plugins/ce/skills/fixing-flaky-tests/SKILL.md

Diagnose and fix tests that pass in isolation but fail when run concurrently. Covers shared state isolation and resource conflicts. References condition-based-waiting for timing issues.

111 stars

testing

Updated Apr 11, 2026

$ install --global

skillsauth

npx skillsauth add rileyhilliard/claude-essentials fixing-flaky-tests

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 11, 2026, 10:56 PM4.7s4 files scanned

SKILL.md

name:: fixing-flaky-tests
description:: Diagnose and fix tests that pass in isolation but fail when run concurrently. Covers shared state isolation and resource conflicts. References condition-based-waiting for timing issues.

Fixing Flaky Tests

Target symptom: Tests pass when run alone, fail when run with other tests.

Diagnose first

Test passes alone, fails with others?
    │
    ├─ Same error every time → Shared state
    │   └─ Database, globals, files, singletons
    │
    ├─ Random/timing failures → Race condition
    │   └─ Use `condition-based-waiting` skill
    │
    └─ Resource errors (port, file lock) → Resource conflict
        └─ Need unique resources per test/worker

Quick diagnosis:

Run failing test 10x alone - does it always pass?
Run failing test 10x with the suite - same error or different?
Check error message - mentions port/file/connection?

Shared state (deterministic failures)

Tests pollute state that other tests depend on. Fix by isolating state per test.

Database isolation (most common):

# Python: Savepoint rollback - each test gets rolled back
@pytest.fixture
async def db_session(db_engine):
    async with db_engine.connect() as conn:
        await conn.begin()
        await conn.begin_nested()  # Savepoint
        # ... yield session ...
        await conn.rollback()  # All changes vanish

// Jest: Reset mocks between tests
beforeEach(() => {
  jest.clearAllMocks()
  jest.resetModules()  // Clear module cache before test
})

afterEach(() => {
  jest.restoreAllMocks()  // Restore spied functions
})

See language-specific references for complete patterns.

Race conditions (random failures)

Tests don't wait for async operations to complete.

Use the condition-based-waiting skill for detailed patterns on:

Framework-specific waiting (Testing Library findBy, Playwright auto-wait)
Custom polling helpers
When arbitrary timeouts are acceptable

Quick summary: Wait for conditions, not time:

// Bad
await sleep(500)

// Good
await waitFor(() => expect(result).toBe('done'))

Resource conflicts (port/file errors)

Multiple tests or workers compete for same resource.

Worker-specific resources:

# Python pytest-xdist: unique DB per worker
@pytest.fixture(scope="session")
def database_url(worker_id):
    if worker_id == "master":
        return "postgresql://localhost/test"
    return f"postgresql://localhost/test_{worker_id}"

// Jest/Node: dynamic port allocation
const server = app.listen(0)  // OS assigns available port
const port = server.address().port

File conflicts:

import tempfile

@pytest.fixture
def temp_dir():
    with tempfile.TemporaryDirectory() as d:
        yield d

Language-specific isolation patterns

| Stack | Reference | |-------|-----------| | Python (pytest, SQLAlchemy) | references/python.md | | Jest / Testing Library | references/jest.md | | Playwright E2E | references/playwright.md |

Verification

After fixing, verify the fix worked:

# Run the specific test many times
pytest tests/test_flaky.py -x --count=20

# Run with parallelism
pytest -n auto

# Jest equivalent
jest --runInBand  # First verify serial works
jest              # Then verify parallel works

Related Skills

rileyhilliard/structuring-articles

development

VerifiedTrustedCommunity

Selects and applies professional journalistic story structures (WSJ Formula, Inverted Pyramid, Hourglass, Tick-Tock, etc.) based on the content being written. Use when writing articles, blog posts, features, essays, long-form content, news stories, trend pieces, investigative reports, profiles, or any narrative prose longer than a few paragraphs. Also use when the user asks for help structuring a piece, choosing a story framework, organizing a draft, outlining an article, or wants to know which article format fits their content. Trigger on requests like "help me structure this," "what format should I use," "write a feature about," "draft a blog post on," or any mention of story structure, article architecture, or narrative frameworks. Complements the writer skill (which handles tone and anti-AI rhetoric) by providing the structural blueprint.

122SKILL.mdUpdated May 28, 2026

rileyhilliard/structuring-articles

rileyhilliard/writer

testing

VerifiedTrustedCommunity

Writing style and tone guide for human-sounding content. Use when writing documentation, READMEs, commit messages, PR descriptions, blog posts, LinkedIn posts, social media content, or any user-facing content.

122SKILL.mdUpdated Apr 11, 2026

rileyhilliard/writing-plans

data-ai

VerifiedTrustedCommunity

Create implementation plans with tasks grouped by subsystem. Related tasks share agent context; groups parallelize across subsystems.

118SKILL.mdUpdated Apr 11, 2026

rileyhilliard/writing-plans

rileyhilliard/systematic-debugging

development

VerifiedTrustedCommunity

Debugging framework that finds root causes before proposing fixes. Use when investigating bugs, errors, unexpected behavior, failed tests, or when previous fixes haven't worked.

118SKILL.mdUpdated Apr 11, 2026

rileyhilliard/systematic-debugging

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/rileyhilliard/claude-essentials.git

# Copy into Claude Code skills folder (global)
cp -r claude-essentials/plugins/ce/skills/fixing-flaky-tests ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

rileyhilliard/claude-essentials

111 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT