Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

ad-sdl/galahad

Name: galahad
Author: ad-sdl

skills/galahad/SKILL.md

npx skillsauth add ad-sdl/madsci galahad

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Coding Agent Quality Rules (Galahad Principle)

Based on Jonathan Lange's "The Galahad Principle": https://jml.io/galahad-principle/

Core idea: getting to 100% yields disproportionate value—especially simplicity and trust. When checks are truly "all green", any new failure is a strong, unambiguous signal; "absence of evidence becomes evidence of absence".

Assess Before Applying

Before enforcing these rules strictly, understand the context:

Read project conventions: Check tsconfig.json, pyproject.toml, .eslintrc, setup.cfg, mypy.ini for existing standards
Gauge existing tech debt: If the codebase already has 500 any types, don't block progress on fixing all of them
Match scope to task: A quick bug fix ≠ a new feature ≠ a refactor

When working in a codebase that doesn't meet these standards:

Don't make things worse: No new type escapes, no new skipped tests
Opportunistically improve: Clean up what you touch
Don't block the user's goal: Pragmatic progress beats ideological purity
Use code ratchets to improve over time: Use the "code-ratchets" skill to improve patterns over time

Discovering project standards

TypeScript: Check tsconfig.json for strict, noImplicitAny, strictNullChecks. Match existing settings.

Python: Check for mypy.ini, pyproject.toml [tool.mypy], pyrightconfig.json. Note the strictness level.

General: Look at existing test files for patterns, existing code for style. When in doubt, match what's there.

Non-negotiables: never evade feedback

Treat type errors, test failures, pre-commit hooks, lint errors, and coverage warnings as helpful feedback. Fix root causes.

Forbidden by default (unless the user explicitly orders it)

Type escapes / silencing
- TypeScript: any, sketchy unknown laundering, unchecked casts, as any, @ts-ignore, disabling strict mode, weakening compiler flags
- Python: # type: ignore, # pyright: ignore, # mypy: ignore-errors, cast() without justification, Any in public APIs, disabling type checkers
- General: noqa, pragma comments to silence legitimate warnings
Coverage gaming
- TypeScript: /* istanbul ignore */, /* c8 ignore */, artificial exclusions in config
- Python: # pragma: no cover, # coverage: skip, excluding entire modules from coverage config
- General: "generated" file tricks, decorator/macro suppression, lowering coverage thresholds
Faking results
- Skipping CI steps and claiming success; "snapshotting" coverage; lowering thresholds; marking tests flaky to ignore them

When user requests conflict with these principles

If the user explicitly asks for a type escape, to skip tests, or similar:

Comply, but note the tradeoff: "Adding any here—this will need cleanup before the type system can catch errors in this area."
Offer alternatives briefly: "If you prefer, I could extract this to a small typed helper instead."
Don't lecture: One sentence, then move on.

The user owns the codebase. Your job is to inform, not obstruct.

Priorities

Type safety is part of correctness and outranks tests.

When tradeoffs exist, prioritize in this order:

Type safety / soundness
Correctness + meaningful tests
Clarity / maintainability
Performance
Backwards compatibility

Breaking changes are acceptable when they improve verifiability and simplify the system, but:

Flag breaking changes explicitly to the user
Prefer non-breaking improvements when effort is similar
Consider migration paths for public APIs

Default workflow (when anything fails)

Read the failure output carefully.
Understand the context: Why does this code exist? What was the original intent? Check git history or ask if unclear.
Restate the real invariant being violated in plain English.
Fix the root cause (not the symptom).
Improve tests so the behavior is pinned and regressions get caught.
Refactor production code if needed to make it easy to type-check and validate.

Run checks in this order

Typecheck
Unit tests
Integration tests
Doc and End-to-End tests
Lint / pre-commit
Coverage

Goal: a repo where "all green" is normal, and any new red is a loud, trustworthy signal.

What makes a test meaningful

✅ Meaningful tests:

Test observable behavior from the caller's perspective
Would catch real regressions
Document intent and edge cases
Fail when actual bugs are introduced

❌ Not meaningful:

Test implementation details (private methods, internal state)
Duplicate what the type checker already verifies
Assert only on mock interactions, not outcomes
Pass regardless of whether the code works

The test: "If this test failed, would I learn something useful about a real bug?"

Coverage: aim for meaningful, not mechanical

Do: Cover all business logic paths, edge cases, error handling
Don't: Chase 100% by testing trivial getters or truly unreachable defensive code
Legitimate exclusions exist: Platform-specific branches, debug-only code, abstract method stubs
The bar: Would a failure in this line indicate a real bug? If yes, cover it.

Coverage comes from exercising real behavior, not from exclusion comments.

Handling flaky tests

If a test is genuinely flaky:

Identify the source: Time-dependence? Race condition? External service? Order-dependence?
Fix the non-determinism: Inject clocks, add synchronization, mock external calls, isolate state
If unfixable now: Quarantine in a separate test suite (not skipped, but run separately and tracked)
Never: Mark as "expected flaky" and leave in the main CI path

"Hard to test" means refactor

If something is hard to test or hard to type, treat it as a design smell.

Refactor towards:

Smaller pure functions
Explicit data flow, minimal global state
Clear boundaries between logic and side effects
Typed domain models over stringly-typed data
- TypeScript: strong interfaces/types instead of Record<string, any>
- Python: dataclasses, Pydantic models, or TypedDicts instead of dict[str, Any]

Mocks: use sparingly and explicitly

Avoid injecting mocks via monkeypatching or replacing system utilities by default.

Preferred approach:

Make the function under test able to operate in multiple environments by passing in the substitutable operations explicitly (as function parameters or small interfaces)
Only do this for operations that genuinely need substitution in tests: time, randomness, network, filesystem, process execution
This makes the injection point explicit, documents what varies, and keeps tests honest

Examples:

TypeScript:

// ❌ Bad: hard-coded dependency, requires monkeypatching to test
function processOrder(orderId: string) {
  const now = new Date();
  const order = database.getOrder(orderId);
  // ...
}

// ✅ Good: explicit dependencies
function processOrder(
  orderId: string,
  deps: { getTime: () => Date; getOrder: (id: string) => Order }
) {
  const now = deps.getTime();
  const order = deps.getOrder(orderId);
  // ...
}

Python:

# ❌ Bad: hard-coded dependency, requires monkeypatching to test
def process_order(order_id: str) -> OrderResult:
    now = datetime.now()
    order = database.get_order(order_id)
    # ...

# ✅ Good: explicit dependencies
def process_order(
    order_id: str,
    *,
    get_time: Callable[[], datetime] = datetime.now,
    get_order: Callable[[str], Order] = database.get_order,
) -> OrderResult:
    now = get_time()
    order = get_order(order_id)
    # ...

Summary: what "good" looks like

Types encode invariants; no "trust me" casts
Tests assert observable behavior (not implementation trivia)
Coverage comes from exercising real behavior, not exclusions
If a thing can't be verified cleanly, refactor until it can
Progress beats perfection; don't make things worse, do make things better

ad-sdl/galahad

skills/galahad/SKILL.md

How to approach tests, types, lints, and coverage

51 stars

development

Updated Mar 27, 2026

$ install --global

skillsauth

npx skillsauth add ad-sdl/madsci galahad

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Mar 30, 2026, 11:06 AM55.7s1 file scanned

SKILL.md

name:: galahad
description:: How to approach tests, types, lints, and coverage

Coding Agent Quality Rules (Galahad Principle)

Based on Jonathan Lange's "The Galahad Principle": https://jml.io/galahad-principle/

Assess Before Applying

Before enforcing these rules strictly, understand the context:

Read project conventions: Check tsconfig.json, pyproject.toml, .eslintrc, setup.cfg, mypy.ini for existing standards
Gauge existing tech debt: If the codebase already has 500 any types, don't block progress on fixing all of them
Match scope to task: A quick bug fix ≠ a new feature ≠ a refactor

When working in a codebase that doesn't meet these standards:

Don't make things worse: No new type escapes, no new skipped tests
Opportunistically improve: Clean up what you touch
Don't block the user's goal: Pragmatic progress beats ideological purity
Use code ratchets to improve over time: Use the "code-ratchets" skill to improve patterns over time

Discovering project standards

TypeScript: Check tsconfig.json for strict, noImplicitAny, strictNullChecks. Match existing settings.

Python: Check for mypy.ini, pyproject.toml [tool.mypy], pyrightconfig.json. Note the strictness level.

General: Look at existing test files for patterns, existing code for style. When in doubt, match what's there.

Non-negotiables: never evade feedback

Treat type errors, test failures, pre-commit hooks, lint errors, and coverage warnings as helpful feedback. Fix root causes.

Forbidden by default (unless the user explicitly orders it)

Type escapes / silencing
- TypeScript: any, sketchy unknown laundering, unchecked casts, as any, @ts-ignore, disabling strict mode, weakening compiler flags
- Python: # type: ignore, # pyright: ignore, # mypy: ignore-errors, cast() without justification, Any in public APIs, disabling type checkers
- General: noqa, pragma comments to silence legitimate warnings
Coverage gaming
- TypeScript: /* istanbul ignore */, /* c8 ignore */, artificial exclusions in config
- Python: # pragma: no cover, # coverage: skip, excluding entire modules from coverage config
- General: "generated" file tricks, decorator/macro suppression, lowering coverage thresholds
Faking results
- Skipping CI steps and claiming success; "snapshotting" coverage; lowering thresholds; marking tests flaky to ignore them

When user requests conflict with these principles

If the user explicitly asks for a type escape, to skip tests, or similar:

Comply, but note the tradeoff: "Adding any here—this will need cleanup before the type system can catch errors in this area."
Offer alternatives briefly: "If you prefer, I could extract this to a small typed helper instead."
Don't lecture: One sentence, then move on.

The user owns the codebase. Your job is to inform, not obstruct.

Priorities

Type safety is part of correctness and outranks tests.

When tradeoffs exist, prioritize in this order:

Type safety / soundness
Correctness + meaningful tests
Clarity / maintainability
Performance
Backwards compatibility

Breaking changes are acceptable when they improve verifiability and simplify the system, but:

Flag breaking changes explicitly to the user
Prefer non-breaking improvements when effort is similar
Consider migration paths for public APIs

Default workflow (when anything fails)

Read the failure output carefully.
Understand the context: Why does this code exist? What was the original intent? Check git history or ask if unclear.
Restate the real invariant being violated in plain English.
Fix the root cause (not the symptom).
Improve tests so the behavior is pinned and regressions get caught.
Refactor production code if needed to make it easy to type-check and validate.

Run checks in this order

Typecheck
Unit tests
Integration tests
Doc and End-to-End tests
Lint / pre-commit
Coverage

Goal: a repo where "all green" is normal, and any new red is a loud, trustworthy signal.

What makes a test meaningful

✅ Meaningful tests:

Test observable behavior from the caller's perspective
Would catch real regressions
Document intent and edge cases
Fail when actual bugs are introduced

❌ Not meaningful:

Test implementation details (private methods, internal state)
Duplicate what the type checker already verifies
Assert only on mock interactions, not outcomes
Pass regardless of whether the code works

The test: "If this test failed, would I learn something useful about a real bug?"

Coverage: aim for meaningful, not mechanical

Do: Cover all business logic paths, edge cases, error handling
Don't: Chase 100% by testing trivial getters or truly unreachable defensive code
Legitimate exclusions exist: Platform-specific branches, debug-only code, abstract method stubs
The bar: Would a failure in this line indicate a real bug? If yes, cover it.

Coverage comes from exercising real behavior, not from exclusion comments.

Handling flaky tests

If a test is genuinely flaky:

Identify the source: Time-dependence? Race condition? External service? Order-dependence?
Fix the non-determinism: Inject clocks, add synchronization, mock external calls, isolate state
If unfixable now: Quarantine in a separate test suite (not skipped, but run separately and tracked)
Never: Mark as "expected flaky" and leave in the main CI path

"Hard to test" means refactor

If something is hard to test or hard to type, treat it as a design smell.

Refactor towards:

Smaller pure functions
Explicit data flow, minimal global state
Clear boundaries between logic and side effects
Typed domain models over stringly-typed data
- TypeScript: strong interfaces/types instead of Record<string, any>
- Python: dataclasses, Pydantic models, or TypedDicts instead of dict[str, Any]

Mocks: use sparingly and explicitly

Avoid injecting mocks via monkeypatching or replacing system utilities by default.

Preferred approach:

Make the function under test able to operate in multiple environments by passing in the substitutable operations explicitly (as function parameters or small interfaces)
Only do this for operations that genuinely need substitution in tests: time, randomness, network, filesystem, process execution
This makes the injection point explicit, documents what varies, and keeps tests honest

Examples:

TypeScript:

// ❌ Bad: hard-coded dependency, requires monkeypatching to test
function processOrder(orderId: string) {
  const now = new Date();
  const order = database.getOrder(orderId);
  // ...
}

// ✅ Good: explicit dependencies
function processOrder(
  orderId: string,
  deps: { getTime: () => Date; getOrder: (id: string) => Order }
) {
  const now = deps.getTime();
  const order = deps.getOrder(orderId);
  // ...
}

Python:

# ❌ Bad: hard-coded dependency, requires monkeypatching to test
def process_order(order_id: str) -> OrderResult:
    now = datetime.now()
    order = database.get_order(order_id)
    # ...

# ✅ Good: explicit dependencies
def process_order(
    order_id: str,
    *,
    get_time: Callable[[], datetime] = datetime.now,
    get_order: Callable[[str], Order] = database.get_order,
) -> OrderResult:
    now = get_time()
    order = get_order(order_id)
    # ...

Summary: what "good" looks like

Types encode invariants; no "trust me" casts
Tests assert observable behavior (not implementation trivia)
Coverage comes from exercising real behavior, not exclusions
If a thing can't be verified cleanly, refactor until it can
Progress beats perfection; don't make things worse, do make things better

Related Skills

ad-sdl/code-ratchets

development

VerifiedTrustedCommunity

Implement code quality ratchets to prevent proliferation of deprecated patterns. Use when (1) migrating away from legacy code patterns, (2) enforcing gradual codebase improvements, (3) preventing copy-paste proliferation of deprecated practices, or (4) setting up pre-commit hooks to count and limit specific code patterns. A ratchet fails if pattern count exceeds OR falls below expected—ensuring patterns never increase and prompting updates when they decrease.

51SKILL.mdUpdated Mar 27, 2026

openclaw/openclaw-secret-scanning-maintainer

development

VerifiedTrustedCommunity

Maintainer-only workflow for handling GitHub Secret Scanning alerts on OpenClaw. Use when Codex needs to triage, redact, clean up, and resolve secret leakage found in issue comments, issue bodies, PR comments, or other GitHub content.

357,764SKILL.mdUpdated Apr 15, 2026

openclaw/openclaw-secret-scanning-maintainer

openclaw/openclaw-release-maintainer

development

VerifiedTrustedCommunity

Maintainer workflow for OpenClaw releases, prereleases, changelog release notes, and publish validation. Use when Codex needs to prepare or verify stable or beta release steps, align version naming, assemble release notes, check release auth requirements, or validate publish-time commands and artifacts.

357,764SKILL.mdUpdated Apr 10, 2026

openclaw/openclaw-release-maintainer

openclaw/openclaw-qa-testing

development

VerifiedTrustedCommunity

Run, watch, debug, and extend OpenClaw QA testing with qa-lab and qa-channel. Use when Codex needs to execute the repo-backed QA suite, inspect live QA artifacts, debug failing scenarios, add new QA scenarios, or explain the OpenClaw QA workflow. Prefer the live OpenAI lane with regular openai/gpt-5.4 in fast mode; do not use gpt-5.4-pro or gpt-5.4-mini unless the user explicitly overrides that policy.

357,764SKILL.mdUpdated Apr 10, 2026

openclaw/openclaw-qa-testing

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/ad-sdl/madsci.git

# Copy into Claude Code skills folder (global)
cp -r madsci/skills/galahad ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

ad-sdl/madsci

51 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT