skills/testing/SKILL.md
Skill validation framework PLUS daily test-suite health and regression intelligence. Validates skill conformance (frontmatter, manifest coverage, resolver coverage). Runs the project test suite in tiered phases (unit / evals / integration / system health), classifies failures, and produces a regression-aware report.
npx skillsauth add garrytan/gbrain testingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Convention: see conventions/quality.md for the test-before-bulk pattern; this skill enforces it across the project's own test suite.
This skill has two related but distinct modes:
Skill conformance validation — gbrain's own conformance bar (the original 1.0 scope). Validates every skill has SKILL.md with frontmatter, every reference exists, manifest + resolver coverage round-trips.
Project test-suite health (v0.25.1 extension) — runs the project's tiered test suite and produces a regression-classified report. Used by daily cron, container-restart bootstrap, and "how are the tests" prompts.
Pick the mode by trigger.
This mode guarantees:
SKILL.md fileSKILL.md has valid YAML frontmatter (name, description)SKILL.md has required sections per
test/skills-conformance.test.tsskills/manifest.json lists every skill directoryskills/RESOLVER.md references every skill in the manifestopenclaw.plugin.json skills[] round-trips with bothSKILL.md.manifest.json.bun test test/skills-conformance.test.ts test/resolver.test.ts
The CI-gated check is the package.json test script.
Skill Validation Report
========================
Skills found: N
Conformance: N/N pass
Manifest coverage: N/N
Resolver coverage: N/N
Round-trip: N/N
MECE violations: N
Issues:
- <skill>: <issue>
| Tier | What it runs | Wall time | Gates |
|------|--------------|-----------|-------|
| Unit | bun test (deterministic, zero external calls) | <2s | Every commit |
| Evals | LLM-judge or quality evals | ~60s | Daily |
| Integration | E2E tests against real Postgres | ~5m | Pre-ship + nightly |
| System health | Disk / memory / CPU / service liveness | <10s | Daily |
When the cron fires (or the user asks), do ALL of this:
bun test 2>&1
Parse: total passed, total failed, total skipped, file-level results.
# Adapt to the project's eval config
bun test --filter eval 2>&1
Parse: same format. Note any flakes (tests that fail due to API timeouts, not code bugs).
gbrain doctor --fast --json# What changed since last test run?
git log --oneline --since="24 hours ago"
For each failing test:
| Classification | Marker | Action | |---------------|--------|--------| | REGRESSION — code changed, test broke | 🔴 | Flag with the commit that broke it | | STALE — test expects old behavior; code is correct | 🟡 | Fix the test, not the code | | FLAKE — API timeout, service down, LLM variance | ⚠️ | Note, don't alarm; retry once | | NEW — test was just added and isn't passing yet | 🟢 | Check if intentional | | INFRA — container restart wiped state | 🛠 | Run bootstrap, retest |
🧪 Daily Tests — YYYY-MM-DD
Unit: X/Y passed (Z skipped)
Evals: X/Y passed
System: [health summary]
REGRESSIONS:
🔴 <test-name>: broke by commit <sha> "<commit message>"
STALE TESTS:
🟡 <test-name>: expects X but code now does Y (commit <sha>)
FLAKES:
⚠️ <test-name>: timeout (retry passed)
✅ ALL CLEAR (when applicable)
DO auto-fix:
DO NOT auto-fix:
When uncertain: check the commit message that changed the code, check if there's a related PR or conversation, ask the user if still unclear.
Track results in ~/.gbrain/test-state.json for trend tracking:
{
"lastRun": "2026-04-16T13:37:00Z",
"unit": { "passed": 1262, "failed": 31, "skipped": 8 },
"evals": { "passed": 17, "failed": 0 },
"system": { "doctor": "ok", "gbrain": "0.25.1" },
"failureHistory": [
{ "test": "<name>", "since": "2026-04-14", "classification": "stale" }
]
}
This enables:
manifest.json without adding to RESOLVER.mdThis skill guarantees:
writes_to: (when applicable).quality.md, brain-first.md, _brain-filing-rules.md) are followed.The full behavior contract is documented in the body sections above; this section exists for the conformance test.
The skill's output shape is documented inline in the body sections above (see "Output", "Brain page format", or equivalent). The literal section header here exists for the conformance test (test/skills-conformance.test.ts).
tools
--- name: query-helper triggers: - find a page tools: - search - query writes_pages: false --- # query-helper This skill helps you query the brain. The first prose line becomes the description when no `description:` frontmatter is present.
testing
# broken This SKILL.md has no YAML frontmatter fence. It must still be listed (with the directory name as its name and empty triggers), never throw.
documentation
Read, enrich, and write brain pages with source attribution.
testing
Migrate a brain from gbrain-base (or any pack) to gbrain-base-v2's 14-canonical-type taxonomy via gbrain onboard --check + the unify-types Minion handler. Collapses 94 noisy types to 15 canonical with subtypes, alias rows, and link rows. Triggers when an agent notices pack_upgrade_available, type_proliferation, or asks "what is the canonical taxonomy / how do I clean up my page types".