skills/skill-tester/SKILL.md
[Hyper] Test Codex/agent skills for intended triggering and behavior with realistic positive, negative, boundary, and edge-case scenarios. Use when validating a skill folder, SKILL.md, rules/references/scripts/assets, trigger precision, workflow correctness, or regression coverage before shipping skill changes.
npx skillsauth add alpoxdev/hypercore skill-testerInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
@rules/test-matrix.md @rules/scenario-design.md @rules/evidence-reporting.md @references/prompt-pack-template.md
Prove a skill works as intended before trusting it.
<output_language>
Default all user-facing deliverables, saved artifacts, reports, plans, generated docs, summaries, handoff notes, commit/message drafts, and validation notes to Korean, even when this canonical skill file is written in English.
Preserve source code identifiers, CLI commands, file paths, schema keys, JSON/YAML field names, API names, package names, proper nouns, and quoted source excerpts in their required or original language.
Use a different language only when the user explicitly requests it, an existing target artifact must stay in another language for consistency, or a machine-readable contract requires exact English tokens. If a localized template or reference exists (for example *.ko.md or *.ko.json), prefer it for user-facing artifacts.
</output_language>
<purpose><routing_rule>
Use skill-tester when the user wants to test, validate, QA, regression-test, or edge-case-test an existing skill or skill folder.
Use skill-maker when the main job is creating or structurally refactoring a skill.
Use autoresearch-skill when the main job is repeated measured optimization across experiments.
Use qa or project-specific QA skills when the target is an application feature rather than a skill.
Do not use skill-tester when:
</routing_rule>
<trigger_conditions>
Positive examples:
skills/git-maker/ and tell me whether it triggers correctly."SKILL.md, rules, references, and scripts before I ship this skill."Negative examples:
skill-maker.autoresearch-skill.Boundary example:
skill-tester if the emphasis is evidence and failures; switch to skill-maker only for structural edits after the test findings are clear.</trigger_conditions>
<supported_targets>
SKILL.md and optional localized variants such as SKILL.ko.md.rules/, references/, scripts/, and assets/.</supported_targets>
<required_inputs>
Minimum input:
If either is missing, inspect local context first. Ask only when the target skill or intended behavior cannot be inferred safely.
Optional but useful:
</required_inputs>
<skill_architecture>
Load support files deliberately:
scripts/validate-skill.mjs for deterministic static checks when a filesystem skill folder is available.Keep test evidence close to the target skill when the user asks for reusable artifacts; otherwise report findings inline.
</skill_architecture>
<workflow>| Phase | Task | Output |
|------|------|------|
| 0 | Identify target skill, intended behavior, and neighboring skills that might conflict | Test scope |
| 1 | Read SKILL.md and directly linked support files needed for the test | Baseline behavior map |
| 2 | Build a scenario matrix covering positive, negative, boundary, edge, and regression cases | Test matrix |
| 3 | Run static anatomy checks and inspect support-file references | Static findings |
| 4 | Simulate skill routing and workflow execution for each scenario | Pass/fail table |
| 5 | Classify failures by trigger, scope, resource placement, workflow, validation, or safety | Ranked defects |
| 6 | Recommend minimal fixes or hand off to skill-maker/autoresearch-skill when edits are needed | Evidence-backed report |
<test_requirements>
Every meaningful skill test should include at least:
For localized skills, include at least one scenario in each supported language when trigger behavior depends on language. In this repository, include at least one Korean positive or boundary request when testing skills that ship SKILL.ko.md.
</test_requirements>
<failure_taxonomy>
Classify each issue as one of:
trigger-miss: target request may not activate the skill.trigger-overreach: unrelated request may activate the skill.scope-conflict: neighboring skill or workflow owns the request better.workflow-gap: instructions do not tell the agent what to do next.resource-drift: linked files are missing, stale, duplicated, or misplaced.validation-gap: completion can be claimed without evidence.edge-case-gap: missing handling for realistic boundary conditions.safety-gap: instructions allow risky or irreversible behavior without checks.</failure_taxonomy>
<output_contract>
Default report format:
## Skill Test Report
**Target**: `skills/example/`
**Intended behavior**: ...
**Verdict**: pass | pass-with-risks | fail
### Scenario results
| ID | Type | Prompt / condition | Expected | Observed | Result |
|----|------|--------------------|----------|----------|--------|
### Findings
1. [severity] [taxonomy] Evidence-backed issue and affected file/section.
### Edge cases covered
- ...
### Recommended fixes
- Minimal next edit or handoff target.
### Validation evidence
- Commands run, files read, and checks completed.
If the user asks for reusable tests, also create a prompt pack or checklist under the target skill's references/ or a task-specific .hypercore/ workspace.
</output_contract>
<validation_checklist>
Before declaring a skill tested, confirm:
</validation_checklist>
development
[Hyper] Use when working on Vite + TanStack Router projects - enforces architecture rules (layers, routes, hooks, services, conventions) with mandatory validation before any code change. Triggers on file creation, route work, hook patterns, or any structural change in a Vite + TanStack Router codebase.
development
[Hyper] Update semantic versions across node/rust/python projects, keep discovered version files synchronized, and prefer the installed `git-commit` skill for the final git step with a direct fallback when it is unavailable.
development
[Hyper] Use when working on TanStack Start projects and the task involves auth, sessions, cookies, CSRF, secrets, env exposure, server functions/routes, headers/CSP, webhooks, or security review/fixes. Triggers on protecting routes, hardening auth flows, preventing secret leaks, securing server boundaries, or reviewing HTTP/security behavior in a TanStack Start app.
tools
[Hyper] Enforce TanStack Start architecture in existing Start projects, especially project/folder structure, route structure, nested shared folder organization, server functions, loader/client-server boundaries, importProtection, hooks, SSR/hydration, and hypercore conventions. Use before structural code changes, folder-structure reviews, route work, server function work, or architecture audits in TanStack Start codebases.