skills/exploratory-testing/SKILL.md
Use when a feature feels under-tested, after implementing new functionality, or before a release to discover edge cases, UX issues, and bugs through hands-on CLI exploration
npx skillsauth add mattwynne/yaks exploratory-testingInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Exploratory testing discovers what automated tests miss. The agent
acts as a curious, methodical user - running yx commands in a
sandbox, observing output, trying edge cases, and logging findings.
Each session is guided by a charter that focuses exploration on a
specific area with specific heuristics.
Don't use when:
The user provides (or the agent suggests) a target area to explore. Good targets are specific commands, workflows, or quality attributes:
yx add command"--under flag"--format options"Pick 2-4 heuristics from the menu below that suit the target:
| Heuristic | CLI Application |
|-----------|----------------|
| CRUD | Add, list, show, update, remove yaks through full lifecycle |
| Zero, One, Many | Empty state, single yak, many yaks, deep nesting |
| Boundary Values | Long names, special chars, spaces, empty strings, unicode |
| Never and Always | Invariants (done yaks always show done, removed yaks never listed) |
| Follow the Data | Add -> list -> modify -> list -> verify consistency |
| Some, None, All | Filters with matching/non-matching/all items |
| Starve | Missing .yaks directory, non-existent directories, no permissions |
| Interrupt | Broken pipes (yx ls \| head -1), partial stdin, Ctrl-C |
| Configuration Tour | --format options, --only filters, env vars |
| Claims Tour | Does --help text match actual behaviour? |
| Sequence Variation | Unusual command orders (done before add, prune with no done) |
| User Tour | Common real-world workflows end-to-end |
Format the charter using Elisabeth Hendrickson's template:
Explore [target area] With [selected heuristics] To Discover [risks or information we seek]
Example:
Explore the
yx addcommand With Boundary Values, Zero/One/Many, Claims Tour To Discover how it handles edge-case input and whether--helpaccurately describes its behaviour
Present the charter to the user. Do not proceed until the user confirms the charter. The user may adjust the target, heuristics, or risk focus.
Create an isolated environment so exploration never touches the
project's real .yaks/ data.
Follow the yx-sandbox skill to set up
and use a temp directory. Use the sandbox prefix on every yx
command during exploration.
Before exploring, create enough data to work with:
# Using the literal sandbox path from mktemp output:
cd /tmp/tmp.xYz123AbC
YX_SKIP_GIT_CHECKS=1 yx add make the tea
YX_SKIP_GIT_CHECKS=1 yx add buy biscuits
YX_SKIP_GIT_CHECKS=1 yx add wash the cups --under make the tea
YX_SKIP_GIT_CHECKS=1 yx state wash the cups wip
YX_SKIP_GIT_CHECKS=1 yx done buy biscuits
Adapt seeding to the charter - if exploring hierarchy, create deeper nesting; if exploring empty state, skip seeding entirely at first.
Work through each chartered heuristic. For each one:
Keep a running session log in this format:
### [Heuristic Name]
**Probe:** [what you're trying]
**Command:** `cd <sandbox-path> && YX_SKIP_GIT_CHECKS=1 yx ...`
**Expected:** [what you thought would happen]
**Actual:** [what actually happened]
**Verdict:** OK | BUG | UX-ISSUE | INCONSISTENCY | UNEXPECTED | QUESTION
**Notes:** [any additional observations]
echo $? after commands that should
fail - do they return non-zero?--format plain for scripting.After exploration, present a structured report to the user.
One-line summary: how many probes, how many findings, overall impression.
Group findings into these categories (skip empty ones):
| Category | Description | |----------|-------------| | Bugs | Incorrect behaviour, crashes, wrong exit codes | | UX Issues | Confusing output, unclear errors, surprising defaults | | Inconsistencies | Behaviour differs between similar commands | | Missing Error Handling | No error where one is expected, unhelpful messages | | Unexpected Behaviour | Works but not how a user would expect | | Worked Well | Things that behaved exactly right, good UX moments |
For each finding, include:
Concrete next steps:
--help is misleadingRemove the sandbox as described in the yx-sandbox skill.
| Phase | What Happens | Gate | |-------|-------------|------| | Charter | Agree target, select heuristics, write charter | User approves charter | | Exploration | Sandbox setup, systematic probing, session log | None - explore autonomously | | Report | Categorised findings, follow-ups, cleanup | User reviews findings |
| Mistake | Fix | |---------|-----| | Exploring without a charter | Always agree on target and heuristics first | | Only testing happy paths | Heuristics exist to push beyond the obvious | | Logging only failures | Record successes too - they confirm expected behaviour | | Exploring everything at once | Pick 2-4 heuristics per session, stay focused |
| Skipping exit code checks | echo $? after commands that should fail |
| Not seeding enough data | Adapt seed data to what the charter needs |
testing
Use when writing or reviewing Gherkin features, especially after discovering examples or edge cases that reveal a new business rule
databases
Use when running yx commands that create, modify, or delete yaks outside of real project work — provides an isolated temp environment
documentation
Use when starting work on a yak - sets up an isolated git worktree, reads yak context, and guides the full cycle from claiming through merge and cleanup
development
Use when planning work by approaching goals and discovering blockers, before creating comprehensive plans