skills/requirements/req-to-test/SKILL.md
Derives executable test cases directly from requirements (user stories, acceptance criteria, specs) by extracting testable conditions, enumerating equivalence classes and boundaries, and producing a traceability map from each test back to its source requirement. Use when building acceptance tests from a spec, when checking whether requirements are covered by existing tests, when translating Gherkin or plain-English criteria into code, or when proving coverage for compliance.
npx skillsauth add santosomar/general-secure-coding-agent-skills req-to-testInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Turn each requirement into the smallest set of tests that (a) fail if the requirement is violated and (b) pass otherwise. Every test traces back to exactly one requirement. Every requirement has at least one test, or is explicitly marked untestable with the reason.
Not all requirements produce tests the same way. Classify first:
| Type | Recognition cues | What it yields | | --------------- | ----------------------------------------------------- | --------------------------------------- | | Functional | "shall," "the system does X," "when Y then Z" | Direct input/output test cases | | Constraint | "must not exceed," "within N ms," "at most," "never" | Boundary and property tests | | Conditional | "if," "unless," "when … otherwise" | One test per branch — including the else | | State-based | "after," "once … then," "until," references to phases | Sequence tests with setup state | | Non-functional | "available," "secure," "scalable," "performant" | Usually untestable at this level → split or defer |
If a requirement doesn't fit any row, it's probably ambiguous — route to → ambiguity-detector before wasting effort deriving tests from mush.
Strip filler and normalize to trigger → condition(s) → expected outcome. If any of the three is missing, the requirement is incomplete and the gap is your first test.
Requirement: "When a user submits an order with a total exceeding their credit limit, the system shall reject the order and display an error message explaining why."
Kernel:
order submittedorder.total > user.creditLimitorder.status == REJECTEDerror message visible AND mentions credit limitTwo outcomes → two assertions. They can live in one test (same arrange/act), but both must be checked.
The condition order.total > user.creditLimit splits the input space. Don't test every value — test one representative per class plus every boundary.
| Class | Representative | Expected |
| ------------------------ | ------------------------------- | ------------ |
| Well below limit | total = limit × 0.5 | Accepted |
| At limit (boundary) | total = limit | Check the spec: is "exceeding" > or >=? If ambiguous, both readings are tests until clarified. |
| Just over (boundary) | total = limit + smallest-unit | Rejected |
| Well over | total = limit × 2 | Rejected |
Boundaries are 0, 1, limit-ε, limit, limit+ε, max. For each, ask the spec. If the spec doesn't say — that's a finding, not a guess.
Every positive test has a shadow. Enumerate:
MAX_INT. Credit limit = 0.> limit is specified, is acceptance at <= limit specified? Often one direction is implicit — make both explicit.These aren't padding. A requirement that only specifies the happy path is a half-requirement, and the negative tests are how you prove that.
| Clue in the requirement | Test level | Framework style | | ------------------------------ | -------------- | ----------------------------- | | "the function returns…" | Unit | pytest, JUnit, jest | | "the API responds with…" | Integration | pytest + httpx, supertest, RestAssured | | "the user sees…", "the page shows…" | Acceptance / E2E | Gherkin → Playwright/Cypress | | "under N concurrent users…" | Load | k6, Locust — note it, don't generate inline |
Output template (per test):
TEST: rejects_order_exceeding_credit_limit
Requirement: REQ-042 (clause 2: "shall reject the order")
Level: Integration
Classes covered: just-over-boundary
ARRANGE user.creditLimit = 1000.00
order.total = 1000.01
ACT POST /orders
ASSERT response.status == 422
response.body.error.code == 'CREDIT_LIMIT_EXCEEDED'
TEST: shows_credit_limit_error_message
Requirement: REQ-042 (clause 3: "display an error message explaining why")
Level: Acceptance
...
Follow with a traceability map:
REQ-042 → rejects_order_exceeding_credit_limit (clause 2, just-over)
→ rejects_order_well_over_limit (clause 2, well-over)
→ accepts_order_at_limit (clause 2, boundary — assumes "exceeding" = strictly greater)
→ shows_credit_limit_error_message (clause 3)
REQ-043 → [UNCOVERED — non-functional, "the system shall be responsive"]
Action: split into measurable sub-requirements or defer to perf suite.
Explicitly list every uncovered requirement with a reason. Silence on coverage reads as "forgot," not "deliberately deferred."
Input:
Scenario: Free shipping threshold
Given a cart with subtotal $50 or more
When the user proceeds to checkout
Then shipping cost is $0
Analysis:
subtotal >= 50 — "or more" is >=, not >. Boundary is 50 exactly.shipping == 0Derived:
| Test | Subtotal | Expected shipping | Covers |
| ------------------------------- | -------- | ----------------- | ------------------------- |
| free_shipping_at_threshold | 50.00 | 0.00 | Boundary, inclusive |
| free_shipping_above_threshold | 75.00 | 0.00 | Well-above class |
| paid_shipping_just_below | 49.99 | > 0.00 | Boundary-minus; exact value unspecified → assert non-zero only |
| shipping_at_zero_subtotal | 0.00 | ??? — GAP | Empty cart: is checkout even allowed? Flag to spec owner. |
Four tests, one gap surfaced. Done.
UNTESTABLE — too broad and kick back to the author.total = 50.00 and total = 50.01 are different classes when the operator could be > vs >=. Test both or prove the spec picks one.ambiguity-detector.development
Extracts human-readable pseudocode from a verified formal artifact (Dafny, Lean, TLA+) while preserving the verified properties as annotations, so the proof-carrying logic can be reimplemented in a production language. Use when porting verified code to an unverified target, when documenting what a formal spec actually does, or when handing a verified algorithm to an implementer.
development
Translates natural-language or pseudocode descriptions of concurrent and distributed systems into TLA+ specifications ready for the TLC model checker. Identifies state variables, actions, type invariants, safety properties, and liveness properties from the description. Use when formalizing a protocol, when the user describes a distributed algorithm to verify, when designing a consensus or locking scheme, or when starting formal verification of a concurrent system.
testing
Reduces a TLA+ model so TLC can actually check it — shrinks constants, adds state constraints, abstracts data, or applies symmetry — when the state space is too large to enumerate. Use when TLC runs out of memory, when checking takes hours, or when a spec works at N=2 and you need confidence at larger scale.
development
TLA+-specific instance of model-guided repair — reads a TLC error trace, identifies the enabling condition that should have been false, strengthens the corresponding action, and maps the fix to source code. Use when TLC reports an invariant violation or deadlock and you have the code-to-TLA+ mapping from extraction.