.stencila/skills/software-test-execution/SKILL.md
Run scoped tests for a TDD slice, determine the appropriate test framework and command, and report structured pass/fail results. Use when tests need to be executed after writing, implementing, or refactoring code. Reads test metadata, discovers the test framework if needed, executes the scoped test command, parses output into structured results, and reports whether tests passed or failed. Handles compilation errors, missing dependencies, timeouts, and works with any language and test framework.
npx skillsauth add stencila/stencila software-test-executionInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Execute scoped tests for a TDD slice and report structured results. This skill runs only the tests relevant to the current slice, parses the output, and reports a clear pass/fail result with details.
This skill does not write or modify any code or test files. It only reads, executes, and reports.
This skill requires the following information to operate:
| Input | Required | Description | |----------------|----------|----------------------------------------------------------------------| | Test command | No | Command to run the scoped tests (discovered if not provided) | | Test files | No | List of test file paths (used to scope the command if needed) | | Slice scope | No | Description of what the slice covers (for context in the report) | | Target packages| No | Packages, crates, or directories involved (for command discovery) | | Slice name | No | Name or identifier of the current slice (for the report header) |
When used standalone, these inputs come from the user or the agent's prompt. When used within a workflow, the workflow's stage prompt will specify how to obtain them.
After completing its work, this skill reports:
| Output | Description | |--------------|----------------------------------------------------------------| | Result | Whether the tests passed or failed (Pass / Fail) | | Report | A structured report with counts, failure details, and notes |
Ensure the available inputs are collected:
If a test command is provided and non-empty, use it. Otherwise, discover the command:
glob to search for build files in the target directories:
Cargo.toml, go.mod, package.json, pyproject.toml, setup.py, pom.xml, build.gradle*, Gemfile, mix.exs, MakefileMakefile for test: targetspackage.json scripts.test for JS/TS projects.github/workflows/*.yml) for test commandsreferences/framework-detection.md for the mapping from build file to test commandreferences/framework-detection.md for scoping patterns per frameworkshell with an appropriate timeout (default 120 seconds; increase to 300 seconds for compiled languages like Rust, Go, or C++ where compilation may be slow)error[E...] in Rust, SyntaxError in Python, Cannot find module in JS), report this as a failureModuleNotFoundError: No module named 'pytest', error: no matching package named), report this as a failureAnalyze the combined stdout/stderr output to extract:
Use references/output-parsing.md for framework-specific parsing patterns. If the output format is unfamiliar, extract what you can and include the raw output in the report.
The result is Pass if and only if:
The result is Fail if:
When the exit code and output disagree (e.g., exit code 0 but failures in output), trust the output — some frameworks have bugs or configurations that mask failures in exit codes.
Output a clear structured report:
## Test Results: [PASS | FAIL]
**Slice**: <slice name>
**Command**: `<the command that was run>`
**Exit code**: <code>
**Duration**: <time if available>
### Summary
- Passed: N
- Failed: N
- Skipped: N
### Failed Tests
1. `<test_name>` — <brief failure reason>
```
<assertion message or error detail>
```
### Passed Tests
1. `<test_name>`
### Notes
- <any observations: warnings, slow tests, suspicious patterns>
If no tests failed, omit the "Failed Tests" section. If the list of passing tests is very long (>20), summarize rather than listing each one.
Test command: cargo test -p my-auth, scope: "token validation"
$ cargo test -p my-auth
Compiling my-auth v0.1.0
Finished test target(s)
Running unittests src/lib.rs
running 4 tests
test token::tests::test_valid_token_parses ... ok
test token::tests::test_expired_token_rejected ... ok
test token::tests::test_malformed_token_returns_error ... ok
test token::tests::test_missing_roles_uses_empty_vec ... ok
test result: ok. 4 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
Report: PASS, 4 passed, 0 failed. Result: Pass.
Test command: pytest tests/test_parser.py, scope: "CSV parser error handling"
$ pytest tests/test_parser.py
FAILED tests/test_parser.py::test_empty_file_raises_error - ImportError: cannot import name 'parse_csv'
FAILED tests/test_parser.py::test_malformed_row_skipped - ImportError: cannot import name 'parse_csv'
====== 0 passed, 2 failed in 0.12s ======
Report: FAIL, 0 passed, 2 failed. Failures are import errors because the implementation does not exist yet — expected in Red phase. Result: Fail.
Test command: absent. Target packages: "frontend". Test files: "frontend/src/components/tests/Button.test.tsx"
Discovery:
glob finds frontend/package.jsonpackage.json → "scripts": { "test": "vitest run" }cd frontend && npx vitest run src/components/__tests__/Button.test.tsxExecute, parse, report as normal.
Test command: cargo test -p my-codec
$ cargo test -p my-parser
error[E0433]: failed to resolve: could not find `Parser` in `parser`
--> src/lib.rs:15:22
|
15 | use crate::parser::Parser;
| ^^^^^^^ not found in `parser`
Report: FAIL, 0 passed, 0 failed (compilation error prevented tests from running). Include the full compiler error. Result: Fail.
Makefile with a test target or CI config with a test step. If nothing is found, report the failure clearly — "Could not determine how to run tests for this project" — and report Fail.shell execution fails because the test runner binary is not installed (e.g., command not found: pytest), report Fail with the specific error so the upstream agent can install the dependency.cargo test -p <crate> from the repo root) while others require cd <dir> && <command>. Check whether the test command includes a directory change or package selector, and if it fails with a "not found" error, try running from the workspace root with a package flag.documentation
An agent skill providing instructions for AI agents.
testing
Critically review a Stencila workflow and suggest improvements. Use when asked to review, audit, critique, evaluate, or improve a workflow directory or WORKFLOW.md file. Covers frontmatter validation, DOT pipeline quality, workflow structure, agent selection quality, discovery metadata, ephemeral workflow conventions, workflow composition, and adherence to Stencila workflow patterns.
development
Create a new Stencila workflow. Use when asked to create, write, scaffold, or set up a workflow directory or WORKFLOW.md file. Covers workflow discovery, duplicate-name checks, ephemeral workflows, WORKFLOW.md frontmatter, DOT pipeline authoring, goals, agents, branching, composition, and validation.
development
Critically review an existing or proposed Stencila theme artifact for correctness, token usage, target coverage, cross-target portability, dark-mode handling, maintainability, and approval readiness. Use when asked to review, critique, assess, audit, or validate a theme.css file, theme patch, theme plan, site theme, document theme, plot theme, print or PDF theme, check design tokens, assess DOCX or email behavior, review dark mode support, or validate with stencila themes validate.