skills/remy-testgen/SKILL.md
Generate persistent unit tests for existing or stub code. Supports post-hoc testing (default) and TDD mode (--tdd). Multi-angle agent analysis at medium/high effort levels.
npx skillsauth add till-crazy-tears-us-apart/claude-code-engineering-suite remy-testgenInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Generate persistent unit tests and write them into the project's test directory. Supports two modes:
--tdd): Generate failing test skeletons from interface signatures or plan packets, before implementation exists.Path Convention: All paths below are relative to
~/.claude/. UseRead("~/.claude/skills/remy-testgen/...")to access them.
| File | Purpose |
| :--- | :--- |
| skills/remy-testgen/frameworks.json | Test framework detection rules (Phase 2). User-extensible. |
| skills/remy-testgen/schemas/test_scenario.json | Output schema for test generation agents (Phase 3). |
| skills/remy-testgen/prompts/generate_behavioral.md | Agent A: Behavioral contract analysis prompt (medium+). |
| skills/remy-testgen/prompts/generate_boundary.md | Agent B: Boundary exploration prompt (medium+). |
| skills/remy-testgen/prompts/generate_property.md | Agent C: Property-based testing prompt (high only). |
| skills/remy-testgen/templates/test_python.py.j2 | Jinja2 template for Python test files. |
| skills/remy-testgen/templates/test_typescript.ts.j2 | Jinja2 template for TypeScript test files. |
| skills/remy-testgen/templates/test_go.go.j2 | Jinja2 template for Go test files. |
| skills/remy-testgen/templates/test_c.c.j2 | Jinja2 template for C test files (multi-framework: kunit/cmocka/Unity/criterion/plain). |
| skills/remy-testgen/templates/report.md.j2 | Jinja2 template for the coverage report. |
| skills/remy-testgen/render.py | Template rendering helper. Uses Jinja2 when available, falls back to built-in formatting. |
| skills/remy-testgen/output_schema.json | Final output schema (coverage report structure). |
render.py attempts import jinja2. If unavailable, all templates are rendered via built-in string formatting. Jinja2 can be installed via install.py (optional step).
| Environment Variable | Default | Description |
| :--- | :--- | :--- |
| TEST_GEN_EFFORT | medium | Fallback effort level when not specified as argument. |
| TEST_COVERAGE_THRESHOLD | 80 | Branch coverage percentage target. Shared with /remy-inspect. |
| TEST_COVERAGE_MAX_SUPPLEMENT_ROUNDS | 3 | Maximum coverage supplement iterations before stopping. |
/remy-testgen [effort] [--tdd [packet_file]] [target_files_or_functions...]
low, medium, or high (case-insensitive): use it as effort level, remaining args are parsed further.TEST_GEN_EFFORT env var (default medium), all args are parsed further.--tdd is present: enable TDD mode. If followed by a filename matching *.json, treat it as a remy-plan packet file.git diff --name-only HEAD (or git diff --cached --name-only if staged).| Effort | Generation Strategy | Agents | Coverage Supplement | | :--- | :--- | :--- | :--- | | low | Heuristic: signatures + docstrings → happy/edge/error cases | 0 | Disabled | | medium | 2 parallel agents: Behavioral Contract (A) + Boundary Exploration (B) | 2 | Enabled (ask user) | | high | 3 parallel agents: A + B + Property-Based Testing (C) | 3 | Enabled (ask user) |
Goal: Determine what code needs test coverage.
git diff --name-only HEAD (or git diff --cached --name-only if staged). If not a git repo, ask the user to specify targets via AskUserQuestion.target_set: [{file, symbol, type, signature}].If --tdd with packet file:
Read the packet from .claude/temp_task/{packet_file}.sender_payload.plan[] and evidence_packet.proposed_changes[].target_set from the described interfaces.If --tdd without packet file:
pass, ..., raise NotImplementedError, or # TODO comments.target_set from stub signatures.Bash("test -f .claude/logic_index.json && echo EXISTS || echo MISSING").Bash("python \"~/.claude/skills/remy-index/impact.py\" <target_file_1> ...").
Output: Print the target set as a summary table.
Goal: Identify the project's testing conventions.
Load detection rules from frameworks.json. Each entry defines:
indicators: file existence or file-content checks, evaluated in priority order.run_command / coverage_command: command templates.test_file_pattern / test_dir_patterns: glob patterns for locating test files.Execute indicator checks for each framework entry in priority order. Stop at the first match.
If no framework detected: use AskUserQuestion to ask the user which framework to target.
test_dir_patterns from the matched framework.test_output_dir.AskUserQuestion to ask the user where to place test files. Options:
tests/ directory" (with framework-appropriate structure)src/foo.py → src/test_foo.py)For each symbol in target_set:
Grep for the symbol name in test directories.Grep for import/require statements referencing the target file.{symbol → [test_file:test_function]}.Output: Print existing coverage mapping. Flag uncovered symbols.
Goal: Design test cases for uncovered symbols.
For each uncovered symbol in target_set:
test_plan: [{symbol, test_name, category, description, setup, assertion}].Read the prompt templates from ~/.claude/skills/remy-testgen/prompts/:
| Effort | Agents Launched (in parallel) |
| :--- | :--- |
| medium | generate_behavioral.md (Agent A) + generate_boundary.md (Agent B) |
| high | A + B + generate_property.md (Agent C) |
For each agent, construct the Agent call:
Agent({
description: "remy-testgen: [angle name]",
prompt: "[prompt template content]\n\n---\n\n## Provided Context\n\n### Source\n```\n{source}\n```\n\n### Signatures\n{signatures}\n\n### Existing Tests\n{existing_test_names}\n\n### Caller/Callee Context\n{impact_summary}\n\n### Mode\n{post-hoc|tdd}"
})
Launch all agents in parallel (single message, multiple Agent tool calls).
test_scenario.json schema. If parsing fails, discard that angle's results with a warning.symbol with >80% description overlap, keep the one with higher priority.test_plan.Output: Print the merged test plan as a table:
| # | Symbol | Test Name | Category | Priority | Source | | :--- | :--- | :--- | :--- | :--- | :--- |
Goal: Write persistent test files into the project.
Using test_output_dir from Phase 2:
{test_output_dir}/test_{source_module}.py{test_output_dir}/{source_module}.test.ts{source_dir}/{source_module}_test.go (Go convention: same directory){test_output_dir}/test_{source_module}.c (or {source_dir}/{source_module}_test.c for kunit)Before writing each test file:
AskUserQuestion:
Use render.render_template() to generate test files. Populate the context dict:
Python / TypeScript / Go:
{
"module_name": "...",
"imports": ["import ...", ...],
"test_cases": [
{
"name": "function_name_scenario_expected",
"description": "...",
"body_lines": ["result = func(arg)", "assert result == expected"],
"is_async": False
}
]
}
C (additional keys):
{
"framework": "kunit|cmocka|unity|criterion|plain_c",
"module_name": "...",
"suite_name": "...",
"includes": ['"header.h"', "<system.h>"],
"test_cases": [
{
"name": "test_func_scenario_expected",
"description": "...",
"body_lines": ["KUNIT_EXPECT_EQ(test, 2, add(1, 1));"]
}
]
}
The framework value is determined by Phase 2 detection. suite_name is derived from module_name (e.g., my_module → my_module_test). includes replaces imports for C — each entry is a literal #include argument (with quotes or angle brackets).
If the target language has no matching template, generate tests directly via LLM (no template).
test_{function_name}_{scenario}_{expected_outcome}
Example: test_load_policy_empty_string_returns_default
Generated tests MUST satisfy:
In TDD mode, generated tests MUST:
# Expected to FAIL until implementation is complete.Output: Print the list of generated test files and their locations.
Goal: Execute generated tests to validate correctness.
run_command from the detected framework (Phase 2).AskUserQuestion:
/remy-patch."AskUserQuestion:
Goal: Measure and report test coverage of the target symbols.
Trigger: Post-hoc mode only (TDD mode skips to Phase 7).
Read TEST_COVERAGE_THRESHOLD from environment (default: 80).
coverage_command from frameworks.json.Print a table:
| Symbol | Branches | Covered | Coverage | Status |
| :--- | :--- | :--- | :--- | :--- |
| load_policy | 6 | 5 | 83% | PASS |
| inject_all | 10 | 7 | 70% | FAIL |
If any symbol is below TEST_COVERAGE_THRESHOLD:
AskUserQuestion:
Initialize: _supplement_round = 0
LOOP:
IF _supplement_round >= TEST_COVERAGE_MAX_SUPPLEMENT_ROUNDS:
HALT loop. Report: "Reached supplement limit ({max} rounds). Coverage: {current}%."
Break to Phase 7.
1. Identify uncovered branches from coverage output.
2. Generate additional test cases targeting those branches (same rules as Phase 4).
3. Append to existing test file.
4. Re-run tests + coverage.
5. IF coverage >= threshold for all symbols → break LOOP.
6. IF no new test cases generated (all branches are impractical to test) → break LOOP.
7. _supplement_round += 1, continue LOOP.
Goal: Produce a persistent report and (in TDD mode) an evidence packet.
Bash("mkdir -p '.claude/temp_testgen'").Bash("date +\"%Y%m%d_%H%M%S\"") → {TIMESTAMP}.render.save_report() to generate and persist the report to .claude/temp_testgen/testgen_{TIMESTAMP}.md.Populate the context dict:
{
"project_name": "...",
"mode": "post-hoc|tdd",
"effort_level": "medium",
"target_set": [{"file": "...", "symbol": "...", "type": "..."}],
"test_plan": [{"symbol": "...", "test_name": "...", "category": "...", "priority": "..."}],
"generated_files": [{"path": "...", "test_count": N}],
"test_results": [{"name": "...", "status": "PASS/FAIL"}],
"passed": N,
"total": N,
"coverage_data": [{"symbol": "...", "branches": N, "covered": N, "percent": N, "status": "PASS/FAIL"}],
"supplement_rounds": N,
"final_status": "PASS / FAIL / RED (TDD)"
}
If the user declined supplement in Phase 6.3:
.claude/temp_testgen/coverage_{TIMESTAMP}.md.In TDD mode, produce an evidence packet for /remy-patch:
Bash("git rev-parse HEAD 2>/dev/null || echo NO_GIT")..claude/temp_task/testgen_{TIMESTAMP}.json:{
"v": "1.0.0",
"task": {
"id": "testgen_{TIMESTAMP}",
"mode": "write",
"summary": "Implement functions to make generated TDD tests pass",
"read_only_until_evidence": true
},
"sender_payload": {
"plan": ["Implement {symbol_1} to satisfy test_{name_1}", "..."],
"analysis": "Tests expect: {interface_contracts_summary}",
"assumptions": []
},
"evidence_packet": {
"source_revision": {
"type": "git",
"commit": "{COMMIT}",
"retrieved_at": "{ISO-8601}"
},
"evidence": [
{
"id": "E-001",
"file_type": "test",
"path": "{test_file_path}",
"range": {"start": 1, "end": 50},
"why": "Generated TDD test defining expected interface behavior",
"status": "confirmed",
"confidence": 1.0,
"excerpt": "{verbatim test content}"
}
],
"proposed_changes": [
{
"id": "C-001",
"description": "Implement {symbol} to pass {test_count} tests",
"evidence_refs": ["E-001"]
}
]
}
}
.active_packet: Bash("rm -f '.claude/temp_task/.active_packet' && echo 'testgen_{TIMESTAMP}.json' > '.claude/temp_task/.active_packet'").Print a condensed summary to stdout:
Test Generation Complete
========================
Mode: {post-hoc|tdd}
Effort: {effort_level}
Target Set: {N} symbols across {M} files
Generated: {test_count} tests in {file_count} files
Results: {passed}/{total} {passed_text} | Status: {PASS|FAIL|RED}
Coverage: {min}% - {max}% (threshold: {TEST_COVERAGE_THRESHOLD}%)
Supplement: {rounds} rounds
Report: .claude/temp_testgen/testgen_{TIMESTAMP}.md
{Packet: .claude/temp_task/testgen_{TIMESTAMP}.json | Execute: /remy-patch testgen_{TIMESTAMP}.json}
The Packet line is printed only in TDD mode.
/remy-patch and /remy-inspect.AskUserQuestion.TEST_COVERAGE_THRESHOLD. The env var is the single source of truth.low, no agents are spawned. All test generation is heuristic-based.plan → test → patch workflow chain.data-ai
Deep repository analysis with multi-agent parallel perspectives. Requires /init + /remy-index as prerequisites. Produces structured research reports.
tools
Analyze CI/CD failure logs to diagnose build, test, and gate failures. Supports GitHub Actions (gh CLI), local log files, and pasted logs. Produces evidence packets for /remy-patch.
tools
--- name: remy-secure description: Review branch changes for security vulnerabilities. Multi-stage: regex pre-scan, parallel agents, and false-positive filtering. allowed-tools: Read, Grep, Glob, Bash, AskUserQuestion, Agent argument-hint: "[low|medium|high] [diff_range (optional, e.g. HEAD~3...HEAD)]" disable-model-invocation: true --- # Security Audit Protocol Security-focused review of code changes on the current branch. Identifies exploitable vulnerabilities with high confidence (≥ 8/10),
tools
--- name: remy-reposcout description: Analyze a GitHub repository in two stages: metadata assessment via GH CLI, then sandboxed deep inspection upon confirmation. disable-model-invocation: true allowed-tools: Bash, Glob, Grep, Read --- # Repository Audit Skill This skill allows you to safely inspect GitHub repositories without polluting your main workspace. It operates in two stages to prevent unnecessary cloning of massive repositories. **Requirements**: - `git` - `gh` (GitHub CLI) - Must be