Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

pangcheng1849/test-designer

Name: test-designer
Author: pangcheng1849

skills/test-designer/SKILL.md

npx skillsauth add pangcheng1849/g-claude-code-plugins test-designer

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

Test Designer

Independent test-design orchestrator. Encodes Independent Evaluation: the agent writing the tests must not be the agent implementing the feature, and must not inherit the implementation's assumptions.

When to Use

TDD red phase for a complex / non-trivial feature (multi-file, multi-branch logic, new subsystem)
Requirement is ambiguous enough that the implementer's tests would likely rationalize the implementation instead of catching bugs
User explicitly asks for "independent test design", "fresh-eyes tests", or runs /test-designer

Don't use for:

Trivial changes (one-line fix, rename) — just write the test inline
Bug reproduction tests — write directly from the bug report
Non-code changes (pure docs, pure config, pure prompt)

The Iron Law

The agent designing the tests must not carry the implementation's context. If you (the main Agent) are about to implement the feature, you are disqualified from designing its tests. Dispatch.

Violating this = tests that pass because they mirror the buggy implementation.

Steps

Step 1: Assemble the dispatch package

Collect only these inputs — nothing else:

Requirement description — "what to do" and acceptance criteria (not "how to do")
Relevant code file paths — read-only access to the code the feature will touch or integrate with
Edge case prompts — categories the dispatched agent should enumerate:
- Boundary inputs (empty, max, min, off-by-one)
- Concurrency / ordering (if applicable)
- Resource lifecycle (cleanup on error, partial failure)
- Invariants (data consistency, idempotency)
- Adversarial inputs (malformed, oversized, mis-encoded)

Explicitly exclude:

The implementation plan or design you've been developing
Hints about which approach you've chosen
Code excerpts from a work-in-progress branch
Your own guesses about "the right way to test this"

Step 2: Choose the executor

| Task shape | Executor | Reason | |---|---|---| | Complex, architectural implications | Independent Agent (e.g., codex-agent or claude-code-agent with fresh session) | True zero-context isolation; can use strongest model at highest effort | | Medium complexity, current conversation clean | In-conversation subagent | Cheaper; still acceptable if main Agent hasn't yet proposed an implementation | | Trivial | Don't dispatch — write tests inline |

Default to Independent Agent when the main Agent has already discussed or sketched implementation. Subagent isolation within the same conversation doesn't undo prior context pollution.

Step 3: Dispatch with the strongest model and highest effort

Test design is a correctness-critical reasoning task, not a rote mechanical one. Use:

Model: strongest reasoning model the runtime offers — inherit if the main Agent is already on that tier; otherwise override. Don't hardcode a specific brand name
Effort: xhigh (the maximum level the runtime supports). Escalation ladder: low → medium → high → xhigh
Tools: Read / Grep / Glob on code paths; Write on test files only
Permission: read-only on non-test files; writable on test files

Example dispatch prompt skeleton:

You are designing failing tests for a feature. You will NOT see or write the
implementation. Your job is to produce executable tests that fail today and
pass only when the feature is correctly implemented.

Requirement:
<paste requirement description + acceptance criteria>

Code paths (read-only, for understanding context):
<list of file paths>

Existing test framework and conventions:
<infer from repo or specify>

Produce:
1. A test plan — enumerate the behaviors being tested (happy path + edge
   cases), grouped by category (boundary / concurrency / lifecycle /
   invariants / adversarial).
2. Executable test files that fail against the current code (or against
   an empty implementation).
3. For each test, one-line rationale explaining the bug it would catch.

Constraints:
- Do NOT propose an implementation.
- Do NOT edit files outside the test directory.
- Cover edge cases explicitly; don't only test the happy path.
- Use the project's existing test framework and style.

Step 4: Validate the returned tests

Before handing the tests to the implementation phase:

Run the tests — they should FAIL (red), and fail for the reason the rationale predicts. A test that fails on ImportError, missing fixture, syntax error, or "module not found" is fake red — the test isn't actually exercising the behavior it claims to. Fix the test or drop it.
Scan the rationale — does each test catch a distinct failure mode? Drop duplicates.
Check coverage — are all edge case categories represented? Request additions if not.
Confirm the test framework matches — ensure the dispatched agent used the right runner / assertion lib / fixtures.
Check for shape-to-example tests — a test that asserts on specific happy-path values (e.g., "output equals exactly [1, 2, 3] for this fixture") is shaping the test to the example, not to the requirement. Such a test passes when the implementation matches the fixture and breaks for any valid variant input. Replace with property-style assertions ("output is sorted and contains all input elements") or add a second test with a different input that exercises the same property.

Step 5: Hand off to implementation

With the validated failing tests in place, implementation proceeds per test-driven-development skill: write minimal code to make them pass (green), then regression.

Output Format (from the dispatched agent)

Require the agent to return:

A test plan (bullet list, grouped by category) followed by the test files. Each test must include a one-line rationale comment. No implementation code. No commentary on how to implement. If assumptions about the code are needed, list them explicitly at the top of the test file.

Anti-patterns

❌ Main Agent writes the tests after sketching the implementation — tests will mirror the implementation's assumptions
❌ Dispatching with medium effort / weaker model to save cost — test design quality compounds across the whole feature's lifetime
❌ Passing the work-in-progress branch contents to the dispatched agent — defeats Independent Evaluation
❌ Accepting tests that pass against an empty implementation — those tests don't constrain anything
❌ Skipping Step 4 validation — unvalidated tests get merged as fake green
❌ Accepting "shape-to-example" tests — a test that asserts on specific happy-path values from the requirement's example data passes whenever input==fixture and breaks for any variant. Use property assertions (sorted, idempotent, contains-all-inputs) or pair the example test with a variant-input test that exercises the same invariant
❌ Accepting fake red — a test that fails on ImportError, missing fixture, or "module not found" looks red but isn't testing anything. Step 4 must verify the test fails for the reason the rationale predicts

Relationship to other skills

brainstorming → clarifies the requirement (upstream of test-designer)
test-driven-development → governs the red-green-refactor loop (downstream; consumes the failing tests)
systematic-debugging → kicks in if tests unexpectedly fail after implementation (downstream)
verification-before-completion → runs the tests at the "done" gate (downstream)

Example invocation

User: Starting work on the new plugin dependency resolver. Design tests first.
Assistant:
  1. Requirement: "Resolver takes a plugin manifest and returns install order
     respecting deps and detecting cycles. Must handle: transitive deps,
     diamond deps, self-references, missing deps, cycles."
  2. Code paths: src/plugins.ts, .claude/plugins.json schema, tests/ dir
  3. Dispatch to an independent-agent skill (fresh session) at `xhigh` effort,
     read-only on src/, writable on tests/
  4. Agent returns: test plan (5 categories, 18 tests), tests/resolver.test.ts
     with failing assertions + per-test rationale comments
  5. Main Agent runs tests → all red → validates rationale → hands off

pangcheng1849/test-designer

skills/test-designer/SKILL.md

Design failing tests for complex features using Independent Evaluation — dispatches a context-free agent that sees only the requirement spec and code paths (not the implementation approach), then returns executable failing tests. Use when starting TDD for a non-trivial feature, when the requirement is ambiguous enough that biased tests are a risk, or when the user asks for independent test design.

development

Updated May 10, 2026

$ install --global

skillsauth

npx skillsauth add pangcheng1849/g-claude-code-plugins test-designer

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 10, 2026, 5:15 AM210.4s1 file scanned

SKILL.md

name:: test-designer
description:: Design failing tests for complex features using Independent Evaluation — dispatches a context-free agent that sees only the requirement spec and code paths (not the implementation approach), then returns executable failing tests. Use when starting TDD for a non-trivial feature, when the requirement is ambiguous enough that biased tests are a risk, or when the user asks for independent test design.

Test Designer

When to Use

TDD red phase for a complex / non-trivial feature (multi-file, multi-branch logic, new subsystem)
Requirement is ambiguous enough that the implementer's tests would likely rationalize the implementation instead of catching bugs
User explicitly asks for "independent test design", "fresh-eyes tests", or runs /test-designer

Don't use for:

Trivial changes (one-line fix, rename) — just write the test inline
Bug reproduction tests — write directly from the bug report
Non-code changes (pure docs, pure config, pure prompt)

The Iron Law

The agent designing the tests must not carry the implementation's context. If you (the main Agent) are about to implement the feature, you are disqualified from designing its tests. Dispatch.

Violating this = tests that pass because they mirror the buggy implementation.

Steps

Step 1: Assemble the dispatch package

Collect only these inputs — nothing else:

Requirement description — "what to do" and acceptance criteria (not "how to do")
Relevant code file paths — read-only access to the code the feature will touch or integrate with
Edge case prompts — categories the dispatched agent should enumerate:
- Boundary inputs (empty, max, min, off-by-one)
- Concurrency / ordering (if applicable)
- Resource lifecycle (cleanup on error, partial failure)
- Invariants (data consistency, idempotency)
- Adversarial inputs (malformed, oversized, mis-encoded)

Explicitly exclude:

The implementation plan or design you've been developing
Hints about which approach you've chosen
Code excerpts from a work-in-progress branch
Your own guesses about "the right way to test this"

Step 2: Choose the executor

Default to Independent Agent when the main Agent has already discussed or sketched implementation. Subagent isolation within the same conversation doesn't undo prior context pollution.

Step 3: Dispatch with the strongest model and highest effort

Test design is a correctness-critical reasoning task, not a rote mechanical one. Use:

Model: strongest reasoning model the runtime offers — inherit if the main Agent is already on that tier; otherwise override. Don't hardcode a specific brand name
Effort: xhigh (the maximum level the runtime supports). Escalation ladder: low → medium → high → xhigh
Tools: Read / Grep / Glob on code paths; Write on test files only
Permission: read-only on non-test files; writable on test files

Example dispatch prompt skeleton:

You are designing failing tests for a feature. You will NOT see or write the
implementation. Your job is to produce executable tests that fail today and
pass only when the feature is correctly implemented.

Requirement:
<paste requirement description + acceptance criteria>

Code paths (read-only, for understanding context):
<list of file paths>

Existing test framework and conventions:
<infer from repo or specify>

Produce:
1. A test plan — enumerate the behaviors being tested (happy path + edge
   cases), grouped by category (boundary / concurrency / lifecycle /
   invariants / adversarial).
2. Executable test files that fail against the current code (or against
   an empty implementation).
3. For each test, one-line rationale explaining the bug it would catch.

Constraints:
- Do NOT propose an implementation.
- Do NOT edit files outside the test directory.
- Cover edge cases explicitly; don't only test the happy path.
- Use the project's existing test framework and style.

Step 4: Validate the returned tests

Before handing the tests to the implementation phase:

Run the tests — they should FAIL (red), and fail for the reason the rationale predicts. A test that fails on ImportError, missing fixture, syntax error, or "module not found" is fake red — the test isn't actually exercising the behavior it claims to. Fix the test or drop it.
Scan the rationale — does each test catch a distinct failure mode? Drop duplicates.
Check coverage — are all edge case categories represented? Request additions if not.
Confirm the test framework matches — ensure the dispatched agent used the right runner / assertion lib / fixtures.
Check for shape-to-example tests — a test that asserts on specific happy-path values (e.g., "output equals exactly [1, 2, 3] for this fixture") is shaping the test to the example, not to the requirement. Such a test passes when the implementation matches the fixture and breaks for any valid variant input. Replace with property-style assertions ("output is sorted and contains all input elements") or add a second test with a different input that exercises the same property.

Step 5: Hand off to implementation

With the validated failing tests in place, implementation proceeds per test-driven-development skill: write minimal code to make them pass (green), then regression.

Output Format (from the dispatched agent)

Require the agent to return:

A test plan (bullet list, grouped by category) followed by the test files. Each test must include a one-line rationale comment. No implementation code. No commentary on how to implement. If assumptions about the code are needed, list them explicitly at the top of the test file.

Anti-patterns

❌ Main Agent writes the tests after sketching the implementation — tests will mirror the implementation's assumptions
❌ Dispatching with medium effort / weaker model to save cost — test design quality compounds across the whole feature's lifetime
❌ Passing the work-in-progress branch contents to the dispatched agent — defeats Independent Evaluation
❌ Accepting tests that pass against an empty implementation — those tests don't constrain anything
❌ Skipping Step 4 validation — unvalidated tests get merged as fake green
❌ Accepting "shape-to-example" tests — a test that asserts on specific happy-path values from the requirement's example data passes whenever input==fixture and breaks for any variant. Use property assertions (sorted, idempotent, contains-all-inputs) or pair the example test with a variant-input test that exercises the same invariant
❌ Accepting fake red — a test that fails on ImportError, missing fixture, or "module not found" looks red but isn't testing anything. Step 4 must verify the test fails for the reason the rationale predicts

Relationship to other skills

brainstorming → clarifies the requirement (upstream of test-designer)
test-driven-development → governs the red-green-refactor loop (downstream; consumes the failing tests)
systematic-debugging → kicks in if tests unexpectedly fail after implementation (downstream)
verification-before-completion → runs the tests at the "done" gate (downstream)

Example invocation

User: Starting work on the new plugin dependency resolver. Design tests first.
Assistant:
  1. Requirement: "Resolver takes a plugin manifest and returns install order
     respecting deps and detecting cycles. Must handle: transitive deps,
     diamond deps, self-references, missing deps, cycles."
  2. Code paths: src/plugins.ts, .claude/plugins.json schema, tests/ dir
  3. Dispatch to an independent-agent skill (fresh session) at `xhigh` effort,
     read-only on src/, writable on tests/
  4. Agent returns: test plan (5 categories, 18 tests), tests/resolver.test.ts
     with failing assertions + per-test rationale comments
  5. Main Agent runs tests → all red → validates rationale → hands off

Related Skills

pangcheng1849/parallel-implementation

tools

VerifiedTrustedCommunity

Plan how to slice a non-trivial coding task across parallel subagents. Returns a dispatch plan (file assignments, dependencies, output-format contracts) — the main Agent then executes it with the Agent tool + `isolation: "worktree"`. Invoke only when work justifies multi-agent overhead: (a) greenfield 0→1 across multiple independent modules, (b) change touches ≥3 modules, or (c) ≥5 files each with >50 lines of diff. Small changes write inline.

SKILL.mdUpdated May 10, 2026

pangcheng1849/parallel-implementation

pangcheng1849/ip-diagnosis

development

VerifiedTrustedCommunity

在 macOS + Chrome 上排查公网 IPv4/IPv6 出口、国家/地区、ASN/组织、DNS、默认路由、utun 状态，以及浏览器侧 Server Response 与 WebRTC 暴露情况。适用于用户要求检查 IP、地区一致性、VPN/代理接管情况、IPv6 问题或浏览器网络暴露，并输出详细运维报告与复查链接。

SKILL.mdUpdated May 10, 2026

pangcheng1849/ip-diagnosis

pangcheng1849/gemini-agent

tools

VerifiedTrustedCommunity

通过 Gemini CLI 将编码、审查、诊断、规划和结构化输出任务委派给独立的 Gemini 会话。使用场景包括 `gemini -p` 非交互执行、`gemini -r latest` 续接最近会话、`gemini -r "<session-id>"` 指定会话恢复，以及需要 `--output-format json` / `stream-json`、`--approval-mode plan` 只读审查、`--sandbox` 隔离执行，或 `--worktree` 在独立 git worktree 中跑任务的 scripted / CI 调用。

SKILL.mdUpdated May 10, 2026

pangcheng1849/gemini-agent

pangcheng1849/codex-agent

tools

VerifiedTrustedCommunity

通过 Codex CLI 将编码、审查、诊断、规划、结构化输出和本机浏览器调研任务委派给独立的 Codex 会话。使用场景包括 `codex exec` 新建任务、`codex exec resume` 续接多轮会话、`codex exec review` 做只读审查，以及需要 `--json` 事件流、`-o` 最终消息落盘、图片输入或 Computer Use 浏览器操作时。

SKILL.mdUpdated May 10, 2026

pangcheng1849/codex-agent

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/pangcheng1849/g-claude-code-plugins.git

# Copy into Claude Code skills folder (global)
cp -r g-claude-code-plugins/skills/test-designer ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

pangcheng1849/g-claude-code-plugins

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT