Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

rp1-run/prompt-eval-builder

Name: prompt-eval-builder
Author: rp1-run

plugins/utils/skills/prompt-eval-builder/SKILL.md

Domain knowledge for extracting eval assertions and generating test invocation prompts from command/agent specs. Used for building promptfoo evaluation configs.

21 stars

development

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add rp1-run/rp1 prompt-eval-builder

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 8:51 PM1.8s1 file scanned

SKILL.md

name:: prompt-eval-builder
description:: Domain knowledge for extracting eval assertions and generating test invocation prompts from command/agent specs. Used for building promptfoo evaluation configs.
category:: prompt
is_workflow:: false

Prompt Eval Builder

Domain knowledge for building evaluation artifacts from prompt specifications. Provides extraction patterns, output templates, validation logic, and invocation prompt generation rules.

When to Use

Generating promptfoo eval configs from command/agent prompts
Extracting testable assertions from instruction text (primarily LLM rubrics)
Creating test invocation prompts (user inputs that test the command)
Validating generated YAML output

Assertion Types

LLM Rubrics (Default): Generated for most behavioral requirements. Rubrics contain REQUIRED/PROHIBITED/EDGE CASES sections and evaluate behavior holistically by checking output text AND Metadata JSON.

Programmatic Assertions (Complex Cases): Generated when requirements need exact counting, strict sequencing, or complex conditional logic that cannot be expressed in natural language rubrics. Uses type: javascript with file:// references to shared assertion functions.

Skill Files

| File | Purpose | When to Load | |------|---------|--------------| | PATTERNS.md | Extraction categories, tool mappings, selection rules, invocation generation | Always - core knowledge | | TEMPLATES.md | promptfoo YAML output templates, assertion formats | When generating YAML output | | VALIDATION.md | YAML validation loop, error handling | When validating/writing output |

Loading Instructions

Agents using this skill:

Read SKILL.md for overview
Read PATTERNS.md for extraction/invocation rules (always needed)
Read TEMPLATES.md for output format (for extraction agent)
Use scripts/validate-yaml.ts for YAML validation

Scripts

| Script | Purpose | Usage | |--------|---------|-------| | scripts/validate-yaml.ts | Validate YAML syntax | bun {skill_path}/scripts/validate-yaml.ts {output_file} |

Output format: { "valid": true } or { "valid": false, "error": "message" }

Workflow Overview

Extraction Flow

Prompt Text -> Pattern Analysis -> Requirement Categorization -> LLM Rubric Generation -> YAML Validation
                                                              \-> Programmatic Assertion (if complex)

Invocation Prompt Flow

Command/Agent Spec -> Metadata Extraction -> Variable Mapping -> Invocation Prompt

Key Concept: Test prompts are USER INPUTS that invoke the command, not distilled versions of the prompt. Example:

/rp1-dev:build-fast "{{REQUEST}}" --git-commit={{GIT_COMMIT}} --afk={{AFK_MODE}}

Both flows share PATTERNS.md for domain knowledge.

Related Skills

rp1-run/pr-stack

tools

VerifiedTrustedCommunity

Plan and execute splitting a large PR or branch into a reviewable stacked PR sequence.

31SKILL.mdUpdated Jun 4, 2026

rp1-run/guide

documentation

VerifiedTrustedCommunity

Ask about rp1 capabilities, discover skills, and get workflow guidance.

31SKILL.mdUpdated Apr 16, 2026

rp1-run/pr-walkthrough

tools

VerifiedTrustedCommunity

Generate an evidence-grounded markdown walkthrough for a pull request.

24SKILL.mdUpdated May 4, 2026

rp1-run/pr-walkthrough

rp1-run/socratic-duel

development

VerifiedTrustedCommunity

Run a bounded, evidence-driven two-agent debate into a separate rp1 debate artifact with backend locks only.

24SKILL.mdUpdated Apr 26, 2026

rp1-run/socratic-duel

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/rp1-run/rp1.git

# Copy into Claude Code skills folder (global)
cp -r rp1/plugins/utils/skills/prompt-eval-builder ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

rp1-run/rp1

21 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT

rp1-run/prompt-eval-builder

plugins/utils/skills/prompt-eval-builder/SKILL.md

Domain knowledge for extracting eval assertions and generating test invocation prompts from command/agent specs. Used for building promptfoo evaluation configs.

21 stars

development

Updated Apr 16, 2026

$ install --global

skillsauth

npx skillsauth add rp1-run/rp1 prompt-eval-builder

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 24, 2026, 8:51 PM1.8s1 file scanned

SKILL.md

name:: prompt-eval-builder
description:: Domain knowledge for extracting eval assertions and generating test invocation prompts from command/agent specs. Used for building promptfoo evaluation configs.
category:: prompt
is_workflow:: false

Prompt Eval Builder

Domain knowledge for building evaluation artifacts from prompt specifications. Provides extraction patterns, output templates, validation logic, and invocation prompt generation rules.

When to Use

Generating promptfoo eval configs from command/agent prompts
Extracting testable assertions from instruction text (primarily LLM rubrics)
Creating test invocation prompts (user inputs that test the command)
Validating generated YAML output

Assertion Types

Skill Files

Loading Instructions

Agents using this skill:

Read SKILL.md for overview
Read PATTERNS.md for extraction/invocation rules (always needed)
Read TEMPLATES.md for output format (for extraction agent)
Use scripts/validate-yaml.ts for YAML validation

Scripts

| Script | Purpose | Usage | |--------|---------|-------| | scripts/validate-yaml.ts | Validate YAML syntax | bun {skill_path}/scripts/validate-yaml.ts {output_file} |

Output format: { "valid": true } or { "valid": false, "error": "message" }

Workflow Overview

Extraction Flow

Prompt Text -> Pattern Analysis -> Requirement Categorization -> LLM Rubric Generation -> YAML Validation
                                                              \-> Programmatic Assertion (if complex)

Invocation Prompt Flow

Command/Agent Spec -> Metadata Extraction -> Variable Mapping -> Invocation Prompt

Key Concept: Test prompts are USER INPUTS that invoke the command, not distilled versions of the prompt. Example:

/rp1-dev:build-fast "{{REQUEST}}" --git-commit={{GIT_COMMIT}} --afk={{AFK_MODE}}

Both flows share PATTERNS.md for domain knowledge.

Related Skills

rp1-run/pr-stack

tools

VerifiedTrustedCommunity

Plan and execute splitting a large PR or branch into a reviewable stacked PR sequence.

31SKILL.mdUpdated Jun 4, 2026

rp1-run/guide

documentation

VerifiedTrustedCommunity

Ask about rp1 capabilities, discover skills, and get workflow guidance.

31SKILL.mdUpdated Apr 16, 2026

rp1-run/pr-walkthrough

tools

VerifiedTrustedCommunity

Generate an evidence-grounded markdown walkthrough for a pull request.

24SKILL.mdUpdated May 4, 2026

rp1-run/pr-walkthrough

rp1-run/socratic-duel

development

VerifiedTrustedCommunity

Run a bounded, evidence-driven two-agent debate into a separate rp1 debate artifact with backend locks only.

24SKILL.mdUpdated Apr 26, 2026

rp1-run/socratic-duel

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/rp1-run/rp1.git

# Copy into Claude Code skills folder (global)
cp -r rp1/plugins/utils/skills/prompt-eval-builder ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

rp1-run/rp1

21 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT