Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

paralleldrive/aidd-riteway-ai

Name: aidd-riteway-ai
Author: paralleldrive

ai/skills/aidd-riteway-ai/SKILL.md

npx skillsauth add paralleldrive/aidd aidd-riteway-ai

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

🧪 aidd-riteway-ai

Act as a top-tier AI test engineer to write correct riteway ai prompt evals for multi-step agent skills that involve tool calls.

Refer to /aidd-tdd for assertion style (given/should/actual/expected) and test isolation principles.

Refer to /aidd-requirements for the "Given X, should Y" format when writing assertions inside .sudo eval files.

Process

Read the skill under test and its functional requirements
Identify the discrete steps in the skill's flow
Create one .sudo eval file per step (Rule 1), placed in ai-evals/<skill-name>/
For each file, write the userPrompt — include mock tool preambles for unit evals (Rule 2), assert tool calls for step 1 (Rule 3), supply previous step output for step N > 1 (Rule 4)
Write assertions derived strictly from functional requirements in Given X, should Y format (Rule 7)
Create small, single-condition fixture files as needed (Rule 6)
Verify against the Eval Authoring Checklist below

Eval File Structure

A .sudo eval file has three sections:

import 'ai/skills/<skill-name>/SKILL.md'

userPrompt = """
<prompt sent to the agent under test>
"""

- Given <condition>, should <observable behavior>
- Given <condition>, should <observable behavior>

Assertions are bullet points written after the userPrompt block. Each assertion tests one distinct observable behavior derived from the functional requirements of the skill under test.

Rule 1 — One eval file per step

Given a multi-step flow under test, write one .sudo eval file per step rather than combining all steps into a single overloaded userPrompt.

Naming convention:

ai-evals/<skill-name>/step-1-<description>-test.sudo
ai-evals/<skill-name>/step-2-<description>-test.sudo

Do not collapse multiple steps into one file. Each file tests exactly one discrete agent action.

Rule 2 — Unit evals: tell the agent it is in a test environment

Given a unit eval for a step that involves tool calls (gh, GraphQL, REST API), include a preamble in the userPrompt that:

Tells the prompted agent it is operating in a test environment.
Provides mock tools with stub return values.
Instructs the agent to use the mock tools instead of calling real APIs.

Example preamble:

You have the following mock tools available. Use them instead of real gh or GraphQL calls:

mock gh pr view => returns:
  title: My PR
  branch: feature/foo
  base: main

mock gh api (list review threads) => returns:
  [{ id: "T_01", resolved: false, body: "..." }]

Rule 3 — Step 1: assert tool calls, do not pre-supply answers

Given a unit eval for step 1 of a tool-calling flow, assert that the agent makes the correct tool calls. Do not pre-supply the answers those calls would return — that defeats the purpose of the eval.

Correct pattern for step 1:

userPrompt = """
You have mock tools available. Use them instead of real API calls.
Run step 1 of your skill under test: fetch the PR details and review threads.
"""

- Given mock gh tools, should call gh pr view to retrieve the PR branch name
- Given mock gh tools, should call gh api to list the open review threads
- Given the review threads, should present them before taking any action

Wrong pattern (pre-supplying answers in step 1):

# ❌ Do not do this — it removes the assertion value
userPrompt = """
The PR branch is feature/foo.
The review threads are: [...]
Now generate delegation prompts.
"""

Rule 4 — Step N > 1: supply previous step output as context

Given a unit eval for step N > 1, include the output of the previous step as context inside the userPrompt. This makes each eval independently executable without running the prior steps live.

Example for step 2:

userPrompt = """
You have mock tools available. Use them instead of real calls.

Triage is complete. The following issues remain unresolved:

Issue 1 (thread ID: T_01):
  File: src/utils.js, line 5
  "add() subtracts instead of adding"

Generate delegation prompts for the remaining issues.
"""

Rule 5 — E2E evals: use real tools, follow -e2e.test.sudo naming

Given an e2e eval, use real tools (no mock preamble) and follow the -e2e.test.sudo naming convention to mirror the project's existing unit/e2e split:

ai-evals/<skill-name>/step-1-<description>-e2e.test.sudo

E2E evals run against live APIs. Only run them when the environment is configured with the necessary credentials.

Rule 6 — Fixture files: small, one condition per file

Given fixture files needed by an eval, keep them small (< 20 lines) with one clear bug or condition per file. Fixtures live in:

ai-evals/<skill-name>/fixtures/<filename>

Example fixture (add.js):

export const add = (a, b) => a - b; // bug: subtracts instead of adds

Do not combine multiple bugs in one fixture file. Each fixture must make the assertion conditions unambiguous.

Rule 7 — Assertions: derived from functional requirements only

Given assertions in a .sudo eval, derive them strictly from the functional requirements of the skill under test using the /aidd-requirements format:

- Given <condition>, should <observable behavior>

Include only assertions that test distinct observable behaviors. Do not:

Assert implementation details (e.g. internal variable names)
Repeat the same observable behavior with different wording
Assert things that are implied by another assertion already in the file

Eval Authoring Checklist

Before saving a .sudo eval file, verify:

[ ] One step per file (Rule 1)
[ ] Unit evals include mock tool preamble (Rule 2)
[ ] Step 1 asserts tool calls, not pre-supplied answers (Rule 3)
[ ] Step N > 1 includes previous step output as context (Rule 4)
[ ] E2E evals use -e2e.test.sudo suffix (Rule 5)
[ ] Fixture files are small, one condition each (Rule 6)
[ ] Assertions derived from functional requirements, no duplicates (Rule 7)

Commands { 🧪 /aidd-riteway-ai - write correct riteway ai prompt evals for multi-step tool-calling flows }

paralleldrive/aidd-riteway-ai

ai/skills/aidd-riteway-ai/SKILL.md

Teaches agents how to write correct riteway ai prompt evals (.sudo files) for multi-step flows that involve tool calls. Use when writing prompt evals, creating .sudo test files, or testing agent skills that use tools such as gh, GraphQL, or external APIs.

344 stars

tools

Updated May 11, 2026

$ install --global

skillsauth

npx skillsauth add paralleldrive/aidd aidd-riteway-ai

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: May 11, 2026, 5:07 AM350.6s3 files scanned

SKILL.md

name:: aidd-riteway-ai
description:: >
compatibility:: Requires riteway >=9 with the `riteway ai` subcommand available.

🧪 aidd-riteway-ai

Act as a top-tier AI test engineer to write correct riteway ai prompt evals for multi-step agent skills that involve tool calls.

Refer to /aidd-tdd for assertion style (given/should/actual/expected) and test isolation principles.

Refer to /aidd-requirements for the "Given X, should Y" format when writing assertions inside .sudo eval files.

Process

Read the skill under test and its functional requirements
Identify the discrete steps in the skill's flow
Create one .sudo eval file per step (Rule 1), placed in ai-evals/<skill-name>/
For each file, write the userPrompt — include mock tool preambles for unit evals (Rule 2), assert tool calls for step 1 (Rule 3), supply previous step output for step N > 1 (Rule 4)
Write assertions derived strictly from functional requirements in Given X, should Y format (Rule 7)
Create small, single-condition fixture files as needed (Rule 6)
Verify against the Eval Authoring Checklist below

Eval File Structure

A .sudo eval file has three sections:

import 'ai/skills/<skill-name>/SKILL.md'

userPrompt = """
<prompt sent to the agent under test>
"""

- Given <condition>, should <observable behavior>
- Given <condition>, should <observable behavior>

Assertions are bullet points written after the userPrompt block. Each assertion tests one distinct observable behavior derived from the functional requirements of the skill under test.

Rule 1 — One eval file per step

Given a multi-step flow under test, write one .sudo eval file per step rather than combining all steps into a single overloaded userPrompt.

Naming convention:

ai-evals/<skill-name>/step-1-<description>-test.sudo
ai-evals/<skill-name>/step-2-<description>-test.sudo

Do not collapse multiple steps into one file. Each file tests exactly one discrete agent action.

Rule 2 — Unit evals: tell the agent it is in a test environment

Given a unit eval for a step that involves tool calls (gh, GraphQL, REST API), include a preamble in the userPrompt that:

Tells the prompted agent it is operating in a test environment.
Provides mock tools with stub return values.
Instructs the agent to use the mock tools instead of calling real APIs.

Example preamble:

You have the following mock tools available. Use them instead of real gh or GraphQL calls:

mock gh pr view => returns:
  title: My PR
  branch: feature/foo
  base: main

mock gh api (list review threads) => returns:
  [{ id: "T_01", resolved: false, body: "..." }]

Rule 3 — Step 1: assert tool calls, do not pre-supply answers

Correct pattern for step 1:

userPrompt = """
You have mock tools available. Use them instead of real API calls.
Run step 1 of your skill under test: fetch the PR details and review threads.
"""

- Given mock gh tools, should call gh pr view to retrieve the PR branch name
- Given mock gh tools, should call gh api to list the open review threads
- Given the review threads, should present them before taking any action

Wrong pattern (pre-supplying answers in step 1):

# ❌ Do not do this — it removes the assertion value
userPrompt = """
The PR branch is feature/foo.
The review threads are: [...]
Now generate delegation prompts.
"""

Rule 4 — Step N > 1: supply previous step output as context

Given a unit eval for step N > 1, include the output of the previous step as context inside the userPrompt. This makes each eval independently executable without running the prior steps live.

Example for step 2:

userPrompt = """
You have mock tools available. Use them instead of real calls.

Triage is complete. The following issues remain unresolved:

Issue 1 (thread ID: T_01):
  File: src/utils.js, line 5
  "add() subtracts instead of adding"

Generate delegation prompts for the remaining issues.
"""

Rule 5 — E2E evals: use real tools, follow -e2e.test.sudo naming

Given an e2e eval, use real tools (no mock preamble) and follow the -e2e.test.sudo naming convention to mirror the project's existing unit/e2e split:

ai-evals/<skill-name>/step-1-<description>-e2e.test.sudo

E2E evals run against live APIs. Only run them when the environment is configured with the necessary credentials.

Rule 6 — Fixture files: small, one condition per file

Given fixture files needed by an eval, keep them small (< 20 lines) with one clear bug or condition per file. Fixtures live in:

ai-evals/<skill-name>/fixtures/<filename>

Example fixture (add.js):

export const add = (a, b) => a - b; // bug: subtracts instead of adds

Do not combine multiple bugs in one fixture file. Each fixture must make the assertion conditions unambiguous.

Rule 7 — Assertions: derived from functional requirements only

Given assertions in a .sudo eval, derive them strictly from the functional requirements of the skill under test using the /aidd-requirements format:

- Given <condition>, should <observable behavior>

Include only assertions that test distinct observable behaviors. Do not:

Assert implementation details (e.g. internal variable names)
Repeat the same observable behavior with different wording
Assert things that are implied by another assertion already in the file

Eval Authoring Checklist

Before saving a .sudo eval file, verify:

[ ] One step per file (Rule 1)
[ ] Unit evals include mock tool preamble (Rule 2)
[ ] Step 1 asserts tool calls, not pre-supplied answers (Rule 3)
[ ] Step N > 1 includes previous step output as context (Rule 4)
[ ] E2E evals use -e2e.test.sudo suffix (Rule 5)
[ ] Fixture files are small, one condition each (Rule 6)
[ ] Assertions derived from functional requirements, no duplicates (Rule 7)

Commands { 🧪 /aidd-riteway-ai - write correct riteway ai prompt evals for multi-step tool-calling flows }

Related Skills

paralleldrive/aidd-write

documentation

VerifiedTrustedCommunity

Top tier author skill for delivering essential truths with the persuasive power to inspire positive change. Use when writing, reviewing, editing, or scoring any content.

344SKILL.mdUpdated May 15, 2026

paralleldrive/aidd-write

paralleldrive/aidd-upskill

development

VerifiedTrustedCommunity

Guide for crafting high-quality AIDD skills. Use when creating, reviewing, or refactoring skills in ai/skills/ or aidd-custom/skills/.

344SKILL.mdUpdated May 11, 2026

paralleldrive/aidd-upskill

paralleldrive/aidd-rtc

testing

VerifiedTrustedCommunity

Reflective Thought Composition. Structured thinking pipeline for complex decisions, design evaluation, and deep analysis. Use when quality of reasoning matters more than speed of response.

344SKILL.mdUpdated May 11, 2026

paralleldrive/aidd-rtc

paralleldrive/aidd-requirements

testing

VerifiedTrustedCommunity

Write functional requirements for a user story. Use when drafting requirements, specifying user stories, or when the user asks for functional specs.

344SKILL.mdUpdated May 11, 2026

paralleldrive/aidd-requirements

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/paralleldrive/aidd.git

# Copy into Claude Code skills folder (global)
cp -r aidd/ai/skills/aidd-riteway-ai ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

paralleldrive/aidd

344 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT