Adoption

Agent Skills are supported by leading AI development tools.

VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory VS Code Gemini CLI GitHub Goose Amp Cursor Claude Code Letta OpenCode Claude OpenAI Codex Factory

alfredolopez80/edd

Name: edd
Author: alfredolopez80

.claude/skills/edd/SKILL.md

npx skillsauth add alfredolopez80/multi-agent-ralph-loop edd

Clean

TrivyContainer and dependency vulnerability scanner

Clean

SemgrepStatic code analysis for vulnerabilities

Clean

mcp-scan (Snyk)Model Context Protocol security validation

Skipped

Snyk (dep)Open source security scanning

Skipped

Socket.devSupply chain security analysis

Skipped

VirusTotalMulti-engine malware detection

Skipped

CrowdStrikeAdvanced threat intelligence

Skipped

OSV-ScannerOpen Source Vulnerability database check

Skipped

OWASP Dep-Check

EDD (Eval-Driven Development) Framework v2.64

Eval-Driven Development is a quality-first development pattern that enforces define-before-implement workflow with structured evaluations.

v2.88 Key Changes (MODEL-AGNOSTIC)

Model-agnostic: Uses model configured in ~/.claude/settings.json or CLI/env vars
No flags required: Works with the configured default model
Flexible: Works with GLM-5, Claude, Minimax, or any configured model
Settings-driven: Model selection via ANTHROPIC_DEFAULT_*_MODEL env vars

What is EDD?

EDD provides a systematic approach to software development with three phases:

DEFINE - Create structured eval specifications using TEMPLATE.md
IMPLEMENT - Build features according to eval definitions
VERIFY - Validate implementation against eval criteria

Check Types

| Prefix | Type | Purpose | |--------|------|---------| | CC- | Capability Checks | Feature capabilities and functionality | | BC- | Behavior Checks | Expected behaviors and responses | | NFC- | Non-Functional Checks | Performance, security, maintainability |

Usage

# Invoke EDD workflow
/edd "Define memory-search feature"

# CLI script (if available)
ralph edd define memory-search
ralph edd check memory-search

Components

TEMPLATE.md: Template for creating eval definitions
edd.sh: CLI script for eval management
/edd skill: Skill invocation from Claude Code
~/.claude/evals/: Directory for eval definitions

Template Structure

Each eval definition includes:

Capability Checks (CC-) - What the feature can do
Behavior Checks (BC-) - How the feature behaves
Non-Functional Checks (NFC-) - Performance, security, etc.
Implementation Notes - Technical guidance
Verification Evidence - Test results

Example: memory-search.md

# Memory Search Eval

**Status**: DRAFT
**Created**: 2026-01-30

## Capability Checks
- [ ] CC-1: Search across semantic memory
- [ ] CC-2: Support filtering by type

## Behavior Checks
- [ ] BC-1: Returns ranked results
- [ ] BC-2: Handles empty queries gracefully

## Non-Functional Checks
- [ ] NFC-1: Search completes in <2s
- [ ] NFC-2: Memory usage <100MB

## Implementation Notes
- Use parallel search for performance
- Cache frequent queries

## Verification Evidence
- Test results attached

Integration with Orchestrator

EDD integrates with the orchestrator workflow to ensure quality-first development:

Clarify phase - Define evals
Plan phase - Review eval requirements
Implement phase - Build to eval specs
Validate phase - Verify against evals

Swarm Mode Integration (v2.81.1)

EDD framework now supports swarm mode for parallel evaluation across multiple check types.

Auto-Spawn Configuration

When invoked via /edd, the framework automatically spawns a specialized evaluation team:

Task:
  subagent_type: "general-purpose"
  model: "sonnet"
  team_name: "edd-evaluation-team"
  name: "edd-coordinator"
  mode: "delegate"
  run_in_background: true
  prompt: |
    Execute Eval-Driven Development workflow for: $ARGUMENTS

    EDD Pattern:
    1. DEFINE - Create structured eval specifications
    2. DISTRIBUTE - Assign check types to specialists
    3. VERIFY - Validate against eval criteria
    4. CONSOLIDATE - Merge findings from all evaluators

Team Composition

| Role | Purpose | Specialization | |------|---------|----------------| | Coordinator | EDD workflow orchestration | Manages eval lifecycle, consolidates findings | | Teammate 1 | Capability Checks specialist | CC- prefix: feature capabilities and functionality | | Teammate 2 | Behavior Checks specialist | BC- prefix: expected behaviors and responses | | Teammate 3 | Non-Functional Checks specialist | NFC- prefix: performance, security, maintainability |

Swarm Mode Workflow

User invokes: /edd "Define memory-search feature"

1. Team "edd-evaluation-team" created
2. Coordinator (edd-coordinator) receives task
3. 3 Teammates spawned with check-type specializations
4. Eval definition distributed:
   - Teammate 1 → Capability Checks (CC-)
   - Teammate 2 → Behavior Checks (BC-)
   - Teammate 3 → Non-Functional Checks (NFC-)
5. Teammates work in parallel (background execution)
6. Coordinator monitors progress and gathers results
7. Findings consolidated into single eval specification
8. Final eval document returned

Parallel Evaluation Pattern

Each teammate focuses on their check type:

# Teammate 1: Capability Checks
CC-1: Feature can perform X
CC-2: Feature supports Y configuration
CC-3: Feature integrates with Z system

# Teammate 2: Behavior Checks
BC-1: Feature handles error case A gracefully
BC-2: Feature returns expected response for B
BC-3: Feature maintains state across C

# Teammate 3: Non-Functional Checks
NFC-1: Response time < 100ms
NFC-2: Memory usage < 50MB
NFC-3: Security vulnerability scan passes

Communication Between Teammates

Teammates use the built-in mailbox system:

# Teammate sends finding to coordinator
SendMessage:
  type: "message"
  recipient: "edd-coordinator"
  content: "CC-3 defined: Feature integrates with auth system via OAuth2"

Task List Coordination

All teammates share a unified task list:

# Location: ~/.claude/tasks/edd-evaluation-team/tasks.json

# Example tasks:
[
  {"id": "1", "subject": "Define Capability Checks", "owner": "teammate-1"},
  {"id": "2", "subject": "Define Behavior Checks", "owner": "teammate-2"},
  {"id": "3", "subject": "Define Non-Functional Checks", "owner": "teammate-3"},
  {"id": "4", "subject": "Consolidate eval specification", "owner": "edd-coordinator"}
]

Manual Override

To disable swarm mode:

/edd "Define feature X" --no-swarm

Output Location

# Evals saved to ~/.claude/evals/
ls ~/.claude/evals/

# View last eval
cat ~/.claude/evals/latest.md

Testing

Test suite: tests/test_v264_edd_framework.bats (33 tests)

Run tests:

bats tests/test_v264_edd_framework.bats

Swarm Mode Tests

Additional tests for swarm mode integration:

# Test swarm team creation
tests/edd/test-swarm-team-creation.sh

# Test parallel evaluation
tests/edd/test-parallel-evaluation.sh

Status

Current: Framework defined with swarm mode integration (v2.81.1) Note: TEMPLATE.md and evals directory structure ready for use

Version: v2.64 | Status: DRAFT | Tests: 33 passing

Action Reporting (v2.93.0)

Esta skill genera reportes automáticos completos para trazabilidad:

Reporte Automático

Cuando esta skill completa, se genera automáticamente:

En la conversación de Claude: Resultados visibles
En el repositorio: docs/actions/edd/{timestamp}.md
Metadatos JSON: .claude/metadata/actions/edd/{timestamp}.json

Contenido del Reporte

Cada reporte incluye:

✅ Summary: Descripción de la tarea ejecutada
✅ Execution Details: Duración, iteraciones, archivos modificados
✅ Results: Errores encontrados, recomendaciones
✅ Next Steps: Próximas acciones sugeridas

Ver Reportes Anteriores

# Listar todos los reportes de esta skill
ls -lt docs/actions/edd/

# Ver el reporte más reciente
cat $(ls -t docs/actions/edd/*.md | head -1)

# Buscar reportes fallidos
grep -l "Status: FAILED" docs/actions/edd/*.md

Generación Manual (Opcional)

source .claude/lib/action-report-lib.sh
start_action_report "edd" "Task description"
# ... ejecución ...
complete_action_report "success" "Summary" "Recommendations"

Referencias del Sistema

Action Reports System - Documentación completa
action-report-lib.sh - Librería helper
action-report-generator.sh - Generador

alfredolopez80/edd

.claude/skills/edd/SKILL.md

Eval-Driven Development (EDD) Framework v2.87.0 - Define-before-implement pattern with structured evals. Provides workflow: Define specifications → Implement features → Verify against evals. Components: TEMPLATE.md for eval definitions, edd.sh CLI script, /edd skill invocation. Check types: CC- (Capability), BC- (Behavior), NFC- (Non-Functional). Integrates with orchestrator workflow for quality-first development. Keywords: evals, define, implement, verify, capability checks, behavior checks, non-functional checks, template, quality assurance, test-driven, specification. Use when: defining new features with structured evals, implementing with verification requirements, creating quality specifications, TDD-style workflow with evals.

121 stars

tools

Updated Apr 15, 2026

$ install --global

skillsauth

npx skillsauth add alfredolopez80/multi-agent-ralph-loop edd

Install this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.

Security Scan Results

3 of 9 scanners reported clean

Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.

Scanners Passed

Scanners in report

Clean

TrivyContainer and dependency vulnerability scanner

95%

Clean

SemgrepStatic code analysis for vulnerabilities

95%

Clean

mcp-scan (Snyk)Model Context Protocol security validation

95%

Skipped

Snyk (dep)Open source security scanning

50%

Skipped

Socket.devSupply chain security analysis

50%

Skipped

VirusTotalMulti-engine malware detection

50%

Skipped

CrowdStrikeAdvanced threat intelligence

50%

Skipped

OSV-ScannerOpen Source Vulnerability database check

50%

Skipped

OWASP Dep-Check

50%

Last scanned: Apr 15, 2026, 6:06 PM13.5s1 file scanned

SKILL.md

# VERSION:: 3.0.0
name:: edd
description:: Eval-Driven Development (EDD) Framework v2.87.0 - Define-before-implement pattern with structured evals. Provides workflow: Define specifications → Implement features → Verify against evals. Components: TEMPLATE.md for eval definitions, edd.sh CLI script, /edd skill invocation. Check types: CC- (Capability), BC- (Behavior), NFC- (Non-Functional). Integrates with orchestrator workflow for quality-first development. Keywords: evals, define, implement, verify, capability checks, behavior checks, non-functional checks, template, quality assurance, test-driven, specification. Use when: defining new features with structured evals, implementing with verification requirements, creating quality specifications, TDD-style workflow with evals.

EDD (Eval-Driven Development) Framework v2.64

Eval-Driven Development is a quality-first development pattern that enforces define-before-implement workflow with structured evaluations.

v2.88 Key Changes (MODEL-AGNOSTIC)

Model-agnostic: Uses model configured in ~/.claude/settings.json or CLI/env vars
No flags required: Works with the configured default model
Flexible: Works with GLM-5, Claude, Minimax, or any configured model
Settings-driven: Model selection via ANTHROPIC_DEFAULT_*_MODEL env vars

What is EDD?

EDD provides a systematic approach to software development with three phases:

DEFINE - Create structured eval specifications using TEMPLATE.md
IMPLEMENT - Build features according to eval definitions
VERIFY - Validate implementation against eval criteria

Check Types

Usage

# Invoke EDD workflow
/edd "Define memory-search feature"

# CLI script (if available)
ralph edd define memory-search
ralph edd check memory-search

Components

TEMPLATE.md: Template for creating eval definitions
edd.sh: CLI script for eval management
/edd skill: Skill invocation from Claude Code
~/.claude/evals/: Directory for eval definitions

Template Structure

Each eval definition includes:

Capability Checks (CC-) - What the feature can do
Behavior Checks (BC-) - How the feature behaves
Non-Functional Checks (NFC-) - Performance, security, etc.
Implementation Notes - Technical guidance
Verification Evidence - Test results

Example: memory-search.md

# Memory Search Eval

**Status**: DRAFT
**Created**: 2026-01-30

## Capability Checks
- [ ] CC-1: Search across semantic memory
- [ ] CC-2: Support filtering by type

## Behavior Checks
- [ ] BC-1: Returns ranked results
- [ ] BC-2: Handles empty queries gracefully

## Non-Functional Checks
- [ ] NFC-1: Search completes in <2s
- [ ] NFC-2: Memory usage <100MB

## Implementation Notes
- Use parallel search for performance
- Cache frequent queries

## Verification Evidence
- Test results attached

Integration with Orchestrator

EDD integrates with the orchestrator workflow to ensure quality-first development:

Clarify phase - Define evals
Plan phase - Review eval requirements
Implement phase - Build to eval specs
Validate phase - Verify against evals

Swarm Mode Integration (v2.81.1)

EDD framework now supports swarm mode for parallel evaluation across multiple check types.

Auto-Spawn Configuration

When invoked via /edd, the framework automatically spawns a specialized evaluation team:

Task:
  subagent_type: "general-purpose"
  model: "sonnet"
  team_name: "edd-evaluation-team"
  name: "edd-coordinator"
  mode: "delegate"
  run_in_background: true
  prompt: |
    Execute Eval-Driven Development workflow for: $ARGUMENTS

    EDD Pattern:
    1. DEFINE - Create structured eval specifications
    2. DISTRIBUTE - Assign check types to specialists
    3. VERIFY - Validate against eval criteria
    4. CONSOLIDATE - Merge findings from all evaluators

Team Composition

Swarm Mode Workflow

User invokes: /edd "Define memory-search feature"

1. Team "edd-evaluation-team" created
2. Coordinator (edd-coordinator) receives task
3. 3 Teammates spawned with check-type specializations
4. Eval definition distributed:
   - Teammate 1 → Capability Checks (CC-)
   - Teammate 2 → Behavior Checks (BC-)
   - Teammate 3 → Non-Functional Checks (NFC-)
5. Teammates work in parallel (background execution)
6. Coordinator monitors progress and gathers results
7. Findings consolidated into single eval specification
8. Final eval document returned

Parallel Evaluation Pattern

Each teammate focuses on their check type:

# Teammate 1: Capability Checks
CC-1: Feature can perform X
CC-2: Feature supports Y configuration
CC-3: Feature integrates with Z system

# Teammate 2: Behavior Checks
BC-1: Feature handles error case A gracefully
BC-2: Feature returns expected response for B
BC-3: Feature maintains state across C

# Teammate 3: Non-Functional Checks
NFC-1: Response time < 100ms
NFC-2: Memory usage < 50MB
NFC-3: Security vulnerability scan passes

Communication Between Teammates

Teammates use the built-in mailbox system:

# Teammate sends finding to coordinator
SendMessage:
  type: "message"
  recipient: "edd-coordinator"
  content: "CC-3 defined: Feature integrates with auth system via OAuth2"

Task List Coordination

All teammates share a unified task list:

# Location: ~/.claude/tasks/edd-evaluation-team/tasks.json

# Example tasks:
[
  {"id": "1", "subject": "Define Capability Checks", "owner": "teammate-1"},
  {"id": "2", "subject": "Define Behavior Checks", "owner": "teammate-2"},
  {"id": "3", "subject": "Define Non-Functional Checks", "owner": "teammate-3"},
  {"id": "4", "subject": "Consolidate eval specification", "owner": "edd-coordinator"}
]

Manual Override

To disable swarm mode:

/edd "Define feature X" --no-swarm

Output Location

# Evals saved to ~/.claude/evals/
ls ~/.claude/evals/

# View last eval
cat ~/.claude/evals/latest.md

Testing

Test suite: tests/test_v264_edd_framework.bats (33 tests)

Run tests:

bats tests/test_v264_edd_framework.bats

Swarm Mode Tests

Additional tests for swarm mode integration:

# Test swarm team creation
tests/edd/test-swarm-team-creation.sh

# Test parallel evaluation
tests/edd/test-parallel-evaluation.sh

Status

Current: Framework defined with swarm mode integration (v2.81.1) Note: TEMPLATE.md and evals directory structure ready for use

Version: v2.64 | Status: DRAFT | Tests: 33 passing

Action Reporting (v2.93.0)

Esta skill genera reportes automáticos completos para trazabilidad:

Reporte Automático

Cuando esta skill completa, se genera automáticamente:

En la conversación de Claude: Resultados visibles
En el repositorio: docs/actions/edd/{timestamp}.md
Metadatos JSON: .claude/metadata/actions/edd/{timestamp}.json

Contenido del Reporte

Cada reporte incluye:

✅ Summary: Descripción de la tarea ejecutada
✅ Execution Details: Duración, iteraciones, archivos modificados
✅ Results: Errores encontrados, recomendaciones
✅ Next Steps: Próximas acciones sugeridas

Ver Reportes Anteriores

# Listar todos los reportes de esta skill
ls -lt docs/actions/edd/

# Ver el reporte más reciente
cat $(ls -t docs/actions/edd/*.md | head -1)

# Buscar reportes fallidos
grep -l "Status: FAILED" docs/actions/edd/*.md

Generación Manual (Opcional)

source .claude/lib/action-report-lib.sh
start_action_report "edd" "Task description"
# ... ejecución ...
complete_action_report "success" "Summary" "Recommendations"

Referencias del Sistema

Action Reports System - Documentación completa
action-report-lib.sh - Librería helper
action-report-generator.sh - Generador

Related Skills

alfredolopez80/vault

development

VerifiedTrustedCommunity

Living knowledge base management. Actions: search (query vault), save (store learning), index (update indices), compile (raw->wiki->rules graduation), init (create vault structure). Follows Karpathy pipeline: ingest->compile->query. Use when: (1) searching accumulated knowledge, (2) saving learnings, (3) compiling raw notes into wiki, (4) initializing a new vault. Triggers: /vault, 'vault search', 'knowledge base', 'save learning'.

121SKILL.mdUpdated Apr 15, 2026

alfredolopez80/spec

testing

VerifiedTrustedCommunity

Produce a verifiable technical specification before coding. 6 mandatory sections: Interfaces, Behaviors, Invariants (from Aristotle Phase 2), File Plan, Test Plan, Exit Criteria (executable bash commands + expected results). Use when: (1) before implementing features with complexity > 4, (2) as Step 1.5 in orchestrator workflow, (3) when requirements need formalization. Triggers: /spec, 'create spec', 'write specification', 'technical spec'.

121SKILL.mdUpdated Apr 15, 2026

alfredolopez80/ship

testing

VerifiedTrustedCommunity

Pre-launch shipping checklist orchestrating /gates, /security, /browser-test, /perf. Ensures nothing ships without passing all quality checks. Use when: (1) before deploying, (2) before merging to main, (3) before release. Triggers: /ship, 'ship it', 'ready to deploy', 'pre-launch check'.

121SKILL.mdUpdated Apr 15, 2026

alfredolopez80/perf

development

VerifiedTrustedCommunity

Performance optimization skill. Core Web Vitals via Lighthouse, bundle size analysis, metrics tracking over time. Use when: (1) optimizing frontend performance, (2) analyzing bundle size, (3) tracking metrics regression. Triggers: /perf, 'performance audit', 'core web vitals', 'bundle size'.

121SKILL.mdUpdated Apr 15, 2026

Download

For Claude Desktop. Download once, then upload the file in the app — no terminal needed.

Need help? View full Cowork setup guide →

Install manually

Choose your platform

# Clone the repo
git clone https://github.com/alfredolopez80/multi-agent-ralph-loop.git

# Copy into Claude Code skills folder (global)
cp -r multi-agent-ralph-loop/.claude/skills/edd ~/.claude/skills/

Claude Code Skills — official skills path docs.

Repository

alfredolopez80/multi-agent-ralph-loop

121 stars

Compatible with

Claude Code

OpenAI Codex CLI

ChatGPT