Assay

Overview

All subagent dispatches use disk-mediated dispatch. See shared/dispatch-convention.md for the full protocol.

Evaluate competing approaches against codebase constraints. Returns a structured Assay Report with a recommendation, alternatives with kill criteria, and confidence scoring. Evidence-grounded — recommendations cite specific file:line references, not generic best practices.

Skill type: Rigid — follow exactly, no shortcuts.

Models:

Evaluator agent: Opus (synthesis/judgment work needs the best model)
Orchestrator: runs on whatever model the session uses

Announce at start: "I'm using the assay skill to evaluate competing approaches."

Name origin: In metallurgy, an assay tests raw material to determine its quality and composition before committing it to the forge.

Invocation API

/assay
  question: "How should the auth middleware handle token refresh?"
  context: { ... }
  decision_type: "architecture"
  approaches: [...]
  cascading_decisions: [...]

Parameters

question (required) — The decision or question to evaluate. One clear sentence.

context (required) — Evidence for the evaluator to reason against. Accepts different shapes depending on the caller:

| Caller | Context Shape | Key Fields | |---|---|---| | /design | Recon brief + agent findings | project_structure, existing_patterns, scope_boundaries, prior_art | | /spec | Recon brief + agent findings (autonomous) | project_structure, existing_patterns, scope_boundaries, prior_art | | /migrate | Recon brief + migration analysis | project_structure, migration_target, breaking_changes, blast_radius | | Generic caller | Freeform evidence | description (string) — unstructured context, lower confidence |

When context contains unrecognized keys, the evaluator treats them as additional evidence. When context is a bare string, treat as { "description": context }.

decision_type (optional) — architecture | strategy | diagnosis | optimization. Auto-detected from the question if omitted. Defaults to architecture when ambiguous.

approaches (optional) — Array of { name, description } candidates to evaluate. When omitted, the evaluator generates 2-4 candidates from the question and context.

cascading_decisions (optional) — Array of { decision, reasoning } representing prior decisions. Treated as hard constraints — the evaluator cannot modify or challenge them. Conflicts are reported in prior_decision_conflicts.

The Process

Phase 1: Input Validation

Verify question is present and non-empty
Verify context is present (object or string)
If decision_type is provided, validate it's one of the 4 recognized values
If approaches is provided, verify it's an array with at least 2 entries, each having name and description

Phase 2: Dispatch Evaluator

Dispatch a single Opus agent using skills/assay/assay-evaluator-prompt.md.

Fill template placeholders before writing the dispatch file:

{{QUESTION}} — the decision question
{{CONTEXT}} — the full context object/string
{{DECISION_TYPE}} — the decision type (provided or "auto-detect")
{{APPROACHES}} — the approaches array (or "Generate 2-4 candidates")
{{CASCADING_DECISIONS}} — cascading decisions array (or "None")

Phase 3: Validate Output

Parse the evaluator's response as JSON. Validate:

All required fields present: decision_type, confidence, missing_information, recommended, alternatives, prior_decision_conflicts
recommended has: name, rationale, evidence, risks, kill_criteria, constraint_fit
Each alternative has: name, constraint_fit, pros, cons, would_recommend_if
constraint_fit objects have: pattern_alignment, scope_fit, reversibility, integration_risk
confidence is one of: high, medium, low

On validation failure: Retry once with the validation errors as feedback. On second failure, return:

{ "error": "Evaluator produced invalid output after retry", "raw_output": "..." }

Phase 4: Return Report

Return the validated Assay Report to the caller.

Decision Type Adaptation

The evaluator adapts scoring weights based on decision type:

| Type | Primary Weight | Secondary Weight | |---|---|---| | architecture | Reversibility, constraint fit | Long-term cost, extensibility | | strategy | Risk, phasing | Blast radius, team capacity | | diagnosis | Evidence strength, testability | Explanation coverage, simplicity | | optimization | Measurable improvement | Disruption cost, reversibility |

Output: Assay Report

{
  "decision_type": "architecture",
  "confidence": "high",
  "missing_information": [],
  "recommended": {
    "name": "Event-driven via message bus",
    "rationale": "Aligns with existing src/events/bus.ts pattern...",
    "evidence": ["src/events/bus.ts:14 — existing event dispatch"],
    "risks": ["Adds async complexity to currently synchronous flow"],
    "kill_criteria": "Switch away if latency requirements exceed 50ms p99",
    "constraint_fit": {
      "pattern_alignment": "high",
      "scope_fit": "high",
      "reversibility": "two-way door",
      "integration_risk": "low"
    }
  },
  "alternatives": [
    {
      "name": "Direct service calls",
      "constraint_fit": {
        "pattern_alignment": "medium",
        "scope_fit": "high",
        "reversibility": "one-way door",
        "integration_risk": "medium"
      },
      "pros": ["Simpler mental model", "Synchronous"],
      "cons": ["Tight coupling", "Requires shared deployment"],
      "would_recommend_if": "Latency is critical or team prefers simplicity"
    }
  ],
  "prior_decision_conflicts": []
}

Confidence Scoring

| Level | Criteria | |---|---| | high | One approach clearly dominates on all weighted dimensions | | medium | Two viable options with trade-offs that depend on priority | | low | Need more information — missing_information lists what would help |

Evidence Grounding

Every recommendation must cite specific evidence from the context:

File:line references from recon briefs
Specific pattern names from the codebase
Concrete constraint violations or alignments

"This is the industry standard approach" is NOT evidence. "This aligns with how src/api/routes/users.ts already handles it" IS evidence.

Without a recon brief, evidence cites the caller's context. Confidence scores skew lower.

Kill Criteria

kill_criteria on recommended approach: condition that would flip the recommendation
would_recommend_if on each alternative: condition that would make it the recommendation

These make decisions revisitable without re-running the full analysis.

Error Handling

| Failure | Behavior | |---|---| | Missing question or context | Return error immediately — no dispatch | | Evaluator returns invalid JSON | Retry once with validation errors. Second failure returns { "error": ... } | | Evaluator timeout | Return { "error": "Evaluator timed out" } | | Invalid decision_type | Warn and default to architecture | | approaches has fewer than 2 entries | Ignore provided approaches, let evaluator generate candidates |

Integration

Called by

| Skill | Decision Type | Context Source | Approaches | |---|---|---|---| | /design | architecture | Recon brief + cascading decisions | Evaluator generates | | /spec | architecture | Recon brief + cascading decisions (autonomous — confidence routing) | Evaluator generates | | /migrate | strategy | Recon brief + migration analysis | Evaluator generates |

Not called by (investigated, not a fit): /debugging (hypothesis evaluation uses quality-gate, not assay), /prospector (competing design evaluation is more sophisticated than assay for this use case). See #147 for rationale.

Consumer Dispatch Examples

From /design:

/assay
  question: "How should components communicate in the new auth module?"
  context: { recon brief with project_structure, existing_patterns }
  decision_type: "architecture"
  cascading_decisions: [{ decision: "Using Redis for session store", reasoning: "..." }]

From /spec:

/assay
  question: "How should the auth middleware handle token refresh?"
  context: { recon brief + investigation findings }
  decision_type: "architecture"
  cascading_decisions: [{ decision: "Using Redis for session store", reasoning: "..." }]

Spec consumes assay output autonomously: high confidence = accept, medium = terminal alert, low = block alert.

From /migrate:

/assay
  question: "What migration strategy minimizes risk for the React 18→19 upgrade?"
  context: { recon brief + migration_target: "React 19", breaking_changes: [...] }
  decision_type: "strategy"

Standalone Usage

/assay question: "Should we use PostgreSQL or SQLite for this project?"
  context: "Small team, <10K users, read-heavy workload, deployed on single server"

Dispatches

Evaluator agent (Opus) via skills/assay/assay-evaluator-prompt.md

Does NOT

Investigate the codebase (that's /recon)
Challenge prior decisions (that's /design's Challenger agent)
Make the decision for the user (it recommends; the caller decides)
Iterate or loop (one dispatch, one report)

Assay

Overview

All subagent dispatches use disk-mediated dispatch. See shared/dispatch-convention.md for the full protocol.

Skill type: Rigid — follow exactly, no shortcuts.

Models:

Evaluator agent: Opus (synthesis/judgment work needs the best model)
Orchestrator: runs on whatever model the session uses

Announce at start: "I'm using the assay skill to evaluate competing approaches."

Name origin: In metallurgy, an assay tests raw material to determine its quality and composition before committing it to the forge.

Invocation API

/assay
  question: "How should the auth middleware handle token refresh?"
  context: { ... }
  decision_type: "architecture"
  approaches: [...]
  cascading_decisions: [...]

Parameters

question (required) — The decision or question to evaluate. One clear sentence.

context (required) — Evidence for the evaluator to reason against. Accepts different shapes depending on the caller:

When context contains unrecognized keys, the evaluator treats them as additional evidence. When context is a bare string, treat as { "description": context }.

decision_type (optional) — architecture | strategy | diagnosis | optimization. Auto-detected from the question if omitted. Defaults to architecture when ambiguous.

approaches (optional) — Array of { name, description } candidates to evaluate. When omitted, the evaluator generates 2-4 candidates from the question and context.

The Process

Phase 1: Input Validation

Verify question is present and non-empty
Verify context is present (object or string)
If decision_type is provided, validate it's one of the 4 recognized values
If approaches is provided, verify it's an array with at least 2 entries, each having name and description

Phase 2: Dispatch Evaluator

Dispatch a single Opus agent using skills/assay/assay-evaluator-prompt.md.

Fill template placeholders before writing the dispatch file:

{{QUESTION}} — the decision question
{{CONTEXT}} — the full context object/string
{{DECISION_TYPE}} — the decision type (provided or "auto-detect")
{{APPROACHES}} — the approaches array (or "Generate 2-4 candidates")
{{CASCADING_DECISIONS}} — cascading decisions array (or "None")

Phase 3: Validate Output

Parse the evaluator's response as JSON. Validate:

All required fields present: decision_type, confidence, missing_information, recommended, alternatives, prior_decision_conflicts
recommended has: name, rationale, evidence, risks, kill_criteria, constraint_fit
Each alternative has: name, constraint_fit, pros, cons, would_recommend_if
constraint_fit objects have: pattern_alignment, scope_fit, reversibility, integration_risk
confidence is one of: high, medium, low

On validation failure: Retry once with the validation errors as feedback. On second failure, return:

{ "error": "Evaluator produced invalid output after retry", "raw_output": "..." }

Phase 4: Return Report

Return the validated Assay Report to the caller.

Decision Type Adaptation

The evaluator adapts scoring weights based on decision type:

Output: Assay Report

{
  "decision_type": "architecture",
  "confidence": "high",
  "missing_information": [],
  "recommended": {
    "name": "Event-driven via message bus",
    "rationale": "Aligns with existing src/events/bus.ts pattern...",
    "evidence": ["src/events/bus.ts:14 — existing event dispatch"],
    "risks": ["Adds async complexity to currently synchronous flow"],
    "kill_criteria": "Switch away if latency requirements exceed 50ms p99",
    "constraint_fit": {
      "pattern_alignment": "high",
      "scope_fit": "high",
      "reversibility": "two-way door",
      "integration_risk": "low"
    }
  },
  "alternatives": [
    {
      "name": "Direct service calls",
      "constraint_fit": {
        "pattern_alignment": "medium",
        "scope_fit": "high",
        "reversibility": "one-way door",
        "integration_risk": "medium"
      },
      "pros": ["Simpler mental model", "Synchronous"],
      "cons": ["Tight coupling", "Requires shared deployment"],
      "would_recommend_if": "Latency is critical or team prefers simplicity"
    }
  ],
  "prior_decision_conflicts": []
}

Confidence Scoring

Evidence Grounding

Every recommendation must cite specific evidence from the context:

File:line references from recon briefs
Specific pattern names from the codebase
Concrete constraint violations or alignments

"This is the industry standard approach" is NOT evidence. "This aligns with how src/api/routes/users.ts already handles it" IS evidence.

Without a recon brief, evidence cites the caller's context. Confidence scores skew lower.

Kill Criteria

kill_criteria on recommended approach: condition that would flip the recommendation
would_recommend_if on each alternative: condition that would make it the recommendation

These make decisions revisitable without re-running the full analysis.

Error Handling

Integration

Called by

Consumer Dispatch Examples

From /design:

/assay
  question: "How should components communicate in the new auth module?"
  context: { recon brief with project_structure, existing_patterns }
  decision_type: "architecture"
  cascading_decisions: [{ decision: "Using Redis for session store", reasoning: "..." }]

From /spec:

/assay
  question: "How should the auth middleware handle token refresh?"
  context: { recon brief + investigation findings }
  decision_type: "architecture"
  cascading_decisions: [{ decision: "Using Redis for session store", reasoning: "..." }]

Spec consumes assay output autonomously: high confidence = accept, medium = terminal alert, low = block alert.

From /migrate:

/assay
  question: "What migration strategy minimizes risk for the React 18→19 upgrade?"
  context: { recon brief + migration_target: "React 19", breaking_changes: [...] }
  decision_type: "strategy"

Standalone Usage

/assay question: "Should we use PostgreSQL or SQLite for this project?"
  context: "Small team, <10K users, read-heavy workload, deployed on single server"

Dispatches

Evaluator agent (Opus) via skills/assay/assay-evaluator-prompt.md

Does NOT

Investigate the codebase (that's /recon)
Challenge prior decisions (that's /design's Challenger agent)
Make the decision for the user (it recommends; the caller decides)
Iterate or loop (one dispatch, one report)

Adoption

raddue/assay

$ install --global

Security Scan Results

SKILL.md

Assay

Overview

Invocation API

Parameters

The Process

Phase 1: Input Validation

Phase 2: Dispatch Evaluator

Phase 3: Validate Output

Phase 4: Return Report

Decision Type Adaptation

Output: Assay Report

Confidence Scoring

Evidence Grounding

Kill Criteria

Error Handling

Integration

Called by

Consumer Dispatch Examples

Standalone Usage

Dispatches

Does NOT

Related Skills

raddue/delve

raddue/ledger

raddue/grudge

raddue/calibration-reconcile

raddue/assay

$ install --global

Security Scan Results

SKILL.md

Assay

Overview

Invocation API

Parameters

The Process

Phase 1: Input Validation

Phase 2: Dispatch Evaluator

Phase 3: Validate Output

Phase 4: Return Report

Decision Type Adaptation

Output: Assay Report

Confidence Scoring

Evidence Grounding

Kill Criteria

Error Handling

Integration

Called by

Consumer Dispatch Examples

Standalone Usage

Dispatches

Does NOT

Related Skills

raddue/delve

raddue/ledger

raddue/grudge

raddue/calibration-reconcile