Announce: "Using ds-verify (Phase 5) to confirm reproducibility and completion."

The Iron Law of DS Verification
Verification Facts
The Verification Gate
Verification Checklist
Reproducibility Demonstration
Claims Requiring Evidence
Insufficient Evidence
Required Output Structure
Completion Criteria

Context Monitoring

| Level | Remaining Context | Action | |-------|------------------|--------| | Normal | >35% | Proceed normally | | Warning | 25-35% | Complete current review cycle, then trigger ds-handoff | | Critical | ≤25% | Immediately trigger ds-handoff — do not start new review cycles |

Verification Gate

Final verification with reproducibility checks and user acceptance interview.

<EXTREMELY-IMPORTANT> ## The Iron Law of DS Verification

NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION. This is not negotiable.

Load shared enforcement first.

Auto-load all constraints matching applies-to: ds-verify:

!uv run python3 ${CLAUDE_SKILL_DIR}/../../scripts/load-constraints.py ds-verify

You MUST have these constraints loaded before proceeding. No claiming you "remember" them.

Before claiming analysis is complete, you MUST:

RE-RUN - Execute analysis fresh (not cached results)
CHECK - Verify outputs match expectations
REPRODUCE - Confirm results are reproducible
ASK - Interview user about constraints and acceptance
Only THEN claim completion

This applies even when:

"I just ran it"
"Results look the same"
"It should reproduce"
"User seemed happy earlier"

About to claim COMPLETE without a fresh re-run this session → STOP (you ship unverified results that waste the user's time). </EXTREMELY-IMPORTANT>

Verification Facts

Review checks methodology; verify checks reproducibility. They are different gates — a passed review is not evidence the analysis reproduces.
Reproducibility means re-running, not re-reading: prior results don't prove current reproducibility, because code, data, or environment may have changed since. Reading cached output proves nothing.
Verifying your own work shares the implementer's biases — the reproducibility demonstration is run fresh, not vouched for by its author. "Verified" without re-execution is an unverified claim.
The 10-minute reproducibility check is cheap against the 10 days of debugging when someone else can't run the analysis — skipping it to deliver faster is anti-efficient.
"User will be happy" is an assumption, not acceptance — completion requires the user's explicit confirmation, not your prediction of it.

Static Analysis (Constraint Check Scripts)

Before running runtime DQ checks, run the static analysis constraint check suite:

bash "${CLAUDE_SKILL_DIR}/../../scripts/check-all-ds.sh" "$(pwd)"

This runs all DS constraint check scripts (determinism, join audits, idempotency, error handling, schema contracts, standard errors, visualization integrity).

If any check FAILS: Report the failures in LEARNINGS.md. These are code quality issues in the analysis scripts that must be fixed before proceeding. Dispatch a fix subagent if needed.

This re-run is defense-in-depth. The same check-all-ds.sh floor is hook-enforced one phase upstream at ds-validate (mechanical-floor-gate.py, FLOOR=ds). ds-verify deliberately carries no hook for it: its only fan-out is a single fresh-Agent reproducibility check, and gating Agent here would deadlock against ds-no-main-chat-code-guard (a failing floor could not be fixed — fixes require an Agent).

If all checks PASS: Proceed to runtime DQ checks.

The Verification Gate

Checkpoint type: decision (user confirms results — cannot auto-advance)

Before making ANY completion claim, follow this flowchart.

This flowchart IS the specification. If prose elsewhere and this diagram disagree, the diagram wins.

        ┌──────────────────────────────┐
        │ 1. RE-RUN (fresh, not cached) │
        └──────────────┬───────────────┘
                       ▼
        ┌──────────────────────────────┐
        │ 2. CHECK vs success criteria  │
        └──────────────┬───────────────┘
                  pass? │
            ┌───── no ──┴── yes ─────┐
            ▼                        ▼
   ┌─────────────────┐   ┌──────────────────────────┐
   │ NEEDS WORK →    │   │ 3. REPRODUCE             │
   │ log + dispatch  │   │ (same inputs→same outputs)│
   │ fix subagent    │   └────────────┬─────────────┘
   └────────┬────────┘        match?  │
            │           ┌──── no ──────┴── yes ───┐
            │           ▼                         ▼
            │  ┌─────────────────┐   ┌─────────────────────────┐
            │  │ NEEDS WORK →    │   │ 4. ASK — user           │
            │  │ non-determinism │   │ acceptance interview    │
            │  │ is a defect     │   └───────────┬─────────────┘
            │  └────────┬────────┘    accept?     │
            │           │      ┌── no/partial ────┴── yes ──┐
            │           │      ▼                            ▼
            │           │  ┌──────────────────┐  ┌────────────────────┐
            └───────────┴─▶│ loop: ds-fix /   │  │ 5. CLAIM COMPLETE  │
                           │ ds-implement,    │  │ (only after 1-4)   │
                           │ then re-verify   │  └────────────────────┘
                           └──────────────────┘

Skipping any step is not verification. Reaching step 5 without passing 1-4 is a false completion claim.

Visual Diagnostics for Verification

When presenting verification results to the user in the acceptance interview, generate diagnostic plots to support the decision:

| Verification Check | Diagnostic to Generate | |-------------------|----------------------| | Reproducibility comparison | Overlay plot of Run 1 vs Run 2 key outputs | | Data integrity | Pipeline waterfall chart (input rows → cleaning → joins → final) | | Distribution sanity | Histogram/density plots of key variables with expected ranges annotated | | Model performance | ROC curve, residual plot, or coefficient comparison (as appropriate) |

Format: Inline plots in notebooks, or saved to scratch/diagnostics/ for script-based workflows. Present alongside the acceptance interview questions.

Verification Checklist

Technical Verification

Outputs Match Expectations

[ ] All required outputs generated
[ ] Output formats correct (files, figures, tables)
[ ] Numbers are reasonable (sanity checks)
[ ] Visualizations render correctly

Reproducibility Confirmed

[ ] Ran analysis twice, got same results
[ ] Random seeds produce consistent output
[ ] No dependency on execution order
[ ] Environment documented (packages, versions)

Data Integrity

[ ] Input data unchanged
[ ] Row counts traceable through pipeline
[ ] No silent data loss or corruption

Trace to Requirements: For each success criterion, reference its requirement ID (e.g., "DATA-01: Panel has 50K+ firm-years — VERIFIED with df.shape output"). End-to-end traceability from SPEC.md through PLAN.md through VALIDATION.md through verification.

User Acceptance Interview

CRITICAL: Before claiming completion, conduct user interview.

Step 1: Replication Constraints

AskUserQuestion:
  question: "Were there specific methodology requirements I should have followed?"
  options:
    - label: "Yes, replicating existing analysis"
      description: "Results should match a reference"
    - label: "Yes, required methodology"
      description: "Specific methods were mandated"
    - label: "No constraints"
      description: "Methodology was flexible"

If replicating:

Ask for reference to compare against
Verify results match within tolerance
Document any deviations and reasons

Step 2: Results Verification

AskUserQuestion:
  question: "Do these results answer your original question?"
  options:
    - label: "Yes, fully"
      description: "Analysis addresses the core question"
    - label: "Partially"
      description: "Some aspects addressed, others missing"
    - label: "No"
      description: "Does not answer the question"

If "Partially" or "No":

Ask which aspects are missing
Return to /ds-implement to address gaps
Re-run verification

Step 3: Output Format

AskUserQuestion:
  question: "Are the outputs in the format you need?"
  options:
    - label: "Yes"
      description: "Format is correct"
    - label: "Need adjustments"
      description: "Format needs modification"

Step 4: Confidence in Results

AskUserQuestion:
  question: "Do you have any concerns about the methodology or results?"
  options:
    - label: "No concerns"
      description: "Comfortable with approach and results"
    - label: "Minor concerns"
      description: "Would like clarification on some points"
    - label: "Major concerns"
      description: "Significant issues need addressing"

Reproducibility Demonstration

MANDATORY: Demonstrate reproducibility before completion.

<EXTREMELY-IMPORTANT> ## Independent Verification Required

You MUST NOT verify your own work. Spawn a fresh Task agent for reproducibility.

The implementer shares biases and sunk-cost attachment. A fresh subagent sees only the spec and outputs — it verifies without context pollution.

If you're about to re-run the analysis yourself, STOP. Dispatch a Task agent. </EXTREMELY-IMPORTANT>

Dispatch a fresh Task agent to run the reproducibility check:

All paths below are relative to this skill's base directory.

Agent(subagent_type="general-purpose",
  allowed_tools=["Read", "Glob", "Grep", "Bash(read-only)"],
  prompt="""
# Reproducibility Verification

**Tool Restrictions:** The verifier is READ-ONLY. It re-runs analyses and checks output but MUST NOT modify notebooks, scripts, or code. It MUST NOT use Write or Edit.

Verify this analysis produces consistent results from a fresh run.

## Context
- Read .planning/SPEC.md for objectives and success criteria
- Read .planning/PLAN.md for expected outputs
- Read .planning/LEARNINGS.md for pipeline documentation

## Shared Checks
Read the shared check definitions:
Read `${CLAUDE_SKILL_DIR}/../../skills/ds-implement/references/ds-checks.md` and follow its instructions.

Run checks: DQ1-DQ4, DQ6, COV, M1, R1

## Reproducibility Protocol

### For scripts:
```python
# Run 1
result1 = run_analysis(seed=42)
hash1 = hash(str(result1))

# Run 2
result2 = run_analysis(seed=42)
hash2 = hash(str(result2))

# Verify
assert hash1 == hash2, "Results not reproducible!"
print(f"Reproducibility confirmed: {hash1} == {hash2}")

For notebooks:

jupyter nbconvert --execute --inplace notebook.ipynb
papermill notebook.ipynb output.ipynb -p seed 42

Required Checks

RE-RUN: Execute analysis fresh (not cached results)
CHECK: Verify outputs match SPEC.md success criteria
REPRODUCE: Same inputs → same outputs (run twice, compare hashes)
DATA INTEGRITY: Input data unchanged, row counts traceable
COVERAGE (COV): each windowed source spans the Required window of every task reading it; gaps dispositioned

Output

Report:

Reproducibility: PASS/FAIL (with hash comparison)
Data quality checks: DQ1-DQ4, DQ6 results
Sample-period coverage: COV result (each windowed source spans its Required window; gaps dispositioned)
Spec compliance: M1 result
Any discrepancies found """)


**Post-subagent boundary (C5):** After verification agent returns, read its report only. Do NOT read source code, notebooks, or data files yourself. If FAIL, dispatch a fresh investigation subagent.

**If Task agent reports FAIL:** Dispatch a fresh Task agent to investigate the discrepancy. Do NOT investigate yourself — that violates the post-subagent boundary (C5 from ds-common-constraints.md).

## Claims Requiring Evidence

| Claim | Required Evidence |
|-------|-------------------|
| "Analysis complete" | All success criteria verified |
| "Results reproducible" | Same output from fresh run |
| "Matches reference" | Comparison showing match |
| "Data quality handled" | Documented cleaning steps |
| "Methodology appropriate" | Assumptions checked |

## Insufficient Evidence

These do NOT count as verification:

- Previous run results (must be fresh)
- "Should be reproducible" (demonstrate it)
- Visual inspection only (quantify where possible)
- Single run (need reproducibility check)
- Skipped user acceptance (must ask)

## Required Output Structure

```markdown
## Verification Report: [Analysis Name]

### Technical Verification

#### Outputs Generated
- [ ] Output 1: [location] - verified [date/time]
- [ ] Output 2: [location] - verified [date/time]

#### Reproducibility Check
- Run 1 hash: [value]
- Run 2 hash: [value]
- Match: YES/NO

#### Environment
- Python: [version]
- Key packages: [list with versions]
- Random seed: [value]

### User Acceptance

#### Replication Check
- Constraint: [none/replicating/required methodology]
- Reference: [if applicable]
- Match status: [if applicable]

#### User Responses
- Results address question: [yes/partial/no]
- Output format acceptable: [yes/needs adjustment]
- Methodology concerns: [none/minor/major]

### Verdict

**COMPLETE** or **NEEDS WORK**

[If COMPLETE]
- All technical checks passed
- User accepted results
- Reproducibility demonstrated

[If NEEDS WORK]
- [List items requiring attention]
- Recommended next steps

Workflow Loops (If NEEDS WORK)

Identify which item(s) need fixing
Return to ds-implement with specific task(s) to fix
Re-run those tasks with output-first verification
Update LEARNINGS.md with fixes
Re-invoke ds-verify for fresh verification

Maximum 3 verification cycles. If issues persist after 3 rounds, escalate to user with summary of blocking issues.

Chaining instruction (if NEEDS WORK). Discover and load ds-implement: Read ${CLAUDE_SKILL_DIR}/../../skills/ds-implement/SKILL.md and follow its instructions. Then fix the identified issues and re-run verification.

Completion Criteria

Only claim COMPLETE when ALL are true:

[ ] All success criteria from SPEC.md verified
[ ] Results reproducible (demonstrated, not assumed)
[ ] User confirmed results address their question
[ ] User has no major concerns
[ ] Outputs in acceptable format
[ ] If replicating: results match reference

Both technical and user acceptance must pass. No shortcuts.

Workflow Complete

When user confirms all criteria are met:

Announce: "DS workflow complete. All 5 phases passed."

The /ds workflow is now finished. Offer to:

Export results to final format
Clean up .planning/ files
Start a new analysis with /ds

Announce: "Using ds-verify (Phase 5) to confirm reproducibility and completion."

The Iron Law of DS Verification
Verification Facts
The Verification Gate
Verification Checklist
Reproducibility Demonstration
Claims Requiring Evidence
Insufficient Evidence
Required Output Structure
Completion Criteria

Context Monitoring

Verification Gate

Final verification with reproducibility checks and user acceptance interview.

<EXTREMELY-IMPORTANT> ## The Iron Law of DS Verification

NO COMPLETION CLAIMS WITHOUT FRESH VERIFICATION. This is not negotiable.

Load shared enforcement first.

Auto-load all constraints matching applies-to: ds-verify:

!uv run python3 ${CLAUDE_SKILL_DIR}/../../scripts/load-constraints.py ds-verify

You MUST have these constraints loaded before proceeding. No claiming you "remember" them.

Before claiming analysis is complete, you MUST:

RE-RUN - Execute analysis fresh (not cached results)
CHECK - Verify outputs match expectations
REPRODUCE - Confirm results are reproducible
ASK - Interview user about constraints and acceptance
Only THEN claim completion

This applies even when:

"I just ran it"
"Results look the same"
"It should reproduce"
"User seemed happy earlier"

About to claim COMPLETE without a fresh re-run this session → STOP (you ship unverified results that waste the user's time). </EXTREMELY-IMPORTANT>

Verification Facts

Review checks methodology; verify checks reproducibility. They are different gates — a passed review is not evidence the analysis reproduces.
Reproducibility means re-running, not re-reading: prior results don't prove current reproducibility, because code, data, or environment may have changed since. Reading cached output proves nothing.
Verifying your own work shares the implementer's biases — the reproducibility demonstration is run fresh, not vouched for by its author. "Verified" without re-execution is an unverified claim.
The 10-minute reproducibility check is cheap against the 10 days of debugging when someone else can't run the analysis — skipping it to deliver faster is anti-efficient.
"User will be happy" is an assumption, not acceptance — completion requires the user's explicit confirmation, not your prediction of it.

Static Analysis (Constraint Check Scripts)

Before running runtime DQ checks, run the static analysis constraint check suite:

bash "${CLAUDE_SKILL_DIR}/../../scripts/check-all-ds.sh" "$(pwd)"

This runs all DS constraint check scripts (determinism, join audits, idempotency, error handling, schema contracts, standard errors, visualization integrity).

If any check FAILS: Report the failures in LEARNINGS.md. These are code quality issues in the analysis scripts that must be fixed before proceeding. Dispatch a fix subagent if needed.

This re-run is defense-in-depth. The same check-all-ds.sh floor is hook-enforced one phase upstream at ds-validate (mechanical-floor-gate.py, FLOOR=ds). ds-verify deliberately carries no hook for it: its only fan-out is a single fresh-Agent reproducibility check, and gating Agent here would deadlock against ds-no-main-chat-code-guard (a failing floor could not be fixed — fixes require an Agent).

If all checks PASS: Proceed to runtime DQ checks.

The Verification Gate

Checkpoint type: decision (user confirms results — cannot auto-advance)

Before making ANY completion claim, follow this flowchart.

This flowchart IS the specification. If prose elsewhere and this diagram disagree, the diagram wins.

        ┌──────────────────────────────┐
        │ 1. RE-RUN (fresh, not cached) │
        └──────────────┬───────────────┘
                       ▼
        ┌──────────────────────────────┐
        │ 2. CHECK vs success criteria  │
        └──────────────┬───────────────┘
                  pass? │
            ┌───── no ──┴── yes ─────┐
            ▼                        ▼
   ┌─────────────────┐   ┌──────────────────────────┐
   │ NEEDS WORK →    │   │ 3. REPRODUCE             │
   │ log + dispatch  │   │ (same inputs→same outputs)│
   │ fix subagent    │   └────────────┬─────────────┘
   └────────┬────────┘        match?  │
            │           ┌──── no ──────┴── yes ───┐
            │           ▼                         ▼
            │  ┌─────────────────┐   ┌─────────────────────────┐
            │  │ NEEDS WORK →    │   │ 4. ASK — user           │
            │  │ non-determinism │   │ acceptance interview    │
            │  │ is a defect     │   └───────────┬─────────────┘
            │  └────────┬────────┘    accept?     │
            │           │      ┌── no/partial ────┴── yes ──┐
            │           │      ▼                            ▼
            │           │  ┌──────────────────┐  ┌────────────────────┐
            └───────────┴─▶│ loop: ds-fix /   │  │ 5. CLAIM COMPLETE  │
                           │ ds-implement,    │  │ (only after 1-4)   │
                           │ then re-verify   │  └────────────────────┘
                           └──────────────────┘

Skipping any step is not verification. Reaching step 5 without passing 1-4 is a false completion claim.

Visual Diagnostics for Verification

When presenting verification results to the user in the acceptance interview, generate diagnostic plots to support the decision:

Format: Inline plots in notebooks, or saved to scratch/diagnostics/ for script-based workflows. Present alongside the acceptance interview questions.

Verification Checklist

Technical Verification

Outputs Match Expectations

[ ] All required outputs generated
[ ] Output formats correct (files, figures, tables)
[ ] Numbers are reasonable (sanity checks)
[ ] Visualizations render correctly

Reproducibility Confirmed

[ ] Ran analysis twice, got same results
[ ] Random seeds produce consistent output
[ ] No dependency on execution order
[ ] Environment documented (packages, versions)

Data Integrity

[ ] Input data unchanged
[ ] Row counts traceable through pipeline
[ ] No silent data loss or corruption

User Acceptance Interview

CRITICAL: Before claiming completion, conduct user interview.

Step 1: Replication Constraints

AskUserQuestion:
  question: "Were there specific methodology requirements I should have followed?"
  options:
    - label: "Yes, replicating existing analysis"
      description: "Results should match a reference"
    - label: "Yes, required methodology"
      description: "Specific methods were mandated"
    - label: "No constraints"
      description: "Methodology was flexible"

If replicating:

Ask for reference to compare against
Verify results match within tolerance
Document any deviations and reasons

Step 2: Results Verification

AskUserQuestion:
  question: "Do these results answer your original question?"
  options:
    - label: "Yes, fully"
      description: "Analysis addresses the core question"
    - label: "Partially"
      description: "Some aspects addressed, others missing"
    - label: "No"
      description: "Does not answer the question"

If "Partially" or "No":

Ask which aspects are missing
Return to /ds-implement to address gaps
Re-run verification

Step 3: Output Format

AskUserQuestion:
  question: "Are the outputs in the format you need?"
  options:
    - label: "Yes"
      description: "Format is correct"
    - label: "Need adjustments"
      description: "Format needs modification"

Step 4: Confidence in Results

AskUserQuestion:
  question: "Do you have any concerns about the methodology or results?"
  options:
    - label: "No concerns"
      description: "Comfortable with approach and results"
    - label: "Minor concerns"
      description: "Would like clarification on some points"
    - label: "Major concerns"
      description: "Significant issues need addressing"

Reproducibility Demonstration

MANDATORY: Demonstrate reproducibility before completion.

<EXTREMELY-IMPORTANT> ## Independent Verification Required

You MUST NOT verify your own work. Spawn a fresh Task agent for reproducibility.

The implementer shares biases and sunk-cost attachment. A fresh subagent sees only the spec and outputs — it verifies without context pollution.

If you're about to re-run the analysis yourself, STOP. Dispatch a Task agent. </EXTREMELY-IMPORTANT>

Dispatch a fresh Task agent to run the reproducibility check:

All paths below are relative to this skill's base directory.

Agent(subagent_type="general-purpose",
  allowed_tools=["Read", "Glob", "Grep", "Bash(read-only)"],
  prompt="""
# Reproducibility Verification

**Tool Restrictions:** The verifier is READ-ONLY. It re-runs analyses and checks output but MUST NOT modify notebooks, scripts, or code. It MUST NOT use Write or Edit.

Verify this analysis produces consistent results from a fresh run.

## Context
- Read .planning/SPEC.md for objectives and success criteria
- Read .planning/PLAN.md for expected outputs
- Read .planning/LEARNINGS.md for pipeline documentation

## Shared Checks
Read the shared check definitions:
Read `${CLAUDE_SKILL_DIR}/../../skills/ds-implement/references/ds-checks.md` and follow its instructions.

Run checks: DQ1-DQ4, DQ6, COV, M1, R1

## Reproducibility Protocol

### For scripts:
```python
# Run 1
result1 = run_analysis(seed=42)
hash1 = hash(str(result1))

# Run 2
result2 = run_analysis(seed=42)
hash2 = hash(str(result2))

# Verify
assert hash1 == hash2, "Results not reproducible!"
print(f"Reproducibility confirmed: {hash1} == {hash2}")

For notebooks:

jupyter nbconvert --execute --inplace notebook.ipynb
papermill notebook.ipynb output.ipynb -p seed 42

Required Checks

RE-RUN: Execute analysis fresh (not cached results)
CHECK: Verify outputs match SPEC.md success criteria
REPRODUCE: Same inputs → same outputs (run twice, compare hashes)
DATA INTEGRITY: Input data unchanged, row counts traceable
COVERAGE (COV): each windowed source spans the Required window of every task reading it; gaps dispositioned

Output

Report:

Reproducibility: PASS/FAIL (with hash comparison)
Data quality checks: DQ1-DQ4, DQ6 results
Sample-period coverage: COV result (each windowed source spans its Required window; gaps dispositioned)
Spec compliance: M1 result
Any discrepancies found """)


**Post-subagent boundary (C5):** After verification agent returns, read its report only. Do NOT read source code, notebooks, or data files yourself. If FAIL, dispatch a fresh investigation subagent.

**If Task agent reports FAIL:** Dispatch a fresh Task agent to investigate the discrepancy. Do NOT investigate yourself — that violates the post-subagent boundary (C5 from ds-common-constraints.md).

## Claims Requiring Evidence

| Claim | Required Evidence |
|-------|-------------------|
| "Analysis complete" | All success criteria verified |
| "Results reproducible" | Same output from fresh run |
| "Matches reference" | Comparison showing match |
| "Data quality handled" | Documented cleaning steps |
| "Methodology appropriate" | Assumptions checked |

## Insufficient Evidence

These do NOT count as verification:

- Previous run results (must be fresh)
- "Should be reproducible" (demonstrate it)
- Visual inspection only (quantify where possible)
- Single run (need reproducibility check)
- Skipped user acceptance (must ask)

## Required Output Structure

```markdown
## Verification Report: [Analysis Name]

### Technical Verification

#### Outputs Generated
- [ ] Output 1: [location] - verified [date/time]
- [ ] Output 2: [location] - verified [date/time]

#### Reproducibility Check
- Run 1 hash: [value]
- Run 2 hash: [value]
- Match: YES/NO

#### Environment
- Python: [version]
- Key packages: [list with versions]
- Random seed: [value]

### User Acceptance

#### Replication Check
- Constraint: [none/replicating/required methodology]
- Reference: [if applicable]
- Match status: [if applicable]

#### User Responses
- Results address question: [yes/partial/no]
- Output format acceptable: [yes/needs adjustment]
- Methodology concerns: [none/minor/major]

### Verdict

**COMPLETE** or **NEEDS WORK**

[If COMPLETE]
- All technical checks passed
- User accepted results
- Reproducibility demonstrated

[If NEEDS WORK]
- [List items requiring attention]
- Recommended next steps

Workflow Loops (If NEEDS WORK)

Identify which item(s) need fixing
Return to ds-implement with specific task(s) to fix
Re-run those tasks with output-first verification
Update LEARNINGS.md with fixes
Re-invoke ds-verify for fresh verification

Maximum 3 verification cycles. If issues persist after 3 rounds, escalate to user with summary of blocking issues.

Completion Criteria

Only claim COMPLETE when ALL are true:

[ ] All success criteria from SPEC.md verified
[ ] Results reproducible (demonstrated, not assumed)
[ ] User confirmed results address their question
[ ] User has no major concerns
[ ] Outputs in acceptable format
[ ] If replicating: results match reference

Both technical and user acceptance must pass. No shortcuts.

Workflow Complete

When user confirms all criteria are met:

Announce: "DS workflow complete. All 5 phases passed."

The /ds workflow is now finished. Offer to:

Export results to final format
Clean up .planning/ files
Start a new analysis with /ds

Adoption

edwinhu/ds-verify

$ install --global

Security Scan Results

SKILL.md

Contents

Context Monitoring

Verification Gate

Verification Facts

Static Analysis (Constraint Check Scripts)

The Verification Gate

Visual Diagnostics for Verification

Verification Checklist

Technical Verification

Outputs Match Expectations

Reproducibility Confirmed

Data Integrity

User Acceptance Interview

Step 1: Replication Constraints

Step 2: Results Verification

Step 3: Output Format

Step 4: Confidence in Results

Reproducibility Demonstration

For notebooks:

Required Checks

Output

Workflow Loops (If NEEDS WORK)

Completion Criteria

Workflow Complete

Related Skills

edwinhu/npx-ownership-panel

edwinhu/crsp-v2

edwinhu/fuzzy-name-matching

edwinhu/ds-tables

edwinhu/ds-verify

$ install --global

Security Scan Results

SKILL.md

Contents

Context Monitoring

Verification Gate

Verification Facts

Static Analysis (Constraint Check Scripts)

The Verification Gate

Visual Diagnostics for Verification

Verification Checklist

Technical Verification

Outputs Match Expectations

Reproducibility Confirmed

Data Integrity

User Acceptance Interview

Step 1: Replication Constraints

Step 2: Results Verification

Step 3: Output Format

Step 4: Confidence in Results

Reproducibility Demonstration

For notebooks:

Required Checks

Output

Workflow Loops (If NEEDS WORK)

Completion Criteria

Workflow Complete

Related Skills

edwinhu/npx-ownership-panel

edwinhu/crsp-v2

edwinhu/fuzzy-name-matching

edwinhu/ds-tables