skills/validate-feature/SKILL.md
Deploy locally, run security scans and behavioral tests, check CI/CD, and verify OpenSpec spec compliance
npx skillsauth add jankneumann/agentic-coding-tools validate-featureInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
Deploy the feature locally with DEBUG logging, run security scans and behavioral tests against live services, check CI/CD status, and verify OpenSpec spec compliance. Produces a structured validation report and posts it to the PR.
$ARGUMENTS - OpenSpec change-id (required), optionally followed by flags:
--skip-e2e or --skip-playwright — skip the Playwright E2E phase--skip-ci — skip the CI/CD status check--skip-security — skip the Security Scan phase--phase <name>[,<name>] — run only specified phases (e.g., --phase smoke,security)Valid phase names: deploy, smoke, gen-eval, security, e2e, architecture, spec, logs, ci
openspec/<change-id>, or the operator-mandated branch when OPENSPEC_BRANCH_OVERRIDE is set)openspec/changes/<change-id>//implement-feature first if no implementation existsWhen validation delegates checks or evidence review, treat the provider-neutral dispatch adapter as the canonical cross-provider path. Claude Code, Codex, and Gemini/Jules are first-class providers when configured; Claude-specific harness examples are adapter internals, with inline validation as the fallback.
Use OpenSpec-generated runtime assets first, then CLI fallback:
.claude/commands/opsx/*.md or .claude/skills/openspec-*/SKILL.md.codex/skills/openspec-*/SKILL.md.gemini/commands/opsx/*.toml or .gemini/skills/openspec-*/SKILL.mdopenspec CLI commandsUse docs/coordination-detection-template.md as the shared detection preamble.
CAN_* flag is trueValidation writes reports, evidence, logs, and sometimes follow-up artifacts. In local CLI execution, validation MUST run inside the feature worktree or another managed worktree before the first write:
eval "$(python3 "<skill-base-dir>/../worktree/scripts/worktree.py" setup "$CHANGE_ID")"
cd "$WORKTREE_PATH"
skills/.venv/bin/python skills/shared/checkout_policy.py require-mutation
Read-only CI status checks may inspect from the shared checkout, but any report or evidence file must be written in a worktree so it lands on the PR branch.
At skill start, run the coordination detection preamble and set:
COORDINATOR_AVAILABLECOORDINATION_TRANSPORT (mcp|http|none)CAN_LOCK, CAN_QUEUE_WORK, CAN_HANDOFF, CAN_MEMORY, CAN_GUARDRAILSIf CAN_MEMORY=true, recall relevant validation history:
recall"<skill-base-dir>/../coordination-bridge/scripts/coordination_bridge.py" try_recall(...)On recall failure/unavailability, continue with validation and log informationally.
# Parse change-id from argument or current branch
BRANCH=$(git branch --show-current)
CHANGE_ID=${ARGUMENTS%% --*} # Everything before first flag
CHANGE_ID=${CHANGE_ID:-$(echo $BRANCH | sed 's/^openspec\///')}
# Detect worktree context and resolve OpenSpec path
# Note: detect auto-discovers context from the working directory;
# agent-id information is available via the worktree registry if needed.
eval "$(python3 "<skill-base-dir>/../worktree/scripts/worktree.py" detect)"
PROJECT_ROOT="${MAIN_REPO:-$(git rev-parse --show-toplevel)}"
Parse flags from $ARGUMENTS:
--skip-e2e or --skip-playwright → set SKIP_E2E=true--skip-ci → set SKIP_CI=true--skip-security → set SKIP_SECURITY=true--phase <names> → set PHASES to comma-separated list; only run those phasesIf --phase is provided, only the listed phases execute. If --phase includes phases other than deploy, assume services are already running (skip deploy and teardown).
# Resolve the expected feature branch — honors registry + OPENSPEC_BRANCH_OVERRIDE
eval "$(python3 "<skill-base-dir>/../worktree/scripts/worktree.py" resolve-branch "$CHANGE_ID")"
FEATURE_BRANCH="$BRANCH"
# Verify on feature branch
CURRENT_BRANCH="$(git branch --show-current)"
if [[ "$CURRENT_BRANCH" != "$FEATURE_BRANCH" ]]; then
echo "ERROR: on '$CURRENT_BRANCH' but expected '$FEATURE_BRANCH' (source: $BRANCH_SOURCE)" >&2
exit 1
fi
# Verify proposal exists
openspec show $CHANGE_ID
# Verify implementation commits exist
COMMIT_COUNT=$(git log --oneline main..HEAD | wc -l)
if [ "$COMMIT_COUNT" -eq 0 ]; then
echo "ERROR: No implementation commits found on this branch."
echo "Run /implement-feature $CHANGE_ID first."
exit 1
fi
# Check Docker availability (only if Deploy phase will run)
if docker info > /dev/null 2>&1; then
echo "Docker is available"
else
echo "ERROR: Docker is not available. Install Docker Desktop or start the Docker daemon."
echo " macOS: brew install --cask docker"
echo " Linux: sudo systemctl start docker"
exit 1
fi
If not on the feature branch, check out $FEATURE_BRANCH (which honors OPENSPEC_BRANCH_OVERRIDE). If no implementation commits exist, abort with guidance.
Preferred path:
opsx:verify equivalent) for artifact guidance.CLI fallback path:
openspec instructions validation-report --change "$CHANGE_ID"
openspec instructions architecture-impact --change "$CHANGE_ID"
openspec status --change "$CHANGE_ID"
Ensure validation-report.md and architecture-impact.md are updated in the change directory as part of this validation run.
Phase name: deploy
Criticality: Critical (stops validation on failure)
# Find docker-compose file
COMPOSE_FILE=$(find "$PROJECT_ROOT" -maxdepth 2 -name "docker-compose.yml" | head -1)
if [ -z "$COMPOSE_FILE" ]; then
echo "SKIP: No docker-compose.yml found. Skipping Deploy phase."
echo " Smoke tests will run against already-running services."
DEPLOY_SKIPPED=true
else
COMPOSE_DIR=$(dirname "$COMPOSE_FILE")
LOG_FILE="/tmp/validate-feature-${CHANGE_ID}-$(date +%s).log"
echo "Starting services with DEBUG logging..."
echo " Compose file: $COMPOSE_FILE"
echo " Log file: $LOG_FILE"
# Start services with DEBUG logging, redirect output to log file.
#
# When the compose file gates the API server behind a profile (e.g. the
# agent-coordinator's `coordinator-api` service uses `profiles: [api]` so it
# doesn't auto-start during simple `docker compose up`), pass
# `COMPOSE_PROFILES` so the API process IS started here — without it the
# smoke + e2e phases get connection-refused on the API port. Multiple
# profiles can be comma-separated (`api,langfuse`).
AGENT_COORDINATOR_DB_PORT=${AGENT_COORDINATOR_DB_PORT:-54322} \
AGENT_COORDINATOR_REST_PORT=${AGENT_COORDINATOR_REST_PORT:-8081} \
AGENT_COORDINATOR_REALTIME_PORT=${AGENT_COORDINATOR_REALTIME_PORT:-4000} \
COMPOSE_PROFILES=${COMPOSE_PROFILES:-api} \
LOG_LEVEL=DEBUG docker-compose -f "$COMPOSE_FILE" up -d --build 2>&1 | tee "$LOG_FILE"
# Wait for health checks
echo "Waiting for services to be healthy..."
docker-compose -f "$COMPOSE_FILE" ps
# Wait for PostgreSQL health check (up to 30 seconds)
for i in $(seq 1 30); do
if docker-compose -f "$COMPOSE_FILE" exec -T postgres pg_isready -U postgres > /dev/null 2>&1; then
echo "PostgreSQL is ready"
break
fi
sleep 1
done
# Wait for REST API health endpoint (up to 30 seconds — the API container
# may need build time on first run + warmup before /health flips to 200)
for i in $(seq 1 30); do
if curl -sf http://localhost:${AGENT_COORDINATOR_REST_PORT:-8081}/health > /dev/null 2>&1; then
echo "REST API is ready"
break
fi
sleep 1
done
# Collect running container logs in background
docker-compose -f "$COMPOSE_FILE" logs -f >> "$LOG_FILE" 2>&1 &
LOG_PID=$!
DEPLOY_RESULT="pass"
fi
If Deploy fails, report the failure with Docker logs and skip to Teardown.
Phase name: smoke
Criticality: Critical (stops validation on failure)
Run the reusable pytest smoke test suite against the live services. The suite is configurable via environment variables so it works with any deployed HTTP API.
# Configure for the target API (adjust per project)
export API_BASE_URL="${API_BASE_URL:-http://localhost:8000}"
export API_HEALTH_ENDPOINT="${API_HEALTH_ENDPOINT:-/health}"
export API_READY_ENDPOINT="${API_READY_ENDPOINT:-/ready}"
export API_AUTH_HEADER="${API_AUTH_HEADER:-X-Admin-Key}"
export API_AUTH_VALUE="${API_AUTH_VALUE:-$ADMIN_API_KEY}"
export API_PROTECTED_ENDPOINT="${API_PROTECTED_ENDPOINT:-/api/v1/settings/prompts}"
export API_CORS_ORIGIN="${API_CORS_ORIGIN:-http://localhost:5173}"
# Run smoke tests
SKILL_DIR="$(git rev-parse --show-toplevel)/skills/validate-feature"
pytest "$SKILL_DIR/scripts/smoke_tests/" -v --tb=short 2>&1
SMOKE_EXIT=$?
if [ $SMOKE_EXIT -eq 0 ]; then
SMOKE_RESULT="pass"
elif [ $SMOKE_EXIT -eq 5 ]; then
# Exit code 5 = no tests collected (services not running, all skipped)
SMOKE_RESULT="skip"
echo "SKIP: Services not running — smoke tests auto-skipped"
else
SMOKE_RESULT="fail"
SMOKE_FAILED=true
fi
The smoke tests cover:
If Smoke fails (SMOKE_EXIT != 0 and != 5), stop validation and skip to Teardown.
Phase name: gen-eval
Criticality: Non-critical (continues on failure)
Run generator-evaluator testing when interface descriptors exist for the project. This phase auto-detects descriptor files and selects between two modes:
cli-augmented mode when both an interface descriptor AND an OpenSpec change directory at openspec/changes/<change-id>/specs/ exist. Gen-eval is invoked with --mode cli-augmented --openspec-change <change-id> so the generator seeds scenarios from the change's WHEN/THEN spec blocks.template-only mode (existing fallback) when descriptors exist but no OpenSpec change directory. Requires no CLI or SDK dependencies.# Auto-detect gen-eval descriptors
GENEVAL_DESCRIPTORS=$(find "$PROJECT_ROOT" -path "*/evaluation/gen_eval/descriptors/*.yaml" -type f 2>/dev/null)
if [ -z "$GENEVAL_DESCRIPTORS" ]; then
echo "SKIP: No gen-eval descriptors found. Skipping gen-eval phase."
GENEVAL_RESULT="skip"
else
# Mode selection: cli-augmented requires both descriptor AND OpenSpec change dir
GENEVAL_CHANGE_DIR="$PROJECT_ROOT/openspec/changes/$CHANGE_ID/specs"
if [ -d "$GENEVAL_CHANGE_DIR" ]; then
GENEVAL_MODE_FLAGS="--mode cli-augmented --openspec-change $CHANGE_ID"
GENEVAL_MODE_LABEL="mode=cli-augmented"
echo "gen-eval: $GENEVAL_MODE_LABEL (descriptor + OpenSpec change present at $GENEVAL_CHANGE_DIR)"
else
GENEVAL_MODE_FLAGS="--mode template-only --no-services"
GENEVAL_MODE_LABEL="mode=template-only"
echo "gen-eval: $GENEVAL_MODE_LABEL (no OpenSpec change at openspec/changes/$CHANGE_ID/specs/, falling back to template-only)"
fi
echo "Running gen-eval testing ($GENEVAL_MODE_LABEL)..."
GENEVAL_FAILED=false
for DESCRIPTOR in $GENEVAL_DESCRIPTORS; do
echo " Descriptor: $DESCRIPTOR"
# Resolve the module root (parent of evaluation/) and cd into it
GENEVAL_MODULE_ROOT=$(dirname "$(dirname "$(dirname "$(dirname "$DESCRIPTOR")")")")
GENEVAL_PYTHON="$GENEVAL_MODULE_ROOT/.venv/bin/python"
if [ ! -f "$GENEVAL_PYTHON" ]; then GENEVAL_PYTHON="python3"; fi
(cd "$GENEVAL_MODULE_ROOT" && "$GENEVAL_PYTHON" -m evaluation.gen_eval \
--descriptor "$DESCRIPTOR" \
$GENEVAL_MODE_FLAGS \
--report-format both \
--output-dir "$PROJECT_ROOT/openspec/changes/$CHANGE_ID" 2>&1)
GENEVAL_EXIT=$?
if [ $GENEVAL_EXIT -ne 0 ]; then
GENEVAL_FAILED=true
echo " gen-eval: FAIL for $DESCRIPTOR (exit $GENEVAL_EXIT)"
else
echo " gen-eval: PASS for $DESCRIPTOR"
fi
done
if [ "$GENEVAL_FAILED" = true ]; then
GENEVAL_RESULT="fail"
echo "Gen-eval: FAIL — One or more descriptors had failures (non-blocking)"
else
GENEVAL_RESULT="pass"
echo "Gen-eval: PASS — All descriptors passed"
fi
fi
Gen-eval failures are non-critical and do not block validation. Results are included in the validation report for informational purposes. cli-augmented mode failures (e.g., from prompt-injection attempts caught by the parser) still degrade gracefully — the validate-feature pipeline continues to subsequent phases.
Phase name: security
Criticality: Non-critical (continues on failure)
Run security scanners (OWASP Dependency-Check and ZAP) against the live deployment using the existing security-review orchestrator.
# Skip if --skip-security flag was provided
if [ "$SKIP_SECURITY" = true ]; then
echo "SKIP: Security phase skipped (--skip-security flag)"
SECURITY_RESULT="skip"
else
echo "Running security scans against live deployment..."
# Invoke the security-review orchestrator with the live API target
python3 skills/security-review/scripts/main.py \
--repo . \
--out-dir docs/security-review \
--zap-target "http://localhost:${AGENT_COORDINATOR_REST_PORT:-3000}" \
--change "$CHANGE_ID" \
--allow-degraded-pass 2>&1
SECURITY_EXIT=$?
if [ $SECURITY_EXIT -eq 0 ]; then
SECURITY_RESULT="pass"
echo "Security: PASS — No threshold findings"
elif [ $SECURITY_EXIT -eq 10 ]; then
SECURITY_RESULT="fail"
echo "Security: FAIL — Threshold findings detected"
elif [ $SECURITY_EXIT -eq 11 ]; then
SECURITY_RESULT="degraded"
echo "Security: INCONCLUSIVE — Scanners degraded (check prerequisites)"
else
SECURITY_RESULT="fail"
echo "Security: ERROR — Unexpected exit code $SECURITY_EXIT"
fi
fi
The Security phase reuses the /security-review skill's scripts without requiring a separate invocation. The --allow-degraded-pass flag ensures missing prerequisites (Java, container runtime) degrade gracefully instead of blocking validation.
Phase name: e2e
Criticality: Non-critical (continues on failure)
# Skip if --skip-e2e flag was provided
if [ "$SKIP_E2E" = true ]; then
echo "SKIP: E2E phase skipped (--skip-e2e flag)"
else
# Check if pytest-playwright is installed
if python3 -c "import playwright" 2>/dev/null; then
PLAYWRIGHT_AVAILABLE=true
else
PLAYWRIGHT_AVAILABLE=false
fi
# Check if E2E tests exist
E2E_DIR=$(find "$PROJECT_ROOT" -path "*/tests/e2e" -type d | head -1)
if [ -z "$E2E_DIR" ]; then
echo "SKIP: No tests/e2e/ directory found. Skipping E2E phase."
elif [ "$PLAYWRIGHT_AVAILABLE" = false ]; then
echo "SKIP: pytest-playwright not installed. To install:"
echo " pip install pytest-playwright"
echo " playwright install chromium"
else
echo "Running E2E tests from $E2E_DIR..."
pytest "$E2E_DIR" -v --tb=short 2>&1
E2E_EXIT=$?
if [ $E2E_EXIT -eq 0 ]; then
E2E_RESULT="pass"
else
E2E_RESULT="fail"
fi
fi
fi
Phase name: architecture
Criticality: Non-critical (continues on failure)
Run architecture flow validation against the changed files:
# Get changed files relative to main
CHANGED_FILES=$(git diff --name-only main...HEAD | tr '\n' ',')
if [ -f "<skill-base-dir>/../validate-flows/scripts/validate_flows.py" ] && [ -f "docs/architecture-analysis/architecture.graph.json" ]; then
echo "Running architecture validation on changed files..."
python3 "<skill-base-dir>/../validate-flows/scripts/validate_flows.py" \
--graph docs/architecture-analysis/architecture.graph.json \
--output docs/architecture-analysis/architecture.diagnostics.json \
--files "$CHANGED_FILES" 2>&1
ARCH_EXIT=$?
if [ $ARCH_EXIT -eq 0 ]; then
ARCH_RESULT="pass"
ARCH_ERRORS=$(python3 -c "import json; d=json.load(open('docs/architecture-analysis/architecture.diagnostics.json')); print(d['summary']['errors'])" 2>/dev/null || echo 0)
ARCH_WARNINGS=$(python3 -c "import json; d=json.load(open('docs/architecture-analysis/architecture.diagnostics.json')); print(d['summary']['warnings'])" 2>/dev/null || echo 0)
if [ "$ARCH_ERRORS" -gt 0 ]; then
ARCH_RESULT="fail"
elif [ "$ARCH_WARNINGS" -gt 0 ]; then
ARCH_RESULT="warn"
fi
else
ARCH_RESULT="fail"
fi
else
echo "SKIP: Architecture validation not available (missing scripts or artifacts)"
echo " Run 'make architecture' to generate architecture artifacts"
ARCH_RESULT="skip"
fi
Report architecture diagnostics including broken flows, missing test coverage, orphaned code, and disconnected endpoints.
Phase name: spec
Criticality: Non-critical (continues on failure) — EXCEPT the task-drift gate (7.0) which is CRITICAL within this phase.
Before verifying requirements against the live system, enforce that tasks.md reflects commit reality. Drift between the two is a spec-compliance failure: the plan document either overstates completeness (dangerous) or understates it (bookkeeping debt that breaks archive-time invariants).
# Run from the change's worktree or feature branch
TASKS_FILE="openspec/changes/<change-id>/tasks.md"
UNCHECKED=$(grep -cE "^\s*- \[ \]" "$TASKS_FILE" 2>/dev/null || echo 0)
# Count commits on the feature branch since divergence from main
# (excludes commits on main that the branch inherited)
COMMIT_COUNT=$(git rev-list --count main..HEAD 2>/dev/null || echo 0)
if [ "$UNCHECKED" -gt 0 ] && [ "$COMMIT_COUNT" -gt 0 ]; then
echo "FAIL: task checkbox drift detected"
echo " $TASKS_FILE has $UNCHECKED unchecked boxes"
echo " branch has $COMMIT_COUNT commit(s) since main"
echo ""
echo "Either:"
echo " (a) complete the remaining tasks — in which case this validation run is premature"
echo " (b) reconcile — flip checkboxes for tasks whose code has landed (new commit, do NOT amend)"
echo " (c) defer — move genuinely-skipped tasks to deferred-tasks.md"
echo ""
echo "Do not proceed to requirement verification with drifted tasks.md."
exit 1 # CRITICAL failure — halts the spec phase
fi
Why this is CRITICAL: Archive validation (openspec archive) checks the tasks artifact's overall status, not individual checkboxes — meaning drift can slip through archive-time and leave inaccurate history. Catching it here, before the archive path, ensures the spec phase is the single source of truth for "does the plan document match what was built?" Per the incident log, specialized-workflow-agents shipped 29 tasks' worth of implementation to main with 0/29 checkboxes flipped because the validation gate didn't catch the drift (this check was added 2026-04-22 in response).
In CI-vs-local behavior: In local validation, exit 1 halts the phase immediately. In CI-invoked validation (where halting would abort merge-gate automation unhelpfully), record the drift as a CRITICAL finding in validation-report.md under "Phase Results" with Result=fail and Details listing the specific unchecked task IDs — do not silently continue.
Use the change-context.md traceability matrix as the spec compliance artifact:
Read change-context.md from the change directory ($OPENSPEC_PATH/changes/<change-id>/change-context.md).
specs/, extract SHALL/MUST clauses, and create rows with Req ID, Spec Source, Description, and Test(s) derived from git diff --name-only main..HEAD.For each row in the Requirement Traceability Matrix, verify the requirement against the live system:
Update the Evidence column for each row:
pass <short-SHA> — requirement verified successfully against the live systemfail <short-SHA> — requirement verification failed (include brief reason)deferred <reason> — cannot verify in this environment (e.g., requires production)Update Coverage Summary with final counts: requirements traced, tests mapped, evidence collected, gaps, and deferred items.
Report results sourced from the updated change-context.md:
Spec Compliance Results (from change-context.md):
✓ skill-workflow.1: Change context artifact generated during implementation
✓ skill-workflow.2: 3-phase incremental generation
✗ skill-workflow.3: TDD enforcement — test written after implementation
✓ skill-workflow.4: Validation report references change-context.md
Phase name: evidence
Criticality: Non-critical (continues on failure)
This phase runs only when work-packages.yaml exists at openspec/changes/<change-id>/. It audits that all work packages produced valid results and contract compliance evidence is present.
For each work package, validate its result (if artifacts/<package-id>/work-queue-result.json exists):
skills/.venv/bin/python "<skill-base-dir>/../validate-packages/scripts/validate_work_result.py" \
artifacts/<package-id>/result.json
Checks per package:
work-queue-result.schema.jsoncontracts_revision and plan_revision match work-packages.yamlscope_check.passed is trueverification.passed is trueCross-package consistency:
If change-context.md exists, populate the Evidence column from work-queue results.
Phase name: logs
Criticality: Non-critical (continues on failure)
Scan the collected log file for warning signs:
if [ -f "$LOG_FILE" ]; then
echo "Analyzing logs: $LOG_FILE"
echo "Log file size: $(wc -l < "$LOG_FILE") lines"
# Count by severity
WARNINGS=$(grep -c -i "WARNING" "$LOG_FILE" 2>/dev/null || echo 0)
ERRORS=$(grep -c -i "ERROR" "$LOG_FILE" 2>/dev/null || echo 0)
CRITICALS=$(grep -c -i "CRITICAL" "$LOG_FILE" 2>/dev/null || echo 0)
# Check for specific patterns
DEPRECATIONS=$(grep -c -i "deprecat" "$LOG_FILE" 2>/dev/null || echo 0)
STACK_TRACES=$(grep -c "Traceback" "$LOG_FILE" 2>/dev/null || echo 0)
UNHANDLED=$(grep -c "unhandled\|uncaught" "$LOG_FILE" 2>/dev/null || echo 0)
echo " Warnings: $WARNINGS"
echo " Errors: $ERRORS"
echo " Critical: $CRITICALS"
echo " Deprecations: $DEPRECATIONS"
echo " Stack traces: $STACK_TRACES"
echo " Unhandled exceptions: $UNHANDLED"
# Show context for errors and critical entries
if [ "$ERRORS" -gt 0 ] || [ "$CRITICALS" -gt 0 ]; then
echo ""
echo "Error/Critical entries with context:"
grep -n -i -B2 -A2 "ERROR\|CRITICAL" "$LOG_FILE" | head -50
fi
# Show deprecation warnings
if [ "$DEPRECATIONS" -gt 0 ]; then
echo ""
echo "Deprecation notices:"
grep -n -i "deprecat" "$LOG_FILE" | head -20
fi
else
echo "SKIP: No log file available (Deploy phase was skipped or no services started)"
fi
Categorize findings by severity:
Phase name: ci
Criticality: Non-critical (continues on failure)
# Skip if --skip-ci flag was provided
if [ "$SKIP_CI" = true ]; then
echo "SKIP: CI/CD check skipped (--skip-ci flag)"
else
# Check if GitHub remote is configured
if git remote get-url origin > /dev/null 2>&1; then
# Check if PR exists for this branch (uses resolved FEATURE_BRANCH)
PR_URL=$(gh pr view "$FEATURE_BRANCH" --json url --jq '.url' 2>/dev/null)
if [ -n "$PR_URL" ]; then
echo "PR found: $PR_URL"
echo ""
echo "CI/CD Check Status:"
gh pr checks "$FEATURE_BRANCH" 2>/dev/null || echo " No CI checks configured yet"
else
echo "No PR found for $FEATURE_BRANCH"
echo "Checking latest workflow runs..."
gh run list --branch "$FEATURE_BRANCH" --limit 3 2>/dev/null || echo " No workflow runs found"
fi
else
echo "SKIP: No GitHub remote configured"
fi
fi
Stop services and clean up:
# Only teardown if we started services (Deploy phase ran)
if [ "$DEPLOY_SKIPPED" != true ] && [ -n "$COMPOSE_FILE" ]; then
echo "Stopping services..."
# Stop background log collection
if [ -n "$LOG_PID" ]; then
kill $LOG_PID 2>/dev/null
fi
# Stop docker-compose services
docker-compose -f "$COMPOSE_FILE" down
echo "Services stopped"
fi
# Handle log file
if [ -f "$LOG_FILE" ]; then
if [ "$ALL_PHASES_PASSED" = true ]; then
rm "$LOG_FILE"
echo "Log file removed (all phases passed)"
else
echo "Log file preserved for inspection: $LOG_FILE"
fi
fi
Produce a structured summary of all phases:
## Validation Report: <change-id>
**Date**: YYYY-MM-DD HH:MM:SS
**Commit**: <short SHA>
**Branch**: <resolved feature branch — openspec/<change-id> by default, or operator override>
### Phase Results
✓ Deploy: Services started (N containers, DEBUG logging enabled)
✓ Smoke: All health checks passed (API, MCP, database)
○ Gen-Eval: Skipped (no descriptors found) _or_ ✓ Gen-Eval: 12/12 scenarios passed (template-only)
✓ Security: PASS — No threshold findings (dependency-check: ok, zap: ok)
✗ E2E: 3/5 tests passed, 2 failures
- test_login_flow: TimeoutError on /api/auth
- test_dashboard_load: Element not found: #stats-panel
✓ Architecture: No broken flows (2 warnings: orphaned functions)
✓ Spec Compliance: 8/8 requirements verified (see change-context.md)
⚠ Log Analysis: 3 warnings found
- [WARNING] Deprecated function call: old_api_handler (line 142)
✓ CI/CD: All checks passing
### Result
**PASS** — Ready for `/cleanup-feature <change-id>`
_or_
**FAIL** — Address findings, then re-run `/validate-feature` or `/iterate-on-implementation`
Use these symbols:
Write the validation report to the OpenSpec change directory:
REPORT_FILE="$OPENSPEC_PATH/changes/$CHANGE_ID/validation-report.md"
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
COMMIT_SHA=$(git rev-parse --short HEAD)
# Write report (overwrites previous)
cat > "$REPORT_FILE" << EOF
# Validation Report: $CHANGE_ID
**Date**: $TIMESTAMP
**Commit**: $COMMIT_SHA
**Branch**: $FEATURE_BRANCH
## Phase Results
<phase results from Step 10>
## Result
<PASS or FAIL with guidance>
EOF
echo "Report written to: $REPORT_FILE"
Post the validation report as a PR comment:
PR_NUMBER=$(gh pr view "$FEATURE_BRANCH" --json number --jq '.number' 2>/dev/null)
if [ -n "$PR_NUMBER" ]; then
gh pr comment "$PR_NUMBER" --body "$(cat <<EOF
## 🔍 Automated Validation Report
<contents of validation report from Step 10>
---
_Generated by \`/validate-feature $CHANGE_ID\` at $TIMESTAMP_
EOF
)"
echo "Report posted to PR #$PR_NUMBER"
else
echo "SKIP: No PR found for $FEATURE_BRANCH — report not posted"
echo " Create a PR first, then re-run to post the report"
fi
Construct a PhaseRecord for the Validation phase and call write_both(). Validation phases are typically read-only; emphasize the validation outcomes (pass/fail summary, waivers, deferred issues) in the structured fields rather than free-form prose.
Capture from this validation:
["spec", "evidence", "deploy", "smoke"]).Persist via PhaseRecord.write_both():
python3 - <<'EOF'
import sys
sys.path.insert(0, "skills/session-log/scripts")
from phase_record import PhaseRecord, Decision
record = PhaseRecord(
change_id="<change-id>",
phase_name="Validation",
agent_type="<agent-type>",
summary="<2-3 sentences: phases run, pass/fail summary, waivers>",
decisions=[
Decision(title="<title>", rationale="<rationale>"),
],
completed_work=["spec", "evidence"], # phases that passed
next_steps=["/cleanup-feature <change-id>"],
)
result = record.write_both()
print(f"markdown_path={result.markdown_path}")
print(f"sanitized={result.sanitized}")
print(f"handoff_id={result.handoff_id or '(local fallback)'}")
print(f"handoff_local_path={result.handoff_local_path}")
for w in result.warnings:
print(f"WARN: {w}", file=sys.stderr)
EOF
write_both() runs three best-effort steps internally: append rendered markdown → sanitize in-place → coordinator handoff (or local fallback at openspec/changes/<change-id>/handoffs/validation-<N>.json). Each step logs warnings on failure but does not raise.
Commit and push (validate-feature is read-only, so this needs a dedicated commit):
git add "openspec/changes/<change-id>/session-log.md"
git commit -m "chore: append validation session log for <change-id>"
git push
If all phases PASS:
Ready for cleanup:
/cleanup-feature <change-id>
If phases FAIL:
Option 1: Fix findings and re-validate:
/iterate-on-implementation <change-id>
/validate-feature <change-id>
Option 2: Re-run specific failing phases:
/validate-feature <change-id> --phase smoke,spec
Option 3: Skip non-critical failures and proceed:
/cleanup-feature <change-id>
Present the validation report and let the user decide the next step.
openspec/changes/<change-id>/validation-report.mdIf CAN_MEMORY=true, remember validation outcomes (phase pass/fail, key regressions, and next actions):
remember"<skill-base-dir>/../coordination-bridge/scripts/coordination_bridge.py" try_remember(...)After validation passes:
/cleanup-feature <change-id>
development
Open the artifacts relevant to a review (OpenSpec proposal, branch changes, or explicit paths) in VS Code, in a curated read-order, in the right worktree.
tools
Render and seed coordinator-owned task status block in OpenSpec tasks.md
testing
User-invocable skill that omits the tail block
tools
Missing several required keys