skills/forge-repair-state/SKILL.md
Fix corrupted pipeline state — repair counters, stale locks, invalid stages, and WAL recovery in state.json. Use when /forge-diagnose reports problems, pipeline fails with state errors, or counters seem wrong. Confirms changes before writing. Trigger: /forge-repair-state, fix state, repair pipeline, state corrupted
npx skillsauth add quantumbitcz/dev-pipeline forge-repair-stateInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
You validate .forge/state.json and fix specific issues found. Unlike /forge-diagnose (read-only) and /forge-reset (full wipe), this skill makes targeted repairs while preserving pipeline progress.
Before any action, verify:
git rev-parse --show-toplevel 2>/dev/null. If fails: report "Not a git repository. Navigate to a project directory." and STOP..forge/state.json exists. If not: report "No pipeline state found. Nothing to repair. Run /forge-run to start a pipeline." and STOP..forge/state.json exists.
/forge-run to start a pipeline." and stop..forge/state.json as JSON.
/forge-reset to start fresh." and stop..claude/forge-config.md for configured maximums (fallback to defaults).Before any other repair, check for pending WAL entries:
bash shared/forge-state-write.sh recover --forge-dir .forge
Run each check below. Collect all needed repairs before applying any.
R1: Schema version mismatch
version is missing or not "1.5.0": propose setting version to "1.5.0".R2: Missing required fields
story_id, requirement, story_state, mode, complete: if missing, propose a repair.
story_id: set to "unknown-repair-{date}" (e.g., "unknown-repair-2026-04-10").requirement: set to "unknown (recovered by repair-state)".story_state: infer from stage_timestamps (latest completed stage + 1). If no timestamps: set to "PREFLIGHT".mode: set to "standard".complete: set to false.R3: Invalid story_state
story_state is not a recognized state (see /forge-diagnose for the full list): propose resetting to the last valid state inferred from stage_timestamps.R4: Invalid mode
mode is not one of standard, bugfix, migration, bootstrap, testing, refactor, performance: propose setting to "standard".R5: Corrupted sequence counter
_seq is missing, zero, negative, or non-numeric: propose setting to 1.R6: Counter overflows
forge-config.md or use defaults: total_retries_max = 10, max_weight = 5.5, max_iterations = 8.total_retries > total_retries_max: propose capping to total_retries_max.recovery_budget.total_weight > recovery_budget.max_weight: propose capping to max_weight.convergence.total_iterations > configured max_iterations: propose capping to max_iterations.R7: Completion inconsistency
complete: true but story_state is not COMPLETE or ABORTED: propose setting story_state to "COMPLETE".complete: false but story_state is COMPLETE or ABORTED: propose setting complete to true.R8: Stale lock file
.forge/.lock exists:
kill -0 $pid 2>/dev/null.forge/.lock..forge/.lock.R9: Missing convergence object
convergence is missing entirely: propose initializing it with defaults:
{
"phase": "correctness",
"phase_iterations": 0,
"total_iterations": 0,
"plateau_count": 0,
"last_score_delta": 0,
"convergence_state": "IMPROVING",
"phase_history": [],
"safety_gate_passed": false,
"safety_gate_failures": 0,
"unfixable_findings": [],
"diminishing_count": 0,
"unfixable_info_count": 0
}
R10: Missing recovery_budget object
recovery_budget is missing entirely: propose initializing it with defaults:
{
"total_weight": 0.0,
"max_weight": 5.5,
"applications": []
}
If no repairs are needed: report "State is healthy. No repairs needed." and stop.
If repairs are needed, present them to the user:
Use AskUserQuestion:
If the user confirms:
python3:
python3 -c "
import json, sys
state = json.load(open('.forge/state.json'))
# ... apply mutations ...
state['_seq'] = state.get('_seq', 0) + 1 # increment _seq for consistency
print(json.dumps(state, indent=2))
" > /tmp/forge-repair-state.json
forge-state-write.sh (preserves WAL and atomic write semantics):
bash "${CLAUDE_PLUGIN_ROOT}/shared/forge-state-write.sh" write "$(cat /tmp/forge-repair-state.json)" --forge-dir .forge
If forge-state-write.sh is not available, fall back to direct write with mv:
mv /tmp/forge-repair-state.json .forge/state.json
.forge/.lock.## Repair Results
Applied {n} repairs to .forge/state.json:
- R{n}: {description} — FIXED
- ...
State file is now valid. Run `/forge-diagnose` to verify.
If the user cancels: report "Repair cancelled. State unchanged."
.forge/state.json — that is what /forge-reset does..claude/forge.local.md, .claude/forge-config.md, or .claude/forge-log.md.forge-state-write.sh recover for WAL recovery — do not manually parse the WAL file.| Condition | Action |
|-----------|--------|
| state.json missing | Report "No pipeline state found. Nothing to repair. Run /forge-run to start a pipeline." and STOP |
| state.json unparseable JSON | Attempt WAL recovery first. If WAL recovery fails, report "state.json is corrupted beyond repair. Run /forge-reset to start fresh." and STOP |
| WAL file missing or corrupt | Log INFO "No WAL to recover." Continue with remaining checks |
| forge-state-write.sh unavailable | Fall back to direct write with mv (atomic on same filesystem) |
| User cancels repair | Report "Repair cancelled. State unchanged." and STOP |
| Write fails (permissions) | Report "Could not write repaired state. Check file permissions." and STOP |
| forge-config.md missing | Use default values for all maximum checks |
/forge-diagnose -- Read-only diagnostic to identify issues before repairing (run diagnose first, then repair)/forge-reset -- Clear all state when repair is insufficient (more destructive)/forge-resume -- Resume pipeline after state is repaired/forge-status -- Check pipeline state after repair to verifydevelopment
[writes] Build, fix, deploy, review, or modify code in this project. Universal entry for the forge pipeline. Auto-bootstraps on first run; brainstorms before planning when given a feature description. Use when you want to take any productive action: implementing features, fixing bugs, reviewing branches, deploying, committing, running migrations.
tools
[writes] Manage forge state and configuration: recovery, abort, config edits, session handoff, automations, playbooks, output compression, knowledge graph maintenance. Use when you need to recover from broken pipeline state, edit settings, or manage long-lived state.
development
[writes] Create, list, show, resume, or search forge session handoffs. Use when context is getting heavy and you want to transfer a forge run or conversation into a fresh Claude Code session, or to resume from a prior handoff artefact. Subcommands - no args (write), list, show, resume, search.
development
[writes] Manage the Neo4j knowledge graph. Subcommands: init, rebuild (writes); status, query <cypher>, debug (read-only). Requires Docker. No default — an explicit subcommand is required. Use when setting up the graph for the first time, rebuilding after major refactors, checking graph health, or running ad-hoc Cypher diagnostics.