plugins/legacy/skills/oneshot/SKILL.md
End-to-end autonomous workflow — research, plan, implement, validate, ship, and merge in one command. **ALWAYS use when** the user says 'oneshot', 'do everything end to end', 'full workflow', or wants to go from ticket/idea to merged PR autonomously. All phases run sequentially in the current session, using agent teams for parallelism when needed.
npx skillsauth add coalesce-labs/catalyst oneshotInstall this skill globally with one command. Works with Claude Code, Cursor, and Windsurf.
3 of 9 scanners reported clean
Some scanners were skipped, did not run, or reported a non-clean status. Review each row below.
End-to-end autonomous workflow that chains research → plan → implement → validate → ship → merge in a single session. All phases run sequentially in the current Claude Code session, invoking skills directly. Context is managed naturally via Claude's automatic compaction, and the thoughts/ system provides persistent handoff documents between phases.
Legacy mode for the orchestrator (post-2026-06-15).
oneshotremains the canonical single-shot lifecycle for direct user invocation. The orchestrator's default worker dispatch is also stilloneshot-legacy— opt in to the per-phase pipeline by settingcatalyst.orchestration.dispatchModeto"phase-agents"in.catalyst/config.json, which runs nine short-livedclaude --bgskills (phase-triage…phase-monitor-deploy) instead of one longclaude -poneshotworker. See Phase agents for the pipeline, model assignment, and cost economics.
# Resolve the shared catalyst-dev scripts dir (skills live in catalyst-legacy; scripts stay in
# catalyst-dev). Fail fast with an actionable message if catalyst-dev is not installed.
source "${CLAUDE_PLUGIN_ROOT:-plugins/legacy}/scripts/require-catalyst-dev.sh" \
"${CLAUDE_PLUGIN_ROOT:-plugins/legacy}" || exit 1
# 0. Check project setup (thoughts, config, workflow context init)
if [[ -f "${CATALYST_DEV_SCRIPTS}/check-project-setup.sh" ]]; then
"${CATALYST_DEV_SCRIPTS}/check-project-setup.sh" || exit 1
fi
# 1. Validate thoughts system (REQUIRED)
if [[ ! -d "thoughts/shared" ]]; then
echo "❌ ERROR: Thoughts system not configured"
echo "Run: ./scripts/humanlayer/init-project.sh . {project-name}"
exit 1
fi
Supports two input modes:
Ticket-based:
/catalyst-dev:oneshot PROJ-123
Reads ticket from Linear, uses title/description as research query.
Freeform:
/catalyst-dev:oneshot "How does authentication work and can we add OAuth?"
Uses the provided text as the research query directly.
| Flag | Description |
| ---------------------- | ------------------------------------------------------------- |
| --team | Use agent teams for parallel implementation in Phase 3 |
| --label <text> | Custom display label for the session (overrides auto-derived) |
| --no-merge | Stop after PR creation — do NOT enter listen loop or merge |
| --no-ticket | Skip Linear ticket creation in freeform mode |
| --skip-validation | Skip Phase 4 entirely |
| --skip-quality-gates | Run /validate-plan but skip quality gate loop |
When running under an /orchestrate coordinator, oneshot writes status updates to a worker signal
file so the orchestrator can track progress and run adversarial verification.
Single-ticket scope contract (READ FIRST — CTL-208). Your assigned scope is exactly the ticket
ID passed as the first positional argument ($1). This is the SOLE source of truth for what work to
do. The orchestrator state directory ($ORCH_DIR), wave briefings, sibling worker signal files, and
comms channel participant lists exist for write-through state reporting and one-way context
absorption — they NEVER expand or modify your scope.
DO:
${TICKET_ID} (= $1) as your single ticket throughout the workflow.${ORCH_DIR}/workers/${TICKET_ID}.json — the SPECIFIC file named for
your ticket, not the directory.${ORCH_DIR}/wave-${WAVE}-briefing.md, where
${WAVE} comes from your signal file's wave field (set by the dispatcher in
orchestrate-dispatch-next).DO NOT:
${ORCH_DIR}/workers/*.json to discover sibling tickets.${ORCH_DIR}/state.json to see what other tickets are queued or in flight.$1.catalyst-comms status) as your scope.$1. If $1 is empty or missing, fail loudly; do not search for tickets to do.Detection (checked once at startup):
# 1. CATALYST_ORCHESTRATOR_DIR env var (set by orchestrator in dispatch)
ORCH_DIR="${CATALYST_ORCHESTRATOR_DIR:-}"
# 2. CATALYST_ORCHESTRATOR_ID env var (set by orchestrator in dispatch)
ORCH_ID="${CATALYST_ORCHESTRATOR_ID:-}"
# 3. Sibling directory with workers/ subdirectory (convention-based)
if [ -z "$ORCH_DIR" ]; then
PARENT=$(dirname "$(pwd)")
for DIR in "$PARENT"/*/workers; do
if [ -d "$DIR" ]; then
ORCH_DIR=$(dirname "$DIR")
break
fi
done
fi
# Resolve global state script path
STATE_SCRIPT="${CATALYST_DEV_SCRIPTS}/catalyst-state.sh"
Shared comms channel (CTL-111 / CTL-249): if CATALYST_COMMS_CHANNEL is set by the
orchestrator, the worker joins the shared channel, posts real traffic at each lifecycle boundary,
and reads inbound messages (directed to $TICKET_ID) after each phase transition. Best-effort —
every call is wrapped so a missing catalyst-comms CLI never crashes the worker. The worker posts
at minimum 4 messages per run: start + phase transitions + done. Inbound reads are driven by
comms_check (see below) — a non-blocking poll that checks for abort, use-event-driven, and
reprioritize signals from the orchestrator.
# Resolve the catalyst-comms binary. Prefer the plugin-shipped copy so installs
# where `catalyst-comms` is only a shell alias (which doesn't propagate to
# subshells) still work. Fall back to PATH for users who have symlinked it.
COMMS_BIN="${CATALYST_DEV_SCRIPTS}/catalyst-comms"
[ -x "$COMMS_BIN" ] || COMMS_BIN="$(command -v catalyst-comms 2>/dev/null || true)"
if [ -z "$COMMS_BIN" ] || [ ! -x "$COMMS_BIN" ]; then
echo "warn: catalyst-comms not found — worker comms disabled" >&2
COMMS_BIN=""
fi
# Helper — called at every hook point below. Silent no-op when comms is unavailable.
comms_post() {
local type="$1" body="$2"
[ -z "${CATALYST_COMMS_CHANNEL:-}" ] && return 0
[ -n "$COMMS_BIN" ] || return 0
"$COMMS_BIN" send "$CATALYST_COMMS_CHANNEL" "$body" \
--as "$TICKET_ID" --type "$type" >/dev/null 2>&1 || true
}
# Inbound comms — read messages directed to this worker at each phase boundary.
# COMMS_LAST_READ tracks the channel file line offset so we skip historical messages.
# Initialized after join (below) to the current end-of-file.
CATALYST_DIR="${CATALYST_DIR:-$HOME/catalyst}"
COMMS_CHANNEL_FILE="${CATALYST_DIR}/comms/channels/${CATALYST_COMMS_CHANNEL:-_}.jsonl"
COMMS_LAST_READ=0
comms_check() {
[ -z "${CATALYST_COMMS_CHANNEL:-}" ] && return 0
[ -n "$COMMS_BIN" ] || return 0
[ -f "$COMMS_CHANNEL_FILE" ] || return 0
local msgs next_pos
# Snapshot line count BEFORE polling so messages arriving during the read
# window are picked up on the next call rather than silently skipped.
next_pos=$(wc -l < "$COMMS_CHANNEL_FILE" | tr -d ' ')
msgs=$("$COMMS_BIN" poll "$CATALYST_COMMS_CHANNEL" \
--filter-to "$TICKET_ID" --since "$COMMS_LAST_READ" 2>/dev/null || true)
COMMS_LAST_READ="$next_pos"
[ -z "$msgs" ] && return 0
while IFS= read -r msg; do
[ -z "$msg" ] && continue
local msg_type msg_body
msg_type=$(printf '%s' "$msg" | jq -r '.type // "info"' 2>/dev/null || echo "info")
msg_body=$(printf '%s' "$msg" | jq -r '.body // ""' 2>/dev/null || echo "")
echo "[comms] Inbound ($msg_type): $msg_body" >&2
case "$msg_body" in
abort*|ABORT*) echo "[comms] Abort signal — exiting" >&2; exit 1 ;;
esac
done <<< "$msgs"
}
# Once, at startup — right after orchestrator mode detection:
if [ -n "${CATALYST_COMMS_CHANNEL:-}" ] && [ -n "$COMMS_BIN" ]; then
"$COMMS_BIN" join "$CATALYST_COMMS_CHANNEL" \
--as "$TICKET_ID" \
--capabilities "oneshot: ${TICKET_ID}" \
--orch "${CATALYST_ORCHESTRATOR_ID:-}" \
--parent orchestrator \
--ttl 3600 >/dev/null 2>&1 || true
comms_post info "started oneshot for $TICKET_ID"
# Snapshot the channel file line count so comms_check skips pre-worker messages
COMMS_CHANNEL_FILE="${CATALYST_DIR}/comms/channels/${CATALYST_COMMS_CHANNEL}.jsonl"
[ -f "$COMMS_CHANNEL_FILE" ] && COMMS_LAST_READ=$(wc -l < "$COMMS_CHANNEL_FILE" | tr -d ' ')
fi
# CTL-303: broker registration helpers. The worker uses agent.checkin auto-correlation
# instead of explicit filter.register calls. After PR creation, a second agent.checkin
# with claimed_pr set is all the broker needs to auto-derive a pr_lifecycle interest.
# The Phase 5 listen loop waits on filter.wake.${CATALYST_SESSION_ID} as before.
# When the daemon is not running, the loop falls back to direct catalyst-events wait-for.
broker_daemon_running() {
if command -v catalyst-broker >/dev/null 2>&1; then
catalyst-broker probe 2>/dev/null
elif command -v catalyst-filter >/dev/null 2>&1; then
catalyst-filter status 2>/dev/null | grep -q "^running"
else
return 1
fi
}
# CTL-429: retry wrapper — daemon may be starting up or momentarily unavailable.
# Tries up to 3 times with 2s gaps before declaring the daemon absent.
wait_for_broker_ready() {
local max_attempts=3 attempt=0
while [ $attempt -lt $max_attempts ]; do
broker_daemon_running && return 0
attempt=$((attempt + 1))
[ $attempt -lt $max_attempts ] && sleep 2
done
return 1
}
# CTL-303: update claimed_pr in the broker via a second agent.checkin.
# The broker auto-derives pr_lifecycle from the check-in — no explicit filter.register needed.
broker_claim_pr() {
# Args: $1 = PR_NUMBER, $2 = TICKET_ID, $3 = BRANCH_NAME, $4 = REPO (org/name), $5 = BASE_BRANCH (default: main)
wait_for_broker_ready || return 1
[ -n "${CATALYST_SESSION_ID:-}" ] || return 1
local pr="$1" ticket="$2" repo="$4" base="${5:-main}"
local ts; ts="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
"$STATE_SCRIPT" event "$(jq -nc \
--arg ts "$ts" \
--arg sid "$CATALYST_SESSION_ID" \
--arg orch "${CATALYST_ORCHESTRATOR_ID:-}" \
--argjson pr "$pr" \
--arg ticket "$ticket" \
--arg repo "$repo" \
--arg base "$base" \
'{ts: $ts, event: "agent.checkin",
detail: {
session_id: $sid,
ticket: (if $ticket == "" then null else $ticket end),
orchestrator: (if $orch == "" then null else $orch end),
claimed_pr: $pr,
repo: $repo,
base_branches: [{pr: $pr, base: $base}]
}}')" 2>/dev/null || return 1
return 0
}
# Keep filter_register_worker as an alias for backward compat.
# New code should use broker_claim_pr instead.
filter_daemon_running() { broker_daemon_running; }
filter_register_worker() {
local pr="$1" ticket="$2" branch="$3" repo="$4" base="${5:-main}"
broker_claim_pr "$pr" "$ticket" "$branch" "$repo" "$base"
}
filter_deregister_worker() {
broker_daemon_running || return 0
[ -n "${CATALYST_SESSION_ID:-}" ] || return 0
# Deregister the auto-correlated pr_lifecycle interest (interest_id = sid).
"$STATE_SCRIPT" event "$(jq -nc \
--arg sid "$CATALYST_SESSION_ID" \
'{ts: (now | todate), event: "filter.deregister", orchestrator: null, worker: null, detail: {interest_id: $sid}}')" 2>/dev/null || true
# CTL-357: also deregister the comms_lifecycle interest (interest_id = sid-comms).
"$STATE_SCRIPT" event "$(jq -nc \
--arg id "${CATALYST_SESSION_ID}-comms" \
'{ts: (now | todate), event: "filter.deregister", orchestrator: null, worker: null, detail: {interest_id: $id}}')" 2>/dev/null || true
}
# CTL-357: register a comms_lifecycle worker subscription so the broker wakes
# the worker on messages directed at it (`to=$TICKET_ID` or `to=all`) on the
# shared channel. The interest_id is "${CATALYST_SESSION_ID}-comms" — distinct
# from the broker's auto-correlated pr_lifecycle interest (which uses just
# the session_id) so the two interests coexist. Both share the same
# notify_event = filter.wake.${CATALYST_SESSION_ID}, so the Phase 5 listen
# loop predicate is unchanged. Best-effort; broker absence is fine.
broker_register_comms() {
wait_for_broker_ready || return 0
[ -n "${CATALYST_SESSION_ID:-}" ] || return 0
[ -n "${CATALYST_COMMS_CHANNEL:-}" ] || return 0
[ -n "${TICKET_ID:-}" ] || return 0
"$STATE_SCRIPT" event "$(jq -nc \
--arg sid "$CATALYST_SESSION_ID" \
--arg id "${CATALYST_SESSION_ID}-comms" \
--arg orch "${CATALYST_ORCHESTRATOR_ID:-}" \
--arg notify "filter.wake.${CATALYST_SESSION_ID}" \
--arg channel "$CATALYST_COMMS_CHANNEL" \
--arg ticket "$TICKET_ID" \
'{ts: (now | todate), event: "filter.register",
orchestrator: (if $orch == "" then null else $orch end),
worker: null,
detail: {
interest_id: $id,
interest_type: "comms_lifecycle",
notify_event: $notify,
persistent: true,
session_id: $sid,
channel: $channel,
subscriber_kind: "worker",
subscriber_ticket: $ticket
}}')" 2>/dev/null || true
}
# Register the worker's comms subscription once at startup. The wait-for
# filter (filter.wake.${CATALYST_SESSION_ID}) is unchanged — this just gives
# the broker a second deterministic reason to fire that wake.
broker_register_comms
# Belt-and-suspenders: trap on EXIT/INT/TERM ensures graceful deregister.
# Watchdog cleanup in the daemon handles crash cases via session_id matching.
trap 'filter_deregister_worker' EXIT INT TERM
If ORCH_DIR is detected, the worker:
${ORCH_DIR}/workers/${TICKET_ID}.json (the single named file for
this worker — do NOT list other files in the workers/ directory)status, phase, and updatedAt to both
the local signal file AND the global state at ~/catalyst/state.jsonlabel to the signal file at startup (see Label Derivation below)definitionOfDone at Phase 4 (validation) and Phase 5 (ship) with actual results${ORCH_DIR}/wave-${WAVE}-briefing.md if it exists, where
${WAVE} is read from the worker's own signal file's wave field (set by dispatcher). Do NOT
glob wave-*-briefing.md — only the worker's own wave is in scope. If the signal file has no
wave field (older orchestrators), skip briefing read entirely.Label Derivation (at startup, before first phase transition):
The label field in the signal file gives the session a human-readable display name. It is derived
automatically unless overridden with --label:
# If --label flag was provided, use it directly
if [ -n "$USER_LABEL" ]; then
LABEL="$USER_LABEL"
else
# Auto-derive: "<skill> <ticket>"
SKILL_NAME="oneshot" # or the current skill name
LABEL="${SKILL_NAME} ${TICKET_ID}"
fi
# Write to signal file (once, at startup)
if [ -f "$SIGNAL_FILE" ]; then
jq --arg label "$LABEL" '.label = $label' "$SIGNAL_FILE" > "${SIGNAL_FILE}.tmp" \
&& mv "${SIGNAL_FILE}.tmp" "$SIGNAL_FILE"
fi
Signal file + global state update helper (run at each phase boundary):
SIGNAL_FILE="${ORCH_DIR}/workers/${TICKET_ID}.json"
if [ -f "$SIGNAL_FILE" ]; then
OLD_STATUS=$(jq -r '.status' "$SIGNAL_FILE")
TS="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
# Update local signal file. Atomically records phaseTimestamps[$status] = $ts
# so the monitor can build a Gantt timeline. Sets completedAt for terminal states.
IS_TERMINAL="false"
case "$NEW_STATUS" in done|failed|stalled) IS_TERMINAL="true" ;; esac
jq --arg status "$NEW_STATUS" \
--arg phase "$PHASE_NUM" \
--arg ts "$TS" \
--argjson terminal "$IS_TERMINAL" \
'.status = $status
| .phase = ($phase | tonumber)
| .updatedAt = $ts
| .phaseTimestamps = ((.phaseTimestamps // {}) | .[$status] = $ts)
| (if $terminal then .completedAt = $ts else . end)' \
"$SIGNAL_FILE" > "${SIGNAL_FILE}.tmp" && mv "${SIGNAL_FILE}.tmp" "$SIGNAL_FILE"
# Update global state (if orchestrator ID is known)
if [ -n "$ORCH_ID" ] && [ -f "$STATE_SCRIPT" ]; then
"$STATE_SCRIPT" worker "$ORCH_ID" "$TICKET_ID" \
".status = \"$NEW_STATUS\" | .phase = $PHASE_NUM"
# Emit phase-advance / terminal event via the central producer (CTL-229).
# Routine info-tier transitions (researching → planning → …) coalesce
# within the configured window; terminal transitions (pr-created, merging,
# done, failed, stalled, deploy-failed) flush any pending queue and emit
# immediately, with PR enrichment when --to is PR-bearing.
EMITTER="${CATALYST_DEV_SCRIPTS}/emit-worker-status-change.sh"
if [ -x "$EMITTER" ]; then
"$EMITTER" emit \
--orch "$ORCH_ID" \
--ticket "$TICKET_ID" \
--from "$OLD_STATUS" \
--to "$NEW_STATUS" \
--signal-file "$SIGNAL_FILE" >/dev/null 2>&1 || true
fi
fi
# CTL-111: announce phase transition to shared comms channel. Runs 5× in the
# normal path (researching → planning → implementing → validating → shipping),
# comfortably above the ≥2-transition floor.
comms_post info "${OLD_STATUS} → ${NEW_STATUS}"
fi
# CTL-249: check for inbound orchestrator messages after each phase transition.
comms_check
When worker creates a PR, also update global state with PR details. Record prOpenedAt
immediately so the dashboard can show how long the PR has been open separately from how long it took
to merge:
PR_OPENED_AT="$(date -u +%Y-%m-%dT%H:%M:%SZ)"
if [ -n "$ORCH_ID" ] && [ -f "$STATE_SCRIPT" ]; then
"$STATE_SCRIPT" worker "$ORCH_ID" "$TICKET_ID" \
".pr = {number: ${PR_NUMBER}, url: \"${PR_URL}\", ciStatus: \"pending\", prOpenedAt: \"${PR_OPENED_AT}\", autoMergeArmedAt: null, mergedAt: null}"
"$STATE_SCRIPT" event "$(jq -nc \
--arg ts "$PR_OPENED_AT" \
--arg orch "$ORCH_ID" --arg w "$TICKET_ID" \
--argjson pr "$PR_NUMBER" --arg url "$PR_URL" \
'{ts: $ts, orchestrator: $orch, worker: $w, event: "worker-pr-created", detail: {pr: $pr, url: $url}}')"
fi
# CTL-111: announce PR opening on the shared comms channel
comms_post info "pr:#${PR_NUMBER} opened"
When PR is merged (CTL-252: written by the worker after active listen loop confirms CLEAN):
The worker actively merges its own PR after the listen loop confirms the PR is CLEAN (CI green +
reviews satisfied). The worker writes pr.mergedAt + status: "done" to the signal file and
transitions the Linear ticket. The orchestrator's Phase 4 is a safety-net fallback for workers
that stalled or crashed before completing their own merge.
When worker reaches terminal state (done or failed):
Mandatory attention on block (per [[catalyst-comms]] § Posting Discipline §3): in addition to
the failure path below, the worker MUST also comms_post attention "<reason>" when it hits any of
the following mid-flight, even if it is not yet writing status: "failed":
status: "stalled" (any phase)Use a single attention per blocker (do not retry). Continue with whatever work is still possible,
or exit if the blocker is total. The orchestrator's poll loop will promote the message to a
state-level NEEDS ATTENTION item.
if [ -n "$ORCH_ID" ] && [ -f "$STATE_SCRIPT" ]; then
if [ "$NEW_STATUS" = "done" ]; then
"$STATE_SCRIPT" event "$(jq -nc --arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--arg orch "$ORCH_ID" --arg w "$TICKET_ID" \
'{ts: $ts, orchestrator: $orch, worker: $w, event: "worker-done", detail: null}')"
elif [ "$NEW_STATUS" = "failed" ]; then
"$STATE_SCRIPT" event "$(jq -nc --arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
--arg orch "$ORCH_ID" --arg w "$TICKET_ID" --arg reason "$ERROR_MSG" \
'{ts: $ts, orchestrator: $orch, worker: $w, event: "worker-failed", detail: {reason: $reason}}')"
"$STATE_SCRIPT" attention "$ORCH_ID" "waiting-for-user" "$TICKET_ID" \
"Worker failed: ${ERROR_MSG}"
# CTL-111: post attention on shared comms channel so sibling workers / orchestrator
# monitoring loop can observe the blocker without reading the state file.
comms_post attention "worker failed: ${ERROR_MSG:-unknown}"
fi
fi
Phase-to-status mapping for signal file:
| Phase | Signal Status | Writer |
| ----------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------- |
| 1 start | researching | worker |
| 2 start | planning | worker |
| 3 start | implementing | worker |
| 4 start | validating | worker |
| 5 start | shipping | worker |
| 5 PR opened | pr-created + pr.prOpenedAt + pr.ciStatus: "pending" | worker |
| 5 PR listen loop: inline blocker handled | (worker fixes CI/reviews and loops) | worker |
| 5 PR listen loop: human changes-requested | status: "stalled" + attention | worker |
| 5 PR listen loop: unresolvable conflicts | status: "stalled" + attention | worker |
| 5 PR merged by worker (skipDeployVerification=true or no deploy config) | pr.ciStatus: "merged" + pr.mergedAt + status: "done" + completedAt | worker |
| 5 PR merged by worker (skipDeployVerification=false, CTL-211) | pr.ciStatus: "merged" + pr.mergedAt + pr.mergeCommitSha + deploy.startedAt + deploy.environment → waits for deployment_status | worker |
| 5 deployment_status.success on production env | status: "done" + deploy.completedAt + deploy.result: "success" | worker (orchestrator Phase 4 as fallback) |
| 5 deployment_status.failure on production env | status: "deploy-failed" + deploy.failedAttempts + attention | worker (orchestrator Phase 4 as fallback) |
| 5 worker stalled/crashed — merge fallback | pr.ciStatus: "merged" + pr.mergedAt + status: "done" | orchestrator Phase 4 (safety net) |
| Any failure | failed | worker |
Start a catalyst-session at the very beginning of the workflow, before Phase 1. This session spans the entire oneshot lifecycle and records phase transitions, PR creation, and completion.
SESSION_SCRIPT="${CATALYST_DEV_SCRIPTS}/catalyst-session.sh"
if [[ -x "$SESSION_SCRIPT" ]]; then
CATALYST_SESSION_ID=$("$SESSION_SCRIPT" start --skill "oneshot" \
--ticket "${TICKET_ID:-}" \
--workflow "${CATALYST_SESSION_ID:-}")
export CATALYST_SESSION_ID
# CTL-455: mirror the catalyst session_id into the worker signal file so
# `orchestrate-roll-usage.sh` can populate session_metrics when the worker's
# stream produces a result event. Orchestrator mode only — standalone runs
# have no signal file.
if [ -n "${ORCH_DIR:-}" ] && [ -f "${ORCH_DIR}/workers/${TICKET_ID}.json" ]; then
_SF="${ORCH_DIR}/workers/${TICKET_ID}.json"
jq --arg sid "$CATALYST_SESSION_ID" '.catalystSessionId = $sid' \
"$_SF" > "${_SF}.tmp" && mv "${_SF}.tmp" "$_SF"
fi
fi
At each phase transition, call BOTH the signal file update helper (above) AND the session phase call. The session phase call is additive — it never replaces the signal file write:
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" phase "$CATALYST_SESSION_ID" "$NEW_STATUS" --phase "$PHASE_NUM"
fi
When a PR is created (Phase 5), record it in the session:
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" pr "$CATALYST_SESSION_ID" --number "$PR_NUMBER" --url "$PR_URL"
fi
At terminal states (done or failed), end the session:
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" end "$CATALYST_SESSION_ID" --status done --reason "PR merged"
# or: "$SESSION_SCRIPT" end "$CATALYST_SESSION_ID" --status failed --reason "<why>"
fi
Iteration counter (see CTL-158): bump --kind plan whenever the plan is re-entered
(validate-plan kicks back to create-plan) and --kind fix whenever an automated fix retry runs in
Phase 4 (quality gates) or Phase 5 (CI auto-fix). The counts are flushed to OTLP as
claude_code_iteration_count_total{linear_key,kind} at session end so downstream estimation can
read rework signal per ticket:
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" iteration "$CATALYST_SESSION_ID" --kind fix # or --kind plan
fi
This phase runs in the current session to allow user interaction during research.
"${CATALYST_DEV_SCRIPTS}/workflow-context.sh" set-ticket "TICKET-ID"
This ensures .catalyst/.workflow-context.json exists and currentTicket is set before any
other work begins. Downstream skills and hooks depend on this file existing.stateMap.research (default: "In
Progress")--no-ticket): After research completes, offer to create a Linear
ticket from the findings:
Research complete. Would you like to create a Linear ticket from these findings?
[y/N]
If yes, create a ticket via the Linearis CLI (run linearis issues usage for create syntax)
using the research summary as description, then register the ticket ID:
workflow-context.sh set-ticket "NEW-TICKET-ID"/catalyst-dev:research-codebase process exactly. This is the
single source of truth for how codebase research works (including DeepWiki orientation, sub-agent
spawning, synthesis, and document creation). The research document MUST be written to
thoughts/shared/research/ and tracked in workflow context before proceeding to Phase 2.Runs /catalyst-dev:create-plan directly in the current session.
/catalyst-dev:create-plan thoughts/shared/research/$RESEARCH_DOC
What happens:
/create-plan interactively with the userthoughts/shared/plans/YYYY-MM-DD-{ticket}-{description}.mdUser interaction: The user interacts with the planning session normally. The plan is refined iteratively until approved.
Linear: If ticket exists, move to stateMap.planning (default: "In Progress").
After the plan is approved, runs implementation directly:
/catalyst-dev:implement-plan thoughts/shared/plans/$PLAN_DOC
What happens:
/implement-plan with full capabilities — follows TDD (tests written before implementation
per phase)Linear: If ticket exists, move to stateMap.inProgress (default: "In Progress").
Skip this phase entirely with --skip-validation.
Runs validation and quality enforcement directly:
/catalyst-dev:validate-plan thoughts/shared/plans/$PLAN_DOC
Step 1: Validate plan implementation
/validate-plan against the plan documentStep 2: Run skill-based quality gates (skip with --skip-quality-gates)
Run these skill/agent gates in order:
Gate 1: /validate-type-safety → tsc + reward hacking scan + tests + lint
Gate 2: /security-review → security vulnerability scan (built-in)
Gate 3: code-reviewer agent → style/guideline adherence
Gate 4: pr-test-analyzer agent → test coverage verification
For each gate: run it, if it fails and is auto-fixable (gates 1 and 2), attempt to fix and re-run. Gates 3 and 4 produce advisory findings — address them if significant.
Step 3: Run config-based quality gates
Reads additional gates from .catalyst/config.json under catalyst.qualityGates (see Configuration
section below). Runs each gate in order sequence:
For each gate (sorted by order):
1. Run gate.command
2. If passes → mark ✅, continue to next gate
3. If fails AND gate.autofix is true:
- Analyze errors
- Attempt automated fix
- Re-run gate.command
- After the fix attempt (pass OR fail), bump the iteration counter:
catalyst-session.sh iteration "$CATALYST_SESSION_ID" --kind fix
4. If fails AND gate.autofix is false OR autofix attempt failed:
- Log failure, continue to next gate
5. After all gates, if any required gate failed:
- Retry from first failed gate (up to maxRetries total cycles)
After max retries exhausted with failures: Present the user with options:
⚠️ Quality gates failed after {maxRetries} attempts:
❌ typecheck: 3 errors remaining
❌ test: 2 failing tests
Options:
[1] Fix manually and re-run gates
[2] Continue to Ship phase anyway (gates marked as skipped)
[3] Create handoff document and stop
Fallback behavior (no qualityGates config): If catalyst.qualityGates is not configured,
construct default gates from legacy config keys:
| Legacy Key | Gate | Order |
| ------------------------------ | --------- | ----- |
| catalyst.pr.typecheckCommand | typecheck | 1 |
| catalyst.pr.lintCommand | lint | 2 |
| catalyst.pr.testCommand | test | 3 |
| catalyst.pr.buildCommand | build | 4 |
If none of those keys exist either, skip quality gates entirely (validation-only mode).
Step 1: Smart PR Creation/Update
Check if a PR already exists for the current branch:
EXISTING_PR=$(gh pr list --head "$(git branch --show-current)" --json number --jq '.[0].number' 2>/dev/null)
If PR exists ($EXISTING_PR is not empty):
/describe-prIf no PR exists:
/create-pr (handles commit, push, PR creation, description, Linear linking)Step 2: Active PR Listen Loop — Wait for CLEAN then Merge (replaces auto-merge)
After the PR is created, enter an event-driven listen loop. The preferred wake mechanism (CTL-269)
is a single filter.register covering CI, comms inbound, reviews, BEHIND, and Linear ticket changes
— the worker then waits on filter.wake.${CATALYST_SESSION_ID} and the Groq-backed filter daemon
decides which raw events match. When the daemon is not running, the loop falls back to the
[[wait-for-github]] two-phase pattern with per-concern jq filters. See [[catalyst-filter]] for
registration recipes. The worker actively resolves blockers (CI failures, bot review threads,
BEHIND) inline and proceeds to Step 3 only when the PR is CLEAN (CI green + reviews satisfied). On
unrecoverable blockers (human changes-requested, persistent DIRTY) the worker writes
status: "stalled" and exits; the orchestrator's Phase 4 is a safety-net fallback.
Wake narration (MANDATORY, CTL-369). Every iteration of the listen loop — both on a wait-for
return AND on each mergeable_state re-check — must produce a single short line of assistant text
before re-entering the wait. This defeats the Human:\n<task-id> rendering bleed that occurs when
the agent returns an end_turn containing only thinking blocks after a Monitor-wrapped wake. The
line shape is:
wake: <event.name> #<PR_NUMBER> [interest=<type>] — <action being taken>
wake: <event.name> — routine, staying in event loop
wake: <event.name> — already addressed, no-op
wake: rest-poll — broker down, polling gh api
Surface .body.payload.interest_id and .body.payload.reason from the wake envelope when present
(broker wakes carry both); for hand-rolled two-phase filter wakes, surface the matched event.name
and #${PR_NUMBER} instead. See plugins/dev/skills/monitor-events/SKILL.md § Narration for the
full rule and the good-vs-bad transcript fixture.
REPO=$(gh repo view --json nameWithOwner --jq '.nameWithOwner')
PR_OPENED_AT=$(date -u +%Y-%m-%dT%H:%M:%SZ)
# Record PR opening immediately
jq --arg ts "$PR_OPENED_AT" '.pr.prOpenedAt = $ts | .status = "pr-created"' \
"$SIGNAL_FILE" > "$SIGNAL_FILE.tmp" && mv "$SIGNAL_FILE.tmp" "$SIGNAL_FILE"
# Pre-flight: verify event infrastructure (from [[wait-for-github]])
# CTL-572: probe .webhookTunnel.connected — the field that exists. The old
# probe read a field catalyst-monitor never emits, so it always resolved to
# "unknown" and forced REST polling on every run. .connected is optimistic
# (EventSource.isStarted(), not live delivery), so also fall back when the
# tunnel has gone quiet: no webhook event across the shared smee channel for
# TUNNEL_STALE_AGE_SEC.
INFRA_STATUS=$(catalyst-monitor status --json 2>/dev/null)
TUNNEL_STALE_AGE_SEC="${CATALYST_TUNNEL_STALE_AGE_SEC:-21600}" # 6h default
USE_REST=false
TUNNEL_CONNECTED=$(echo "$INFRA_STATUS" | jq -r '.webhookTunnel.connected // false' 2>/dev/null)
if [ "$TUNNEL_CONNECTED" != "true" ]; then
echo "WARN: webhook tunnel not connected — using REST fallback"
USE_REST=true
else
# Staleness guard. jq handles the ISO-8601 math (portable across GNU/BSD date).
# lastEventAt absent (null) => fresh monitor, NOT treated as stale.
TUNNEL_EVENT_AGE=$(echo "$INFRA_STATUS" | jq -r '
(.webhookTunnel.lastEventAt // "") as $t
| if $t == "" then ""
else (now - (($t | sub("\\.[0-9]+Z$"; "Z")) | fromdateiso8601)) | floor
end' 2>/dev/null)
if [ -n "$TUNNEL_EVENT_AGE" ] && [ "$TUNNEL_EVENT_AGE" -gt "$TUNNEL_STALE_AGE_SEC" ]; then
echo "WARN: webhook tunnel last event ${TUNNEL_EVENT_AGE}s ago (> ${TUNNEL_STALE_AGE_SEC}s) — stale, using REST fallback"
USE_REST=true
fi
fi
# CTL-303: use broker auto-correlation. Emit a second agent.checkin with claimed_pr
# so the broker auto-derives a pr_lifecycle interest — no explicit filter.register needed.
# When the broker is not running, the loop falls back to the two-phase pattern below.
USE_FILTER_DAEMON=false
PR_BASE_BRANCH=$(gh pr view "$PR_NUMBER" --json baseRefName --jq '.baseRefName' 2>/dev/null || echo "main")
if broker_claim_pr "$PR_NUMBER" "$TICKET_ID" "$(git branch --show-current)" "$REPO" "$PR_BASE_BRANCH"; then
USE_FILTER_DAEMON=true
echo "[Phase 5] Broker registered pr_lifecycle for session ${CATALYST_SESSION_ID} on PR #${PR_NUMBER}"
else
# Broker absent or unavailable — emit debug telemetry and proceed with two-phase fallback
"$STATE_SCRIPT" event "$(jq -nc \
--arg sid "${CATALYST_SESSION_ID:-}" \
--arg pr "${PR_NUMBER:-}" \
'{ts: (now | todate), event: "broker.fallback.taken",
detail: {reason: "daemon absent", session_id: $sid, pr: ($pr | tonumber? // $pr)}}')" \
2>/dev/null || true
echo "[Phase 5] Broker unavailable — using two-phase wait-for fallback for PR #${PR_NUMBER}"
fi
CI_FIX_ATTEMPTS=0
MAX_CI_FIX_ATTEMPTS=3
PR_DONE=false
while [ "$PR_DONE" = "false" ]; do
# CTL-269 preferred path: single semantic wake covers all concerns.
if [ "$USE_FILTER_DAEMON" = "true" ] && [ "$USE_REST" != "true" ]; then
EVENT=$(catalyst-events wait-for \
--filter ".attributes.\"event.name\" == \"filter.wake.${CATALYST_SESSION_ID}\"" \
--timeout 600 2>/dev/null || true)
if [ -n "$EVENT" ]; then
WAKE_REASON=$(echo "$EVENT" | jq -r '.body.payload.reason // "unknown"' 2>/dev/null || echo "unknown")
echo "[Phase 5] Filter wake: ${WAKE_REASON}"
fi
# Drain inbound comms inside the loop now that filter.wake fires on
# comms.message.posted events too — comms_check is idempotent (advances
# COMMS_LAST_READ atomically) and a no-op when nothing arrived.
comms_check
elif [ "$USE_REST" != "true" ]; then
# Fallback: two-phase event wait (see [[wait-for-github]]).
# Filter field reference: [[event-schema]] — note check_suite/workflow_run use
# detail.prNumbers, not scope.pr. PR/review events DO populate scope.pr.
EVENT=$(catalyst-events wait-for \
--filter "(.attributes.\"vcs.pr.number\" == ${PR_NUMBER} or (.body.payload.prNumbers // [] | contains([${PR_NUMBER}]))) and (
.attributes.\"event.name\" == \"github.pr.merged\" or
.attributes.\"event.name\" == \"github.check_suite.completed\" or
(.attributes.\"event.name\" | startswith(\"github.pr_review\")) or
.attributes.\"event.name\" == \"github.push\"
)" \
--timeout 180 2>/dev/null || true)
if [ -z "$EVENT" ]; then
# Phase 1 timed out — run diagnostics (see [[wait-for-github]] diagnostic block)
_LOG_FILE=~/catalyst/events/$(date -u +%Y-%m).jsonl
_LOG_LINES=$(wc -l < "$_LOG_FILE" 2>/dev/null | tr -d ' ')
_SINCE_LINE=$(( ${_LOG_LINES:-0} > 500 ? ${_LOG_LINES:-0} - 500 : 0 ))
HEARTBEATS=$(catalyst-events tail --since-line "$_SINCE_LINE" 2>/dev/null \
| jq -c 'select(.attributes."event.name" == "heartbeat")' | wc -l | tr -d ' ')
# CTL-572: probe .webhookTunnel.connected (the field that exists). The
# HEARTBEATS == 0 term remains the stronger liveness signal here.
TUNNEL_NOW=$(catalyst-monitor status --json 2>/dev/null \
| jq -r '.webhookTunnel.connected // false')
if [ "${HEARTBEATS:-0}" -eq 0 ] || [ "$TUNNEL_NOW" != "true" ]; then
echo "Infrastructure issue detected — switching to REST polling"
USE_REST=true
else
# Infrastructure healthy — extend to Phase 2 (7200s)
EVENT=$(catalyst-events wait-for \
--filter "(.attributes.\"vcs.pr.number\" == ${PR_NUMBER} or (.body.payload.prNumbers // [] | contains([${PR_NUMBER}]))) and (
.attributes.\"event.name\" == \"github.pr.merged\" or
.attributes.\"event.name\" == \"github.check_suite.completed\" or
(.attributes.\"event.name\" | startswith(\"github.pr_review\")) or
.attributes.\"event.name\" == \"github.push\"
)" \
--timeout 7200 2>/dev/null || true)
fi
fi
# Drain inbound comms after each wake so messages don't sit until phase boundary.
comms_check
fi
# Authoritative REST check — never gh pr view --json (GraphQL); REST only
PR_JSON=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}" 2>/dev/null || echo '{}')
PR_MERGED=$(echo "$PR_JSON" | jq -r '.merged // false')
MERGE_STATE=$(echo "$PR_JSON" | jq -r '.mergeable_state // "unknown"')
# REST .mergeable_state values: "clean", "blocked", "behind", "dirty", "unknown", "unstable"
if [ "$PR_MERGED" = "true" ]; then
PR_DONE=true; break
fi
# Check for human reviewer changes-requested (escalates to stalled)
LAST_CR=$(gh pr view "$PR_NUMBER" --json reviews \
--jq '[.reviews[] | select(.state == "CHANGES_REQUESTED")] | last | .author.login // ""' \
2>/dev/null || echo "")
if [ -n "$LAST_CR" ]; then
ERROR_MSG="Changes requested by human reviewer ${LAST_CR} — operator action required"
NEW_STATUS="stalled"; PHASE_NUM=5
jq --arg status "stalled" --arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
'.status = $status | .updatedAt = $ts' \
"$SIGNAL_FILE" > "$SIGNAL_FILE.tmp" && mv "$SIGNAL_FILE.tmp" "$SIGNAL_FILE"
comms_post attention "stalled: ${ERROR_MSG}"
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" end "$CATALYST_SESSION_ID" --status failed --reason "changes requested by ${LAST_CR}"
fi
exit 1
fi
case "$MERGE_STATE" in
clean)
# CI passed and reviews satisfied — proceed to Step 3
PR_DONE=true
;;
blocked)
# Unresolved bot review threads or CI failure
UNRESOLVED=$(gh pr view "$PR_NUMBER" --json reviewThreads \
--jq '[.reviewThreads[] | select(.isResolved == false)] | length' 2>/dev/null || echo 0)
if [ "${UNRESOLVED:-0}" -gt 0 ]; then
# Bot review threads — resolve via /review-comments, then loop
/catalyst-dev:review-comments "$PR_NUMBER"
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" iteration "$CATALYST_SESSION_ID" --kind fix
fi
elif [ "$CI_FIX_ATTEMPTS" -lt "$MAX_CI_FIX_ATTEMPTS" ]; then
# CI failure — attempt automated fix, push commit, then loop
CI_FIX_ATTEMPTS=$((CI_FIX_ATTEMPTS + 1))
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" iteration "$CATALYST_SESSION_ID" --kind fix
fi
# Analyze CI failure from check run logs and push a targeted fix commit
# (per the Phase 4 quality gate retry pattern)
else
ERROR_MSG="CI blocked after ${MAX_CI_FIX_ATTEMPTS} fix attempts — escalating"
NEW_STATUS="stalled"; PHASE_NUM=5
jq --arg status "stalled" --arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
'.status = $status | .updatedAt = $ts' \
"$SIGNAL_FILE" > "$SIGNAL_FILE.tmp" && mv "$SIGNAL_FILE.tmp" "$SIGNAL_FILE"
comms_post attention "stalled: ${ERROR_MSG}"
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" end "$CATALYST_SESSION_ID" --status failed --reason "CI blocked after ${MAX_CI_FIX_ATTEMPTS} fix attempts"
fi
exit 1
fi
;;
behind)
# Branch is behind base — rebase and push
BASE_BRANCH_NAME=$(git remote show origin 2>/dev/null \
| grep "HEAD branch" | awk '{print $NF}')
git fetch origin && git rebase "origin/${BASE_BRANCH_NAME:-main}"
git push --force-with-lease
;;
dirty)
ERROR_MSG="Merge conflicts (DIRTY) — cannot auto-resolve"
NEW_STATUS="stalled"; PHASE_NUM=5
jq --arg status "stalled" --arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
'.status = $status | .updatedAt = $ts' \
"$SIGNAL_FILE" > "$SIGNAL_FILE.tmp" && mv "$SIGNAL_FILE.tmp" "$SIGNAL_FILE"
comms_post attention "stalled: ${ERROR_MSG}"
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" end "$CATALYST_SESSION_ID" --status failed --reason "merge conflicts (DIRTY)"
fi
exit 1
;;
unknown|unstable)
# Transient state — continue waiting for next event
;;
esac
# REST fallback sleep interval (no event tunnel)
[ "$USE_REST" = "true" ] && sleep 300
done
Step 3: Merge + Record Success
PR is CLEAN (or already merged). Execute the merge directly (no --auto), optionally verify
deployment, write status: "done", and exit.
# Execute merge now that PR is ready (no --auto; worker owns the merge in CTL-252 contract)
if [ "$PR_MERGED" != "true" ]; then
gh pr merge "$PR_NUMBER" --squash --delete-branch
fi
# Confirm via REST (authoritative — never gh pr view --json)
MERGED_OK=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}" --jq '.merged' 2>/dev/null || echo "false")
if [ "$MERGED_OK" != "true" ]; then
ERROR_MSG="gh pr merge succeeded but REST confirms PR not merged — escalating"
comms_post attention "stalled: ${ERROR_MSG}"
jq --arg status "stalled" --arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
'.status = $status | .updatedAt = $ts' \
"$SIGNAL_FILE" > "$SIGNAL_FILE.tmp" && mv "$SIGNAL_FILE.tmp" "$SIGNAL_FILE"
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" end "$CATALYST_SESSION_ID" --status failed --reason "merge not confirmed via REST"
fi
exit 1
fi
MERGE_COMMIT_SHA=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}" \
--jq '.merge_commit_sha // empty' 2>/dev/null || echo "")
MERGED_AT=$(date -u +%Y-%m-%dT%H:%M:%SZ)
# Optional deployment verification (CTL-211)
SKIP_DEPLOY=$(jq -r --arg repo "${REPO}" \
'.catalyst.deploy[$repo].skipDeployVerification // true' .catalyst/config.json 2>/dev/null \
|| echo "true")
PROD_ENV=$(jq -r --arg repo "${REPO}" \
'.catalyst.deploy[$repo].productionEnvironment // "production"' .catalyst/config.json 2>/dev/null)
DEPLOYMENT_URL=""
if [ "$SKIP_DEPLOY" != "true" ] && [ -n "$MERGE_COMMIT_SHA" ]; then
# Two-phase wait for deployment_status (see [[wait-for-github]])
DEPLOY_TIMEOUT=$(jq -r --arg repo "${REPO}" \
'.catalyst.deploy[$repo].timeoutSec // 1800' .catalyst/config.json 2>/dev/null || echo 1800)
DEPLOY_EVENT=$(catalyst-events wait-for \
--filter "(.attributes.\"event.name\" | startswith(\"github.deployment_status\")) and
.attributes.\"deployment.environment\" == \"${PROD_ENV}\" and
.attributes.\"vcs.revision\" == \"${MERGE_COMMIT_SHA}\"" \
--timeout 180 2>/dev/null || true)
# Authoritative deploy lookup (REST — see [[wait-for-github]] REST fallback pattern)
DEPLOY_JSON=$(gh api -X GET "/repos/${REPO}/deployments" \
-f sha="$MERGE_COMMIT_SHA" -f environment="$PROD_ENV" --jq '.[0] // empty' 2>/dev/null || echo "")
if [ -n "$DEPLOY_JSON" ]; then
DEPLOY_ID=$(echo "$DEPLOY_JSON" | jq -r '.id // empty')
STATUS_JSON=$(gh api "/repos/${REPO}/deployments/${DEPLOY_ID}/statuses" \
--jq '.[0] // empty' 2>/dev/null || echo "")
DEPLOY_STATE=$(echo "$STATUS_JSON" | jq -r '.state // "pending"')
DEPLOYMENT_URL=$(echo "$STATUS_JSON" | jq -r '.environment_url // empty')
if [ "$DEPLOY_STATE" = "failure" ] || [ "$DEPLOY_STATE" = "error" ]; then
jq --arg ts "$(date -u +%Y-%m-%dT%H:%M:%SZ)" \
'.status = "deploy-failed" | .updatedAt = $ts' \
"$SIGNAL_FILE" > "$SIGNAL_FILE.tmp" && mv "$SIGNAL_FILE.tmp" "$SIGNAL_FILE"
comms_post attention "deploy-failed: ${PROD_ENV} deploy failed for PR #${PR_NUMBER}"
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" end "$CATALYST_SESSION_ID" --status failed --reason "deployment failed in ${PROD_ENV}"
fi
exit 1
fi
fi
fi
# Record merge in signal file — worker writes status=done (CTL-252 contract)
TS=$(date -u +%Y-%m-%dT%H:%M:%SZ)
jq --arg ts "$MERGED_AT" --arg now "$TS" --arg sha "${MERGE_COMMIT_SHA:-}" \
--arg deploy_url "${DEPLOYMENT_URL:-}" \
'.pr.mergedAt = $ts | .pr.ciStatus = "merged"
| (if $sha != "" then .pr.mergeCommitSha = $sha else . end)
| .status = "done" | .phase = 5 | .updatedAt = $now
| .completedAt = $now | .phaseTimestamps.done = $now
| (if $deploy_url != "" then .deployment = {url: $deploy_url} else . end)' \
"$SIGNAL_FILE" > "${SIGNAL_FILE}.tmp" && mv "${SIGNAL_FILE}.tmp" "$SIGNAL_FILE"
# Update global state
if [ -n "$ORCH_ID" ] && [ -f "$STATE_SCRIPT" ]; then
"$STATE_SCRIPT" worker "$ORCH_ID" "$TICKET_ID" \
".status = \"done\" | .phase = 5 | .pr.mergedAt = \"${MERGED_AT}\" | .pr.ciStatus = \"merged\""
"$STATE_SCRIPT" event "$(jq -nc \
--arg ts "$TS" --arg orch "$ORCH_ID" --arg w "$TICKET_ID" \
--argjson pr "$PR_NUMBER" --arg mt "$MERGED_AT" \
'{ts:$ts, orchestrator:$orch, worker:$w, event:"worker-pr-merged", detail:{pr:$pr, mergedAt:$mt}}')"
"$STATE_SCRIPT" event "$(jq -nc --arg ts "$TS" --arg orch "$ORCH_ID" --arg w "$TICKET_ID" \
'{ts:$ts, orchestrator:$orch, worker:$w, event:"worker-done", detail:null}')"
fi
# Transition Linear ticket to done (worker owns this in CTL-252 contract)
"${CATALYST_DEV_SCRIPTS}/linear-transition.sh" \
--ticket "$TICKET_ID" --transition done --config .catalyst/config.json 2>/dev/null || true
# End session
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" pr "$CATALYST_SESSION_ID" --number "$PR_NUMBER" --url "$PR_URL"
"$SESSION_SCRIPT" end "$CATALYST_SESSION_ID" --status done --reason "PR merged"
fi
# CTL-111: post done to shared comms channel
if [ -n "${CATALYST_COMMS_CHANNEL:-}" ] && [ -n "$COMMS_BIN" ]; then
"$COMMS_BIN" done "$CATALYST_COMMS_CHANNEL" --as "$TICKET_ID" >/dev/null 2>&1 || true
fi
Step 4: Optional Rollup Fragment Contribution (CTL-108)
Before exiting, under orchestrator mode only (ORCH_DIR set), the worker MAY write a short markdown
fragment describing anything surprising, risky, or worth flagging to human reviewers of the whole
orchestrator's output:
if [ -n "$ORCH_DIR" ] && [ -d "$ORCH_DIR/workers" ]; then
FRAGMENT_PATH="${ORCH_DIR}/workers/${TICKET_ID}-rollup.md"
# Write a short note — first line is the one-liner used in the rollup "what shipped" list.
cat > "$FRAGMENT_PATH" <<EOF
One-sentence summary of what shipped and any reviewer heads-up.
Additional context, migration notes, follow-up tickets, etc.
EOF
fi
${TICKET_ID}-rollup.md exactly — the orch-monitor scans for this
pattern to assemble the orchestrator-level rollup briefing.### ${TICKET_ID} heading in the "Gotchas"
section.The orchestrator's Phase 4 poll loop transitions the Linear ticket to stateMap.done when it
confirms state=MERGED via the shared helper (CTL-69):
"${CATALYST_DEV_SCRIPTS}/linear-transition.sh" \
--ticket "$TICKET_ID" --transition done --config .catalyst/config.json
In standalone mode (no orchestrator), the user runs /catalyst-dev:merge-pr which handles this
transition.
Step 5: File improvement findings (CTL-176 / CTL-183 routing)
Drain the findings queue and file one ticket per entry. Orchestrator-dispatched oneshot runs share
the orchestrator's queue ($CATALYST_FINDINGS_FILE=$ORCH_DIR/findings.jsonl) and the orchestrator's
Phase 7 files everything — this step is still safe to run and will find an empty queue in that case.
Standalone oneshot runs (no orchestrator) use a session-scoped queue path derived from
$CATALYST_SESSION_ID, falling back to .catalyst/findings/current.jsonl.
Recording findings during the run. The moment you notice friction worth fixing (workflow gaps, bugs spotted in adjacent code, recurring manual steps), record it on the queue:
"${CATALYST_DEV_SCRIPTS}/add-finding.sh" \
--title "Short imperative title" \
--body "Reproduction + expected + observed + any links" \
--skill oneshot --severity low
Record inline, not as a post-run retrospective — context compaction loses observations that wait. Don't prompt the user; don't batch. Step 5 below files the whole queue in one pass.
What counts: friction the maintainer would want fixed, bugs in adjacent catalyst code spotted
incidentally, gaps in tooling, manual steps that should be automated. What doesn't: this
ticket's own follow-up TODOs (PR body), user preferences that should be durable memory, routine
debugging. In orchestrator-dispatched workers, stdin is not a TTY and CATALYST_AUTONOMOUS=1 is
expected to be set — the helper silently skips filing when consent is not already granted, never
prompts. Standalone oneshot runs prompt interactively once and persist "yes":
FEEDBACK="${CATALYST_DEV_SCRIPTS}/file-feedback.sh"
CONSENT="${CATALYST_DEV_SCRIPTS}/feedback-consent.sh"
FINDINGS_FILE="${CATALYST_FINDINGS_FILE:-.catalyst/findings/${CATALYST_SESSION_ID:-current}.jsonl}"
if [ -x "$FEEDBACK" ] && [ -f "$FINDINGS_FILE" ] && [ -s "$FINDINGS_FILE" ]; then
COUNT=$(wc -l < "$FINDINGS_FILE" | tr -d ' ')
if [ "$("$CONSENT" check)" != "granted" ] && [ -z "${CATALYST_AUTONOMOUS:-}" ] && [ -t 0 ]; then
read -r -p "File $COUNT improvement tickets at end of run? [Y/n] " yn
case "$yn" in [Nn]*) : ;; *) "$CONSENT" grant >/dev/null ;; esac
fi
if [ "$("$CONSENT" check)" = "granted" ]; then
FILED=0
while IFS= read -r line; do
TITLE=$(jq -r '.title' <<<"$line")
BODY=$(jq -r '.body' <<<"$line")
SKILL=$(jq -r '.skill // "oneshot"' <<<"$line")
RESULT=$("$FEEDBACK" --title "$TITLE" --body "$BODY" --skill "$SKILL" --json 2>/dev/null || true)
STATUS=$(jq -r '.status // "failed"' <<<"$RESULT")
if [ "$STATUS" = "filed" ]; then
ID=$(jq -r '.identifier // .url // ""' <<<"$RESULT")
echo " filed: $ID ($TITLE)"
FILED=$((FILED + 1))
fi
done < "$FINDINGS_FILE"
[ "$FILED" -eq "$COUNT" ] && rm -f "$FINDINGS_FILE"
fi
fi
If --no-merge was set, skip Steps 2–3 (listen loop and merge) entirely and report PR status
instead:
PR ready: https://github.com/org/repo/pull/{number}
Merge state: $mergeStateStatus
✅ CI passed
✅ Threads resolved ({N} addressed)
✅ Reviews addressed
❌ Review required — 1 approval needed (if applicable)
Merge later with: /catalyst-dev:merge-pr
Linear: /create-pr moves ticket to stateMap.inReview (default: "In Review"). The worker
transitions it to stateMap.done after merge in Step 3; the orchestrator's Phase 4 handles this
only as a fallback for stalled workers.
Phase 6 used to run /merge-pr separately. Workers now exit at status: "done" after actively
merging their own PR and verifying deployment (CTL-252). /merge-pr is still useful as a standalone
tool for merging a PR opened outside the oneshot flow.
For complex implementations spanning multiple files/layers:
/catalyst-dev:oneshot --team PROJ-123
In team mode, Phase 3 uses agent teams for parallel implementation:
When to use --team:
When NOT to use --team:
All phases run in a single session. Context is managed through:
thoughts/shared/ (research
documents, plans). Subsequent phases read these files, so the essential information is always
available even after compaction.Phase 1: Research — spawns parallel sub-agents, writes to thoughts/shared/research/
Phase 2: Plan — reads research doc, runs /create-plan, writes to thoughts/shared/plans/
Phase 3: Implement — reads plan doc, runs /implement-plan (can use --team for agent teams)
Phase 4: Validate — reads plan doc, runs /validate-plan + quality gates
Phase 5: Ship — runs /create-pr, enters active listen loop (event-driven, resolves CI/review
blockers inline), merges when CLEAN, verifies deployment, writes status=done, exits.
Orchestrator Phase 4 is a safety-net fallback for stalled/crashed workers only.
Configure quality gates in the consuming project's .catalyst/config.json:
{
"catalyst": {
"qualityGates": {
"enabled": true,
"maxRetries": 3,
"gates": [
{
"name": "typecheck",
"command": "npm run type-check",
"required": true,
"autofix": true,
"order": 1
},
{
"name": "lint",
"command": "npm run lint:fix",
"required": true,
"autofix": true,
"order": 2
},
{
"name": "test",
"command": "npm run test",
"required": true,
"autofix": false,
"order": 3
},
{
"name": "build",
"command": "npm run build",
"required": true,
"autofix": false,
"order": 4
}
]
}
}
}
Schema:
| Field | Type | Description |
| ------------------ | ------- | -------------------------------------------------------------------- |
| enabled | boolean | Master toggle for quality gates (default: true) |
| maxRetries | number | Max retry cycles across all gates (default: 3) |
| gates[].name | string | Display name for the gate |
| gates[].command | string | Shell command to run |
| gates[].required | boolean | If true, failure blocks shipping. If false, failure is a warning |
| gates[].autofix | boolean | If true, attempt automated fixes on failure before retrying |
| gates[].order | number | Execution order (lowest first) |
Backward compatibility: If qualityGates is absent, the command falls back to constructing
gates from catalyst.pr.typecheckCommand, catalyst.pr.lintCommand, catalyst.pr.testCommand, and
catalyst.pr.buildCommand. If none of those exist, quality gates are skipped entirely.
All phases run in the current session using whatever model the session was started with. When
running as an orchestrator worker, the model is set by the orchestrator's workerModel config
(default: Opus).
State transitions throughout the lifecycle:
| Phase | Transition | Config Key | Default |
| ---------------------------------- | ------------ | --------------------- | ------------- |
| 1 start | → research | stateMap.research | "In Progress" |
| 1 end (ticket created in freeform) | → backlog | stateMap.backlog | "Backlog" |
| 2 start | → planning | stateMap.planning | "In Progress" |
| 3 start | → inProgress | stateMap.inProgress | "In Progress" |
| 5 (PR created) | → inReview | stateMap.inReview | "In Review" |
| 5 (PR merged by worker) | → done | stateMap.done | "Done" |
The worker transitions the ticket to stateMap.done after actively merging the PR in Step 3. The
orchestrator's Phase 4 handles this transition only as a fallback for workers that stalled before
completing their own merge (CTL-252).
All error paths must end the session. Before presenting errors or creating handoffs, always run:
if [[ -n "${CATALYST_SESSION_ID:-}" && -x "$SESSION_SCRIPT" ]]; then
"$SESSION_SCRIPT" end "$CATALYST_SESSION_ID" --status failed --reason "<why it failed>"
fi
If research phase fails:
--status failed/catalyst-dev:research-codebase manuallyIf implementation fails:
--status failed/catalyst-dev:resume-handoffIf quality gates fail after max retries:
--status failed, remaining phases are documented for
next sessionIf CI checks fail in Phase 5:
mergeable_statestatus: "stalled" and posts attention to commsorchestrate-auto-fixup (CTL-64)Automatic handoff on stop: When the workflow stops at any phase (user choice, unrecoverable error, context exhaustion):
--status failed/create-handoff with: phases completed, current phase status, unresolved issues,
CI/review status, and remaining phasesthoughts/shared/handoffs//catalyst-dev:resume-handoff[[filename]]), not
full pathsstatus: "done" (CTL-252) — the worker opens the PR, enters an
event-driven listen loop using catalyst-events wait-for, resolves CI/review blockers inline,
merges when CLEAN with gh pr merge --squash --delete-branch (no --auto), and writes
status: "done" with pr.mergedAt and deployment.url (if applicable). Workers do NOT use
ScheduleWakeup (unreliable in -p mode) — they use catalyst-events wait-for which is a
blocking subprocess call that works reliably in non-interactive sessionsstatus: "stalled"pr.mergedAt + status: "done" — after actively merging the PR in Phase 5
Step 3. The orchestrator's Phase 4 handles this only for workers that stalled before completingstatus: "done" — this is the expected success path. The
orchestrator distinguishes this from stalls (no PR, no progress for 15+ minutes)info is the default heartbeat (phase transitions only,
~5–7 per session), attention is reserved for orchestrator action (0–2 per session, MANDATORY on
the escalation triggers listed there — scope conflict, missing access, ambiguous spec, 3+ repeated
CI failures, status="stalled"), done fires once at terminal success via the done subcommand.
The existing comms_post helper in this skill already routes correctly — these rules govern
when you call it.comms_check is called after each phase transition via the
signal-file update block. It polls for messages directed to $TICKET_ID (skipping pre-worker
history via $COMMS_LAST_READ), logs all inbound messages, and exits on abort.
catalyst-comms send already emits comms.message.posted events to the global event log
(CTL-210), so Option B event emission is complete — extending catalyst-events wait-for to
include comms.message filters is tracked in CTL-247 (wait-for-github skill).IMPORTANT: Document Storage Rules
thoughts/shared/ (research, plans, prs subdirectories)thoughts/searchable/ — this is a read-only search indextesting
Phase-agent that fixes a failing verify verdict so the pipeline self-heals instead of stalling to needs-human (CTL-653). Reads `${ORCH_DIR}/workers/<ticket>/verify.json`, fixes the `findings[]` (every severity:"high" plus the regression_risk drivers) directly via Edit/Write, commits the remediation, and emits `phase.remediate.complete.<ticket>`. The scheduler's router then re-dispatches `verify` to re-check (the verify⇄remediate cycle, cap 3). Dispatched as a `claude --bg` job by `phase-agent-dispatch`, which invokes it via slash command — hence `user-invocable: true`.
tools
--- name: phase-triage description: Phase agent that triages a Linear ticket — expands acronyms, classifies (feature/bug/docs/refactor/chore), identifies genuine blockers (a semantic second-pass over the backlog — NOT a prose scrape; CTL-838), estimates scope, writes triage.json, and posts a triage analysis comment to Linear. Triage completion is signaled by that comment plus the local triage.json — there is no `triaged` label. Emits phase.triage.complete.<TICKET> on success and phase.triage.fai
tools
Phase agent for the research step of the 9-phase orchestrator pipeline (CTL-450). Wraps /catalyst-dev:research-codebase and produces thoughts/shared/research/<date>-<ticket>.md, then emits phase.research.complete.<ticket>. Reads triage.json from the worker dir as its prior-phase artifact. Spawned via plugins/dev/scripts/phase-agent-dispatch, which invokes it via slash command — hence `user-invocable: true`.
development
Phase-agent wrapper that opens the pull request after implementation completes (CTL-449 Initiative 1 Phase 3). Delegates to `/catalyst-dev:create-pr` (which already auto-runs `describe-pr` and transitions Linear to `inReview`), then writes the PR number + URL into the phase signal file so the downstream `phase-monitor-merge` agent can read it without re-querying GitHub. Dispatched as a `claude --bg` job by `phase-agent-dispatch`, which invokes it via slash command — hence `user-invocable: true`.