monitor-events — Event-driven waits in skill prose

What this is for

CTL-210 unified the Catalyst event log: every GitHub webhook, Linear webhook, comms post, and orchestrator/worker lifecycle event flows through ~/catalyst/events/YYYY-MM.jsonl. Consumers no longer poll gh pr view, linearis read, or signal files — they subscribe to the event stream via filter.

This skill documents the canonical patterns. Use it as a reference when writing or migrating skill prose; do not invoke it as a slash command.

Prerequisite — orch-monitor daemon must be running

The two primitives below read from ~/catalyst/events/YYYY-MM.jsonl, which is populated by the orch-monitor daemon (plugins/dev/scripts/orch-monitor/server.ts). When the daemon is not running:

catalyst-events tail returns an empty stream
catalyst-events wait-for blocks until its --timeout expires (default 600s) and exits non-zero — callers fall back to gh pr view polling, which can't see deploys

Liveness check (the same call wired into check-project-setup.sh):

plugins/dev/scripts/catalyst-monitor.sh status        # human-readable
plugins/dev/scripts/catalyst-monitor.sh status --json # {"running":true,"pid":...}

Skills that invoke check-project-setup.sh (orchestrate, oneshot, merge-pr) handle the liveness check automatically — interactive runs prompt to start the daemon, autonomous runs warn-to-stderr and proceed. If you reuse the primitives outside those skills, run the status check yourself and either start the daemon (catalyst-monitor.sh start) or plan for the polling fallback.

Pattern selection & cost tradeoffs

Three patterns are available; pick by cost shape, not just by mechanism. Listed cheapest first.

| Pattern | Mechanism | Cost shape | When to use | |---|---|---|---| | Broker interest (preferred) | The catalyst-broker daemon (see [[broker]]) classifies events between Claude turns and emits filter.wake.{id} only on semantic match. Deterministic interest types (pr_lifecycle, ticket_lifecycle, comms_lifecycle) match by typed-field comparison; prose interests go through Groq Llama 3.1 8B. | Lowest. Zero turns while blocked; 1 turn per matched wake. Deterministic routes cost 0 LLM tokens; prose routes ~$0.05–0.10/M tokens at small batch sizes. Pre-filtering happens out of Claude's context entirely. | Whenever the broker daemon is running. Worker-scope (single PR) and orchestrator-scope (many PRs) both supported via the same registration mechanism. | | catalyst-events wait-for with jq | Blocking Bash CLI; one jq predicate; exits on first match or --timeout. Works without the broker. | Zero turns while blocked, 1 turn per match. Same per-match cost as Monitor but workers usually wait for ONE specific event, so total turn count stays low. Filter expansion (CI + reviews + push + merge) widens the per-match probability. | Short-lived claude -p workers when the broker is not running; standalone one-shot waits; CI scripts. | | Monitor over catalyst-events tail | Claude Code's Monitor tool wraps tail --filter; every matching line surfaces as a turn-resuming notification. | Highest. 1 wake per matching line means broad filters can dominate context. | Long-lived orchestrator only. NOT for short-lived claude -p workers — they have no long-lived turn loop to consume notifications. |

Worker contract (matches oneshot Phase 5, CTL-371): dispatched workers prefer the broker (Pattern 3), fall back to wait-for (Pattern 2) when the daemon is down, and never use Monitor/tail (Pattern 1). See plugins/dev/skills/oneshot/SKILL.md Phase 5 and the orchestrate dispatch prompt for the canonical invocation.

Note on numbering. The "Pattern N" labels in the recipe sections below (Pattern 1 — worker waits for PR merge, Pattern 2 — long-lived orchestrator wakes, Pattern 3 — reactive PR lifecycle, Pattern 4 — tail by ticket) are recipe IDs, not the cost-tier rank in the table above. The recipes pre-date the broker integration; both the broker-preferred path and the wait-for fallback inside each recipe map to the table rows above.

Both wait-for and Monitor use catalyst-events under the hood. tail is the streaming foundation; wait-for is tail | head -n 1 with a timeout. The broker reads the same event log as tail does but classifies before waking, which is why its per-wake cost is so much lower.

Pattern 1 — Worker waits for its PR to merge

A claude -p worker that just opened PR #342 needs to block until the PR merges, then do post-merge work.

Preferred (when catalyst-filter is running, CTL-269): register a single semantic interest covering every concern the worker cares about (CI, comms, reviews, BEHIND, Linear), then wait on filter.wake.${CATALYST_SESSION_ID}. The Groq-backed daemon classifies raw events against the natural-language prompt and emits one wake per match. See [[catalyst-filter]] for the full registration recipe and the daemon-restart contract. The two-phase pattern below is the fallback for environments where the daemon is not running.

Use the two-phase pattern from [[wait-for-github]]: a 3-minute Phase 1 with a diagnostic checkpoint before committing to the full 2-hour wait.

# Two-phase pattern — see [[wait-for-github]] for full reference.
REPO=$(gh repo view --json nameWithOwner --jq '.nameWithOwner')
EVENT=""
_WFG_MATCHED=false

# Phase 1: short wait with diagnostic checkpoint (3 minutes).
EVENT=$(catalyst-events wait-for \
  --filter ".attributes.\"event.name\" == \"github.pr.merged\" and .attributes.\"vcs.pr.number\" == ${PR_NUMBER}" \
  --timeout 180 2>/dev/null || true)

if [ -n "$EVENT" ]; then
  _WFG_MATCHED=true
else
  # Phase 1 timed out — run diagnostics before extending to Phase 2.
  echo "Phase 1 timed out after 3 min — running diagnostics..."
  STALLED=false
  FILTER_MISMATCH=false

  _LOG_FILE=~/catalyst/events/$(date -u +%Y-%m).jsonl
  _LOG_LINES=$(wc -l < "$_LOG_FILE" 2>/dev/null | tr -d ' ')
  _SINCE_LINE=$(( ${_LOG_LINES:-0} > 500 ? ${_LOG_LINES:-0} - 500 : 0 ))
  HEARTBEATS=$(catalyst-events tail --since-line "$_SINCE_LINE" 2>/dev/null \
    | jq -c 'select(.attributes."event.name" == "session.heartbeat")' | wc -l | tr -d ' ')
  [ "${HEARTBEATS:-0}" -eq 0 ] && { echo "WARN: No heartbeats — event log may be stalled"; STALLED=true; }

  RAW_HIT=$(catalyst-events tail --since-line "$_SINCE_LINE" 2>/dev/null | jq -c \
    --argjson pr "$PR_NUMBER" \
    'select((.attributes."vcs.pr.number" == $pr) or (.body.payload.prNumbers // [] | contains([$pr])))' | head -1)
  if [ -n "$RAW_HIT" ]; then
    echo "WARN: Event arrived but filter did not match. Raw event:"; echo "$RAW_HIT" | jq .
    FILTER_MISMATCH=true
  fi

  # The smee→monitor webhook tunnel is the GitHub-event ingestion path and is NOT yet
  # retired (Linear smee retires first; GitHub smee is gated on CTC-134). A dead tunnel
  # produces zero events while the monitor keeps heartbeating — so without this check a
  # worker would treat infra as healthy and enter the 2-hour Phase 2 wait. Tunnel down →
  # skip the extension and rely on the authoritative REST confirmation below.
  TUNNEL_STATE=$(catalyst-monitor status --json 2>/dev/null | jq -r '.webhookTunnel.connected // false')
  [ "$TUNNEL_STATE" != "true" ] && { echo "WARN: Webhook tunnel not running"; STALLED=true; }

  if [ "$FILTER_MISMATCH" = "false" ] && [ "$STALLED" = "false" ]; then
    # Infrastructure healthy — extend to Phase 2.
    EVENT=$(catalyst-events wait-for \
      --filter ".attributes.\"event.name\" == \"github.pr.merged\" and .attributes.\"vcs.pr.number\" == ${PR_NUMBER}" \
      --timeout 7200 2>/dev/null || true)
    [ -n "$EVENT" ] && _WFG_MATCHED=true
  fi
fi

# Authoritative REST confirmation — always follows any wait-for path.
MERGED=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}" --jq '.merged' 2>/dev/null || echo "false")
if [ "$MERGED" = "true" ]; then
  # Proceed with post-merge work
fi

Non-negotiable: every wait-for is paired with an authoritative REST check. Reasons:

The orch-monitor daemon may be down. No daemon → no webhook events → wait-for blocks until timeout. The gh api call after timeout is the safety net.
Transient state can race the event. The webhook may arrive while the worker is doing setup before reaching wait-for. The fallback covers that gap too.
Filters may not match exactly. wait-for returns the first matching line; gh api returns canonical truth. Use gh api (REST), never gh pr view --json (GraphQL).

Pattern 2 — Long-lived orchestrator wakes on multiple event types

The orchestrator's Phase 4 used to poll every 2–3 minutes for every active worker. With CTL-210, the orchestrator runs a Monitor watching all PR/CI/push/lifecycle events, and the reactive scan drops to a 10-minute idle fallback as the safety net (CTL-243).

Preferred (when catalyst-filter is running, CTL-257 + CTL-269): the orchestrate skill emits filter.register at Phase 4 start with a prompt covering CI events, PR transitions, BEHIND-state pushes, comms attention from workers, and Linear ticket changes. Phase 4 then waits on filter.wake.${ORCH_NAME} for a single unified wake covering all those concerns. See [[catalyst-filter]]. The Monitor-over-tail pattern below is the fallback for environments without the daemon.

The recommended shape is scope-aware, generated from the orchestrator's worker signal directory (CTL-240):

Use the `Monitor` tool with this command:

FILTER=$(catalyst-events build-orchestrator-filter "$ORCH_DIR")
catalyst-events tail --filter "$FILTER"

When a notification arrives, re-evaluate the affected worker's state via the
canonical `gh pr view` query. Do NOT trust the event's payload as the source
of truth — use it only as a wake-up trigger.

build-orchestrator-filter reads ${ORCH_DIR}/workers/*.json and emits a single jq predicate that scopes catalyst-origin events by orchestrator name, github events by branch-ref prefix and PR-number set, check_suite / workflow_run events by detail.prNumbers, and linear events by ticket. Re-build it after dispatching new workers so the PR/ticket sets stay in sync.

If you need a hand-rolled equivalent (e.g. the orchestrator name isn't yet known, or you only want broad event-type coverage and don't care about scoping out sibling orchestrators), the broad form is:

catalyst-events tail --filter '
  (.attributes."event.name" | startswith("github.pr.")) or
  (.attributes."event.name" | startswith("github.pr_review")) or
  (.attributes."event.name" | startswith("github.issue_comment")) or
  (.attributes."event.name" | startswith("github.check_")) or
  (.attributes."event.name" | startswith("github.workflow_run")) or
  (.attributes."event.name" | startswith("github.deployment")) or
  (.attributes."event.name" == "github.push") or
  (.attributes."event.name" | startswith("linear.issue.")) or
  (.attributes."event.name" == "orchestrator.worker.phase_advanced") or
  (.attributes."event.name" == "orchestrator.worker.status_terminal") or
  (.attributes."event.name" == "orchestrator.worker.pr_created") or
  (.attributes."event.name" == "orchestrator.worker.done") or
  (.attributes."event.name" == "orchestrator.worker.failed") or
  (.attributes."event.name" == "orchestrator.attention.raised") or
  (.attributes."event.name" == "orchestrator.attention.resolved")
'

pr_review_comment events are where Codex review threads land (required for CTL-64 BLOCKED auto-fixup detection); workflow_run.completed is the most reliable CI-done signal. The filter is intentionally broad — it covers every event type that could require a dashboard re-render, a fix-up dispatch, or a merge-confirmation re-scan. See orchestrate/SKILL.md Phase 4 for the wake-up classification table that maps each event to its reaction.

The orchestrator continues to maintain its 10-minute fallback scan (defense-in-depth). The fast path is event-driven; the slow path is the safety net.

Cross-orchestrator scoping (CTL-234). When multiple orchestrators run on the same machine, narrow the filter with (.attributes."catalyst.orchestrator.id" == "orch-foo") to ignore events from sibling runs. As of CTL-234, the webhook receiver stamps .attributes."catalyst.orchestrator.id" (and the back-compat top-level .orchestrator) on github.* events for PRs whose head branch starts with <orchId>-, so the filter

(.attributes."catalyst.orchestrator.id" == "orch-foo") and (
  (.attributes."event.name" | startswith("github.pr.")) or
  (.attributes."event.name" | startswith("github.check_")) or
  (.attributes."event.name" == "github.push") or
  (.attributes."event.name" | startswith("worker-"))
)

works for both worker-lifecycle events (already attributed) and webhook events (now attributed via PR-number lookup or head-ref prefix). Events that don't belong to any active orchestrator (human-merged PRs to main, dependabot PRs, etc.) keep .orchestrator == null and are filtered out, which is the desired behaviour.

Narration: one short line per wake (CTL-369)

Every Monitor wake and every event-driven wait-for return MUST be acknowledged with a single short line of assistant text before the agent returns to waiting. This is the canonical way to defeat the Human:\n<task-id> rendering bleed — without it, transcripts become unreadable.

Why this is non-negotiable. The Claude Code harness wraps every Monitor wake in a user-role <task-notification> XML message whose <summary> is the description field you passed to the Monitor tool. If the assistant returns an end_turn containing only thinking blocks (no text content) — which happens when the model decides the event is routine and has "nothing to say" — the UI renders the next <task-notification> raw and the <task-id> element leaks into the visible transcript as a phantom user message:

⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ Human:
  ba18h9cyy
⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ Human:
  ba18h9cyy

The "ba18h9cyy" lines look like the human user spoke but are actually harness XML element content. The fix is on the agent side, not the harness side: emit any text content in the response turn and the rendering artifact disappears.

Required line shapes. Pick the one that matches the wake. Each is a single line under ~120 characters; both "what arrived" and "what we're doing about it" must appear (even if "what we're doing" is "nothing").

| Situation | Line shape | |---|---| | Actionable wake | wake: <event.name> #<pr> [interest=<type>] — <action being taken> | | Non-actionable / routine | wake: <event.name> — routine, staying in event loop | | Already addressed (stale broker re-fire) | wake: <event.name> — already addressed, no-op | | Idle-timeout safety-net scan | wake: idle-timeout — running periodic reconciliation scan | | Daemon-down / REST fallback | wake: rest-poll — broker down, polling gh api |

Include the matched filter clause or interest type when available. Broker wakes (filter.wake.<sid>) carry .body.payload.interest_id and .body.payload.reason; surface both. Hand-rolled tail / wait-for filters have no interest_id — surface the event name and the matched PR/ticket scope instead. The audience for this line is a human reading the transcript later trying to reconstruct why the agent fired now and what it decided to do.

Wake-narration fixture (good vs bad). Use this as the acceptance check when reviewing a transcript:

# BAD — orchestrator returned thinking-only end_turn, task-id leaks

⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ Human:
  ba18h9cyy
⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ Human:
  c9a4f2x1q

# GOOD — orchestrator narrates every wake; no task-id bleed

⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ wake: github.check_suite.completed #412 [interest=pr_lifecycle] —
       CI failed on PR #412, dispatching auto-fixup
⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ wake: orchestrator.worker.phase_advanced — routine, staying in event loop

This requirement applies to every event-driven listen loop in Catalyst skills:

orchestrate Phase 4 (orchestrator's own Monitor over catalyst-events tail)
orchestrate worker dispatch prompt (each worker's listen loop)
oneshot Phase 5 (worker's PR listen loop)
Any new skill that wraps Monitor or catalyst-events wait-for

The narration line is not for the agent — it is for the operator reading the transcript later. Treat the wake as a question; treat the line as the answer.

Reading broker wake payloads

When the broker daemon is running and a filter.wake.* event arrives, the payload contains richer context than just the reason string. Use catalyst-events wake-extract to normalize the varied payload shapes into a single predictable object:

EVENT=$(catalyst-events wait-for \
  --filter ".attributes.\"event.name\" | startswith(\"filter.wake.${CATALYST_SESSION_ID}\")" \
  --timeout 600)

# Narrate the wake (mandatory — see Narration section above)
FIELDS=$(echo "$EVENT" | catalyst-events wake-extract)
REASON=$(echo "$FIELDS"   | jq -r '.reason // "unknown"')
INTEREST=$(echo "$FIELDS" | jq -r '.interest_id // "unknown"')
echo "wake: filter.wake [interest=${INTEREST}] — ${REASON}"

# Branch on normalized fields instead of re-querying GitHub/Linear
CI_CONCLUSION=$(echo "$FIELDS" | jq -r '.ci_conclusion // empty')
REVIEW_STATE=$(echo "$FIELDS"  | jq -r '.review_state // empty')
MERGED=$(echo "$FIELDS"        | jq -r '.merged // empty')

case "$CI_CONCLUSION" in
  failure|timed_out)
    # CI failed — pull logs, fix, push without a separate gh api call
    ;;
esac
case "$MERGED" in
  true)
    # PR merged event in the payload — still confirm via gh api REST before declaring done
    ;;
esac

See [[broker]] §10 for the complete wake-extract output schema and the per-interest-type reason string catalogue.

When source_events is empty (watchdog wakes, some Groq prose wakes): all wake-extract fields are null except interest_id and reason. Treat the wake as a "go re-check" signal and fall back to the authoritative REST check.

Pattern 2/3 fallback (no broker): the raw event patterns in this skill use raw github.* events from wait-for, not filter.wake.* wakes — wake-extract does not apply to those paths.

Pattern 3 — Reactive PR lifecycle (multi-event wait + classify + dispatch)

Pattern 1's single-event wait is fine for the happy path: the PR merges, the worker exits. But between PR-create and PR-merge, four things can happen that the agent should react to, not just sleep through:

| Event | Means | Agent should | |---|---|---| | github.check_suite.completed (conclusion=failure / timed_out) | CI failed | pull failure logs, fix, push, re-enter the wait | | github.pr_review.submitted (state=changes_requested) | Reviewer requested changes | run /review-comments, push, re-enter the wait | | github.push to the base branch | PR is now BEHIND | gh pr update-branch, re-enter the wait | | github.pr.merged / github.pr.closed | terminal | confirm via gh api REST, exit |

Wrap one disjunctive wait-for around all of them; classify with a case on .event; re-enter the loop on every non-terminal event. Authoritative gh api REST check runs on every wake-up — same safety rule as Pattern 1.

# Two-phase compliant cadence loop — see [[wait-for-github]]. The 1800s timeout
# serves as a cadence fallback; the authoritative REST check runs on every wake-up.
REPO=$(gh repo view --json nameWithOwner --jq '.nameWithOwner')
BASE_BRANCH=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}" --jq '.base.ref')
ITER=0
MAX_ITER=20

while [ $ITER -lt $MAX_ITER ]; do
  ITER=$((ITER + 1))

  EVENT_JSON=$(catalyst-events wait-for \
    --filter '
      (.attributes."event.name" == "github.pr.merged" and .attributes."vcs.pr.number" == '"$PR_NUMBER"') or
      (.attributes."event.name" == "github.pr.closed" and .attributes."vcs.pr.number" == '"$PR_NUMBER"') or
      (.attributes."event.name" == "github.check_suite.completed"
         and (.body.payload.prNumbers // [] | index('"$PR_NUMBER"') != null)
         and (.attributes."cicd.pipeline.run.conclusion" == "failure" or .attributes."cicd.pipeline.run.conclusion" == "timed_out")) or
      (.attributes."event.name" == "github.pr_review.submitted"
         and .attributes."vcs.pr.number" == '"$PR_NUMBER"'
         and (.body.payload.state == "changes_requested"
              or (.body.payload.state == "commented" and (.body.payload.author.type // "") == "Bot"))) or
      (.attributes."event.name" == "github.push" and .attributes."vcs.ref.name" == "refs/heads/'"$BASE_BRANCH"'")
    ' \
    --timeout 1800 || true)

  # MANDATORY authoritative REST re-check on every wake-up.
  PR_STATE=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}" \
    --jq 'if .merged then "MERGED" elif .state == "closed" then "CLOSED" else "OPEN" end' \
    2>/dev/null || echo "OPEN")
  if [ "$PR_STATE" = "MERGED" ]; then break; fi
  if [ "$PR_STATE" = "CLOSED" ]; then exit 1; fi

  EVENT=$(echo "$EVENT_JSON" | jq -r '.attributes."event.name" // ""')
  case "$EVENT" in
    github.check_suite.completed)
      # Pull failure logs, classify, fix, push. Then re-enter the loop.
      ;;
    github.pr_review.submitted)
      # Bot reviewers are addressable inline; humans require operator action.
      # See "Bot vs human authorship" below for the routing heuristic.
      AUTHOR_TYPE=$(echo "$EVENT_JSON" | jq -r '.body.payload.author.type // "User"')
      if [ "$AUTHOR_TYPE" = "Bot" ]; then
        /catalyst-dev:review-comments "$PR_NUMBER"
      fi
      ;;
    github.push)
      gh pr update-branch "$PR_NUMBER" || true
      ;;
    "")
      # Timed out — no event. The gh api check above confirmed not merged;
      # fall through to next iteration.
      ;;
  esac
done

Bot vs human authorship

Review and comment events carry body.payload.author = { login, type } where type is GitHub's user.type field — typically "User" or "Bot". Use it to route review-changes-requested events without re-fetching from the GitHub API:

AUTHOR_TYPE=$(echo "$EVENT_JSON" | jq -r '.body.payload.author.type // "User"')
case "$AUTHOR_TYPE" in
  Bot)
    # codex, claude-code-review, dependabot — addressable inline.
    /catalyst-dev:review-comments "$PR_NUMBER"
    ;;
  *)
    # Human reviewer — surface to the operator and keep waiting.
    ;;
esac

The // "User" fallback ensures pre-CTL-228 events (no author field) are treated as human-authored — the safer default.

Gotchas

check_suite.completed has no vcs.pr.number. A check suite spans many PRs; the affected PR numbers live in body.payload.prNumbers. Filter with (.body.payload.prNumbers // [] | index($PR) != null), not .attributes."vcs.pr.number" == $PR.
The filter is one jq expression. Clauses are joined with or, not comma. Each clause is parenthesized.
Bash quoting. The shell-variable interpolation ('"$PR_NUMBER"') is intentional — the outer single quotes protect the jq syntax from $-expansion, the inner double quotes re-enable it for one variable. Test your filter by piping a fixture event through jq -c "select(<filter>)" before trusting it in production.
Iteration cap. MAX_ITER=20 prevents runaway loops on a stuck failure mode. Apply per-failure-type fix budgets inside each handler too (e.g. give up after 3 distinct fix attempts on the same CI check).
All filtering belongs inside the --filter jq predicate (CTL-240, CTL-372). Do NOT add a downstream | grep … / | awk … / | sed … / | jq … post-pipe to a catalyst-events tail invocation. The primary reason is clarity: --filter is the single place a reader can look to know what reaches the consumer. Splitting filter logic across two stages hides conditions and invites small regressions (someone drops the --line-buffered flag, or the post-pipe pattern no longer matches the canonical envelope). Use catalyst-events build-orchestrator-filter "$ORCH_DIR" to generate a complete scope-aware predicate from the worker signal directory instead of hand-rolling secondary pipes.

Secondary reason (the historical CTL-240 concern): BSD awk, unflagged BSD grep, and unflagged sed buffer stdout in 4 KB blocks when stdout is not a TTY (the Monitor harness captures it). With the typical ~1–3 events/min orchestrator cadence the buffer never fills and notifications stall silently for 15+ minutes despite live PR activity. grep --line-buffered and jq --unbuffered DO mechanically flush per line on macOS and Linux (per their man pages), so the buffering failure mode is conditional, not absolute — but you should still not need either flag, because filtering belongs in --filter.

Anti-pattern: | grep -v '"event.name":"filter.wake"' on the orchestrator's Monitor (observed in real sessions). Wrong for two reasons: (a) filter.wake.* envelopes are canonical-only and do not satisfy any clause of build-orchestrator-filter's v1 predicate, so they never reach the consumer in the first place. (b) The pattern would also strip the orchestrator's OWN intended filter.wake.${ORCH_NAME} wake — the event the orchestrator registered for. Since CTL-346 the broker no longer re-classifies its own emissions, so there is no feedback loop to defend against on the consumer side either.
github.* events carry orchestrator: null and worker: null (CTL-240). Real webhook events are scoped only by .attributes."vcs.repository.name", .attributes."vcs.ref.name", .attributes."vcs.pr.number", .attributes."vcs.revision", and .body.payload.prNumbers. A scope predicate like .attributes."catalyst.orchestrator.id" == "orch-foo" will silently drop every github event. Use branch-ref prefix matching (.attributes."vcs.ref.name" | startswith("refs/heads/orch-foo-")) and PR-number-set matching (.attributes."vcs.pr.number" | IN(501,502)) instead — or use build-orchestrator-filter which handles this for you.

Long-lived precedent

The orchestrator's Phase 4 loop has used this shape for a while — Monitor over tail with a disjunctive filter, then case on the gh pr view result. The pattern above is the short-lived claude -p-friendly equivalent: wait-for instead of Monitor, case on the matched event instead of the canonical PR state. They share the same safety rule: treat events as wake-up triggers; treat gh pr view (or its equivalent) as truth.

Worker phase events — severity tiers and coalescing (CTL-229)

The worker emitter splits phase transitions into two topics so subscribers can filter by severity instead of inspecting .detail fields:

| Topic | Tier | When | Coalesces? | Carries detail.pr? | |---|---|---|---|---| | worker-phase-advanced | info | routine in-flight phases (researching, planning, implementing, validating, shipping) | yes — batched per orchestrator within windowSec (default 30 s) | no | | worker-status-terminal | act | actionable transitions (pr-created, merging, merged, done, failed, stalled, deploy-failed, deploying) | no — emitted immediately and flushes any pending coalesce queue | yes when to ∈ {pr-created, merging, merged, done, deploy-failed} |

Coalesced orchestrator.worker.phase_advanced events leave attributes."catalyst.worker.ticket" unset at the envelope level; the per-change worker lives inside .body.payload.changes[]:

{
  "ts": "2026-05-04T22:00:00Z",
  "orchestrator": "orch-foo",
  "worker": null,
  "event": "worker-phase-advanced",
  "detail": {
    "windowSec": 30,
    "changes": [
      { "ts": "2026-05-04T21:59:32Z", "worker": "CTL-229", "from": "researching", "to": "planning" },
      { "ts": "2026-05-04T21:59:36Z", "worker": "CTL-232", "from": "planning",    "to": "implementing" }
    ]
  }
}

Stragglers (the last event in a sequence) flush via the next emit OR via an explicit emit-worker-status-change.sh flush --orch <id> invocation. The orchestrator's 10-min idle scan is the documented contract for periodic flushing — a worker exiting between phases does not need to flush its own queue.

Subscriber recipes:

# Subscribe to actionable transitions only (no routine progress noise)
catalyst-events tail --filter '.attributes."event.name" == "orchestrator.worker.status_terminal"'

# Subscribe to routine progress (already coalesced into batches)
catalyst-events tail --filter '.attributes."event.name" == "orchestrator.worker.phase_advanced"'

# A worker just opened a PR — wait until it tells you the PR number
catalyst-events wait-for --timeout 600 \
  --filter '.attributes."event.name" == "orchestrator.worker.status_terminal" and .body.payload.to == "pr-created" and .attributes."catalyst.worker.ticket" == "CTL-229"' \
  | jq -r '.body.payload.pr.number'

Pattern 4 — Tail everything happening to a ticket

Useful for live debugging or operator dashboards:

# linear.issue.identifier for Linear-event context; catalyst.worker.ticket for worker/orchestrator context
catalyst-events tail --filter '.attributes."linear.issue.identifier" == "CTL-210" or .attributes."catalyst.worker.ticket" == "CTL-210"'

Captures GitHub PR events scoped to that ticket, Linear webhook events for the issue, comms posts where the ticket is the from/parent, and orchestrator/worker lifecycle events.

Diagnostic mode vs subscription mode

The patterns above are all subscription-mode usage. tail and wait-for seek to EOF on first run, so they only see events that arrive after the command starts. That is the correct default when a worker is blocking on a fresh PR merge or an orchestrator is waking on live progress — historical heartbeat noise would otherwise drown out the signal.

It is the wrong default when the question is "are events flowing at all?"

# User runs this to "check if any events are coming through"
catalyst-events tail --filter '.attributes."event.name" | startswith("github.")'
# Sits silent. User concludes: tunnel is dead.
# Reality: tunnel is fine, just no NEW events since they started tailing.

A silent live-tail does NOT mean the tunnel is dead. It means there has been no NEW activity matching your filter since you started tailing. To verify flow, switch to diagnostic mode by passing --since-line 0, which reads the entire current month's log from the start.

Diagnostic recipes

# Most recent github event of any kind, regardless of repo
catalyst-events tail --since-line 0 --filter '.attributes."event.name" | startswith("github.")' \
  | tail -1

# Hourly count over the current log file
catalyst-events tail --since-line 0 --filter '.attributes."event.name" | startswith("github.")' \
  | jq -r '.ts | sub("Z$"; "") | sub(":[0-9]{2}:[0-9]{2}$"; ":00:00")' \
  | sort | uniq -c

# Per-repo breakdown — distinguishes "quiet repo" from "dead tunnel"
catalyst-events tail --since-line 0 --filter '.attributes."event.name" | startswith("github.")' \
  | jq -r '.attributes."vcs.repository.name"' | sort | uniq -c | sort -rn

The per-repo breakdown is the one that most often resolves the misdiagnosis — a tunnel can be perfectly healthy while one watched repo has been quiet for hours and another is flowing normally.

Prefer status JSON when available

Once CTL-244 lands, catalyst-monitor status --json will expose a webhookTunnel object ({connected, smeeUrl, lastEventAt, eventCount24h, eventCount24hByRepo}). That is the structured first diagnostic step and should be checked before reaching for the recipes above. The diagnostic recipes here are the manual deep-dive when status JSON is unavailable, insufficient, or contradicts what you expect.

Filter cookbook

All event.name values are the canonical OTel form that appears on disk. The authoritative list of actionable names for workers lives in [[event-name-allowlist]]; the rows below are illustrative filters built from it.

| Need | Filter | |---|---| | All GitHub webhook events | .attributes."event.name" \| startswith("github.") | | All Linear webhook events | .attributes."event.name" \| startswith("linear.") | | One PR's merge | .attributes."event.name" == "github.pr.merged" and .attributes."vcs.pr.number" == 342 | | Any push to a branch | .attributes."event.name" == "github.push" and .attributes."vcs.ref.name" == "refs/heads/main" | | CI completion | .attributes."event.name" \| startswith("github.check_suite.") | | CI failure for one PR | .attributes."event.name" == "github.check_suite.completed" and .attributes."cicd.pipeline.run.conclusion" == "failure" and (.body.payload.prNumbers // [] \| index(342) != null) | | Review changes-requested by a bot | .attributes."event.name" == "github.pr_review.submitted" and .body.payload.state == "changes_requested" and .body.payload.author.type == "Bot" | | Comment from a human on a PR | .attributes."event.name" == "github.issue_comment.created" and (.body.payload.author.type // "User") != "Bot" | | Linear ticket state change | .attributes."event.name" == "linear.issue.state_changed" and .attributes."linear.issue.identifier" == "CTL-210" | | Comms message in one channel | .attributes."event.name" == "comms.message.posted" and .body.payload.channel == "orch-foo" | | Routine worker phase transitions (info-tier, coalesced batches; CTL-229) | .attributes."event.name" == "orchestrator.worker.phase_advanced" | | Worker terminal transitions (PR-created, merging, done, fail; CTL-229) | .attributes."event.name" == "orchestrator.worker.status_terminal" | | One worker's terminal events with PR number | .attributes."event.name" == "orchestrator.worker.status_terminal" and .attributes."catalyst.worker.ticket" == "CTL-210" and (.body.payload.pr.number // null) | | Worker reached terminal state | .attributes."event.name" == "orchestrator.worker.done" or .attributes."event.name" == "orchestrator.worker.failed" | | PR review activity | (.attributes."event.name" \| startswith("github.pr_review")) or (.attributes."event.name" == "github.issue_comment.created") | | Deploy outcome | .attributes."event.name" \| startswith("github.deployment") | | Attention raised in this orchestrator | .attributes."event.name" == "orchestrator.attention.raised" and .attributes."catalyst.orchestrator.id" == "orch-foo" |

`--timeout` semantics

wait-for --timeout N exits 1 after N seconds with no output. The caller decides what to do (usually: run the authoritative one-shot, then either re-invoke wait-for or give up).
Default timeout is 1800 s (30 min) — long enough for human-paced events, short enough to recover from a daemon crash.
For long waits (e.g. PR merge: hours), set --timeout 7200. The fallback after timeout re-checks via gh and either continues or re-invokes wait-for.

Centralization risk

The event stream is a single point of failure. Mitigations:

Always pair wait-for with a one-shot fallback. No skill prose may say "trust the event stream" — every wait must be paired with an authoritative check.
The 10-minute fallback poll inside orch-monitor keeps writing events even when webhook delivery is broken. So daemon-up-but-webhooks-down is recoverable.
The event log is plain JSONL on the local filesystem. Anyone with shell access can tail -F it; no daemon required for reads.
Catalyst-origin events (worker-dispatched, phase-changed, comms.message.posted) are written by writers that don't depend on the daemon. Daemon-down only loses GitHub/Linear webhook events.

v1 vs v2 vs canonical envelopes

The event log carries two legacy schemas plus the new canonical shape (CTL-300):

v1 (bash writers, catalyst-state.sh event): { ts, event, orchestrator, worker, detail }
v2 (TypeScript writers, webhook receiver, CTL-209+): adds id, schemaVersion: 2, source, scope (replacing flat orchestrator / worker with a nested object; v2 still emits the flat fields too as backward-compat aliases).
canonical (CTL-300+): OTel-shaped envelope with attributes."event.name", attributes."vcs.pr.number", etc. All new producers emit canonical; filters in this doc target canonical paths.

Filters that read .attributes."vcs.repository.name" / .attributes."vcs.pr.number" / .attributes."linear.issue.identifier" only match canonical envelopes. Filters that read .attributes."event.name" work for canonical; .event / .worker / .orchestrator work for v1/v2. Choose based on which sources you need to match — webhook events use canonical, orchestrator events may still use v1/v2.

Quick reference

catalyst-events tail [--filter <jq>] [--since-line <N>]
catalyst-events wait-for [--filter <jq>] [--timeout <sec>]

# Exit codes:
#   0   wait-for: matched a line (printed to stdout)
#   1   wait-for: timed out
#   2   usage error

Environment:

CATALYST_DIR — base directory (default $HOME/catalyst)
CATALYST_EVENTS_DIR — events directory (default $CATALYST_DIR/events)
CATALYST_EVENTS_FILE — override path entirely (used by tests)

Related skills

merge-pr Phase 6 — uses Pattern 3 (reactive PR lifecycle, CTL-228)
create-pr Step 12 — uses Pattern 3 (reactive PR lifecycle, CTL-228)
oneshot Phase 5 — worker exits at merging; long-lived watchers (orchestrator Phase 4, standalone /merge-pr) consume Pattern 3 on its behalf
orchestrate Phase 4 — uses Monitor over tail with a disjunctive filter; the long-lived precedent for Pattern 3
catalyst-comms — agent-to-agent pub/sub on per-channel files; comms.message.posted fan-out events go through this same log
[[broker]] — catalyst-broker daemon protocol (auto-correlation via agent.checkin, deterministic pr_lifecycle / ticket_lifecycle / comms_lifecycle routes). Preferred wake mechanism when running; collapses the per-concern jq filters in the recipes below into a single filter.wake.{id} per agent (CTL-303, CTL-371)
[[catalyst-filter]] — Groq-backed semantic event router (deprecated alias for the broker; prose-classification path retained but env-gated off by default since CTL-357)

monitor-events — Event-driven waits in skill prose

What this is for

This skill documents the canonical patterns. Use it as a reference when writing or migrating skill prose; do not invoke it as a slash command.

Prerequisite — orch-monitor daemon must be running

catalyst-events tail returns an empty stream
catalyst-events wait-for blocks until its --timeout expires (default 600s) and exits non-zero — callers fall back to gh pr view polling, which can't see deploys

Liveness check (the same call wired into check-project-setup.sh):

plugins/dev/scripts/catalyst-monitor.sh status        # human-readable
plugins/dev/scripts/catalyst-monitor.sh status --json # {"running":true,"pid":...}

Pattern selection & cost tradeoffs

Three patterns are available; pick by cost shape, not just by mechanism. Listed cheapest first.

Worker contract (matches oneshot Phase 5, CTL-371): dispatched workers prefer the broker (Pattern 3), fall back to wait-for (Pattern 2) when the daemon is down, and never use Monitor/tail (Pattern 1). See plugins/dev/skills/oneshot/SKILL.md Phase 5 and the orchestrate dispatch prompt for the canonical invocation.

Note on numbering. The "Pattern N" labels in the recipe sections below (Pattern 1 — worker waits for PR merge, Pattern 2 — long-lived orchestrator wakes, Pattern 3 — reactive PR lifecycle, Pattern 4 — tail by ticket) are recipe IDs, not the cost-tier rank in the table above. The recipes pre-date the broker integration; both the broker-preferred path and the wait-for fallback inside each recipe map to the table rows above.

Pattern 1 — Worker waits for its PR to merge

A claude -p worker that just opened PR #342 needs to block until the PR merges, then do post-merge work.

Use the two-phase pattern from [[wait-for-github]]: a 3-minute Phase 1 with a diagnostic checkpoint before committing to the full 2-hour wait.

# Two-phase pattern — see [[wait-for-github]] for full reference.
REPO=$(gh repo view --json nameWithOwner --jq '.nameWithOwner')
EVENT=""
_WFG_MATCHED=false

# Phase 1: short wait with diagnostic checkpoint (3 minutes).
EVENT=$(catalyst-events wait-for \
  --filter ".attributes.\"event.name\" == \"github.pr.merged\" and .attributes.\"vcs.pr.number\" == ${PR_NUMBER}" \
  --timeout 180 2>/dev/null || true)

if [ -n "$EVENT" ]; then
  _WFG_MATCHED=true
else
  # Phase 1 timed out — run diagnostics before extending to Phase 2.
  echo "Phase 1 timed out after 3 min — running diagnostics..."
  STALLED=false
  FILTER_MISMATCH=false

  _LOG_FILE=~/catalyst/events/$(date -u +%Y-%m).jsonl
  _LOG_LINES=$(wc -l < "$_LOG_FILE" 2>/dev/null | tr -d ' ')
  _SINCE_LINE=$(( ${_LOG_LINES:-0} > 500 ? ${_LOG_LINES:-0} - 500 : 0 ))
  HEARTBEATS=$(catalyst-events tail --since-line "$_SINCE_LINE" 2>/dev/null \
    | jq -c 'select(.attributes."event.name" == "session.heartbeat")' | wc -l | tr -d ' ')
  [ "${HEARTBEATS:-0}" -eq 0 ] && { echo "WARN: No heartbeats — event log may be stalled"; STALLED=true; }

  RAW_HIT=$(catalyst-events tail --since-line "$_SINCE_LINE" 2>/dev/null | jq -c \
    --argjson pr "$PR_NUMBER" \
    'select((.attributes."vcs.pr.number" == $pr) or (.body.payload.prNumbers // [] | contains([$pr])))' | head -1)
  if [ -n "$RAW_HIT" ]; then
    echo "WARN: Event arrived but filter did not match. Raw event:"; echo "$RAW_HIT" | jq .
    FILTER_MISMATCH=true
  fi

  # The smee→monitor webhook tunnel is the GitHub-event ingestion path and is NOT yet
  # retired (Linear smee retires first; GitHub smee is gated on CTC-134). A dead tunnel
  # produces zero events while the monitor keeps heartbeating — so without this check a
  # worker would treat infra as healthy and enter the 2-hour Phase 2 wait. Tunnel down →
  # skip the extension and rely on the authoritative REST confirmation below.
  TUNNEL_STATE=$(catalyst-monitor status --json 2>/dev/null | jq -r '.webhookTunnel.connected // false')
  [ "$TUNNEL_STATE" != "true" ] && { echo "WARN: Webhook tunnel not running"; STALLED=true; }

  if [ "$FILTER_MISMATCH" = "false" ] && [ "$STALLED" = "false" ]; then
    # Infrastructure healthy — extend to Phase 2.
    EVENT=$(catalyst-events wait-for \
      --filter ".attributes.\"event.name\" == \"github.pr.merged\" and .attributes.\"vcs.pr.number\" == ${PR_NUMBER}" \
      --timeout 7200 2>/dev/null || true)
    [ -n "$EVENT" ] && _WFG_MATCHED=true
  fi
fi

# Authoritative REST confirmation — always follows any wait-for path.
MERGED=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}" --jq '.merged' 2>/dev/null || echo "false")
if [ "$MERGED" = "true" ]; then
  # Proceed with post-merge work
fi

Non-negotiable: every wait-for is paired with an authoritative REST check. Reasons:

The orch-monitor daemon may be down. No daemon → no webhook events → wait-for blocks until timeout. The gh api call after timeout is the safety net.
Transient state can race the event. The webhook may arrive while the worker is doing setup before reaching wait-for. The fallback covers that gap too.
Filters may not match exactly. wait-for returns the first matching line; gh api returns canonical truth. Use gh api (REST), never gh pr view --json (GraphQL).

Pattern 2 — Long-lived orchestrator wakes on multiple event types

The recommended shape is scope-aware, generated from the orchestrator's worker signal directory (CTL-240):

Use the `Monitor` tool with this command:

FILTER=$(catalyst-events build-orchestrator-filter "$ORCH_DIR")
catalyst-events tail --filter "$FILTER"

When a notification arrives, re-evaluate the affected worker's state via the
canonical `gh pr view` query. Do NOT trust the event's payload as the source
of truth — use it only as a wake-up trigger.

catalyst-events tail --filter '
  (.attributes."event.name" | startswith("github.pr.")) or
  (.attributes."event.name" | startswith("github.pr_review")) or
  (.attributes."event.name" | startswith("github.issue_comment")) or
  (.attributes."event.name" | startswith("github.check_")) or
  (.attributes."event.name" | startswith("github.workflow_run")) or
  (.attributes."event.name" | startswith("github.deployment")) or
  (.attributes."event.name" == "github.push") or
  (.attributes."event.name" | startswith("linear.issue.")) or
  (.attributes."event.name" == "orchestrator.worker.phase_advanced") or
  (.attributes."event.name" == "orchestrator.worker.status_terminal") or
  (.attributes."event.name" == "orchestrator.worker.pr_created") or
  (.attributes."event.name" == "orchestrator.worker.done") or
  (.attributes."event.name" == "orchestrator.worker.failed") or
  (.attributes."event.name" == "orchestrator.attention.raised") or
  (.attributes."event.name" == "orchestrator.attention.resolved")
'

The orchestrator continues to maintain its 10-minute fallback scan (defense-in-depth). The fast path is event-driven; the slow path is the safety net.

(.attributes."catalyst.orchestrator.id" == "orch-foo") and (
  (.attributes."event.name" | startswith("github.pr.")) or
  (.attributes."event.name" | startswith("github.check_")) or
  (.attributes."event.name" == "github.push") or
  (.attributes."event.name" | startswith("worker-"))
)

Narration: one short line per wake (CTL-369)

⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ Human:
  ba18h9cyy
⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ Human:
  ba18h9cyy

Wake-narration fixture (good vs bad). Use this as the acceptance check when reviewing a transcript:

# BAD — orchestrator returned thinking-only end_turn, task-id leaks

⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ Human:
  ba18h9cyy
⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ Human:
  c9a4f2x1q

# GOOD — orchestrator narrates every wake; no task-id bleed

⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ wake: github.check_suite.completed #412 [interest=pr_lifecycle] —
       CI failed on PR #412, dispatching auto-fixup
⏺ Monitor event: "orch-adv-931 events (PR/CI/worker/comms)"
⏺ wake: orchestrator.worker.phase_advanced — routine, staying in event loop

This requirement applies to every event-driven listen loop in Catalyst skills:

orchestrate Phase 4 (orchestrator's own Monitor over catalyst-events tail)
orchestrate worker dispatch prompt (each worker's listen loop)
oneshot Phase 5 (worker's PR listen loop)
Any new skill that wraps Monitor or catalyst-events wait-for

The narration line is not for the agent — it is for the operator reading the transcript later. Treat the wake as a question; treat the line as the answer.

Reading broker wake payloads

EVENT=$(catalyst-events wait-for \
  --filter ".attributes.\"event.name\" | startswith(\"filter.wake.${CATALYST_SESSION_ID}\")" \
  --timeout 600)

# Narrate the wake (mandatory — see Narration section above)
FIELDS=$(echo "$EVENT" | catalyst-events wake-extract)
REASON=$(echo "$FIELDS"   | jq -r '.reason // "unknown"')
INTEREST=$(echo "$FIELDS" | jq -r '.interest_id // "unknown"')
echo "wake: filter.wake [interest=${INTEREST}] — ${REASON}"

# Branch on normalized fields instead of re-querying GitHub/Linear
CI_CONCLUSION=$(echo "$FIELDS" | jq -r '.ci_conclusion // empty')
REVIEW_STATE=$(echo "$FIELDS"  | jq -r '.review_state // empty')
MERGED=$(echo "$FIELDS"        | jq -r '.merged // empty')

case "$CI_CONCLUSION" in
  failure|timed_out)
    # CI failed — pull logs, fix, push without a separate gh api call
    ;;
esac
case "$MERGED" in
  true)
    # PR merged event in the payload — still confirm via gh api REST before declaring done
    ;;
esac

See [[broker]] §10 for the complete wake-extract output schema and the per-interest-type reason string catalogue.

Pattern 2/3 fallback (no broker): the raw event patterns in this skill use raw github.* events from wait-for, not filter.wake.* wakes — wake-extract does not apply to those paths.

Pattern 3 — Reactive PR lifecycle (multi-event wait + classify + dispatch)

# Two-phase compliant cadence loop — see [[wait-for-github]]. The 1800s timeout
# serves as a cadence fallback; the authoritative REST check runs on every wake-up.
REPO=$(gh repo view --json nameWithOwner --jq '.nameWithOwner')
BASE_BRANCH=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}" --jq '.base.ref')
ITER=0
MAX_ITER=20

while [ $ITER -lt $MAX_ITER ]; do
  ITER=$((ITER + 1))

  EVENT_JSON=$(catalyst-events wait-for \
    --filter '
      (.attributes."event.name" == "github.pr.merged" and .attributes."vcs.pr.number" == '"$PR_NUMBER"') or
      (.attributes."event.name" == "github.pr.closed" and .attributes."vcs.pr.number" == '"$PR_NUMBER"') or
      (.attributes."event.name" == "github.check_suite.completed"
         and (.body.payload.prNumbers // [] | index('"$PR_NUMBER"') != null)
         and (.attributes."cicd.pipeline.run.conclusion" == "failure" or .attributes."cicd.pipeline.run.conclusion" == "timed_out")) or
      (.attributes."event.name" == "github.pr_review.submitted"
         and .attributes."vcs.pr.number" == '"$PR_NUMBER"'
         and (.body.payload.state == "changes_requested"
              or (.body.payload.state == "commented" and (.body.payload.author.type // "") == "Bot"))) or
      (.attributes."event.name" == "github.push" and .attributes."vcs.ref.name" == "refs/heads/'"$BASE_BRANCH"'")
    ' \
    --timeout 1800 || true)

  # MANDATORY authoritative REST re-check on every wake-up.
  PR_STATE=$(gh api "repos/${REPO}/pulls/${PR_NUMBER}" \
    --jq 'if .merged then "MERGED" elif .state == "closed" then "CLOSED" else "OPEN" end' \
    2>/dev/null || echo "OPEN")
  if [ "$PR_STATE" = "MERGED" ]; then break; fi
  if [ "$PR_STATE" = "CLOSED" ]; then exit 1; fi

  EVENT=$(echo "$EVENT_JSON" | jq -r '.attributes."event.name" // ""')
  case "$EVENT" in
    github.check_suite.completed)
      # Pull failure logs, classify, fix, push. Then re-enter the loop.
      ;;
    github.pr_review.submitted)
      # Bot reviewers are addressable inline; humans require operator action.
      # See "Bot vs human authorship" below for the routing heuristic.
      AUTHOR_TYPE=$(echo "$EVENT_JSON" | jq -r '.body.payload.author.type // "User"')
      if [ "$AUTHOR_TYPE" = "Bot" ]; then
        /catalyst-dev:review-comments "$PR_NUMBER"
      fi
      ;;
    github.push)
      gh pr update-branch "$PR_NUMBER" || true
      ;;
    "")
      # Timed out — no event. The gh api check above confirmed not merged;
      # fall through to next iteration.
      ;;
  esac
done

Bot vs human authorship

AUTHOR_TYPE=$(echo "$EVENT_JSON" | jq -r '.body.payload.author.type // "User"')
case "$AUTHOR_TYPE" in
  Bot)
    # codex, claude-code-review, dependabot — addressable inline.
    /catalyst-dev:review-comments "$PR_NUMBER"
    ;;
  *)
    # Human reviewer — surface to the operator and keep waiting.
    ;;
esac

The // "User" fallback ensures pre-CTL-228 events (no author field) are treated as human-authored — the safer default.

Gotchas

check_suite.completed has no vcs.pr.number. A check suite spans many PRs; the affected PR numbers live in body.payload.prNumbers. Filter with (.body.payload.prNumbers // [] | index($PR) != null), not .attributes."vcs.pr.number" == $PR.
The filter is one jq expression. Clauses are joined with or, not comma. Each clause is parenthesized.
Bash quoting. The shell-variable interpolation ('"$PR_NUMBER"') is intentional — the outer single quotes protect the jq syntax from $-expansion, the inner double quotes re-enable it for one variable. Test your filter by piping a fixture event through jq -c "select(<filter>)" before trusting it in production.
Iteration cap. MAX_ITER=20 prevents runaway loops on a stuck failure mode. Apply per-failure-type fix budgets inside each handler too (e.g. give up after 3 distinct fix attempts on the same CI check).
All filtering belongs inside the --filter jq predicate (CTL-240, CTL-372). Do NOT add a downstream | grep … / | awk … / | sed … / | jq … post-pipe to a catalyst-events tail invocation. The primary reason is clarity: --filter is the single place a reader can look to know what reaches the consumer. Splitting filter logic across two stages hides conditions and invites small regressions (someone drops the --line-buffered flag, or the post-pipe pattern no longer matches the canonical envelope). Use catalyst-events build-orchestrator-filter "$ORCH_DIR" to generate a complete scope-aware predicate from the worker signal directory instead of hand-rolling secondary pipes.

Secondary reason (the historical CTL-240 concern): BSD awk, unflagged BSD grep, and unflagged sed buffer stdout in 4 KB blocks when stdout is not a TTY (the Monitor harness captures it). With the typical ~1–3 events/min orchestrator cadence the buffer never fills and notifications stall silently for 15+ minutes despite live PR activity. grep --line-buffered and jq --unbuffered DO mechanically flush per line on macOS and Linux (per their man pages), so the buffering failure mode is conditional, not absolute — but you should still not need either flag, because filtering belongs in --filter.

Anti-pattern: | grep -v '"event.name":"filter.wake"' on the orchestrator's Monitor (observed in real sessions). Wrong for two reasons: (a) filter.wake.* envelopes are canonical-only and do not satisfy any clause of build-orchestrator-filter's v1 predicate, so they never reach the consumer in the first place. (b) The pattern would also strip the orchestrator's OWN intended filter.wake.${ORCH_NAME} wake — the event the orchestrator registered for. Since CTL-346 the broker no longer re-classifies its own emissions, so there is no feedback loop to defend against on the consumer side either.
github.* events carry orchestrator: null and worker: null (CTL-240). Real webhook events are scoped only by .attributes."vcs.repository.name", .attributes."vcs.ref.name", .attributes."vcs.pr.number", .attributes."vcs.revision", and .body.payload.prNumbers. A scope predicate like .attributes."catalyst.orchestrator.id" == "orch-foo" will silently drop every github event. Use branch-ref prefix matching (.attributes."vcs.ref.name" | startswith("refs/heads/orch-foo-")) and PR-number-set matching (.attributes."vcs.pr.number" | IN(501,502)) instead — or use build-orchestrator-filter which handles this for you.

Long-lived precedent

Worker phase events — severity tiers and coalescing (CTL-229)

The worker emitter splits phase transitions into two topics so subscribers can filter by severity instead of inspecting .detail fields:

Coalesced orchestrator.worker.phase_advanced events leave attributes."catalyst.worker.ticket" unset at the envelope level; the per-change worker lives inside .body.payload.changes[]:

{
  "ts": "2026-05-04T22:00:00Z",
  "orchestrator": "orch-foo",
  "worker": null,
  "event": "worker-phase-advanced",
  "detail": {
    "windowSec": 30,
    "changes": [
      { "ts": "2026-05-04T21:59:32Z", "worker": "CTL-229", "from": "researching", "to": "planning" },
      { "ts": "2026-05-04T21:59:36Z", "worker": "CTL-232", "from": "planning",    "to": "implementing" }
    ]
  }
}

Subscriber recipes:

# Subscribe to actionable transitions only (no routine progress noise)
catalyst-events tail --filter '.attributes."event.name" == "orchestrator.worker.status_terminal"'

# Subscribe to routine progress (already coalesced into batches)
catalyst-events tail --filter '.attributes."event.name" == "orchestrator.worker.phase_advanced"'

# A worker just opened a PR — wait until it tells you the PR number
catalyst-events wait-for --timeout 600 \
  --filter '.attributes."event.name" == "orchestrator.worker.status_terminal" and .body.payload.to == "pr-created" and .attributes."catalyst.worker.ticket" == "CTL-229"' \
  | jq -r '.body.payload.pr.number'

Pattern 4 — Tail everything happening to a ticket

Useful for live debugging or operator dashboards:

# linear.issue.identifier for Linear-event context; catalyst.worker.ticket for worker/orchestrator context
catalyst-events tail --filter '.attributes."linear.issue.identifier" == "CTL-210" or .attributes."catalyst.worker.ticket" == "CTL-210"'

Captures GitHub PR events scoped to that ticket, Linear webhook events for the issue, comms posts where the ticket is the from/parent, and orchestrator/worker lifecycle events.

Diagnostic mode vs subscription mode

It is the wrong default when the question is "are events flowing at all?"

# User runs this to "check if any events are coming through"
catalyst-events tail --filter '.attributes."event.name" | startswith("github.")'
# Sits silent. User concludes: tunnel is dead.
# Reality: tunnel is fine, just no NEW events since they started tailing.

Diagnostic recipes

# Most recent github event of any kind, regardless of repo
catalyst-events tail --since-line 0 --filter '.attributes."event.name" | startswith("github.")' \
  | tail -1

# Hourly count over the current log file
catalyst-events tail --since-line 0 --filter '.attributes."event.name" | startswith("github.")' \
  | jq -r '.ts | sub("Z$"; "") | sub(":[0-9]{2}:[0-9]{2}$"; ":00:00")' \
  | sort | uniq -c

# Per-repo breakdown — distinguishes "quiet repo" from "dead tunnel"
catalyst-events tail --since-line 0 --filter '.attributes."event.name" | startswith("github.")' \
  | jq -r '.attributes."vcs.repository.name"' | sort | uniq -c | sort -rn

The per-repo breakdown is the one that most often resolves the misdiagnosis — a tunnel can be perfectly healthy while one watched repo has been quiet for hours and another is flowing normally.

Prefer status JSON when available

Filter cookbook

`--timeout` semantics

wait-for --timeout N exits 1 after N seconds with no output. The caller decides what to do (usually: run the authoritative one-shot, then either re-invoke wait-for or give up).
Default timeout is 1800 s (30 min) — long enough for human-paced events, short enough to recover from a daemon crash.
For long waits (e.g. PR merge: hours), set --timeout 7200. The fallback after timeout re-checks via gh and either continues or re-invokes wait-for.

Centralization risk

The event stream is a single point of failure. Mitigations:

Always pair wait-for with a one-shot fallback. No skill prose may say "trust the event stream" — every wait must be paired with an authoritative check.
The 10-minute fallback poll inside orch-monitor keeps writing events even when webhook delivery is broken. So daemon-up-but-webhooks-down is recoverable.
The event log is plain JSONL on the local filesystem. Anyone with shell access can tail -F it; no daemon required for reads.
Catalyst-origin events (worker-dispatched, phase-changed, comms.message.posted) are written by writers that don't depend on the daemon. Daemon-down only loses GitHub/Linear webhook events.

v1 vs v2 vs canonical envelopes

The event log carries two legacy schemas plus the new canonical shape (CTL-300):

v1 (bash writers, catalyst-state.sh event): { ts, event, orchestrator, worker, detail }
v2 (TypeScript writers, webhook receiver, CTL-209+): adds id, schemaVersion: 2, source, scope (replacing flat orchestrator / worker with a nested object; v2 still emits the flat fields too as backward-compat aliases).
canonical (CTL-300+): OTel-shaped envelope with attributes."event.name", attributes."vcs.pr.number", etc. All new producers emit canonical; filters in this doc target canonical paths.

Quick reference

catalyst-events tail [--filter <jq>] [--since-line <N>]
catalyst-events wait-for [--filter <jq>] [--timeout <sec>]

# Exit codes:
#   0   wait-for: matched a line (printed to stdout)
#   1   wait-for: timed out
#   2   usage error

Environment:

CATALYST_DIR — base directory (default $HOME/catalyst)
CATALYST_EVENTS_DIR — events directory (default $CATALYST_DIR/events)
CATALYST_EVENTS_FILE — override path entirely (used by tests)

Related skills

merge-pr Phase 6 — uses Pattern 3 (reactive PR lifecycle, CTL-228)
create-pr Step 12 — uses Pattern 3 (reactive PR lifecycle, CTL-228)
oneshot Phase 5 — worker exits at merging; long-lived watchers (orchestrator Phase 4, standalone /merge-pr) consume Pattern 3 on its behalf
orchestrate Phase 4 — uses Monitor over tail with a disjunctive filter; the long-lived precedent for Pattern 3
catalyst-comms — agent-to-agent pub/sub on per-channel files; comms.message.posted fan-out events go through this same log
[[broker]] — catalyst-broker daemon protocol (auto-correlation via agent.checkin, deterministic pr_lifecycle / ticket_lifecycle / comms_lifecycle routes). Preferred wake mechanism when running; collapses the per-concern jq filters in the recipes below into a single filter.wake.{id} per agent (CTL-303, CTL-371)
[[catalyst-filter]] — Groq-backed semantic event router (deprecated alias for the broker; prose-classification path retained but env-gated off by default since CTL-357)

Adoption

coalesce-labs/monitor-events

$ install --global

Security Scan Results

SKILL.md

monitor-events — Event-driven waits in skill prose

What this is for

Prerequisite — orch-monitor daemon must be running

Pattern selection & cost tradeoffs

Pattern 1 — Worker waits for its PR to merge

Pattern 2 — Long-lived orchestrator wakes on multiple event types

Narration: one short line per wake (CTL-369)

Reading broker wake payloads

Pattern 3 — Reactive PR lifecycle (multi-event wait + classify + dispatch)

Bot vs human authorship

Gotchas

Long-lived precedent

Worker phase events — severity tiers and coalescing (CTL-229)

Pattern 4 — Tail everything happening to a ticket

Diagnostic mode vs subscription mode

Diagnostic recipes

Prefer status JSON when available

Filter cookbook

--timeout semantics

Centralization risk

v1 vs v2 vs canonical envelopes

Quick reference

Related skills

Related Skills

coalesce-labs/migrate-dual-harness

coalesce-labs/recovery-pass

coalesce-labs/setup-catalyst

coalesce-labs/plugins/dev/skills/phase-triage

coalesce-labs/monitor-events

$ install --global

Security Scan Results

SKILL.md

monitor-events — Event-driven waits in skill prose

What this is for

Prerequisite — orch-monitor daemon must be running

Pattern selection & cost tradeoffs

Pattern 1 — Worker waits for its PR to merge

Pattern 2 — Long-lived orchestrator wakes on multiple event types

Narration: one short line per wake (CTL-369)

Reading broker wake payloads

Pattern 3 — Reactive PR lifecycle (multi-event wait + classify + dispatch)

Bot vs human authorship

Gotchas

Long-lived precedent

Worker phase events — severity tiers and coalescing (CTL-229)

Pattern 4 — Tail everything happening to a ticket

Diagnostic mode vs subscription mode

Diagnostic recipes

Prefer status JSON when available

Filter cookbook

--timeout semantics

Centralization risk

v1 vs v2 vs canonical envelopes

Quick reference

Related skills

Related Skills

coalesce-labs/migrate-dual-harness

coalesce-labs/recovery-pass

coalesce-labs/setup-catalyst

coalesce-labs/plugins/dev/skills/phase-triage

`--timeout` semantics

`--timeout` semantics