Skip to content

Event architecture

Catalyst’s observability layer has three tiers of durability, and frontends (web UI, terminal UI, scripts) consume a fused stream built from all three.

SourceKindWriterLifetime
Worker signal file (<orch-dir>/workers/<ticket>.json)Mutable snapshotWorker, orchestratorArchived with the orchestrator dir
Global state (~/catalyst/state.json)Mutable snapshotcatalyst-state.sh worker / orchestratorPersists across all orchestrations
Global event log (~/catalyst/events/YYYY-MM.jsonl)Append-only (monthly rotation)catalyst-state.sh eventNever truncated automatically

Snapshots answer “what is the state right now?” — the event log answers “what happened and when?”. Both matter.

An earlier design used only the signal file. It was insufficient because:

  1. Workers exit before merge — the subprocess running /oneshot reliably terminates at its final tool-use, which happens before PR merge completes. If pr.mergedAt lived only in the signal file, it would never be written by the worker itself.
  2. Multi-orchestrator aggregation — a single signal file describes one worker. The dashboard needs to query across all orchestrators to show “how many waves are active right now?”
  3. Audit trails — status snapshots overwrite each other. The event log keeps the full history (researching → planning → implementing → ...) even when the worker is long gone.

So signal files handle the first layer (per-worker snapshot), global state handles the second (fleet snapshot), and the event log handles the third (append-only history).

Using a status transition as the example:

Worker writes ───> Signal file (local snapshot, atomic via tmp+mv)
└───> catalyst-state.sh worker ───> Global state (atomic via jq+mkdir lock)
└──> Event (appended to events.jsonl)
orch-monitor ──> fs.watch on signal files ───> Recomputes snapshot
└──> fs.watch on state.json ───> Fan out via SSE
└──> tail -f on events.jsonl ───> SSE event stream

The monitor never writes — it only reads. Remediation (advancing a ticket, re-dispatching a worker) always goes through the skill layer, which in turn goes through catalyst-state.sh to maintain the write ordering invariants.

The orch-monitor exposes GET /events as a Server-Sent Events stream. Events are JSON objects following the same schema as events.jsonl:

event: worker-update
data: {"orchestrator":"orch-...","worker":"CTL-48","status":"implementing","phase":3,"ts":"2026-04-14T19:03:01Z"}
event: pr-update
data: {"orchestrator":"orch-...","worker":"CTL-48","pr":123,"ciStatus":"passing","ts":"2026-04-14T19:20:44Z"}
event: liveness-change
data: {"orchestrator":"orch-...","worker":"CTL-48","alive":false,"pid":63709,"ts":"2026-04-14T19:22:00Z"}
event: snapshot
data: {"orchestrators":[...],"generatedAt":"2026-04-14T19:22:00Z"}
EventSourceWhen
snapshotGenerated by monitorOn connect, every 60s, and on any state change
worker-updateSignal file changeWorker writes a new status or phase
pr-updateGitHub webhook (fallback: 10-min poll)PR state or CI status changed
liveness-changePID check (every 5s)A worker’s PID stopped responding
attention-raisedGlobal state changeOrchestrator added an attention item
wave-completedEvent log tailOrchestrator emitted wave-completed

Clients (web UI, terminal UI, custom dashboards) subscribe once and render incrementally. No polling.

Any SSE-capable client works. Example in Node:

import { EventSource } from 'eventsource';
const es = new EventSource('http://localhost:7400/events');
es.addEventListener('worker-update', (e) => {
const { worker, status } = JSON.parse(e.data);
console.log(`${worker}: ${status}`);
});
es.addEventListener('pr-update', (e) => {
const { worker, pr, ciStatus } = JSON.parse(e.data);
if (ciStatus === 'failing') notifySlack(`${worker} PR #${pr} CI failed`);
});

Or in Bash, for quick ad-hoc piping:

Terminal window
curl -N http://localhost:7400/events \
| grep -E '^event: (worker-update|pr-update)' -A 1 \
| grep ^data:

catalyst-events is the command-line interface for reading the unified event log at ~/catalyst/events/YYYY-MM.jsonl. It supports real-time tailing and waiting for specific events.

Terminal window
catalyst-events tail [--filter <jq>] [--since-line <N>]

Tails the current month’s event log, printing each new line as it arrives. Use --filter with any jq expression to narrow output:

Terminal window
# All GitHub webhook events
catalyst-events tail --filter '.source == "github.webhook"'
# Linear issue state changes
catalyst-events tail --filter '.event | startswith("linear.issue.state")'
# Events for a specific ticket
catalyst-events tail --filter '.worker == "CTL-48"'
# Worker lifecycle events only
catalyst-events tail --filter '.event | startswith("worker-")'
Terminal window
catalyst-events wait-for --filter <jq> [--timeout <sec>]

Blocks until a matching event appears (or timeout expires). Used internally by skills that need to synchronize on external events:

Terminal window
# Wait up to 120s for a PR merge event for ticket CTL-48
catalyst-events wait-for --filter '.event == "github.pr.merged" and .worker == "CTL-48"' --timeout 120

When the orch-monitor is running with webhooks configured, wait-for returns within ~1s of the event arriving. Without the monitor, it falls back to polling the event log file at 10-minute intervals (600s maximum latency).

Every record in ~/catalyst/events/YYYY-MM.jsonl has an event field with a dot-namespaced topic:

Topic patternSourceDescription
github.pr.opened / github.pr.merged / github.pr.closedGitHub webhookPR lifecycle
github.pr.review_submittedGitHub webhookPR review submitted
github.check_suite.completedGitHub webhookCI suite result
github.workflow_run.completedGitHub webhookGitHub Actions workflow result
github.pushGitHub webhookPush to a tracked branch
github.deployment / github.deployment_statusGitHub webhookDeploy lifecycle
linear.issue.created / linear.issue.state_changedLinear webhookIssue lifecycle
linear.issue.priority_changed / linear.issue.assignee_changedLinear webhookIssue field updates
linear.comment.createdLinear webhookComment on a Linear issue
linear.cycle.*Linear webhookCycle create/update/remove
comms.message.postedcatalyst-commsAgent coordination message
worker-dispatched / worker-pr-created / worker-doneOrchestratorWorker lifecycle
catalyst.*catalyst-sessionSkill/session lifecycle events

GitHub and Linear topics arrive via webhooks when the monitor is configured (see Webhook Pipeline Setup). Fallback polling writes the same topic shapes but with a source: "poll" field instead of "github.webhook" or "linear.webhook".

If a client disconnects (network blip, process restart), the SSE spec gets it reconnected automatically — the browser/library handles retry. On reconnect the monitor immediately sends a fresh snapshot event so the client can reconcile any missed updates.

The event log (events.jsonl) is the durable fallback: if a client missed events while disconnected, it can replay from the log by its last-seen timestamp:

Terminal window
awk -F'"ts":' '$2 > "\"2026-04-14T19:00:00Z\"" {print}' ~/catalyst/events/2026-04.jsonl

File-based append-only logs and filesystem watches are intentionally boring. They:

  • Require no additional process (no Redis, no Kafka)
  • Survive monitor restarts (events.jsonl is the source of truth)
  • Are debuggable with cat, tail, and jq
  • Work offline

The cost is that you’re limited to one machine — if you need multi-host aggregation, pipe events.jsonl into your regular log shipping stack (Vector, Fluent Bit, whatever you already run). The schema is stable.