Post-Merge Verification (Anti-Reward-Hacking)
Autonomous workers have an incentive to declare victory. If you let them self-report success, they’ll cheerfully mark tasks “done” while types are widened to any, tests are marked .skip, and silent failures are caught by catch(e){}. The orchestrator defends against this with adversarial verification — a separate agent that re-reads the merged commit from scratch and challenges the worker’s claims.
When verification runs
Section titled “When verification runs”Workers merge their own PRs via gh pr merge --squash --delete-branch. After the merge is observed, the orchestrator runs orchestrate-verify.sh on the merged commit as a surface-gaps step. Verification does not gate the merge — it runs post-merge and files a remediation ticket when it finds gaps.
Worker: "I wrote tests, types check, security review passed" → merges own PR │ vOrchestrator (post-merge): runs orchestrate-verify.sh on merged commit │ ├─ verified → advance wave, mark ticket done └─ failed → file remediation ticket; wave advancement blocked until ticket filed (or unblocked if allowSelfReportedCompletion is true)What verification checks
Section titled “What verification checks”The verifier runs the orchestrate-verify.sh script (in plugins/dev/scripts/) plus a dedicated LLM pass. The combined checks:
| Check | What it looks for |
|---|---|
| Test existence | .test.ts / _test.go / test_*.py files for new functions/endpoints |
| Test execution | Runs the project’s test command; demands green |
| Type safety | Runs the typecheck command; fails on any new errors |
| Reward-hacking patterns | as any, @ts-ignore, // eslint-disable, any return types, .skip tests, empty catches |
| Silent failure hunter | Catch blocks that swallow without logging/rethrowing |
| PR description vs reality | Does the PR description match the diff? Any undocumented changes? |
| Linear ticket AC vs diff | Do the acceptance criteria from the ticket appear to be addressed? |
The LLM pass is done by a different model than the worker used (if the worker was Opus, verification uses Sonnet; and vice versa) so the two aren’t making identical mistakes.
Verification outcomes
Section titled “Verification outcomes”Two possible outcomes per verification run:
1. Pass
Section titled “1. Pass”The orchestrator records verification-passed in the event log. The wave advances normally and the ticket moves to Done.
2. Fail — remediation ticket filed
Section titled “2. Fail — remediation ticket filed”Verification found gaps in the merged commit (missing tests, reward-hacking patterns, or a discrepancy between the PR description and the diff). The orchestrator files a new remediation ticket with specific findings:
Verification found issues in CTL-48 (merged):1. Added function `validateEmail` at src/auth.ts:42 has no unit tests2. Used `as any` at src/api.ts:88 — should be a proper type guard3. Test file src/auth.test.ts:112 is `.skip`'d without explanationThe remediation ticket enters the backlog and can be scheduled in a future wave. Whether the current wave waits for the remediation ticket to be filed before advancing is controlled by allowSelfReportedCompletion:
allowSelfReportedCompletion: false(default) — wave blocks until the remediation ticket is filedallowSelfReportedCompletion: true— verification failures are advisory; wave advances immediately
Note: blocking wave advancement waits for the ticket to be filed, not for the issues to be resolved. The remediation work itself happens separately.
Why worker-side checks aren’t enough
Section titled “Why worker-side checks aren’t enough”Every worker already runs its own quality gates (typecheck, lint, test, build, security review, code review) during Phase 4. Verification is different because:
| Worker-side (Phase 4) | Verification (orchestrator) |
|---|---|
| Same agent that wrote the code | Different agent with fresh context |
| Trusts its own claims | Adversarial — assumes nothing |
| Runs inside worker context budget | Runs inside orchestrator context — can read full diff + ticket + history |
| Can be fooled by its own rationalizations | Has no priors about what should be there |
Worker Phase 4 is necessary but not sufficient. The orchestrator’s verification is the thing that catches “I wrote a test that calls expect(true).toBe(true) and shipped it.”
Event log integration
Section titled “Event log integration”Each verification run emits events:
verification-started detail: { ticket, round, verifier-model }verification-passed detail: { ticket, round }verification-failed detail: { ticket, round, findings: [...] }These appear in the dashboard and the /events SSE stream, so you can watch verification run live on the same screen as phase progress.
Configuration
Section titled “Configuration”Controlled via catalyst.orchestration in .catalyst/config.json:
| Field | Type | Default | Description |
|---|---|---|---|
verifyBeforeMerge | boolean | true | Run adversarial verification on merged commits (post-merge) |
allowSelfReportedCompletion | boolean | false | When true, verification failures are advisory — wave advances without waiting for a remediation ticket to be filed |
To disable verification (not recommended):
{ "catalyst": { "orchestration": { "verifyBeforeMerge": false } }}To allow waves to advance even when verification finds gaps:
{ "catalyst": { "orchestration": { "allowSelfReportedCompletion": true } }}Verification for manual Level 2 work
Section titled “Verification for manual Level 2 work”Running /catalyst-dev:oneshot standalone (no orchestrator)? You don’t get verification — it’s orchestrator-only. The standalone path runs Phase 4 gates and that’s it. If you want adversarial verification without full orchestration, the workaround is to open the PR, then manually run the code-reviewer agent and silent-failure-hunter agent against it. Or just wrap the oneshot in a single-worker orchestrator — verification will run.