bug-agency-tasks — Workgraph live mirror

Metadata

Status	failed
Assigned	`agent-2581`
Agent identity	`02e879681e52e0a384106169be043416c4d946e850ab26b2269c57681b52a6e7`
Created	2026-05-05T04:50:23.822099139+00:00
Started	2026-05-06T15:24:41.303334762+00:00
Tags	`bug,agency,executor,pinning`, `eval-scheduled`
Tokens	0 in / 0 out
Failure reason	Agent exited with code 1

Description

Observed

TUI runtime panel for .flip-* (and other .assign-* / .evaluate-*) tasks shows:

Executor: codex
Model:    haiku

This combination is incoherent — codex is the OpenAI CLI handler, haiku is an Anthropic model. The codex CLI cannot run haiku.

Confirmed across at least:

.flip-retry-graph-lock (executor: codex, model: haiku) — recently completed
.flip-review-fix-nex-cursor-corruption (visible in TUI now)

Expected (per CLAUDE.md / project contract)

Agency tasks (.assign-*, .flip-*, .evaluate-*) are supposed to be pinned to claude:haiku running on the claude CLI handler, ignoring project-level provider cascade from coordinator.model and from the active profile. The whole point of the pinning is to keep agency cheap and immune to 'openrouter/codex configured but no key / wrong model family' silent failures.

So expected runtime is:

Executor: claude
Model:    haiku

Actual root-cause hypothesis (not investigated)

The pinning logic likely sets model = haiku correctly but doesn't override the executor/handler when an active profile (wg profile use codex) supplies a non-claude default executor. Either:

The handler-for-model derivation (src/dispatch/handler_for_model.rs per CLAUDE.md) is being short-circuited by an explicit executor inherited from the profile, OR
The agency-task scaffolding sets executor at creation time from the active config without the override, OR
[models.evaluator] / [models.assigner] overrides are missing and the cascade falls back to profile executor instead of the hard-coded claude default

The haiku value is set but the claude executor is not — meaning the override is partial.

What to do

Reproduce: with active profile set to codex (wg profile use codex), publish any task and wg show .flip-<task>. Confirm Executor: codex.
Find where agency-task runtime is resolved. Likely candidates: src/agency/ (task scaffolding), src/dispatch/handler_for_model.rs, the publish path that creates .assign-/.flip-/.evaluate- tasks.
Identify the exact path where executor is decided for these tasks. The model-pinning is working; find why the executor pinning isn't.
Fix so all three agency task families ALWAYS resolve to (executor=claude, model=haiku) regardless of active profile, unless an explicit per-role override ([models.evaluator] etc.) is present in config.
Add a regression test: publish a task with active codex profile, assert agency-task runtime is (claude, haiku).

Out of scope

Worker-task runtime (the retry-graph-lock worker on opus is correct — that's the user's profile choice for actual work)
Adding new override config knobs
TUI display changes — the TUI is correctly showing what's stored; the bug is in what gets stored

Validation

Reproduction confirmed and documented (profile codex → agency task spawned with codex executor)
Root cause identified (file:line in commit message)
Fix forces executor=claude, model=haiku for .assign-*, .flip-*, .evaluate-* tasks unconditionally (modulo explicit per-role overrides)
Existing per-role override path ([models.evaluator] etc.) still works
Regression test: publish task under codex profile, assert agency tasks resolve to claude+haiku
cargo build clean, cargo test clean
cargo install --path . after, so the fix is live in global wg
Manual smoke: publish a task, run wg show .flip-<id> and wg show .assign-<id>, both must show Executor: claude, Model: haiku

## Observed
TUI runtime panel for `.flip-*` (and other `.assign-*` / `.evaluate-*`) tasks shows:
```
Executor: codex
Model:    haiku
```

This combination is incoherent — codex is the OpenAI CLI handler, haiku is an Anthropic model. The codex CLI cannot run haiku.

Confirmed across at least:
- `.flip-retry-graph-lock` (executor: codex, model: haiku) — recently completed
- `.flip-review-fix-nex-cursor-corruption` (visible in TUI now)

## Expected (per CLAUDE.md / project contract)
Agency tasks (`.assign-*`, `.flip-*`, `.evaluate-*`) are supposed to be **pinned to claude:haiku running on the claude CLI handler**, ignoring project-level provider cascade from `coordinator.model` and from the active profile. The whole point of the pinning is to keep agency cheap and immune to 'openrouter/codex configured but no key / wrong model family' silent failures.

So expected runtime is:
```
Executor: claude
Model:    haiku
```

## Actual root-cause hypothesis (not investigated)
The pinning logic likely sets `model = haiku` correctly but doesn't override the executor/handler when an active profile (`wg profile use codex`) supplies a non-claude default executor. Either:
- The handler-for-model derivation (`src/dispatch/handler_for_model.rs` per CLAUDE.md) is being short-circuited by an explicit executor inherited from the profile, OR
- The agency-task scaffolding sets executor at creation time from the active config without the override, OR
- `[models.evaluator]` / `[models.assigner]` overrides are missing and the cascade falls back to profile executor instead of the hard-coded claude default

The `haiku` value is set but the `claude` executor is not — meaning the override is partial.

## What to do
1. Reproduce: with active profile set to codex (`wg profile use codex`), publish any task and `wg show .flip-<task>`. Confirm `Executor: codex`.
2. Find where agency-task runtime is resolved. Likely candidates: `src/agency/` (task scaffolding), `src/dispatch/handler_for_model.rs`, the publish path that creates `.assign-/.flip-/.evaluate-` tasks.
3. Identify the exact path where `executor` is decided for these tasks. The model-pinning is working; find why the executor pinning isn't.
4. Fix so all three agency task families ALWAYS resolve to `(executor=claude, model=haiku)` regardless of active profile, unless an explicit per-role override (`[models.evaluator]` etc.) is present in config.
5. Add a regression test: publish a task with active codex profile, assert agency-task runtime is `(claude, haiku)`.

## Out of scope
- Worker-task runtime (the `retry-graph-lock` worker on opus is correct — that's the user's profile choice for actual work)
- Adding new override config knobs
- TUI display changes — the TUI is correctly showing what's stored; the bug is in what gets stored

## Validation
- [ ] Reproduction confirmed and documented (profile codex → agency task spawned with codex executor)
- [ ] Root cause identified (file:line in commit message)
- [ ] Fix forces `executor=claude, model=haiku` for `.assign-*`, `.flip-*`, `.evaluate-*` tasks unconditionally (modulo explicit per-role overrides)
- [ ] Existing per-role override path (`[models.evaluator]` etc.) still works
- [ ] Regression test: publish task under codex profile, assert agency tasks resolve to claude+haiku
- [ ] `cargo build` clean, `cargo test` clean
- [ ] `cargo install --path .` after, so the fix is live in global `wg`
- [ ] Manual smoke: publish a task, run `wg show .flip-<id>` and `wg show .assign-<id>`, both must show `Executor: claude, Model: haiku`

Depends on

done .assign-bug-agency-tasks

Required by

open .flip-bug-agency-tasks

Log

2026-05-05T04:50:23.793703988+00:00 Task paused
2026-05-05T04:50:27.953972627+00:00 Task published
2026-05-05T04:50:52.889835019+00:00 Lightweight assignment: agent=Careful Programmer (02e87968), exec_mode=full, context_scope=task, reason=Careful Programmer is ideal for this bug fix—strong track record (0.81, 640 tasks) on code modification, testing, and exhaustive validation matches the need to trace executor pinning, fix it without breaking per-role overrides, and smoke-test thoroughly.
2026-05-05T04:50:56.805819608+00:00 Spawned by coordinator --executor claude --model opus
2026-05-05T04:51:09.183861305+00:00 Starting investigation: agency tasks (.assign-*, .flip-*, .evaluate-*) showing executor=codex despite pinning to claude:haiku
2026-05-06T15:24:27.382422640+00:00 Task unclaimed: agent 'agent-2579' (PID 1370228) process exited
2026-05-06T15:24:41.303339872+00:00 Spawned by coordinator --executor claude --model opus
2026-05-06T15:24:46.618392823+00:00 Task marked as failed: Agent exited with code 1