fix-agency-tasks

Fix: agency tasks ignore explicit [models.assigner]/[models.evaluator]/[models.flip] overrides

Metadata

Statusdone
Assignedagent-1109
Agent identityf51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e
Created2026-04-29T14:37:10.497590740+00:00
Started2026-04-29T14:38:07.217484840+00:00
Completed2026-04-29T15:00:40.717812643+00:00
Tagspriority-high,bug,agency,config, eval-scheduled
Eval score0.82
└ blocking impact0.85
└ completeness0.80
└ constraint fidelity0.70
└ coordination overhead0.85
└ correctness0.85
└ downstream usability0.85
└ efficiency0.80
└ intent fidelity0.87
└ style adherence0.90

Description

Description

Per CLAUDE.md contract:

.evaluate-, .flip-, and .assign-* tasks ... are pinned to claude:haiku ... Power users who want a non-Anthropic provider for these roles can override per-role via [models.evaluator] / [models.assigner] etc. in config; explicit overrides win, cascade does not.

Observed 2026-04-29: in ~/autohaiku (initialized via wg init --route codex-cli, which writes per-role codex overrides):

[models.assigner]
model = "codex:gpt-5.4-mini"

[models.evaluator]
model = "codex:gpt-5.4-mini"

[models.flip_inference]
model = "codex:gpt-5.4-mini"

[models.flip_comparison]
model = "codex:gpt-5.4-mini"

But wg agents shows the .assign-* task ran on claude:

agent-1   .assign-quality-pass-20260429t000000  claude    3145576    1m  done
agent-2   quality-pass-20260429t000000           codex     3145750    59s  working

The explicit override is NOT being honored. Worker correctly uses codex; agency assigner falls back to the claude pin even when the user explicitly chose codex for that role.

This makes "easy switch to codex" misleading — the route writes the right config but the runtime ignores it.

Repro

  1. mkdir tmp-test && cd tmp-test
  2. wg init --route codex-cli
  3. grep -A1 'models.assigner' .wg/config.toml → confirms codex:gpt-5.4-mini override
  4. wg service start --max-agents 1
  5. wg add 'hello' && wg publish hello
  6. wg agents after .assign-hello spawns → BUG: executor is claude (should be codex)

Investigation

  1. Find where the agency spawn site picks the model. Likely in src/agency/ or src/dispatch/ near the .assign / .evaluate / .flip task spawn paths.
  2. Check the model-resolution order. Per CLAUDE.md it should be:
    • Explicit [models.] override → use that
    • Else → pin to claude:haiku
  3. Verify whether the bug is:
    • The override is read but the pin overrides it (logic bug — pin should be FALLBACK, not preempting)
    • The override is never read at agency spawn time (lookup bug)
    • The override is applied to a different code path than what spawns these tasks (architectural drift)

Validation

  • Failing test written first (TDD): build a project with [models.assigner] = codex:X, spawn an .assign-* task, assert the spawned process's executor=codex
  • Same test for [models.evaluator] and [models.flip_inference] / [models.flip_comparison]
  • Live smoke: in a fresh tmpdir with wg init --route codex-cli, spawn a task, observe .assign-* runs on codex (paste wg agents output as evidence)
  • Fallback still works: a project with NO [models.assigner] override correctly falls back to claude:haiku (don't break the default)
  • CLAUDE.md contract rephrased / clarified if needed — the current wording is right, the implementation just doesn't match
  • cargo build + cargo test pass with no regressions
  • Permanent smoke scenario added that exercises the override path; this task id in owners
  • cargo install --path . was run before claiming done

Process note

This is a 'config UX promise vs runtime behavior' divergence. The codex-cli route writes the override, gives the user the impression of an all-codex setup, but the runtime silently falls back to claude on agency tasks. Surface this asymmetry in the task log if not fully resolvable: at minimum the user should get a clear warning at wg service start if [models.] overrides exist but won't be honored.

Depends on

Required by

Log