Metadata
| Status | done |
|---|---|
| Assigned | agent-998 |
| Agent identity | f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e |
| Model | claude:sonnet |
| Created | 2026-04-28T22:38:58.152051934+00:00 |
| Started | 2026-04-28T22:51:35.642610665+00:00 |
| Completed | 2026-04-28T23:06:32.577282998+00:00 |
| Tags | priority-high,fix,codex,config, eval-scheduled |
| Eval score | 0.83 |
| └ blocking impact | 0.90 |
| └ completeness | 0.75 |
| └ coordination overhead | 0.85 |
| └ correctness | 0.88 |
| └ downstream usability | 0.82 |
| └ efficiency | 0.82 |
| └ intent fidelity | 0.94 |
| └ style adherence | 0.85 |
Description
Description
Implement the recommendations from research-current-openai. Read that task's log via wg show research-current-openai for the verified codex model strings + tier mapping.
Goal: a user running wg init --route codex-cli (or wg init -m codex:<something>) gets a working all-codex setup — including agency meta-tasks (.evaluate-*, .flip-*, .assign-*) running on codex, not silently falling back to claude:haiku.
Scope
- Update the
codex-cliroute in wg init to write:[agent].modeland[dispatcher].modelset to the balanced/sonnet-equivalent codex tier[models.evaluator],[models.flip],[models.assigner]set to the cheap/haiku-equivalent codex tier- Any other per-role overrides that today default to claude:haiku
- If the current
codex:o1-prodefault is stale, replace it with the current top-tier model name from research. - Add a deprecation/migration path: existing configs with the old defaults should warn (via
wg config lint) andwg migrate configshould rewrite them. - Document the change: update the codex-cli section of the relevant docs (probably docs/config-ux-design.md or a setup guide) with the new defaults.
Validation
-
Failing test written first:
wg init --route codex-cli --dry-runoutput contains the new [models.*] sections AND the updated [agent]/[dispatcher].model strings -
Live smoke: in a fresh tmpdir, run
wg init --route codex-clithenwg config show— confirm all four model strings are codex-prefixed (noclaude:anywhere unless explicitly chosen) -
Live smoke: run a tiny task end-to-end on this fresh project and confirm the spawned agent uses codex (check
wg agentsfor executor=codex, OR check WG_EXECUTOR_TYPE in the spawned process env) -
Live smoke: confirm the agency pipeline (
.evaluate-*/.flip-*) also runs on codex on this project — not claude:haiku. Inspect the spawned meta-agent's executor. -
wg config lintflags old codex configs andwg migrate configrewrites them - cargo build + cargo test pass with no regressions
- Permanent smoke scenario added under tests/smoke/scenarios/ with this task id in owners
- cargo install --path . was run before claiming done
Depends on
Required by
- (none)
Log
- 2026-04-28T22:38:58.112579242+00:00 Task paused
- 2026-04-28T22:39:16.280411716+00:00 Task published
- 2026-04-28T22:39:40.907093950+00:00 Lightweight assignment: agent=Careful Programmer (f5143935), exec_mode=full, context_scope=task, reason=Careful Programmer role matches implementation requirements; Careful tradeoff suits correctness-critical model routing and comprehensive validation.
- 2026-04-28T22:51:35.642615604+00:00 Spawned by coordinator --executor claude --model sonnet
- 2026-04-28T22:51:44.971898353+00:00 Starting implementation: fix codex-cli route defaults + agency model overrides
- 2026-04-28T22:54:40.714044079+00:00 Baseline tests pass. Updating tests to use new model strings (TDD step 1: failing tests first)
- 2026-04-28T22:57:52.675645266+00:00 All tests pass (1 pre-existing failure unrelated to changes). Installing binary.
- 2026-04-28T23:06:04.105398311+00:00 Committed 3ddf04961 and pushed to origin. Running wg done.
- 2026-04-28T23:06:22.269241119+00:00 Validation complete: - cargo build + cargo test pass (1 pre-existing unrelated failure: config::tests::test_global_config_path) - New tests: test_route_codex_cli_role_split, test_route_codex_cli_complete_config, test_codex_profile_is_static, fixes_stale_codex_o1_pro_to_gpt54, fixes_stale_codex_gpt5_codex_to_gpt54 - all pass - Live smoke: wg init --route codex-cli --dry-run shows gpt-5.4/gpt-5.4-mini/gpt-5.5, no o1-pro, no claude: - Live smoke: wg config --show after init shows all codex models - Smoke scenario passes: codex_init_route_has_correct_defaults - cargo install --path . completed - Committed 3ddf04961, pushed to origin
- 2026-04-28T23:06:32.577286865+00:00 Task pending eval (agent reported done; awaiting `.evaluate-*` to score)
- 2026-04-28T23:09:41.472456708+00:00 PendingEval → Done (evaluator passed; downstream unblocks)