codex-handler-doesn — Workgraph live mirror

Metadata

Status	done
Assigned	`agent-911`
Agent identity	`f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e`
Model	`claude:opus`
Created	2026-04-28T20:05:48.752478128+00:00
Started	2026-04-28T20:06:06.585870720+00:00
Completed	2026-04-28T20:35:46.261187319+00:00
Tags	`eval-scheduled`
Eval score	0.86
└ blocking impact	0.85
└ completeness	0.85
└ constraint fidelity	0.85
└ coordination overhead	0.90
└ correctness	0.85
└ downstream usability	0.85
└ efficiency	0.90
└ intent fidelity	0.91
└ style adherence	0.95

Description

Every wg task dispatched with --model codex:* fails immediately. Reproduced today on create-agents-md (agent-908): codex CLI was spawned, sat reading stdin, got nothing, exited 1.

Verbatim from /home/erik/workgraph/.wg/agents/agent-908/output.log:

Reading prompt from stdin...
No prompt provided via stdin.
[wrapper] Agent exited with code 1, marking task failed

The prompt file IS being written by the dispatcher (see /home/erik/workgraph/.wg/agents/agent-908/prompt.txt — confirm it has content). The bug is between prompt.txt and the codex subprocess: the handler is launching codex without redirecting prompt.txt onto its stdin, OR codex needs a different invocation shape (-p <file>, --prompt-file, or pasted as argv) that the handler isn't using.

This is the FIRST live test of the codex handler — it appears the handler was wired in but never exercised against a real task. Whatever the gap is, it's been latent.

Suggested investigation order

Look at src/dispatch/handler_for_model.rs and whichever module implements the codex handler (likely src/dispatch/handlers/codex.rs or similar — grep for codex CLI invocation).
Compare to the claude handler (which works): how does it pass the prompt to claude CLI? stdin redirect? arg? Mirror that pattern.
Check codex CLI's actual invocation contract — codex --help on the user's installed binary will show whether it expects stdin, -p, or argv.
Wire the prompt through whichever channel codex actually wants.
Live smoke against a fresh tiny task on codex:gpt-5.5 — agent must read the prompt, do the work, exit 0.

Bonus (probably trivial once the main fix lands)

After fixing, run wg retry create-agents-md to unblock the existing smoke pipeline (smoke-test-confirm is queued behind it and will run automatically once create-agents-md goes done).

Out of scope

Fixing AGENTS.md content / formatting (that's the smoke task itself, will run after this lands)
Adding new codex-only features (model resolution, env vars, etc) beyond what's needed for the prompt to reach the agent
Codex tier mapping / profile switching (separate design discussion in flight)

Validation

Reproduce: dispatch a trivial task with --model codex:gpt-5.5 against current main, confirm 'No prompt provided via stdin' in output.log
Failing test first: a unit/integration test that exercises the codex handler's prompt-passing path; asserts the prompt content reaches the subprocess (mock the subprocess, assert what stdin/argv it was given)
Implementation makes the test pass
Live smoke: wg add 'echo hello in a comment' --model codex:gpt-5.5 (or similar trivial task), agent runs to completion, exits 0, prompt content was visible to the agent
wg retry create-agents-md succeeds end-to-end, AGENTS.md is created, smoke-test-confirm runs and reports the AGENTS.md content correctly
cargo build + cargo test pass with no regressions
No regression on claude handler — wg add ... --model claude:opus still works

## Description

Every wg task dispatched with `--model codex:*` fails immediately. Reproduced today on `create-agents-md` (agent-908): codex CLI was spawned, sat reading stdin, got nothing, exited 1.

Verbatim from `/home/erik/workgraph/.wg/agents/agent-908/output.log`:

```
Reading prompt from stdin...
No prompt provided via stdin.
[wrapper] Agent exited with code 1, marking task failed
```

The prompt file IS being written by the dispatcher (see `/home/erik/workgraph/.wg/agents/agent-908/prompt.txt` — confirm it has content). The bug is between `prompt.txt` and the codex subprocess: the handler is launching codex without redirecting prompt.txt onto its stdin, OR codex needs a different invocation shape (`-p <file>`, `--prompt-file`, or pasted as argv) that the handler isn't using.

This is the FIRST live test of the codex handler — it appears the handler was wired in but never exercised against a real task. Whatever the gap is, it's been latent.

### Suggested investigation order
1. Look at `src/dispatch/handler_for_model.rs` and whichever module implements the codex handler (likely `src/dispatch/handlers/codex.rs` or similar — grep for codex CLI invocation).
2. Compare to the claude handler (which works): how does it pass the prompt to claude CLI? stdin redirect? arg? Mirror that pattern.
3. Check codex CLI's actual invocation contract — `codex --help` on the user's installed binary will show whether it expects stdin, `-p`, or argv.
4. Wire the prompt through whichever channel codex actually wants.
5. Live smoke against a fresh tiny task on `codex:gpt-5.5` — agent must read the prompt, do the work, exit 0.

### Bonus (probably trivial once the main fix lands)
After fixing, run `wg retry create-agents-md` to unblock the existing smoke pipeline (`smoke-test-confirm` is queued behind it and will run automatically once create-agents-md goes done).

### Out of scope
- Fixing AGENTS.md content / formatting (that's the smoke task itself, will run after this lands)
- Adding new codex-only features (model resolution, env vars, etc) beyond what's needed for the prompt to reach the agent
- Codex tier mapping / profile switching (separate design discussion in flight)

## Validation

- [ ] Reproduce: dispatch a trivial task with `--model codex:gpt-5.5` against current main, confirm 'No prompt provided via stdin' in output.log
- [ ] Failing test first: a unit/integration test that exercises the codex handler's prompt-passing path; asserts the prompt content reaches the subprocess (mock the subprocess, assert what stdin/argv it was given)
- [ ] Implementation makes the test pass
- [ ] Live smoke: `wg add 'echo hello in a comment' --model codex:gpt-5.5` (or similar trivial task), agent runs to completion, exits 0, prompt content was visible to the agent
- [ ] `wg retry create-agents-md` succeeds end-to-end, AGENTS.md is created, smoke-test-confirm runs and reports the AGENTS.md content correctly
- [ ] cargo build + cargo test pass with no regressions
- [ ] No regression on claude handler — `wg add ... --model claude:opus` still works

Depends on

done .assign-codex-handler-doesn

Required by

(none)

Log

2026-04-28T20:05:48.744998421+00:00 Task paused
2026-04-28T20:05:51.529807402+00:00 Task published
2026-04-28T20:06:05.611181417+00:00 Lightweight assignment: agent=Careful Programmer (f5143935), exec_mode=full, context_scope=task, reason=Careful Programmer is ideal: bug fix requiring rigorous testing, live smoke validation, and zero regressions—exactly what the Careful tradeoff prioritizes.
2026-04-28T20:06:06.585874217+00:00 Spawned by coordinator --executor claude --model opus
2026-04-28T20:06:16.107287981+00:00 Starting investigation: codex handler not piping prompt to stdin
2026-04-28T20:07:48.405015617+00:00 Found root cause: src/commands/spawn/execution.rs:407 build_prompt() condition omits 'codex', so prompt_template stays None and codex case falls through to the no-pipe branch. Fix is one-line addition.
2026-04-28T20:33:42.822734585+00:00 Fix complete: added 'codex' to executor_uses_auto_prompt() gate. Live smoke confirms run.sh now contains 'cat prompt.txt | codex exec ...' and prompt reaches the codex CLI (codex emits thread.started + turn.started before failing on a separate model-availability issue, NOT the original 'No prompt provided via stdin').
2026-04-28T20:33:42.978261792+00:00 Tests added: 3 unit tests (test_executor_uses_auto_prompt_*) + 1 smoke scenario (codex_handler_pipes_prompt_to_stdin) — verified scenario fails when fix is reverted, passes when restored.
2026-04-28T20:34:05.738661266+00:00 Committed: 0d2f51cb0 — pushed to remote
2026-04-28T20:35:32.451746751+00:00 Bonus retry deferred: the running dispatcher (PID 205653, started 36min before fix) has the OLD spawn code in-process. New binary is installed; any dispatcher restart picks up the fix. Codex CLI itself confirmed working with gpt-5.5 (separate live test: codex emits agent_message + turn.completed). A user-driven service restart unblocks create-agents-md.
2026-04-28T20:35:32.558246380+00:00 Validated: cargo build clean, 3 unit tests pass, 1 smoke scenario passes (and verified-fails when fix is reverted), no regression to existing dispatcher_codex_wins_over_agency scenario.
2026-04-28T20:35:46.261190916+00:00 Task pending eval (agent reported done; awaiting `.evaluate-*` to score)
2026-04-28T20:38:29.594361099+00:00 PendingEval → Done (evaluator passed; downstream unblocks)