Metadata
| Status | done |
|---|---|
| Assigned | agent-911 |
| Agent identity | f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e |
| Model | claude:opus |
| Created | 2026-04-28T20:05:48.752478128+00:00 |
| Started | 2026-04-28T20:06:06.585870720+00:00 |
| Completed | 2026-04-28T20:35:46.261187319+00:00 |
| Tags | eval-scheduled |
| Eval score | 0.86 |
| └ blocking impact | 0.85 |
| └ completeness | 0.85 |
| └ constraint fidelity | 0.85 |
| └ coordination overhead | 0.90 |
| └ correctness | 0.85 |
| └ downstream usability | 0.85 |
| └ efficiency | 0.90 |
| └ intent fidelity | 0.91 |
| └ style adherence | 0.95 |
Description
Description
Every wg task dispatched with --model codex:* fails immediately. Reproduced today on create-agents-md (agent-908): codex CLI was spawned, sat reading stdin, got nothing, exited 1.
Verbatim from /home/erik/workgraph/.wg/agents/agent-908/output.log:
Reading prompt from stdin...
No prompt provided via stdin.
[wrapper] Agent exited with code 1, marking task failed
The prompt file IS being written by the dispatcher (see /home/erik/workgraph/.wg/agents/agent-908/prompt.txt — confirm it has content). The bug is between prompt.txt and the codex subprocess: the handler is launching codex without redirecting prompt.txt onto its stdin, OR codex needs a different invocation shape (-p <file>, --prompt-file, or pasted as argv) that the handler isn't using.
This is the FIRST live test of the codex handler — it appears the handler was wired in but never exercised against a real task. Whatever the gap is, it's been latent.
Suggested investigation order
- Look at
src/dispatch/handler_for_model.rsand whichever module implements the codex handler (likelysrc/dispatch/handlers/codex.rsor similar — grep for codex CLI invocation). - Compare to the claude handler (which works): how does it pass the prompt to claude CLI? stdin redirect? arg? Mirror that pattern.
- Check codex CLI's actual invocation contract —
codex --helpon the user's installed binary will show whether it expects stdin,-p, or argv. - Wire the prompt through whichever channel codex actually wants.
- Live smoke against a fresh tiny task on
codex:gpt-5.5— agent must read the prompt, do the work, exit 0.
Bonus (probably trivial once the main fix lands)
After fixing, run wg retry create-agents-md to unblock the existing smoke pipeline (smoke-test-confirm is queued behind it and will run automatically once create-agents-md goes done).
Out of scope
- Fixing AGENTS.md content / formatting (that's the smoke task itself, will run after this lands)
- Adding new codex-only features (model resolution, env vars, etc) beyond what's needed for the prompt to reach the agent
- Codex tier mapping / profile switching (separate design discussion in flight)
Validation
-
Reproduce: dispatch a trivial task with
--model codex:gpt-5.5against current main, confirm 'No prompt provided via stdin' in output.log - Failing test first: a unit/integration test that exercises the codex handler's prompt-passing path; asserts the prompt content reaches the subprocess (mock the subprocess, assert what stdin/argv it was given)
- Implementation makes the test pass
-
Live smoke:
wg add 'echo hello in a comment' --model codex:gpt-5.5(or similar trivial task), agent runs to completion, exits 0, prompt content was visible to the agent -
wg retry create-agents-mdsucceeds end-to-end, AGENTS.md is created, smoke-test-confirm runs and reports the AGENTS.md content correctly - cargo build + cargo test pass with no regressions
-
No regression on claude handler —
wg add ... --model claude:opusstill works
Depends on
Required by
- (none)
Log
- 2026-04-28T20:05:48.744998421+00:00 Task paused
- 2026-04-28T20:05:51.529807402+00:00 Task published
- 2026-04-28T20:06:05.611181417+00:00 Lightweight assignment: agent=Careful Programmer (f5143935), exec_mode=full, context_scope=task, reason=Careful Programmer is ideal: bug fix requiring rigorous testing, live smoke validation, and zero regressions—exactly what the Careful tradeoff prioritizes.
- 2026-04-28T20:06:06.585874217+00:00 Spawned by coordinator --executor claude --model opus
- 2026-04-28T20:06:16.107287981+00:00 Starting investigation: codex handler not piping prompt to stdin
- 2026-04-28T20:07:48.405015617+00:00 Found root cause: src/commands/spawn/execution.rs:407 build_prompt() condition omits 'codex', so prompt_template stays None and codex case falls through to the no-pipe branch. Fix is one-line addition.
- 2026-04-28T20:33:42.822734585+00:00 Fix complete: added 'codex' to executor_uses_auto_prompt() gate. Live smoke confirms run.sh now contains 'cat prompt.txt | codex exec ...' and prompt reaches the codex CLI (codex emits thread.started + turn.started before failing on a separate model-availability issue, NOT the original 'No prompt provided via stdin').
- 2026-04-28T20:33:42.978261792+00:00 Tests added: 3 unit tests (test_executor_uses_auto_prompt_*) + 1 smoke scenario (codex_handler_pipes_prompt_to_stdin) — verified scenario fails when fix is reverted, passes when restored.
- 2026-04-28T20:34:05.738661266+00:00 Committed: 0d2f51cb0 — pushed to remote
- 2026-04-28T20:35:32.451746751+00:00 Bonus retry deferred: the running dispatcher (PID 205653, started 36min before fix) has the OLD spawn code in-process. New binary is installed; any dispatcher restart picks up the fix. Codex CLI itself confirmed working with gpt-5.5 (separate live test: codex emits agent_message + turn.completed). A user-driven service restart unblocks create-agents-md.
- 2026-04-28T20:35:32.558246380+00:00 Validated: cargo build clean, 3 unit tests pass, 1 smoke scenario passes (and verified-fails when fix is reverted), no regression to existing dispatcher_codex_wins_over_agency scenario.
- 2026-04-28T20:35:46.261190916+00:00 Task pending eval (agent reported done; awaiting `.evaluate-*` to score)
- 2026-04-28T20:38:29.594361099+00:00 PendingEval → Done (evaluator passed; downstream unblocks)