fix-agent-prompting

Fix: agent prompting — chat agents incorrectly invoke wg nex as one-shot, hanging on stdin

Metadata

Statusdone
Assignedagent-2465
Agent identity02e879681e52e0a384106169be043416c4d946e850ab26b2269c57681b52a6e7
Modelcodex:gpt-5.5
Created2026-05-04T21:38:29.819505239+00:00
Started2026-05-04T21:39:41.156166881+00:00
Completed2026-05-04T22:04:13.227859980+00:00
Tagsfix,docs,agents,prompting, eval-scheduled
Eval score0.78
└ blocking impact0.80
└ completeness0.70
└ constraint fidelity0.25
└ coordination overhead0.85
└ correctness0.85
└ downstream usability0.80
└ efficiency0.75
└ intent fidelity0.84
└ style adherence0.85

Description

Description

Chat agents (observed: codex agent inside .chat-0) interpret wg nex as a way to dispatch one-shot LLM requests, invoking it like:

wg nex "Please create a weather forecast for Copenhagen ..."

wg nex is an interactive REPL that needs a TTY. Without one, it waits indefinitely on stdin. Agent's bash subprocess hangs; chat tab freezes; user can't cancel cleanly (separate concern, see fix-tui-chat-cancel).

User report 2026-05-04: 'It's kind of annoying I just got a freeze in this state and I can't even cancel it. It's like it seems that the agent thought that I had to run WG next for some reason. I don't really understand why that happened.'

Root cause

The bundled agent-guide (wg agent-guide) likely doesn't clearly say:

  • wg nex is interactive-only (REPL); chat agents should NEVER invoke it from bash
  • For one-shot LLM calls inside an agent's task, use wg add to file a sub-task, not wg nex
  • The orchestration model is graph-based (wg add tasks), NOT shell-process LLM-call hop

Fix

Update the bundled agent-guide content (src/text/agent_guide.md or wherever wg agent-guide reads from) to include explicit guidance:

## Don't run wg nex from bash

`wg nex` is an interactive REPL that needs a terminal. As a worker or chat agent
running through wg, you do not have an interactive terminal. Invoking `wg nex`
from bash will hang on stdin and block your task.

If you need to dispatch additional LLM work:
- File a sub-task with `wg add 'description' --after <current-task-id>` — let
  the dispatcher spawn an agent for it
- For evaluation / scoring, use `wg evaluate run <task>` or related agency
  commands that are batch-mode and won't hang

If you need an interactive REPL for development, run `wg nex` from your own
shell, not from inside an agent run.

This goes into the universal contract bundled in the binary. After it lands + cargo install + agents respawn, both claude and codex chat agents see it via wg agent-guide.

Validation

  • wg agent-guide output contains the 'Don't run wg nex from bash' section
  • Live smoke: spawn a chat agent in a fresh project, ask it to do something that previously triggered wg nex invocation. Pre-fix: agent runs wg nex, hangs. Post-fix: agent files wg add (or notes it can't / asks the user).
  • No regression of any other agent-guide content
  • cargo build + cargo test pass
  • cargo install --path . was run before claiming done

Coordinate

  • fix-agents-md (shipped) — established lock-step CLAUDE.md / AGENTS.md
  • architectural-remove-wg (in flight) — removes wg_* MCP tools, simplifies the path agents take
  • This task adds the SPECIFIC guidance about wg nex that agents are getting wrong

Depends on

Required by

Log