wg-nex-chat

wg nex --chat: fix multi-message break; model after claude-handler.rs structure

Metadata

Statusabandoned
Assignedagent-159
Agent identityf51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e
Created2026-04-26T23:00:44.728798063+00:00
Started2026-04-26T23:02:44.605706231+00:00
Tagseval-scheduled

Description

Description

User insight: wg nex works fine as a TASK agent (spawns, processes, exits). What's broken is the chat-handler integrationwg nex --chat <ref> faults on the second message in a chat session. User quote: 'we can use nex as a task agent. it's just the damn tui integration which should be like falling off a log if we just rebuild it based on how claude is integrated and codex presumably.'

The chat-handler shim ALREADY exists. claude-handler.rs:1-7 says: 'Peer of wg nex --chat <ref>: where nex IS a native handler that speaks chat/*.jsonl directly, this handler spawns the claude CLI'. So nex's chat-handler is wired in but its multi-message handling has a regression.

This is the OPPOSITE direction from the research-into-impl 'thin-wrapper around an external OAI-compat CLI' approach — that's still valid as a future option, but THIS task is the targeted fix for what already exists.

Diagnose

  1. Reproduce in a scratch dir against lambda01 endpoint:
    rm -rf /tmp/wg-nex-chat && mkdir /tmp/wg-nex-chat && cd /tmp/wg-nex-chat
    wg init -m qwen3-coder -e https://lambda01.tail334fe6.ts.net:30000 -x nex
    wg service start
    wg service create-chat --name test --executor native --model qwen3-coder
    wg tui  # OR programmatic via wg chat send / wg msg send
    # send msg 1: 'hi' — should get response
    # send msg 2: 'hi again' — currently faults
    
  2. Capture the exact stack trace / error from the daemon log + chat session jsonl + handler log when msg 2 faults
  3. Identify where in the nex chat-handler code path the regression is (inbox cursor mismanagement? session state not carried across turns? tool-result accumulation? message-id collision?)

Fix model: claude-handler.rs

claude-handler.rs gets multi-message handling right. Read its approach:

  • inbox cursor: tracks last answered message id; only processes ids > cursor
  • session lock: held for handler lifetime; released cleanly on exit
  • subprocess: claude CLI process is long-lived across turns; not restarted per message
  • restart on crash: supervisor restarts handler; claude process picks up from chat session jsonl

Mirror this structure for the nex chat path. nex doesn't have a separate CLI subprocess (it's in-process), so the equivalent is: keep the nex loop's conversation state alive across inbox polls; don't reinit the LLM client per message.

Files likely to touch

  • src/commands/nex.rs (the wg nex --chat entry point — likely)
  • src/executor/native/agent.rs (nex's loop)
  • src/executor/native/inbox.rs (inbox handling — strong suspect for the bug)
  • Compare side-by-side with src/commands/claude_handler.rs as reference

Hard gate

Before claiming done:

  1. Run the verbatim repro above
  2. Send AT LEAST 5 messages back-to-back; ALL 5 must produce non-fault responses
  3. Capture daemon log + handler log + chat session jsonl as evidence
  4. Add the multi-message scenario to the smoke manifest (per smoke-gate-is) so this regression is locked

Validation

  • Failing test first: test_nex_chat_handler_multi_message — programmatic 5-message roundtrip against a stub OAI endpoint; all 5 succeed
  • Implementation: identify and fix the regression (likely inbox cursor or session state)
  • cargo build + cargo test pass with no regressions
  • HARD GATE: live repro against lambda01 produces 5 successful responses; evidence attached

Out of scope

  • The thin-wrapper-around-codex/aider approach (separate path via research-into-impl + thin-wrapper-impl tasks)
  • TUI dialog fixes for picking nex executor (that's tui-new-coordinator)
  • Making nex feature-parity with claude CLI (auth, prompt cache, etc) — just fix the multi-message break

Depends on

Required by

Log