design-chat-agent

Design: chat agent persistence — tmux wrapper vs targeted codex exit fix vs custom detach

Metadata

Statusdone
Assignedagent-1133
Agent identity3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3
Created2026-04-29T17:00:10.811618835+00:00
Started2026-04-29T17:01:14.390620700+00:00
Completed2026-04-29T17:11:21.001557824+00:00
Tagspriority-high,design,tui,pty,architecture, eval-scheduled
Eval score0.84
└ blocking impact0.88
└ completeness0.85
└ constraint fidelity0.70
└ coordination overhead0.87
└ correctness0.87
└ downstream usability0.87
└ efficiency0.82
└ intent fidelity0.95
└ style adherence0.83

Description

Description

User wants tmux-like behavior for chat agents: if the wg TUI exits, the chat agent process should survive (and ideally be reattachable from a fresh TUI session).

User quote: 'we don't get tmux-like behavior with the pty. say, when our host exits. could... we just jam tmux in there lol? to get auto-resume? well, whatever happens with claude seems to work but if i exit while it's working, codex breaks and i assume claude too but i haven't tested.'

Two-part design question:

Part A: investigate the claude-vs-codex asymmetry

  • WHY does claude seem to survive TUI exit while codex breaks?
  • Hypotheses:
    • claude is mostly line-streaming so an abrupt PTY close affects only the in-flight chunk; codex maintains alt-screen + tool-state and corrupts on signal
    • claude handles SIGHUP / SIGTERM more gracefully because it's a long-running text stream
    • claude's chat handler in wg has different signal-forwarding behavior than codex's
    • Maybe there's already a partial detach/respawn for claude that codex doesn't get
  • Reproduce both and compare: wg tui → spawn each chat → kill the TUI mid-conversation → observe what happens to the child process and any in-flight state

Part B: pick a persistence strategy

Three options to evaluate:

  • A. Wrap every chat agent in tmux. Spawn tmux new-session -d -s wg-chat-<ref> and have wg's PTY pane attach to it. Pros: battle-tested, free reattach via tmux attach -t wg-chat-<ref>, free scrollback as tmux's own buffer (could help other PTY issues), survives TUI restart trivially. Cons: hard dep on tmux being installed, extra control-mode complexity, our PTY emulator now wraps tmux's PTY (two layers).
  • B. Lighter detach utility (dtach / abduco). Smaller dep than tmux, just gives detached PTY semantics, no multiplexing or scrollback features. Pros: minimal. Cons: less common, smaller user base, fewer free features.
  • C. Custom detached-process supervision. wg spawns chat agents with setsid + a control socket; TUI connects/disconnects via socket. Pros: native, no external dep. Cons: most work, reinvents what tmux already does well.
  • D. Targeted fix only for codex SIGHUP/exit handling. If Part A's investigation shows the issue is a specific signal-handling bug rather than a general persistence gap, just fix that bug — don't add a persistence layer.

Part C: what 'auto-resume' means

  • Just survive (process keeps running, output buffered until TUI reconnects)?
  • Or actively reattach (TUI on restart re-establishes the visible PTY)?
  • Or auto-fork with a control socket so multiple TUIs can attach to the same chat?

The simpler the requirement, the smaller the scope. Recommend: 'process survives TUI exit + can be reattached on next wg tui' — not full multi-attach.

Deliverable

Design doc posted via wg log with:

  1. Part A findings: empirically measured claude-vs-codex behavior on TUI exit
  2. Recommended option (A/B/C/D) with rationale
  3. Concrete implementation plan (file paths, spawn-site changes, IPC protocol if needed)
  4. Smoke scenario list for the implementation task

Validation

  • Part A repro performed and documented (kill TUI mid-chat for both claude and codex; describe observed behavior including process tree, file descriptors, in-flight state)
  • Recommendation chosen with rationale
  • Implementation plan concrete enough that a follow-up task can execute without re-investigation
  • Smoke scenario list specified
  • No source modifications — design only

Depends on

Required by

Log