Metadata
| Status | done |
|---|---|
| Assigned | agent-1133 |
| Agent identity | 3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3 |
| Created | 2026-04-29T17:00:10.811618835+00:00 |
| Started | 2026-04-29T17:01:14.390620700+00:00 |
| Completed | 2026-04-29T17:11:21.001557824+00:00 |
| Tags | priority-high,design,tui,pty,architecture, eval-scheduled |
| Eval score | 0.84 |
| └ blocking impact | 0.88 |
| └ completeness | 0.85 |
| └ constraint fidelity | 0.70 |
| └ coordination overhead | 0.87 |
| └ correctness | 0.87 |
| └ downstream usability | 0.87 |
| └ efficiency | 0.82 |
| └ intent fidelity | 0.95 |
| └ style adherence | 0.83 |
Description
Description
User wants tmux-like behavior for chat agents: if the wg TUI exits, the chat agent process should survive (and ideally be reattachable from a fresh TUI session).
User quote: 'we don't get tmux-like behavior with the pty. say, when our host exits. could... we just jam tmux in there lol? to get auto-resume? well, whatever happens with claude seems to work but if i exit while it's working, codex breaks and i assume claude too but i haven't tested.'
Two-part design question:
Part A: investigate the claude-vs-codex asymmetry
- WHY does claude seem to survive TUI exit while codex breaks?
- Hypotheses:
- claude is mostly line-streaming so an abrupt PTY close affects only the in-flight chunk; codex maintains alt-screen + tool-state and corrupts on signal
- claude handles SIGHUP / SIGTERM more gracefully because it's a long-running text stream
- claude's chat handler in wg has different signal-forwarding behavior than codex's
- Maybe there's already a partial detach/respawn for claude that codex doesn't get
- Reproduce both and compare:
wg tui→ spawn each chat → kill the TUI mid-conversation → observe what happens to the child process and any in-flight state
Part B: pick a persistence strategy
Three options to evaluate:
- A. Wrap every chat agent in tmux. Spawn
tmux new-session -d -s wg-chat-<ref>and have wg's PTY pane attach to it. Pros: battle-tested, free reattach viatmux attach -t wg-chat-<ref>, free scrollback as tmux's own buffer (could help other PTY issues), survives TUI restart trivially. Cons: hard dep on tmux being installed, extra control-mode complexity, our PTY emulator now wraps tmux's PTY (two layers). - B. Lighter detach utility (dtach / abduco). Smaller dep than tmux, just gives detached PTY semantics, no multiplexing or scrollback features. Pros: minimal. Cons: less common, smaller user base, fewer free features.
- C. Custom detached-process supervision. wg spawns chat agents with setsid + a control socket; TUI connects/disconnects via socket. Pros: native, no external dep. Cons: most work, reinvents what tmux already does well.
- D. Targeted fix only for codex SIGHUP/exit handling. If Part A's investigation shows the issue is a specific signal-handling bug rather than a general persistence gap, just fix that bug — don't add a persistence layer.
Part C: what 'auto-resume' means
- Just survive (process keeps running, output buffered until TUI reconnects)?
- Or actively reattach (TUI on restart re-establishes the visible PTY)?
- Or auto-fork with a control socket so multiple TUIs can attach to the same chat?
The simpler the requirement, the smaller the scope. Recommend: 'process survives TUI exit + can be reattached on next wg tui' — not full multi-attach.
Deliverable
Design doc posted via wg log with:
- Part A findings: empirically measured claude-vs-codex behavior on TUI exit
- Recommended option (A/B/C/D) with rationale
- Concrete implementation plan (file paths, spawn-site changes, IPC protocol if needed)
- Smoke scenario list for the implementation task
Validation
- Part A repro performed and documented (kill TUI mid-chat for both claude and codex; describe observed behavior including process tree, file descriptors, in-flight state)
- Recommendation chosen with rationale
- Implementation plan concrete enough that a follow-up task can execute without re-investigation
- Smoke scenario list specified
- No source modifications — design only
Depends on
Required by
- (none)
Log
- 2026-04-29T17:00:10.796749657+00:00 Task paused
- 2026-04-29T17:00:14.630744065+00:00 Task published
- 2026-04-29T17:01:14.390623916+00:00 Spawned by coordinator --executor claude --model opus
- 2026-04-29T17:01:47.913714693+00:00 Starting design investigation: exploring chat agent spawn paths for claude vs codex, PTY handling, and TUI exit behavior
- 2026-04-29T17:05:20.842184325+00:00 Investigation phase complete. Key code-level findings: (1) PtyPane Drop calls child.kill() unconditionally for both claude+codex (src/tui/pty_pane.rs:551-555); (2) chat agent spawn site is src/tui/viz_viewer/state.rs:13039-13183 — same PtyPane path for both, no setsid/setpgid; (3) claude resume uses uuid + .jsonl session log under ~/.claude/projects/, codex uses .codex-session-id or 'resume --last' marker — asymmetry in resume robustness, not in spawn signal handling; (4) codex_handler.rs has NO signal handling, claude_handler.rs has SIGTERM forwarding — but these are daemon handlers, not the TUI PTY path; (5) wg already uses tmux for outer wg-tui multi-user wrapping (docs/design/terminal-wrapping-strategy.md) and 'wg server'. Now writing design doc.
- 2026-04-29T17:07:46.550149481+00:00 Design doc written: docs/design/chat-agent-persistence.md. Recommendation: option A (tmux wrapper) with graceful fallback. Key finding: codex's mid-turn breakage is rollout-integrity, not signal-handling — option D rejected because the codex process MUST survive the kill for the rollout to remain well-formed. Implementation plan touches pty_pane.rs (new spawn_via_tmux constructor), state.rs (chat spawn site + archive cleanup), small setup.rs messaging tweak. Seven smoke scenarios specified covering claude+codex persistence, mid-tool-call survival, archive cleanup, orphan sweep, tmux-missing fallback, and wg chat attach.
- 2026-04-29T17:11:13.031022372+00:00 Committed: 65612d6c1 — pushed to origin/wg/agent-1133/design-chat-agent. Validation: all 5 task criteria addressed (Part A documented with code-level evidence + concrete repro recipe with assertions, recommendation A with rationale, implementation plan with exact file paths + line ranges, 7 smoke scenarios named, no source modifications).
- 2026-04-29T17:11:21.001562933+00:00 Task pending eval (agent reported done; awaiting `.evaluate-*` to score)
- 2026-04-29T17:13:38.890731236+00:00 PendingEval → Done (evaluator passed; downstream unblocks)