diagnose-scrollback-corruption

Diagnose: scrollback corruption on SIGWINCH — reflow/rewrap state recompute hypothesis

Metadata

Statusdone
Assignedagent-1104
Agent identityf51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e
Created2026-04-29T13:54:21.441682499+00:00
Started2026-04-29T13:54:49.090996497+00:00
Completed2026-04-29T14:08:54.277301630+00:00
Tagspriority-high,bug,tui,pty, eval-scheduled
Eval score0.86
└ blocking impact0.85
└ completeness0.90
└ constraint fidelity0.85
└ coordination overhead0.90
└ correctness0.85
└ downstream usability0.85
└ efficiency0.85
└ intent fidelity0.66
└ style adherence0.85

Description

Description

Multiple targeted fixes have failed to hold (fix-tui-pty resize, fix-pty-scrollback initial-render). Symptom recurs. User's refined hypothesis:

User quote: 'we should be able to fix this. might need recompute of flow/wrap??? unclear. interaction with claude or codex is weird potentially.'

Hypothesis to test:

  1. Reflow/rewrap on SIGWINCH — when terminal width changes, the scrollback buffer needs to be re-wrapped to the new column count. If the rewrap logic doesn't reset all state (cursor position, line-continuation flags, scroll region top/bottom), garbled output / duplication emerges.
  2. Child-agent output interaction — claude and codex CLIs may emit:
    • Alt-screen enter/exit sequences (\x1b[?1049h / \x1b[?1049l)
    • Scroll region commands (\x1b[;r)
    • Cursor save/restore (\x1b7 / \x1b8 or \x1b[s / \x1b[u)
    • SGR mouse mode toggles (\x1b[?1000h, \x1b[?1006h)
    • DECSET/DECRST mode bits If our reflow code doesn't preserve these mode bits across SIGWINCH, the next render writes into the wrong buffer / wrong scroll region and we get the symptom.

Repro fidelity required

Many SIGWINCH events fire from sources other than deliberate resize: window manager focus changes, parent terminal redraws, tmux operations. Repro should be deterministic — fire SIGWINCH programmatically (via kill -WINCH) at known points in a controlled chat output stream, not rely on a human resizing a window.

Investigation steps (research only — do NOT write the fix yet)

  1. Capture raw bytes from a typical claude chat session AND a typical codex chat session (separate files). Use script or wg's chat-history JSONL.
  2. In a test harness, replay those bytes into the current scrollback emulator with a SIGWINCH fired at frame N. Vary N. Identify which N values produce the corruption.
  3. Read the current src/tui/ reflow path. Specifically look for:
    • Whether all of: cursor, scroll region, alt-screen state, SGR state, mode bits are saved before reflow and restored after
    • Whether the reflow re-parses from buffer-start or from current-cursor — re-parsing from cursor while buffer still contains pre-resize bytes is a known bug class
    • Whether streaming-text + finalized-message double-emit (the recent regression in commit 572a28d37 fix-pty-output) sneaks back in via the reflow path
  4. Compare claude vs codex output streams: which mode bits / sequences differ? Codex CLI may use alt-screen more aggressively (it's a fuller TUI than claude's stream-json output).

Deliverable

A diagnostic write-up posted via wg log, NOT a code change:

  • Confirmed root cause with file:line citations
  • Specific mode bits / state that's lost across SIGWINCH (with citation showing where it should be saved+restored but isn't)
  • Whether claude vs codex paths trigger different bugs OR the same bug at different rates
  • Concrete fix proposal (1-2 paragraphs) so the follow-up implementation task can execute against a clear spec
  • If after investigation the conclusion is 'this is fundamentally an architectural problem, not patchable' — say so explicitly and refer to the queued replace-custom-pty task

Validation

  • Reproduction is deterministic (kill -WINCH at known points produces the bug; without SIGWINCH at those points, no bug)
  • Root cause identified with file:line citations
  • Mode bits / state preservation issues enumerated specifically
  • claude vs codex difference characterized
  • Fix proposal concrete enough that the follow-up task can implement without re-investigation
  • No source modifications — diagnose only

Depends on

Required by

Log