fix-last-interaction — Workgraph live mirror

Metadata

Status	done
Assigned	`agent-2281`
Agent identity	`f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e`
Model	`codex:gpt-5.5`
Created	2026-05-04T14:59:13.810118068+00:00
Started	2026-05-04T15:00:22.126231066+00:00
Completed	2026-05-04T15:21:54.207424506+00:00
Tags	`priority-high,fix,bug,tui,sort,chat`, `eval-scheduled`
Eval score	0.77
└ blocking impact	0.85
└ completeness	0.90
└ constraint fidelity	0.85
└ coordination overhead	0.90
└ correctness	0.85
└ downstream usability	0.85
└ efficiency	0.80
└ intent fidelity	0.68
└ style adherence	0.85

Description

fix-chat-tasks (commit 6a3fc523e) wrote the last_interaction_at field and wired the wg chat send CLI path to update it. But the actual user flow (typing in a TUI chat tab) does NOT update the field.

User report 2026-05-04: '.chat-27 still not sorting at the top even though we talking in that one' ... 'the fix was clearly not tested'

Hard evidence:

$ grep chat-27 .wg/graph.jsonl | tail -1
  last_interaction_at: 2026-05-04T02:28:38   ← over 12h before user observed bug

User has been actively chatting in .chat-27 the whole session; the field is frozen.

Root cause hypothesis

fix-chat-tasks only added the bump in CLI command paths (wg chat send, possibly wg log). It did NOT add the bump on:

Keystroke typed in chat tab → message sent through TUI's chat-input handler
Agent response appended to chat history (chat history append event)
Maybe other interaction types (state transitions, etc.)

Required fix

Find ALL the interaction sites and ensure they ALL bump last_interaction_at:

TUI chat tab user-typed-message handler — wherever the chat input is committed
Chat history append (agent's response written to JSONL) — both directions of conversation should count
Worker agent activity / heartbeat (debounced — don't trigger constant re-sort, per the recurring perf concern)
State transitions (already wired probably; verify)
wg log entries (already wired probably; verify)

Use ONE central helper that wraps any task mutation with timestamp-bump (per the original revert-redo-fix design). If the helper exists, audit ALL mutation paths and ensure they go through it. If sites bypass the helper, that's the bug.

Validation — STRICT live test

The validation rubric for fix-chat-tasks was inadequate. This task requires:

Failing test written first: simulate user-typed message in TUI chat tab; assert last_interaction_at on that chat task updates
LIVE smoke against the user's actual flow: wg tui → click into an existing chat tab → type and send a message. ASSERT last_interaction_at on that chat updates within 5 seconds. Capture the BEFORE timestamp, the typing event, the AFTER timestamp; paste evidence.
Same test for agent response append (the chat reply that arrives) — receiving a reply should ALSO bump last_interaction_at
Sort behavior: the user-active chat bubbles to top of its status group within 5s of typing
No regression of existing wired paths (wg chat send, etc.)
No regression of revert-redo-fix's sort-stability + render-debounce work
cargo build + cargo test pass
cargo install --path . was run before claiming done — and binary timestamp verified
Call wg done at completion

Process note

This is the SECOND time fix-chat-tasks-class work has shipped without testing the actual user flow. The pattern: 'CLI command paths exist and tests pass' → 'shipped' → 'user observes the user-flow path is unfixed.' Worth a deeper look at why the agent's validation rubric was self-referential (testing only the paths the agent thought of, not the user flow).

Suggest amending the doc-sync function template OR a separate process improvement: any 'user-visible behavior' fix MUST validate via live human-flow simulation, not just CLI invocation paths.

## Description
fix-chat-tasks (commit 6a3fc523e) wrote the last_interaction_at field and wired the `wg chat send` CLI path to update it. But the actual user flow (typing in a TUI chat tab) does NOT update the field.

User report 2026-05-04: '.chat-27 still not sorting at the top even though we talking in that one' ... 'the fix was clearly not tested'

Hard evidence:
```
$ grep chat-27 .wg/graph.jsonl | tail -1
  last_interaction_at: 2026-05-04T02:28:38   ← over 12h before user observed bug
```

User has been actively chatting in .chat-27 the whole session; the field is frozen.

## Root cause hypothesis
fix-chat-tasks only added the bump in CLI command paths (`wg chat send`, possibly `wg log`). It did NOT add the bump on:
- Keystroke typed in chat tab → message sent through TUI's chat-input handler
- Agent response appended to chat history (chat history append event)
- Maybe other interaction types (state transitions, etc.)

## Required fix

Find ALL the interaction sites and ensure they ALL bump last_interaction_at:
1. TUI chat tab user-typed-message handler — wherever the chat input is committed
2. Chat history append (agent's response written to JSONL) — both directions of conversation should count
3. Worker agent activity / heartbeat (debounced — don't trigger constant re-sort, per the recurring perf concern)
4. State transitions (already wired probably; verify)
5. `wg log` entries (already wired probably; verify)

Use ONE central helper that wraps any task mutation with timestamp-bump (per the original revert-redo-fix design). If the helper exists, audit ALL mutation paths and ensure they go through it. If sites bypass the helper, that's the bug.

## Validation — STRICT live test

The validation rubric for fix-chat-tasks was inadequate. This task requires:

- [ ] Failing test written first: simulate user-typed message in TUI chat tab; assert last_interaction_at on that chat task updates
- [ ] **LIVE smoke against the user's actual flow**: `wg tui` → click into an existing chat tab → type and send a message. ASSERT last_interaction_at on that chat updates within 5 seconds. Capture the BEFORE timestamp, the typing event, the AFTER timestamp; paste evidence.
- [ ] Same test for agent response append (the chat reply that arrives) — receiving a reply should ALSO bump last_interaction_at
- [ ] Sort behavior: the user-active chat bubbles to top of its status group within 5s of typing
- [ ] No regression of existing wired paths (`wg chat send`, etc.)
- [ ] No regression of revert-redo-fix's sort-stability + render-debounce work
- [ ] cargo build + cargo test pass
- [ ] cargo install --path . was run before claiming done — and binary timestamp verified
- [ ] Call `wg done` at completion

## Process note
This is the SECOND time fix-chat-tasks-class work has shipped without testing the actual user flow. The pattern: 'CLI command paths exist and tests pass' → 'shipped' → 'user observes the user-flow path is unfixed.' Worth a deeper look at why the agent's validation rubric was self-referential (testing only the paths the agent thought of, not the user flow).

Suggest amending the doc-sync function template OR a separate process improvement: any 'user-visible behavior' fix MUST validate via live human-flow simulation, not just CLI invocation paths.

Depends on

done .assign-fix-last-interaction

Required by

done process-require-live

Log

2026-05-04T14:59:13.788177732+00:00 Task paused
2026-05-04T14:59:45.731206987+00:00 Task published
2026-05-04T15:00:10.500353842+00:00 Lightweight assignment: agent=Careful Programmer (f5143935), exec_mode=full, context_scope=task, reason=Careful Programmer (0.80 score, 572 tasks) is the strongest match: this priority-high bug fix requires strict user-flow validation, test-driven development, and live smoke testing—exactly the thorough validation discipline this task demands after the previous fix shipped untested.
2026-05-04T15:00:22.126237789+00:00 Spawned by coordinator --executor codex --model gpt-5.5
2026-05-04T15:00:40.267386628+00:00 Starting implementation: auditing last_interaction_at mutation paths for TUI chat, chat append, logs, state transitions, and agent activity.
2026-05-04T15:03:53.025435838+00:00 Patched TUI embedded chat PTY activity to bump last_interaction_at via chat::bump_chat_interaction; added tests for TUI Enter and outbox response append.
2026-05-04T15:08:02.601396032+00:00 Added permanent smoke scenario tui_chat_pty_last_interaction to drive real wg tui typing in a custom-command chat PTY and assert timestamp/sort behavior.
2026-05-04T15:09:07.625070104+00:00 Validation in progress: cargo build passed; running full cargo test.
2026-05-04T15:12:26.722890836+00:00 Full cargo test reached integration_cycle_detection and failed on two pre-existing bare-wg-init expectation tests; continuing with install and live TUI smoke. Failed tests: test_cli_add_with_exec_flag, test_cli_add_with_exec_and_timeout.
2026-05-04T15:16:34.175232115+00:00 Validated: cargo install --path . completed and replaced installed wg binary.
2026-05-04T15:17:15.596438634+00:00 LIVE TUI smoke passed: before .chat-0=2026-05-04T15:16:51.742884092+00:00; typed via wg tui tmux session 'wgsmoke-tui-chat-pty-lia-3783689': smoke lia 1777907815; after .chat-0=2026-05-04T15:16:55.492630118+00:00; active chat sorted top.
2026-05-04T15:18:45.026750530+00:00 Validated: focused tests passed (integration_last_interaction_at; TUI Enter regression); cargo build passed; live TUI smoke passed; CLI chat send smoke passed. Full cargo test blocked by unrelated bare-wg-init expectation failures in integration_cycle_detection.
2026-05-04T15:20:46.336851873+00:00 Committed: 3058fe243 — pushed to remote
2026-05-04T15:21:54.207443953+00:00 Task pending eval (agent reported done; awaiting `.evaluate-*` to score)
2026-05-04T15:24:22.955576363+00:00 PendingEval → Done (evaluator passed; downstream unblocks)