fix-chat-cap — Workgraph live mirror

Metadata

Status	done
Assigned	`agent-915`
Agent identity	`f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e`
Created	2026-04-28T20:16:01.095654886+00:00
Started	2026-04-28T20:17:33.427322073+00:00
Completed	2026-04-28T20:45:44.915820162+00:00
Tags	`eval-scheduled`
Eval score	0.77
└ blocking impact	0.85
└ completeness	0.70
└ constraint fidelity	0.85
└ coordination overhead	0.85
└ correctness	0.75
└ downstream usability	0.80
└ efficiency	0.85
└ intent fidelity	0.87
└ style adherence	0.75

Description

User reports the TUI status bar shows 'chat cap reached 4/4' but only 2 chat tabs are visible. The counter is including supervised chats / orphan handlers that aren't part of the user's working set.

Same family as the dispatcher-auto-respawns bug (zombie supervisors keeping things alive that shouldn't be). The wg service purge-chats command landed in commit da9c291bf — the count should respect what purge-chats considers 'live' (handler attached + consumer recently seen) vs 'archived/zombie'.

Investigation

Find the cap-checking code path (likely src/commands/service/coordinator_agent.rs or src/commands/service/mod.rs) — what does it iterate to produce 4/4?
Reconcile against actual TUI tab visibility: a chat that has no consumer + is archived MUST NOT count.
Decide: does cap mean 'spawnable slots' (active chats only) or 'reserved slots' (any non-archived chat)? Should be the former.

Repro

Open the TUI with no chats. Spawn one. Spawn another. Stop the dispatcher. Restart it. Observe the cap counter — it should still show N/16 (or whatever default), not balloon based on resurrected supervisors.

Validation

Failing test first: a state where there are 4 .chat-N tasks of which 2 are archived → counter reports 2/cap, not 4/cap
Failing test: zombie supervisor (handler dead, not yet reaped) → counter does NOT include it
Manual smoke: TUI with 2 chats visible, run wg list --tag chat to confirm what's in the graph, verify the cap counter matches the live-consumer count
cargo build + cargo test pass

## Description

User reports the TUI status bar shows 'chat cap reached 4/4' but only 2 chat tabs are visible. The counter is including supervised chats / orphan handlers that aren't part of the user's working set.

Same family as the dispatcher-auto-respawns bug (zombie supervisors keeping things alive that shouldn't be). The `wg service purge-chats` command landed in commit `da9c291bf` — the count should respect what purge-chats considers 'live' (handler attached + consumer recently seen) vs 'archived/zombie'.

### Investigation
1. Find the cap-checking code path (likely `src/commands/service/coordinator_agent.rs` or `src/commands/service/mod.rs`) — what does it iterate to produce 4/4?
2. Reconcile against actual TUI tab visibility: a chat that has no consumer + is archived MUST NOT count.
3. Decide: does cap mean 'spawnable slots' (active chats only) or 'reserved slots' (any non-archived chat)? Should be the former.

### Repro
Open the TUI with no chats. Spawn one. Spawn another. Stop the dispatcher. Restart it. Observe the cap counter — it should still show N/16 (or whatever default), not balloon based on resurrected supervisors.

## Validation

- [ ] Failing test first: a state where there are 4 `.chat-N` tasks of which 2 are archived → counter reports 2/cap, not 4/cap
- [ ] Failing test: zombie supervisor (handler dead, not yet reaped) → counter does NOT include it
- [ ] Manual smoke: TUI with 2 chats visible, run `wg list --tag chat` to confirm what's in the graph, verify the cap counter matches the live-consumer count
- [ ] cargo build + cargo test pass

Depends on

done .assign-fix-chat-cap

Required by

(none)

Log

2026-04-28T20:16:01.081554344+00:00 Task paused
2026-04-28T20:17:05.890835166+00:00 Task published
2026-04-28T20:17:32.381154185+00:00 Lightweight assignment: agent=Careful Programmer (f5143935), exec_mode=full, context_scope=task, reason=Careful Programmer is the best fit for this correctness-critical bug fix requiring TDD, code modification, and exhaustive manual smoke testing per user feedback
2026-04-28T20:17:33.427324788+00:00 Spawned by coordinator --executor claude --model opus
2026-04-28T20:17:58.699076335+00:00 Starting investigation: looking for chat cap check logic
2026-04-28T20:22:26.170262917+00:00 Found cap check sites: 1 in ipc.rs:1450 (CreateChat), 3 in state.rs (TUI launcher 12800, launch_from_launcher 12873, create_with_defaults 12936). All four use the same filter: chat-loop tag + !Abandoned + !archived. Boot enumeration also excludes Done — cap counter inconsistency. Plan: add count_live_chats helper that excludes (a) Done status, (b) chats stale per chat_session_is_idle with 24h threshold (still counts fresh chats and chats with pending inbox/recent consumer).
2026-04-28T20:44:59.888213918+00:00 Smoke validation: created scratch wg dir, hand-crafted graph scenarios. (1) 4 chat tasks with 2 archived: cap reads 2/4, can create 2 more. (2) 4 zombies (created 2h ago, no cursor, no inbox): cap reads 0/4, can create new. (3) 4 fresh live chats: cap reads 4/4, blocks creation. All three behaviors match the validation criteria.
2026-04-28T20:45:34.074889521+00:00 Committed: 22e1036e7 — pushed to remote
2026-04-28T20:45:44.915824020+00:00 Task pending eval (agent reported done; awaiting `.evaluate-*` to score)
2026-04-28T20:48:58.453833714+00:00 PendingEval → Done (evaluator passed; downstream unblocks)