tui-detail-view — Workgraph live mirror

Metadata

Status	done
Assigned	`agent-953`
Agent identity	`f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e`
Created	2026-04-28T21:34:06.196716882+00:00
Started	2026-04-28T21:51:41.196169093+00:00
Completed	2026-04-28T22:14:09.740289868+00:00
Tags	`eval-scheduled`
Tokens	15103228 in / 25425 out
Eval score	0.87
└ blocking impact	0.88
└ completeness	0.85
└ constraint fidelity	0.85
└ coordination overhead	0.88
└ correctness	0.88
└ downstream usability	0.85
└ efficiency	0.78
└ intent fidelity	0.78
└ style adherence	0.92

Description

When a task is in a cycle and iterates, the agency companion tasks (.flip-X, .evaluate-X, .assign-X) ALSO iterate alongside. But the TUI detail view shows FLIP/eval scores without labeling which iteration they came from — so iteration 2 of the user's task displays the iteration 1 FLIP score with no indication it's stale.

User quote: 'in the display in the tui detail view the flip is not iteration specific. but we are iterating the flip tasks too. all the tasks are iterating.'

Currently visible on this user's session via the create-agents-md ↔ verify-agents-md cycle (max-iterations=3): .flip-create-agents-md re-runs each iteration, but the detail panel just shows 'Score: 0.04 Source: flip' with one timestamp, no iteration label.

What to fix

Each .flip-* / .evaluate-* run records or is tagged with the loop_iteration of the parent task at the time it ran.
The TUI detail view's eval/flip section either:
- Shows scores grouped by iteration (e.g. 'Iteration 1: flip 0.04 / eval 0.04 | Iteration 2: flip 0.65 / eval 0.74'), OR
- Shows ONLY the current iteration's score and labels it as such ('Iteration 2: flip 0.65')
The CLI wg show <task> output should match — same iteration labeling so users grepping logs aren't confused.

Likely files to touch

Wherever eval/flip records are stored on the task struct (probably src/graph.rs)
The TUI detail view renderer (likely src/tui/detail.rs or similar)
src/commands/show.rs (or wherever wg show formats evaluations)
The agency pipeline task — wherever .flip-* / .evaluate-* write their score back

Out of scope

Restyling the detail panel beyond what's needed to show iteration label
Changing how cycles iterate or how flip/eval are scheduled
Auto-archiving stale scores

Validation

Failing test first: a task with 2 completed iterations, each producing distinct flip scores, renders both in detail view (or current-only with explicit iteration label) — never just a single unlabeled score
Failing test: wg show <task> includes iteration label on each Evaluation entry
Manual smoke: on the existing create-agents-md cycle, after iteration 2 completes, detail view shows iteration 2's flip score (or both iterations clearly labeled)
No regression for non-cycle tasks (single iteration → no extra label noise, or label says 'Iteration 1' uniformly — pick one)
cargo build + cargo test pass

## Description

When a task is in a cycle and iterates, the agency companion tasks (`.flip-X`, `.evaluate-X`, `.assign-X`) ALSO iterate alongside. But the TUI detail view shows FLIP/eval scores without labeling which iteration they came from — so iteration 2 of the user's task displays the iteration 1 FLIP score with no indication it's stale.

User quote: 'in the display in the tui detail view the flip is not iteration specific. but we are iterating the flip tasks too. all the tasks are iterating.'

Currently visible on this user's session via the create-agents-md ↔ verify-agents-md cycle (max-iterations=3): `.flip-create-agents-md` re-runs each iteration, but the detail panel just shows 'Score: 0.04 Source: flip' with one timestamp, no iteration label.

### What to fix
1. Each `.flip-*` / `.evaluate-*` run records or is tagged with the `loop_iteration` of the parent task at the time it ran.
2. The TUI detail view's eval/flip section either:
   - Shows scores grouped by iteration (e.g. 'Iteration 1: flip 0.04 / eval 0.04 | Iteration 2: flip 0.65 / eval 0.74'), OR
   - Shows ONLY the current iteration's score and labels it as such ('Iteration 2: flip 0.65')
3. The CLI `wg show <task>` output should match — same iteration labeling so users grepping logs aren't confused.

### Likely files to touch
- Wherever eval/flip records are stored on the task struct (probably `src/graph.rs`)
- The TUI detail view renderer (likely `src/tui/detail.rs` or similar)
- `src/commands/show.rs` (or wherever `wg show` formats evaluations)
- The agency pipeline task — wherever `.flip-*` / `.evaluate-*` write their score back

### Out of scope
- Restyling the detail panel beyond what's needed to show iteration label
- Changing how cycles iterate or how flip/eval are scheduled
- Auto-archiving stale scores

## Validation

- [ ] Failing test first: a task with 2 completed iterations, each producing distinct flip scores, renders both in detail view (or current-only with explicit iteration label) — never just a single unlabeled score
- [ ] Failing test: `wg show <task>` includes iteration label on each Evaluation entry
- [ ] Manual smoke: on the existing create-agents-md cycle, after iteration 2 completes, detail view shows iteration 2's flip score (or both iterations clearly labeled)
- [ ] No regression for non-cycle tasks (single iteration → no extra label noise, or label says 'Iteration 1' uniformly — pick one)
- [ ] cargo build + cargo test pass

Depends on

done .assign-tui-detail-view

Required by

abandoned smoke-iter-test

Log

2026-04-28T21:34:06.187058437+00:00 Task paused
2026-04-28T21:34:10.660616793+00:00 Task published
2026-04-28T21:34:25.760917389+00:00 Lightweight assignment: agent=Careful Programmer (f5143935), exec_mode=full, context_scope=graph, reason=Implementation task requiring careful modification of TUI display, graph storage, and CLI output across multiple components with backward compatibility — Careful Programmer's attention to testing and correctness is essential.
2026-04-28T21:34:26.867341960+00:00 Spawned by coordinator --executor claude --model opus
2026-04-28T21:34:32.238921024+00:00 Starting work — investigating how flip/eval scores are stored and displayed
2026-04-28T21:51:30.854703246+00:00 Task unclaimed: agent 'agent-942' (PID 514709) process exited
2026-04-28T21:51:41.196172259+00:00 Spawned by coordinator --executor claude --model opus
2026-04-28T22:13:21.641617668+00:00 Validated: cargo build + cargo test pass (lib + show + evaluation_recording + integration_agency_*); pre-existing test_global_config_path and ResumeConfig failures unaffected. Live smoke: synthetic iter1+iter2 evals render with [iter N] labels in wg show.
2026-04-28T22:13:44.212186083+00:00 Committed: 1df735918 — staged 23 files by name (no -A)
2026-04-28T22:14:09.740292723+00:00 Task pending eval (agent reported done; awaiting `.evaluate-*` to score)
2026-04-28T22:16:33.274393919+00:00 PendingEval → Done (evaluator passed; downstream unblocks)