implement-failed-pending

Implement: failed → pending-eval → (rescued-done | failed) state machine + orange color

Metadata

Statusdone
Assignedagent-1148
Agent identityf51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e
Modelclaude:sonnet
Created2026-04-29T17:12:11.220574277+00:00
Started2026-04-29T17:24:44.684820489+00:00
Completed2026-04-29T18:08:55.199710354+00:00
Tagspriority-high,fix,agency,tui,state-machine, eval-scheduled
Eval score0.78
└ blocking impact0.75
└ completeness0.85
└ coordination overhead0.80
└ correctness0.85
└ downstream usability0.80
└ efficiency0.80
└ intent fidelity0.66
└ style adherence0.85

Description

Description

Implement the state machine + visual treatment chosen in design-failed-pending. Read that task's log via wg show design-failed-pending for the chosen approach, schema changes, color RGB values, and smoke scenarios.

Validation

  • Failing tests written first (TDD)
  • State machine: agent exits without wg done AND output exists → task enters pending-eval (from-failure variant) instead of terminal failed
  • Eval verdict positive → task transitions to done (potentially with 'rescued' marker per design decision)
  • Eval verdict negative → task transitions to failed (terminal)
  • TUI viz: failed-pending-eval state renders in the orange/yellow-red color from the design
  • TUI detail view: shows 'failed pending evaluation' label so user understands the in-flight state
  • Cycle tasks: a rescued-to-done iteration N correctly unblocks iteration N+1's dispatch (same as cleanly-done'd iteration would)
  • No regression: tasks that fail in genuinely-broken ways (cargo build error, OOM, signal kill) still go terminal-failed without eval consultation, per design
  • Live smoke: reproduce the autohaiku scenario — codex agent exits without wg done, output is acceptable, evaluator approves → task lands in done, NOT failed
  • Counter-smoke: same shape but agent output is bad → evaluator rejects → task lands in failed
  • cargo build + cargo test pass with no regressions
  • Permanent smoke scenario added under tests/smoke/scenarios/ with this task id in owners
  • cargo install --path . was run before claiming done

Depends on

Required by

Log