Metadata
| Status | done |
|---|---|
| Assigned | agent-158 |
| Agent identity | f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e |
| Created | 2026-04-26T22:44:51.965153186+00:00 |
| Started | 2026-04-26T23:01:44.632631960+00:00 |
| Completed | 2026-04-26T23:27:00.889482821+00:00 |
| Tags | eval-scheduled |
| Eval score | 0.82 |
| └ blocking impact | 0.85 |
| └ completeness | 0.80 |
| └ coordination overhead | 0.85 |
| └ correctness | 0.85 |
| └ downstream usability | 0.80 |
| └ efficiency | 0.82 |
| └ intent fidelity | 0.84 |
| └ style adherence | 0.78 |
Description
Description
After a credit exhaustion or other systemic failure, lots of tasks end up Failed. Today's recovery is manual: shell loops over wg list --status failed, wg retry each user task, wg abandon agency followups, optionally edit per-task model first. Works but obscure.
User wants a first-class command for this:
'find all fukt. maybe change the config of them to use a diff endpoint optionally. and then just restart them'
Spec
wg recover [OPTIONS] — surveys failed tasks and resets them in one operation.
| Flag | Behavior |
|---|---|
--dry-run (default if no other flags) | Print what would happen: how many failed, how many user vs agency, what would be retried/abandoned, no changes |
--yes | Execute the plan |
--filter <expr> | Only act on tasks matching: status=failed (default), or by tag, by id-prefix, by attempt-count, by error-pattern |
--set-model <model> | Edit failed user-tasks to use this model BEFORE retry (e.g. --set-model 'openrouter:anthropic/claude-sonnet-4-6' to switch to a different provider) |
--set-endpoint <url> | Same idea for endpoint URL |
--keep-agency | Don't abandon .evaluate-* / .flip-* / .assign-* followups (default: abandon them so they regenerate fresh from the parent) |
--max-attempts <N> | Skip tasks that already have attempt-count >= N (default: 5; protects against infinite retry loops) |
--reason <msg> | Tag retried tasks with a recovery-reason log entry (for audit trail) |
Example flows
Credit exhaustion, switch endpoint, retry everything:
wg recover --filter status=failed --set-model 'openrouter:anthropic/claude-sonnet-4-6' --reason 'credit-exhaustion-2026-04-26' --yes
Just see what's failed:
wg recover --dry-run
Retry only the recent failures (skip exhausted-retry tasks):
wg recover --max-attempts 3 --yes
Output
Always prints structured summary, regardless of dry-run vs execute:
=== wg recover: 33 failed tasks ===
User tasks (will retry): 11
Agency followups (will abandon, auto-recreate): 22
Skipped (max-attempts exceeded): 0
User-task changes:
endpoint-inheritance-opt attempt #2 model→ openrouter:...
tui-log-view attempt #2 model→ openrouter:...
...
Apply with --yes
Out of scope
- Recovering from non-failure states (Abandoned, etc.) — that's manual triage
- Cross-repo recovery — single repo at a time
- Auto-detecting credit-exhaustion patterns and recovering automatically — separate future feature
Validation
- Failing tests first: test_recover_dry_run_lists_failures, test_recover_yes_resets_user_tasks, test_recover_yes_abandons_agency_followups, test_recover_set_model_edits_before_retry, test_recover_max_attempts_skips_exhausted
- Implementation makes tests pass
- cargo build + cargo test pass with no regressions
-
Manual smoke: in a scratch dir with mixed failed tasks,
wg recover --dry-runaccurately reports;--yesexecutes; final state has 0 failed user tasks and 0 failed agency followups
Depends on
Required by
- (none)
Log
- 2026-04-26T22:44:51.956963443+00:00 Task paused
- 2026-04-26T22:44:52.101707576+00:00 Task published
- 2026-04-26T23:01:43.837317789+00:00 Lightweight assignment: agent=Careful Programmer (f5143935), exec_mode=full, context_scope=task, reason=Careful Programmer role matches CLI implementation work; Careful tradeoff suits recovery logic requiring correctness/reliability; 52 prior tasks indicate relevant experience.
- 2026-04-26T23:01:44.632639123+00:00 Spawned by coordinator --executor claude --model opus
- 2026-04-26T23:01:54.892256902+00:00 Starting work: implementing wg recover command for batch recovery from mass-failure
- 2026-04-26T23:25:58.442661864+00:00 Implemented wg recover. 15/15 unit tests pass; smoke-tested in /tmp/wg-recover-smoke: dry-run lists failures, --yes resets user tasks + abandons followups, --set-model edits before retry, --max-attempts skips exhausted, filters work (status/id-prefix/error/attempts)
- 2026-04-26T23:26:26.198409502+00:00 Committed: 933ce1ce0 — pushed to remote
- 2026-04-26T23:27:00.889492770+00:00 Task marked as done