wg-recover-clean — Workgraph live mirror

Metadata

Status	done
Assigned	`agent-158`
Agent identity	`f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e`
Created	2026-04-26T22:44:51.965153186+00:00
Started	2026-04-26T23:01:44.632631960+00:00
Completed	2026-04-26T23:27:00.889482821+00:00
Tags	`eval-scheduled`
Eval score	0.82
└ blocking impact	0.85
└ completeness	0.80
└ coordination overhead	0.85
└ correctness	0.85
└ downstream usability	0.80
└ efficiency	0.82
└ intent fidelity	0.84
└ style adherence	0.78

Description

After a credit exhaustion or other systemic failure, lots of tasks end up Failed. Today's recovery is manual: shell loops over wg list --status failed, wg retry each user task, wg abandon agency followups, optionally edit per-task model first. Works but obscure.

User wants a first-class command for this:

'find all fukt. maybe change the config of them to use a diff endpoint optionally. and then just restart them'

Spec

wg recover [OPTIONS] — surveys failed tasks and resets them in one operation.

Flag	Behavior
`--dry-run` (default if no other flags)	Print what would happen: how many failed, how many user vs agency, what would be retried/abandoned, no changes
`--yes`	Execute the plan
`--filter <expr>`	Only act on tasks matching: status=failed (default), or by tag, by id-prefix, by attempt-count, by error-pattern
`--set-model <model>`	Edit failed user-tasks to use this model BEFORE retry (e.g. `--set-model 'openrouter:anthropic/claude-sonnet-4-6'` to switch to a different provider)
`--set-endpoint <url>`	Same idea for endpoint URL
`--keep-agency`	Don't abandon `.evaluate-` / `.flip-` / `.assign-*` followups (default: abandon them so they regenerate fresh from the parent)
`--max-attempts <N>`	Skip tasks that already have attempt-count >= N (default: 5; protects against infinite retry loops)
`--reason <msg>`	Tag retried tasks with a recovery-reason log entry (for audit trail)

Example flows

Credit exhaustion, switch endpoint, retry everything:

wg recover --filter status=failed --set-model 'openrouter:anthropic/claude-sonnet-4-6' --reason 'credit-exhaustion-2026-04-26' --yes

Just see what's failed:

wg recover --dry-run

Retry only the recent failures (skip exhausted-retry tasks):

wg recover --max-attempts 3 --yes

Output

Always prints structured summary, regardless of dry-run vs execute:

=== wg recover: 33 failed tasks ===
User tasks (will retry): 11
Agency followups (will abandon, auto-recreate): 22
Skipped (max-attempts exceeded): 0

User-task changes:
  endpoint-inheritance-opt    attempt #2  model→ openrouter:...
  tui-log-view                attempt #2  model→ openrouter:...
  ...

Apply with --yes

Out of scope

Recovering from non-failure states (Abandoned, etc.) — that's manual triage
Cross-repo recovery — single repo at a time
Auto-detecting credit-exhaustion patterns and recovering automatically — separate future feature

Validation

Failing tests first: test_recover_dry_run_lists_failures, test_recover_yes_resets_user_tasks, test_recover_yes_abandons_agency_followups, test_recover_set_model_edits_before_retry, test_recover_max_attempts_skips_exhausted
Implementation makes tests pass
cargo build + cargo test pass with no regressions
Manual smoke: in a scratch dir with mixed failed tasks, wg recover --dry-run accurately reports; --yes executes; final state has 0 failed user tasks and 0 failed agency followups

## Description

After a credit exhaustion or other systemic failure, lots of tasks end up Failed. Today's recovery is manual: shell loops over `wg list --status failed`, `wg retry` each user task, `wg abandon` agency followups, optionally edit per-task model first. Works but obscure.

User wants a first-class command for this:

> 'find all fukt. maybe change the config of them to use a diff endpoint optionally. and then just restart them'

### Spec

`wg recover [OPTIONS]` — surveys failed tasks and resets them in one operation.

| Flag | Behavior |
|------|----------|
| `--dry-run` (default if no other flags) | Print what would happen: how many failed, how many user vs agency, what would be retried/abandoned, no changes |
| `--yes` | Execute the plan |
| `--filter <expr>` | Only act on tasks matching: status=failed (default), or by tag, by id-prefix, by attempt-count, by error-pattern |
| `--set-model <model>` | Edit failed user-tasks to use this model BEFORE retry (e.g. `--set-model 'openrouter:anthropic/claude-sonnet-4-6'` to switch to a different provider) |
| `--set-endpoint <url>` | Same idea for endpoint URL |
| `--keep-agency` | Don't abandon `.evaluate-*` / `.flip-*` / `.assign-*` followups (default: abandon them so they regenerate fresh from the parent) |
| `--max-attempts <N>` | Skip tasks that already have attempt-count >= N (default: 5; protects against infinite retry loops) |
| `--reason <msg>` | Tag retried tasks with a recovery-reason log entry (for audit trail) |

### Example flows

**Credit exhaustion, switch endpoint, retry everything:**
```
wg recover --filter status=failed --set-model 'openrouter:anthropic/claude-sonnet-4-6' --reason 'credit-exhaustion-2026-04-26' --yes
```

**Just see what's failed:**
```
wg recover --dry-run
```

**Retry only the recent failures (skip exhausted-retry tasks):**
```
wg recover --max-attempts 3 --yes
```

### Output

Always prints structured summary, regardless of dry-run vs execute:
```
=== wg recover: 33 failed tasks ===
User tasks (will retry): 11
Agency followups (will abandon, auto-recreate): 22
Skipped (max-attempts exceeded): 0

User-task changes:
  endpoint-inheritance-opt    attempt #2  model→ openrouter:...
  tui-log-view                attempt #2  model→ openrouter:...
  ...

Apply with --yes
```

### Out of scope

- Recovering from non-failure states (Abandoned, etc.) — that's manual triage
- Cross-repo recovery — single repo at a time
- Auto-detecting credit-exhaustion patterns and recovering automatically — separate future feature

## Validation

- [ ] Failing tests first: test_recover_dry_run_lists_failures, test_recover_yes_resets_user_tasks, test_recover_yes_abandons_agency_followups, test_recover_set_model_edits_before_retry, test_recover_max_attempts_skips_exhausted
- [ ] Implementation makes tests pass
- [ ] cargo build + cargo test pass with no regressions
- [ ] Manual smoke: in a scratch dir with mixed failed tasks, `wg recover --dry-run` accurately reports; `--yes` executes; final state has 0 failed user tasks and 0 failed agency followups

Depends on

done .assign-wg-recover-clean

Required by

(none)

Log

2026-04-26T22:44:51.956963443+00:00 Task paused
2026-04-26T22:44:52.101707576+00:00 Task published
2026-04-26T23:01:43.837317789+00:00 Lightweight assignment: agent=Careful Programmer (f5143935), exec_mode=full, context_scope=task, reason=Careful Programmer role matches CLI implementation work; Careful tradeoff suits recovery logic requiring correctness/reliability; 52 prior tasks indicate relevant experience.
2026-04-26T23:01:44.632639123+00:00 Spawned by coordinator --executor claude --model opus
2026-04-26T23:01:54.892256902+00:00 Starting work: implementing wg recover command for batch recovery from mass-failure
2026-04-26T23:25:58.442661864+00:00 Implemented wg recover. 15/15 unit tests pass; smoke-tested in /tmp/wg-recover-smoke: dry-run lists failures, --yes resets user tasks + abandons followups, --set-model edits before retry, --max-attempts skips exhausted, filters work (status/id-prefix/error/attempts)
2026-04-26T23:26:26.198409502+00:00 Committed: 933ce1ce0 — pushed to remote
2026-04-26T23:27:00.889492770+00:00 Task marked as done