remove-validation-cli — Workgraph live mirror

Metadata

Status	done
Assigned	`agent-92`
Agent identity	`f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e`
Created	2026-04-26T13:03:27.081375627+00:00
Started	2026-04-26T19:43:08.589328667+00:00
Completed	2026-04-26T20:04:51.331372524+00:00
Tags	`eval-scheduled`
Eval score	0.05
└ blocking impact	0.05
└ completeness	0.00
└ coordination overhead	0.15
└ correctness	0.00
└ downstream usability	0.00
└ efficiency	0.10
└ intent fidelity	0.04
└ style adherence	0.00

Description

User decision: hard validation gates always get fucked up; agency evaluator (auto_evaluate + FLIP) handles validation by reading the ## Validation section of task descriptions. Remove the --validation CLI flag and all references that suggest using it.

Files to clean (from grep -rn '--validation' src/)

src/cli.rs:237,241,453 — --validation flag definition on add and edit commands. Remove the flag entirely. If feasible behind a deprecation cycle, keep accepting it for one release with a warning that it is a no-op.
src/config.rs:2715 — comment 'New tasks should use --validation=llm instead.' — remove or rewrite to point at the agency-evaluator pattern.
src/service/executor.rs:239,242,408,487,490,723,760,763,2871-2874 — agent-prompt text + tests asserting the prompt mentions --validation. Replace with: 'put validation criteria in the ## Validation section of the task description; the agency evaluator scores against it.'
src/commands/edit.rs:269 — error message that suggests --validation=llm. Drop the suggestion.
src/commands/quickstart.rs:212,214 — VALIDATION section in quickstart output. Rewrite to describe the agency evaluator path with no flag.
wg add runtime hint that prints '2. Use --validation=llm for an LLM verification gate at completion time' (likely in the same area). Remove option 2 entirely; only option 1 (## Validation in -d) remains.

What stays

PendingValidation task status itself stays — it can still be entered via the agency-driven path (low FLIP score, eval gate, manual flag).
wg approve / wg reject stay — the human-review path off PendingValidation is unchanged.
Anything in the agency evaluator code that scores ## Validation sections stays.

Out of scope

Removing PendingValidation as a status (not asked).
Removing legacy --verify (already removed; only doc text remained and was cleaned in a prior turn).

Validation

Failing tests first:
- test_cli_add_no_validation_flag — wg add --help no longer mentions --validation; passing --validation=llm errors with 'unknown flag' OR (if deprecation kept) prints a one-line warning and is otherwise a no-op
- test_quickstart_no_validation_flag — wg quickstart output mentions ## Validation section but not the --validation flag
- test_executor_prompt_no_validation_flag — agent prompts assembled in build_prompt (or wherever) do not contain the string '--validation'
Implementation removes the flag and rewrites prompt/quickstart text per spec
cargo build + cargo test pass with no regressions
Manual smoke: wg add 'test' --validation=llm either errors cleanly or warns + ignores; wg quickstart output describes only the agency-evaluator path; wg add 'test' followed by inspecting the dispatched agent's prompt shows no --validation mention

## Description

User decision: hard validation gates always get fucked up; agency evaluator (auto_evaluate + FLIP) handles validation by reading the `## Validation` section of task descriptions. Remove the `--validation` CLI flag and all references that suggest using it.

### Files to clean (from grep -rn '--validation' src/)

- `src/cli.rs:237,241,453` — `--validation` flag definition on `add` and `edit` commands. Remove the flag entirely. If feasible behind a deprecation cycle, keep accepting it for one release with a warning that it is a no-op.
- `src/config.rs:2715` — comment 'New tasks should use --validation=llm instead.' — remove or rewrite to point at the agency-evaluator pattern.
- `src/service/executor.rs:239,242,408,487,490,723,760,763,2871-2874` — agent-prompt text + tests asserting the prompt mentions `--validation`. Replace with: 'put validation criteria in the `## Validation` section of the task description; the agency evaluator scores against it.'
- `src/commands/edit.rs:269` — error message that suggests `--validation=llm`. Drop the suggestion.
- `src/commands/quickstart.rs:212,214` — VALIDATION section in quickstart output. Rewrite to describe the agency evaluator path with no flag.
- `wg add` runtime hint that prints '2. Use --validation=llm for an LLM verification gate at completion time' (likely in the same area). Remove option 2 entirely; only option 1 (`## Validation` in -d) remains.

### What stays

- `PendingValidation` task status itself stays — it can still be entered via the agency-driven path (low FLIP score, eval gate, manual flag).
- `wg approve` / `wg reject` stay — the human-review path off PendingValidation is unchanged.
- Anything in the agency evaluator code that scores `## Validation` sections stays.

### Out of scope

- Removing PendingValidation as a status (not asked).
- Removing legacy `--verify` (already removed; only doc text remained and was cleaned in a prior turn).

## Validation

- [ ] Failing tests first:
- test_cli_add_no_validation_flag — `wg add --help` no longer mentions `--validation`; passing `--validation=llm` errors with 'unknown flag' OR (if deprecation kept) prints a one-line warning and is otherwise a no-op
- test_quickstart_no_validation_flag — `wg quickstart` output mentions `## Validation` section but not the `--validation` flag
- test_executor_prompt_no_validation_flag — agent prompts assembled in `build_prompt` (or wherever) do not contain the string '--validation'
- [ ] Implementation removes the flag and rewrites prompt/quickstart text per spec
- [ ] cargo build + cargo test pass with no regressions
- [ ] Manual smoke: `wg add 'test' --validation=llm` either errors cleanly or warns + ignores; `wg quickstart` output describes only the agency-evaluator path; `wg add 'test'` followed by inspecting the dispatched agent's prompt shows no `--validation` mention

Depends on

done .assign-remove-validation-cli

Required by

(none)

Log

2026-04-26T13:03:27.081206128+00:00 Task paused
2026-04-26T16:03:14.701283261+00:00 Task published
2026-04-26T16:14:03.395026810+00:00 Spawned by coordinator --executor native --model claude-opus-4-6
2026-04-26T16:14:03.420921452+00:00 Task marked as failed: Agent exited with code 1
2026-04-26T16:17:15.865315989+00:00 Task reset for retry from failed (attempt #2)
2026-04-26T18:59:15.542969030+00:00 Spawned by coordinator --executor native --model opus
2026-04-26T18:59:15.567052630+00:00 Task marked as failed: Agent exited with code 1
2026-04-26T19:10:28.311441194+00:00 Task reset for retry from failed (attempt #3)
2026-04-26T19:10:30.732581880+00:00 Spawned by coordinator --executor native --model opus
2026-04-26T19:10:30.761514161+00:00 Task marked as failed: Agent exited with code 1
2026-04-26T19:43:06.336238674+00:00 Task reset for retry from failed (attempt #4)
2026-04-26T19:43:08.589332634+00:00 Spawned by coordinator --executor claude --model opus
2026-04-26T19:43:17.744882766+00:00 Starting removal of --validation flag from CLI/prompts/docs
2026-04-26T20:03:51.640033453+00:00 Implementation complete: hidden --validation/--validator-agent/--validator-model on add+edit (deprecation, no-op + warning); removed --validation references from quickstart, executor prompts, decomposition templates, spawn/context, coordinator_prompt_fallback, config comment. Added integration_remove_validation_cli.rs with 5 tests (all pass).
2026-04-26T20:04:05.048100041+00:00 Validated: cargo build pass; full cargo test 2818 pass / 1 fail (provenance_full_lifecycle, pre-existing per main, unrelated); 5/5 new tests pass; wg add --help has no --validation; wg quickstart describes ## Validation section path with no flag; wg add --validation=llm prints deprecation warning and proceeds as no-op.
2026-04-26T20:04:40.705935651+00:00 Committed: 05fde8aba — pushed to remote
2026-04-26T20:04:51.331386750+00:00 Task marked as done