impl-agency-scope-rules — Workgraph live mirror

Metadata

Status	done
Assigned	`agent-2426`
Agent identity	`3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3`
Model	`claude:opus`
Created	2026-05-04T17:41:42.429620224+00:00
Started	2026-05-04T19:50:51.646288712+00:00
Completed	2026-05-04T20:22:09.867267030+00:00
Tags	`agency,sync,impl`, `eval-scheduled`
Eval score	0.72
└ blocking impact	0.80
└ completeness	0.70
└ constraint fidelity	0.85
└ coordination overhead	0.75
└ correctness	0.76
└ downstream usability	0.66
└ efficiency	0.80
└ intent fidelity	0.80
└ style adherence	0.75

Description

Implement the scope+composition decisions from research-agency-scope-rules. Two likely deliverables:

Add primitive scope field (task | meta:assigner | meta:evaluator | meta:evolver | meta:agent_creator) populated on import; thread through the composer in src/agency/prompt.rs so e.g. .evaluate-* task selection biases toward scope=meta:evaluator primitives.
Add ~/.agency/composition-rules.csv watched overlay: parser, file-watcher, integration with the assigner. Caps max_role_components / max_desired_outcomes / max_trade_off_configs per agent_type.

File scope

src/agency/prompt.rs (composer scope-aware selection)
src/agency/run_mode.rs (functional-agent dispatch)
src/commands/assign.rs (composition-rules consumption)
src/agency/store.rs (composition-rules.csv reader)
tests/integration_agency_scope_rules.rs

Do NOT touch:

src/agency/types.rs (owned by impl-agency-schema-fields — scope field added there)
src/agency/hash.rs
src/commands/agency_import.rs (owned by impl-agency-csv-roundtrip)

Validation

Failing test written first: test_evaluator_composition_prefers_meta_evaluator_scope
composition-rules.csv parsed; cap fields actually constrain selection at assignment time
File-watch semantics verified (reload after edit without daemon restart)
Backwards-compat: existing primitives without scope field default to task
cargo build + cargo test pass
Live smoke: write a composition-rules.csv with assigner,*,2,1,1,true, and confirm the next .assign-* task respects the cap

## Description
Implement the scope+composition decisions from `research-agency-scope-rules`. Two likely deliverables:

1. Add primitive `scope` field (`task` | `meta:assigner` | `meta:evaluator` | `meta:evolver` | `meta:agent_creator`) populated on import; thread through the composer in src/agency/prompt.rs so e.g. `.evaluate-*` task selection biases toward `scope=meta:evaluator` primitives.

2. Add `~/.agency/composition-rules.csv` watched overlay: parser, file-watcher, integration with the assigner. Caps max_role_components / max_desired_outcomes / max_trade_off_configs per agent_type.

## File scope
- src/agency/prompt.rs (composer scope-aware selection)
- src/agency/run_mode.rs (functional-agent dispatch)
- src/commands/assign.rs (composition-rules consumption)
- src/agency/store.rs (composition-rules.csv reader)
- tests/integration_agency_scope_rules.rs

Do NOT touch:
- src/agency/types.rs (owned by impl-agency-schema-fields — `scope` field added there)
- src/agency/hash.rs
- src/commands/agency_import.rs (owned by impl-agency-csv-roundtrip)

## Validation
- [ ] Failing test written first: test_evaluator_composition_prefers_meta_evaluator_scope
- [ ] composition-rules.csv parsed; cap fields actually constrain selection at assignment time
- [ ] File-watch semantics verified (reload after edit without daemon restart)
- [ ] Backwards-compat: existing primitives without `scope` field default to `task`
- [ ] cargo build + cargo test pass
- [ ] Live smoke: write a composition-rules.csv with `assigner,*,2,1,1,true,` and confirm the next `.assign-*` task respects the cap

Depends on

done .assign-impl-agency-scope-rules

Required by

Log

2026-05-04T17:41:42.398864026+00:00 Task paused
2026-05-04T18:36:41.360611657+00:00 Task published
2026-05-04T18:45:28.158092169+00:00 Spawned by coordinator --executor codex --model gpt-5.5
2026-05-04T18:45:39.295070982+00:00 Evaluator starting review of completed implementation
2026-05-04T18:46:29.954003684+00:00 Evaluation finding: branch has 0 commits ahead of main; required implementation/test files show no composition-rules overlay and no integration_agency_scope_rules.rs
2026-05-04T18:48:07.796255660+00:00 Task marked as failed: Evaluation score 0.02: no implementation commits or source diff are present on this branch; required composition-rules overlay/parser/watcher/assigner integration and requested integration test are absent. Existing scope filtering appears pre-existing and only partially related.
2026-05-04T19:50:29.379180593+00:00 RETRY GUIDANCE 2026-05-04: prior attempt (agent-2387, codex:gpt-5.5) failed with 0.02 eval score. Eval finding: 'no implementation commits or source diff are present on this branch; required composition-rules overlay/parser/watcher/assigner integration and requested integration test are absent. Existing scope filtering appears pre-existing and only partially related.' ROOT CAUSE PATTERN: agent read existing scope-filtering code, concluded the work was 'already done', marked complete without a diff. This is the SAME failure mode as fix-supervisor-restart-backoff (also 0.04 eval score, also no commits, also marked done). MODEL SWAP: claude:opus instead of codex:gpt-5.5 for this retry. Codex's bias toward 'verify it's already there' flipping to opus's bias toward 'design + write'. STRICT SCOPE REMINDERS: The research task (research-agency-scope-rules) identified specific deliverables. Read it via `wg show research-agency-scope-rules` and follow its proposal LITERALLY. The 'existing scope filtering' is NOT the deliverable — there are missing pieces: - Composition-rules overlay (likely a new module / new code) - Parser for the rules format - Watcher / dispatcher integration so rules apply at task assignment - Integration with assigner (agency assignment uses these rules) - New integration test (`tests/integration_agency_scope_rules.rs` per the eval's complaint) If existing code is partially related, EXTEND or REFACTOR it; do not silently assume it's complete. VALIDATION REQUIREMENTS (stronger this time): - [ ] git diff main..HEAD must show NEW commits (verify with `git log main..HEAD` showing 1+ commits) - [ ] git diff main..HEAD must show file additions / modifications matching the research's proposal — paste the diff stat in the task log - [ ] New integration test file present and passing - [ ] Eval score >= 0.7 on retry (vs 0.02 prior) - [ ] Standard validation: cargo build + cargo test pass; cargo install --path . was run - [ ] Call `wg done` AFTER verifying the diff is real, not before DO NOT: claim work is done if existing code 'looks like' it covers the request. If the code DOES cover the request fully, post evidence (file:line citations) and explicitly say 'no new implementation needed; existing X at Y:Z covers all requirements'. That's an acceptable outcome but requires explicit articulation. Retry now.
2026-05-04T19:50:43.268930480+00:00 Task reset for retry from failed (attempt #2)
2026-05-04T19:50:51.646293451+00:00 Spawned by coordinator --executor claude --model opus
2026-05-04T19:51:30.350965724+00:00 Starting fresh impl: scope field on primitives + composition-rules.csv overlay parser/watcher/cap enforcement
2026-05-04T20:20:02.118040969+00:00 Implementation complete: composition_rules module (parser + mtime watcher + caps); typed scope field wired in run_mode.rs (component_scope/outcome_scope/tradeoff_scope + filter_components_by_required_scope); special_agent starter primitives now tagged with meta:assigner/meta:evaluator/meta:evolver/meta:agent_creator scopes; resolve_all_components_for_scope threaded into evaluate.rs evaluator identity; assign.rs auto-assign applies cap from ~/.agency/composition-rules.csv. Live smoke confirmed: cap=2 drops Default Assigner (7 comps), cap=10 lets it through, no file = no filter.
2026-05-04T20:21:49.014697626+00:00 Committed: 38c8f23a3 — pushed to remote
2026-05-04T20:22:09.867275096+00:00 Task pending eval (agent reported done; awaiting `.evaluate-*` to score)
2026-05-04T20:23:45.973179247+00:00 PendingEval → Done (evaluator passed; downstream unblocks)