design-review-agency

Design review: agency-pipeline constraint-fidelity lint (agent-337)

Metadata

Statusdone
Assignedagent-168
Agent identity3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3
Created2026-04-26T23:13:10.252404610+00:00
Started2026-04-26T23:29:19.969788272+00:00
Completed2026-04-26T23:48:32.965567447+00:00
Tagseval-scheduled
Eval score0.91
└ blocking impact0.90
└ completeness0.95
└ coordination overhead0.90
└ correctness0.95
└ downstream usability0.88
└ efficiency0.80
└ intent fidelity0.37
└ style adherence0.90

Description

Description

Archive ref refs/archive/wg/agent-337/agency-pipeline-should adds a constraint-fidelity lint to the agency evaluation pipeline:

  • New file: src/agency/constraint_fidelity.rs (+543 lines)
  • Modifies: src/agency/mod.rs, src/agency/prompt.rs, src/agency/types.rs, src/commands/evaluate.rs, src/commands/show.rs
  • Adds tests in: tests/integration_agency.rs, tests/integration_verify_first.rs, tests/prompt_snapshots.rs, tests/test_prompt_from_components.rs
  • Total: +762 lines across 10 files

The cherry-pick-valuable task explicitly flagged this as needing design review, not blind cherry-pick. The audit doc docs/audit-unmerged-branches-2026-04-26.md should be consulted.

Required scope of review

  1. Read the diff: git diff $(git merge-base main refs/archive/wg/agent-337/agency-pipeline-should)..refs/archive/wg/agent-337/agency-pipeline-should
  2. Review the new constraint_fidelity.rs module — what does it do, how does it integrate with existing FLIP scoring, does it duplicate any existing logic?
  3. Check whether main has since added similar functionality (the agency module evolved during the wave-1 evaluation work)
  4. Decide: cherry-pick as-is / cherry-pick with modifications / redesign / drop
  5. If accepting, do the cherry-pick (+ resolve any conflicts) and add tests

Validation

  • Design decision documented in this task description
  • If proceeding: cherry-pick committed, cargo build + cargo test pass with no NEW failures (pre-existing fix-wg-done failures excluded)
  • If dropping: rationale documented; archive ref left in place for future re-evaluation

Design Decision: CHERRY-PICK AS-IS (committed 7b47846c9)

Source ref: refs/archive/wg/agent-337/agency-pipeline-should (commit 04a8d0c6e)

What it adds

  • New module src/agency/constraint_fidelity.rs (+543 lines, 19 unit tests)
  • Deterministic regex-based lint that detects orchestrator-fabricated gating constraints in task descriptions:
    • prohibition (do not, don't, must not, shall not, should not + verb)
    • absolute_never (never + verb)
    • gating_action (leave as draft, wait for review, drafts only, require approval, etc.)
    • restrictive_conditional (only if/when/after, not until)
  • Heuristic find_originating_user_message() that scans <dir>/chat/*/inbox.jsonl for user-role messages within a 60s-before / 600s-after window of task creation
  • Concept-anchoring with adjacency: e.g. "draft" anchors "publishing" constraints, "gating" anchors "restriction"
  • Wires constraint_fidelity as a deterministic dimension into the agency evaluation pipeline (parallel to intent_fidelity/FLIP):
    • New eval_source::CONSTRAINT_FIDELITY constant
    • Two new fields on EvaluatorInput: constraint_fidelity_score, constraint_fidelity_unanchored
    • Mechanically injected into evaluator prompt with explicit "do not score this yourself" note
    • Surfaced in wg show output (with yellow flag if score < 0.5)

Why cherry-pick (vs redesign or drop)

  1. No overlap with main — the agency module evolved during wave-1 evaluation work, but intent_fidelity/FLIP measures self-consistency of description vs agent behavior; both inherit fabricated constraints, so they look consistent. Constraint-fidelity is the missing complementary lens.
  2. Zero conflictsgit merge-tree --no-messages main 04a8d0c6e produced no conflict markers; cherry-pick auto-merged cleanly.
  3. Real failure mode this targets — the existing memory note feedback_autopoietic_means_publish.md documents that the orchestrator was historically inserting "leave as draft" gating that wasn't in the user's request. This lint is purpose-built to catch exactly that.
  4. Modest, well-isolated — +762 lines, almost entirely in one new module. The 17 call-site additions to EvaluatorInput constructors are mechanical (constraint_fidelity_score: None, constraint_fidelity_unanchored: None).
  5. Free at runtime — pure-regex deterministic lint, no LLM call.
  6. Generous false-positive avoidancetest_normal_task_description_no_false_positives, test_validation_section_not_flagged, test_should_not_match_only_alone all cover common-case task descriptions.

Known limitations (acceptable for initial deployment, can calibrate later)

  • The generic_gating benefit-of-the-doubt path is permissive: if the user message contains ANY restriction language ("don't deploy until tests pass"), all constraints in the description get auto-anchored. This will under-report fabrication when the user gave one constraint but the orchestrator added five. Calibration data from production runs can tighten this.
  • find_originating_user_message is best-effort and untested at integration level (only the lint logic has unit tests). Returns None if chat dir layout shifts or task created via CLI without chat context — graceful degradation to standalone mode.
  • Standalone-mode score floor of 0.1 means 6+ unanchored constraints all score the same.

Validation

  • cargo build: passes (warnings only, all pre-existing)
  • cargo test --lib: 1971/1971 pass
  • cargo test --lib agency::constraint_fidelity: 19/19 pass
  • cargo test --test integration_agency: 5/5 pass
  • cargo test --test integration_verify_first: 8/8 pass
  • cargo test --test prompt_snapshots snapshot_evaluator: 4/4 pass
  • cargo test --test test_prompt_from_components: 13/13 pass
  • The 5 snapshot_build_prompt_* failures in prompt_snapshots.rs are PRE-EXISTING on main (verified by running tests on HEAD~1 before this cherry-pick); they involve build_prompt rendering, not the evaluator prompt that agent-337 modifies.

Commit

7b47846c9 feat: add constraint-fidelity lint to agency evaluation pipeline Pushed to origin/wg/agent-168/design-review-agency.

Depends on

Required by

Log