design-agency-sync

Design: agency sync with agentbureau/agency — autopoietic spark, fan out the alignment work

Metadata

Statusdone
Assignedagent-2342
Agent identity3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3
Modelclaude:opus
Created2026-05-04T17:28:58.772994839+00:00
Started2026-05-04T17:31:42.322841408+00:00
Completed2026-05-04T17:48:28.614914201+00:00
Tagspriority-high,design,autopoietic,agency,sync, eval-scheduled
Eval score0.93
└ blocking impact0.94
└ completeness0.96
└ coordination overhead0.92
└ correctness0.94
└ downstream usability0.97
└ efficiency0.89
└ intent fidelity0.88
└ style adherence0.93

Description

Description

Workgraph's agency system (roles, tradeoffs, agents, evaluation, evolve loop) needs deep alignment with the agentbureau/agency repo on GitHub. User wants exact match in the agent definition pipeline. The asymmetry between the two systems is producing drift in how agents are defined and evolved.

User direct quote 2026-05-04: 'we should have an agency sync task with https://github.com/agentbureau/agency we want to exactly match it in the agent definition pipeline. needs deep alignment. assign codex:gpt-5.5 to everything. or at least, we should spawn an autopoietic spark that does the fanout.'

This task is the autopoietic spark. Same pattern as design-nex-chat: this design investigates, identifies deltas, then FILES the fan-out subgraph (--paused) of research + impl + synthesis tasks. Chat agent (next user prompt) calls wg publish <root> --wcc to release.

Investigation areas

1. Read the agentbureau/agency repo

Start at https://github.com/agentbureau/agency. Map the agent definition pipeline:

  • How are roles defined? Schema, fields, validation
  • How are tradeoffs / motivations defined?
  • How are agents (role + tradeoff) instantiated?
  • How are evaluations performed?
  • How is the evolve loop structured?
  • Persistence format, file layout, content-hash identity, federation primitives
  • Any concrete CSV / YAML / TOML / JSON schemas the project ships

2. Compare with workgraph's current agency

Workgraph's agency system already exists (per CLAUDE.md):

  • Roles, tradeoffs, agents in .wg/agency/
  • FLIP scoring, evaluation pipeline
  • Federation via content-hash IDs

Diff each:

  • Where do field names differ?
  • Where do file formats differ?
  • Where are concepts in one system but not the other?
  • Where are the workgraph implementations stricter or looser than agentbureau/agency expects?

3. Identify deep-alignment delta

Per user's 'deep alignment' framing: aim to make workgraph's pipeline a strict superset OR exact match of agentbureau/agency's. If superset, the agentbureau/agency primitives drop in unchanged; if exact match, the two are interoperable.

Note any places where workgraph has GOOD reasons to diverge (e.g., domain-specific evaluation thresholds). Document those as intentional, not as drift.

Subgraph the design must file (autopoietic deliverable)

After investigation, file these subgraph tasks via wg add --paused --no-place:

Research fan-out (parallel, all --model claude:opus)

  • 1+ research tasks per major delta area (schemas, evaluation, evolve loop, federation, etc.)
  • Each posts findings via wg log with file:line citations + concrete fix proposal

Implementation fan-out (parallel where possible, all --model codex:gpt-5.5)

  • 1 impl task per research item
  • Each implements the alignment for its area
  • Cross-tested via integration smoke

Cross-model peer review (parallel after each impl, all --model claude:opus)

  • 1 peer-review task per impl, mirroring the design-nex-chat pattern
  • Reviewer reads the impl's diff + smoke output, posts concur/concern verdict

Integration impl (single, --model codex:gpt-5.5)

  • Wires all the area-fixes together; ensures workgraph's agency exactly matches agentbureau/agency's pipeline shape

Fan-in synthesis (single, --model claude:opus)

  • Verifies cross-area composition
  • Runs an end-to-end smoke: a workgraph project's agency primitives are byte-for-byte loadable by agentbureau/agency tooling (or vice versa, per the chosen alignment direction)

Deliverable

wg log entry on this task with:

  • Investigation findings (current state of both systems + delta map)
  • Subgraph shape decided (specific task list with names + dependencies + rationale for parallel-vs-serial)
  • All sub-tasks filed via wg add --paused --no-place
  • Final note: 'subgraph filed, ready for publish — chat agent should run wg publish <root> --wcc to release'

Validation

  • agentbureau/agency repo investigated, key primitives documented
  • Delta map produced (current workgraph vs target alignment)
  • Subgraph filed: research + impl + peer-review + integration + synthesis tasks all present
  • Subgraph dependencies wired correctly (--after chains for the right ordering)
  • Per-task model assigned per the pattern (research=opus, impl=codex:gpt-5.5, peer-review=opus, integration=codex:gpt-5.5, synthesis=opus)
  • Subgraph remains paused (--paused) — chat agent releases via wg publish in next user turn
  • No source / doc modifications outside filing the subgraph
  • Design doc posted via wg log

Coordinate

This is unrelated to the in-flight README chain. Different code surface, different concern. Should run in parallel with that chain without merge conflicts (agency code vs README docs).

Process note

Same autopoietic pattern as design-nex-chat. Worth extracting as a wg func once it lands — 'design-and-fanout' becomes a reusable function for any 'investigate then build a subgraph' pattern. Out of scope for this task.

Depends on

Required by

Log