Metadata
| Status | done |
|---|---|
| Assigned | agent-1214 |
| Agent identity | f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e |
| Model | claude:sonnet |
| Created | 2026-04-30T02:05:06.393358906+00:00 |
| Started | 2026-04-30T02:14:16.803892361+00:00 |
| Completed | 2026-04-30T02:24:45.099086657+00:00 |
| Tags | fix,docs,audit, eval-scheduled |
| Eval score | 0.89 |
| └ blocking impact | 0.95 |
| └ completeness | 0.88 |
| └ coordination overhead | 0.90 |
| └ correctness | 0.95 |
| └ downstream usability | 0.85 |
| └ efficiency | 0.90 |
| └ intent fidelity | 0.81 |
| └ style adherence | 0.95 |
Description
Description
KEY_DOCS.md is the docs index. Audit for: dead links (refs to removed files), non-archive docs not indexed, descriptions stale relative to file content, design vs contributor-only categorization.
Baseline
Determine baseline = last commit that touched this file by a human (heuristic: commits NOT authored by an agent — agent commits have agent IDs in commit messages; human commits do not). If unclear, fall back to 2026-04-12 (date of prior doc-sync-audit doc).
Audit scope = ALL changes to wg behavior / commands / config / state-machine / etc. SINCE that baseline. Not just today. Look at git log between baseline and HEAD.
Output (no source/doc changes)
Post findings via `wg log ` as a structured list of deltas. Each delta = one specific update needed in the audited file. Format:
- Section X: <what is currently said vs what should be said, with citation>
- Missing: <feature/command/concept not mentioned>
- Stale:
- Inconsistent: <conflicts with another doc — note which>
The synthesis task (doc-sync-audit) reads all audit logs and applies updates. This task DOES NOT MODIFY FILES.
Validation
- Baseline commit identified (cite SHA or fall back to 2026-04-12 with reasoning)
- Delta list posted via wg log with structured entries
- No source / doc modifications — audit only
Depends on
Required by
Log
- 2026-04-30T02:05:06.374837289+00:00 Task paused
- 2026-04-30T02:07:26.877693740+00:00 Task published
- 2026-04-30T02:08:01.599429934+00:00 Lightweight assignment: agent=Default Evaluator (31847164), exec_mode=light, context_scope=task, reason=Audit tasks are fundamentally evaluations requiring documentation analysis and structured assessment; Evaluator has proven performance (0.86 score, 377 tasks) in evaluation work with read-only file access.
- 2026-04-30T02:14:16.803897981+00:00 Spawned by coordinator --executor claude --model sonnet
- 2026-04-30T02:14:30.293940861+00:00 Starting audit of docs/KEY_DOCS.md — checking messages, finding baseline commit
- 2026-04-30T02:23:32.059057756+00:00 Baseline identified: commit 95641d34c (2026-04-12), the last doc-sync agent commit for KEY_DOCS.md. All agent commits identified by parenthetical task-ID suffix. 552 commits since baseline. Audit scope is 2026-04-12 to HEAD.
- 2026-04-30T02:24:26.503688083+00:00 AUDIT FINDINGS — docs/KEY_DOCS.md Baseline: commit 95641d34c (2026-04-12, last doc-sync agent commit to KEY_DOCS.md) Audit scope: 2026-04-12 → HEAD (552 commits) --- ## Dead Links All 4 existing dead links (docs/plans/federation-and-distributed-sync.md, docs/reports/bug-report-dynamic-model-list-browsing.md, docs/research/primitive-pool-sync.md, docs/research/ranked-model-list.md) are already correctly marked with strikethrough in KEY_DOCS. No NEW dead links found. --- ## Missing: New files not indexed (since baseline) ### Embedded Documentation (high priority — agent entry points) - **Missing:** `AGENTS.md` — AI coding assistant contract (GitHub Copilot, Cursor, Claude Code non-wg context); contains chat-agent role contract, task-description requirements, smoke gate rules, new dispatcher/chat-agent/worker-agent glossary. Comparable to CLAUDE.md in role. Audience: AI agents. ### User-Facing Documentation - **Missing:** `docs/config-canonical.md` — Canonical config key audit: what to keep, what's stale, what each key does. Audience: Operators, contributors. - **Missing:** `docs/config-ux-design.md` — Config UX design: `wg config init`, `wg setup` improvements, migration strategy, TUI integration deferral. CLAUDE.md now refers to this as the canonical config reference. Audience: Users, operators, contributors. ### Design Documents (new, 2026-04-13 to 2026-04-29) - **Missing:** `docs/design/sessions-as-identity.md` — Unified handler/session model: claude, codex, nex as a common "session" abstraction | Design - **Missing:** `docs/design/sessions-as-identity-rollout.md` — Staged rollout plan for sessions-as-identity | Design - **Missing:** `docs/design/native-executor-run-loop.md` — Nex unified run loop + interruptibility + compaction design | Design - **Missing:** `docs/design/nex-as-coordinator.md` — Nex = task = evaluate = coordinate unification design | Design - **Missing:** `docs/design/nex-executor-improvements.md` — wg nex executor improvements toward self-bootstrapping | Design - **Missing:** `docs/design/nex-web-access-status-2026-04-16.md` — Nex web access status + next steps (2026-04-16) | Reference - **Missing:** `docs/design/verify-deprecation-plan.md` — --verify deprecation plan (--verify now errors; use ## Validation) | Implemented - **Missing:** `docs/design/llm-verification-gate.md` — LLM-based verification gate design (replaces --verify) | Design - **Missing:** `docs/design/model-config-propagation.md` — Model/config propagation from seed tasks to subgraphs | Design - **Missing:** `docs/design/pdf-binary-failure-handling.md` — PDF/binary attachment failure handling design | Implemented - **Missing:** `docs/design/chat-agent-persistence.md` — Chat agent persistence: tmux wrapper vs codex fix vs custom detach | Design - **Missing:** `docs/design/external-executor-class.md` — External executor class: claude, codex, amplifier as uniform family | Design - **Missing:** `docs/design/unified-path-forward.md` — Unified path forward plan (post-tool-fragmentation hardening) | Reference - **Missing:** `docs/design/tool-scoping-for-agents-research.md` — Tool scoping for native-executor agents research | Research - **Missing:** `docs/design-actor-driven-cleanup.md` — Actor-driven worktree cleanup: status-as-cleanup-intent design | Design - **Missing:** `docs/design-cow-worktrees.md` — Copy-on-write worktrees for agent isolation | Design - **Missing:** `docs/design-merge-task.md` — .merge-* tasks: visible, retryable, batchable merge surface | Design - **Missing:** `docs/design-agency-tasks-on-claude.md` — Migrate .evaluate/.flip/.assign tasks to claude CLI path | Reference - **Missing:** `docs/design-named-profiles.md` — Named profiles for runtime model/endpoint switching | Design (implemented in wg profile) - **Missing:** `docs/archival-design.md` — Archival behavior + cross-graph overlay design | Design ### Research Documents (new) - **Missing:** `docs/research/agent-lifecycle-and-kill-mechanics.md` — Agent lifecycle and kill mechanics research | Research - **Missing:** `docs/research/agent-wg-awareness.md` — Agent self-discovery of wg (for nex/Harbor context) | Research - **Missing:** `docs/research/context-window-determination.md` — Context length determination for OAI-compatible endpoints | Research - **Missing:** `docs/research/eval-wait-points.md` — When should agents wait on evaluation before terminating? | Research - **Missing:** `docs/research/model-propagation-subgraphs.md` — Model propagation within subgraphs | Research - **Missing:** `docs/research/qwen3-nex-config.md` — Qwen3 endpoint configuration for wg nex | Research - **Missing:** `docs/research/shell-verify-vs-llm-eval-gap.md` — Shell --verify vs LLM agency evaluation gap analysis | Research - **Missing:** `docs/research/thin-wrapper-executors-2026-04.md` — Thin-wrapper executors for wg nex (recommendation: pty-wrap OAI CLI) | Research - **Missing:** `docs/research/tui-detail-audit.md` — TUI 1:Detail view + evaluation visibility audit | Research - **Missing:** `docs/research/tui-evaluation-color-state.md` — TUI evaluation-state coloring (pink vs chartreuse) research | Research - **Missing:** `docs/research/verify-deprecation-survey.md` — --verify deprecation survey (factual basis for deprecation) | Research - **Missing:** `docs/research/README.md` — Research directory README (audience: contributors only) | Reference ### Designs Directory - **Missing:** `docs/designs/README.md` — Designs directory README (audience: contributors only) | Reference ### Plan Documents - **Missing:** `docs/plan-of-attack-wg-nex.md` — Plan of attack: wg nex + TerminalBench + Harbor (2026-04-21) | Reference ### Audit / Report Documents - **Missing:** `docs/audit-recovery-docs-2026-04-27.md` — Audit: recovery + outage workflows in agent-visible docs | Current - **Missing:** `docs/audit-unmerged-branches-2026-04-25.md` — Audit: unmerged agent branches (2026-04-25, read-only) | Current - **Missing:** `docs/audit-unmerged-branches-2026-04-26.md` — Audit: unmerged agent branches (2026-04-26, execution + root-cause) | Current - **Missing:** `docs/tui-coordinator-resumption-findings.md` — TUI coordinator resumption investigation findings | Current - **Missing:** `docs/codex-handler-merge-bug.md` — Bug: wg done silently drops staged-uncommitted worktree changes | Current - **Missing:** `docs/triage-wg-nex-small-model-reports-2026-04-27.md` — Triage: 3 small-model bug reports from wg nex (Qwen3) | Current ### Archive Section - **Missing:** `docs/archive/2026-04-17-rescued/` — Rescued engineering work from accidentally-lost commit eef15157; INDEX.md explains provenance. Should be added to Archive section. --- ## Stale: Descriptions that need updating - **Section User-Facing, docs/AGENT-SERVICE.md:** currently "Service daemon architecture: coordinator tick, dispatch cycle, agent lifecycle". Since baseline: `--executor` flag removed from `wg service start`/`wg service reload` (now derived from model spec), `wg service interrupt-coordinator` added, coordinator persistence (no-duplicate-on-restart) documented. Suggest: "Service daemon architecture: coordinator tick, dispatch cycle, agent lifecycle, model-spec handler routing". - **Section User-Facing, docs/AGENT-GUIDE.md:** currently "How spawned agents should think about task graphs: patterns, structures, anti-patterns". Since baseline: `wg retry` now also handles in-progress hung agents (SIGTERM → SIGKILL → reset); `wg agents kill` documented. Suggest: "How spawned agents should think about task graphs: patterns, structures, retry/kill mechanics, anti-patterns". - **Section User-Facing, docs/AGENT-LIFECYCLE.md:** currently "Hardened agent lifecycle: spawn, work, complete, die — full state machine". Since baseline: target-dir reaper section added (3 cleanup paths: inline at exit, coordinator periodic, wg sweep --reap-targets); coordinator persistence section added (section 10). Still accurate overall. - **Section User-Facing, docs/models.md:** currently "Model, endpoint, and API key management guide". CLAUDE.md now refers users to docs/config-ux-design.md as the canonical config reference, which is more up-to-date for config UX. Consider noting docs/config-ux-design.md as the companion/successor doc for config UX topics. - **Section Design, docs/design/native-executor.md:** status "Design". New docs (docs/design/native-executor-run-loop.md, docs/design/nex-as-coordinator.md, docs/design/nex-executor-improvements.md) are the current design documents. docs/design/native-executor.md may be superseded. Status should be noted as "Superseded" or "Historical". --- ## Inconsistent: Conflicts with other docs - **Inconsistent (AGENTS.md vs code):** AGENTS.md line "Use `--verify` to attach machine-checkable criteria" in the task-description template is wrong — `--verify` now errors out at runtime (see src/commands/add.rs: "--verify is deprecated and no longer accepted"). AGENTS.md template should use `## Validation` section instead. This conflicts with docs/design/verify-deprecation-plan.md and the actual binary behavior. - **Inconsistent (docs/COMMANDS.md vs code):** COMMANDS.md still documents `--verify <CRITERIA>` for wg add (line 55) and wg edit (line 154) as valid flags, and shows `wg add ... --verify "..."` examples (line 108). But --verify now errors. This conflicts with actual binary behavior and with docs/design/verify-deprecation-plan.md. - **Inconsistent (KEY_DOCS vs COMMANDS.md):** KEY_DOCS describes COMMANDS.md as "Complete CLI command reference with examples" — but COMMANDS.md is missing `wg secret` (new credential management command in src/commands/secret_cmd.rs) and `wg nex` (interactive multi-turn REPL command in src/commands/nex.rs). The "complete" claim is no longer accurate for these commands. --- ## Categorization - `docs/config-canonical.md` and `docs/config-ux-design.md` belong in **User-Facing Documentation** (operators), not "Other Documentation", since they serve users configuring wg. - `docs/designs/README.md` and `docs/research/README.md` are navigational READMEs; they belong in their respective sections (Design Documents and Research Documents). - `AGENTS.md` belongs in **Embedded Documentation** alongside `CLAUDE.md`, not "Other Documentation". - `docs/design-named-profiles.md` (root-level) is separate from `docs/research/profile-research.md` (already indexed) and `docs/plans/provider-profiles.md` (already indexed). It's the implementation design doc and should be in Design Documents.
- 2026-04-30T02:24:36.351358469+00:00 Validation: (1) Baseline commit 95641d34c identified and cited. (2) Delta list posted via wg log with structured entries covering dead links, missing, stale, inconsistent, and categorization issues. (3) No source or doc files modified — audit only.
- 2026-04-30T02:24:45.099114068+00:00 Task pending eval (agent reported done; awaiting `.evaluate-*` to score)
- 2026-04-30T02:26:09.113677011+00:00 PendingEval → Done (evaluator passed; downstream unblocks)