merge-archive-into-graph

Merge archived tasks back into graph.jsonl so wg html renders them

Metadata

Statusdone
Assignedagent-621
Agent identityf51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e
Created2026-05-01T20:51:59.350609177+00:00
Started2026-05-01T20:52:29.446390349+00:00
Completed2026-05-01T21:06:08.328565287+00:00
Tagsmerge, wg-recovery, eval-scheduled
Eval score0.84
└ blocking impact0.95
└ completeness0.95
└ constraint fidelity0.70
└ coordination overhead0.90
└ correctness0.91
└ downstream usability0.93
└ efficiency0.87
└ intent fidelity0.81
└ style adherence0.90

Description

Goal

The 113 user-facing project tasks (TUBB8/OR4F deep research, copy-number-weighted ORA arc, pipeline steps, syntheses, doc fixes) were auto-archived at 2026-05-01T19:52Z and now live in .wg/archive.jsonl. They are invisible to wg html / wg viz because those tools only read graph.jsonl. Goal: get them back into graph.jsonl so wg html shows the full research dependency graph.

Why not use wg archive --undo / wg archive restore

Both are buggy in this install:

  • wg archive --undo reproducibly corrupts archive.jsonl (3 malformed lines per invocation, records concatenated without newlines). Verified twice in a row.
  • wg archive restore claims success but the task never appears in wg list/show. Daemon log shows tasks_ready=3, spawned=0 plus broken-pipe errors — likely stale daemon in-memory state.

Do the merge directly via JSONL file manipulation instead.

Inputs (all in /moosefs/erikg/phrs/.wg/)

  • archive.jsonl (113 valid task records, all kind:task) — recovered version, verified 0 parse errors
  • graph.jsonl (457 system task records — .coordinator-/.assign-/.evaluate-/.flip-)
  • archive-last-batch.json (canonical list of the 113 archived IDs)
  • Existing backups: archive.jsonl.recovered, graph.jsonl.bak-2026-05-01, archive.jsonl.broken-2026-05-01, archive.jsonl.corrupt-pre-swap, archive.jsonl.corrupt-undo-2

Procedure

  1. wg service stop (daemon won't see external file edits while running)
  2. cp graph.jsonl graph.jsonl.pre-merge-$(date -u +%Y%m%dT%H%M%SZ)
  3. cp archive.jsonl archive.jsonl.pre-merge-$(date -u +%Y%m%dT%H%M%SZ)
  4. Python script:
    • Read each line of archive.jsonl as JSON
    • For each record, drop archive-only bookkeeping keys: unplaced, last_resurrected_at, resurrection_count, session_id, triage_count
    • Append the cleaned record to graph.jsonl as one JSON object per line, terminated by \n
    • Use json.dumps(obj, separators=(',',':')) to keep records compact and free of embedded newlines
  5. Verify integrity: every line of graph.jsonl is now valid JSON (json.loads each line); total task count is 570 (457 + 113); all 113 IDs from archive-last-batch.json appear in graph.jsonl
  6. Truncate archive.jsonl to empty (or rename to archive.jsonl.merged-out-$(date)), and likewise archive-last-batch.json
  7. wg service start — wait until 'wg service status' reports running
  8. Run: wg html --out /moosefs/erikg/phrs/public (no --since filter; we want the full graph)
  9. Spot-check the rendered HTML: grep for these IDs in public/index.html or task pages: investigate-great-rgreat, integrate-final-recommendations, executive-summary-combined, deep-research-tubb8, fact-check-validate

Acceptance

  • wg list --status done returns >= 113 user-facing (non-dot-prefixed) tasks
  • wg show investigate-great-rgreat returns the task body (currently 'Task not found')
  • wg html --out ./public completes without error and public/index.html exists
  • The HTML index references at least 113 of the previously-archived task IDs
  • archive.jsonl is empty (or moved aside)
  • All pre-merge backup files preserved on disk

Side investigation (log findings to task log; do NOT block acceptance on these)

  • Check the wg config for any auto-archive / GC policy that fired today at 19:52 unprompted. Look at: wg config get, .wg/config.toml. Note any threshold settings. Goal: identify whether this can be disabled.
  • Check the daemon log for broken-pipe errors and tasks_ready/spawned mismatch: tail /moosefs/erikg/phrs/.wg/service/daemon.log

Depends on

Required by

Log