audit-canonical-materials

Audit canonical materials vs ABSTRACT.md; produce REWRITE_PLAN.md (Nature companion + BoG-this-week)

Metadata

Statusdone
Assignedagent-747
Agent identity3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3
Created2026-05-05T23:11:13.132779574+00:00
Started2026-05-05T23:28:15.271074092+00:00
Completed2026-05-05T23:44:08.486594050+00:00
Tagspaper-prep, audit, rewrite-anchor, eval-scheduled
Eval score0.88
└ blocking impact0.88
└ completeness0.90
└ constraint fidelity0.85
└ coordination overhead0.86
└ correctness0.92
└ downstream usability0.87
└ efficiency0.82
└ intent fidelity0.87
└ style adherence0.88

Description

PRECONDITION: park-off-target task has completed. paper_prep/_brainstorming/ contains the off-target materials; paper_prep/synthesis/ now contains only canonical-or-undecided materials; paper_prep/synthesis/ABSTRACT.md is the anchoring document.

ANCHOR: paper_prep/synthesis/ABSTRACT.md — read this BEFORE doing anything else. Every classification, recommendation, and task-decomposition decision must serve the claims in that abstract.

VENUE & TIMELINE CONSTRAINTS (these change what counts as 'good'):

  • Target venue: NATURE, as a companion to the HPRC v2 main paper (pre-arranged for review alongside HPRC v2). This is non-negotiable framing for the introduction.
  • Submission target: ~2-3 weeks from today.
  • Lead author Erik Garrison is giving a talk at Biology of Genomes (BoG) THIS WEEK based on this work — figure improvements and synthesis material that strengthen the talk are HIGH PRIORITY and should be on a separate priority lane in the rewrite plan.
  • Nature formatting requirements (the rewrite plan must respect these — look up current Nature 'Article' format if uncertain): abstract length cap, main-text length cap, structured methods at the end (not interleaved), Extended Data figure conventions, reference cap, formatted display items. The plan does NOT need to reformat existing material in this task — it just needs to flag where current materials violate Nature constraints so future tasks address it.

KEY CLAIMS FROM ABSTRACT (the rewrite plan must show explicitly which task produces which claim): C1. HPRC v2 companion-paper framing — must be in intro, not buried. C2. Implicit pangenome graph: reference-free, all-to-all, ~12% pairwise sampling, no chromosomal partitioning. Methods + 1 figure or panel showing the approach. C3. 466 near-complete haplotype assemblies from HPRC v2 — dataset description. C4. Genome-wide identity survey: extended (10s–100s kb) interchromosomal homology at nearly all subtelomeres, comparable to PAR2. C5. NJ-tree cladistic analysis: expected (Xp/Yp, Xq/Yq via PARs, acrocentrics) + novel (10p–18p; the {22q,21q,19q,1q,13q,17q} clade; 4q–10q DUX4 with copy-number diversity; large moderate-similarity clade). C6. PCA + community detection on similarity matrix → subtelomere clustering across human populations. C7. Hi-C 3D maps testing nuclear-envelope-proximity recombination hypothesis. C8. Synthesizing thesis: ongoing recombination shapes subtelomeres → 'concerted evolution and unorthodox recombination'.

PROCESS FAILURE TO FIX: The previously-rendered MANUSCRIPT_DRAFT.pdf passed acceptance ('≥5 pages, file=PDF') but contained NO visible figures — useless to a human reader. Future render tasks MUST verify figures appear inline (e.g., by counting embedded image objects or visually screenshotting page-by-page).

TASK — produce three deliverables under paper_prep/synthesis/:

Deliverable 1: AUDIT_REPORT.md Sections in order: ## 1. Figures audit (PRIORITY — Erik needs this for BoG this week) For each fig{1..4} and ed{1..8} directory under paper_prep/figures/: - Open the figure assets, captions, source scripts, READMEs, notes — actually look at what the figure shows, not just the directory name. - For each figure, one row: Figure | What it actually shows | Maps to abstract claim (C1-C8)? | Aligned? (Y / partial / N) | Action: KEEP / MINOR-REVISE / REDO / SCRAP | Reasoning | BoG-talk value (high/med/low/none) End with: list of figures missing for the abstract's claims (e.g., is there an identity heatmap for C4? a NJ tree for C5? a PCA + communities figure for C6? a Hi-C map for C7?). ## 2. Synthesis docs audit For each .md file under paper_prep/synthesis/ (except ABSTRACT.md): Filename | Topic actually covered | Aligned with abstract? (Y/partial/N) | Salvageable content (specific section refs) | Reason for verdict ## 3. Data & code assets audit For each abstract claim C1-C8, list specific data files / scripts / pipelines in the repo that support it, OR mark 'MISSING — needs to be produced.' Look for: HPRC v2 dataset path, implicit pangenome graph alignments (.paf, .gfa, .vcf, etc.), similarity matrix, NJ tree files, PCA/community detection outputs, Hi-C matrices. ## 4. Brainstorming inventory (one-pager) Brief survey of paper_prep/_brainstorming/ — what's there, what (if anything) might plausibly serve as supplementary material or footnote-level evidence for any C-claim. Default expectation: nothing in _brainstorming/ becomes canonical. ## 5. Process-failure note Document the figures-not-rendered-in-PDF issue. Recommend the fix for future render tasks.

Deliverable 2: REWRITE_PLAN.md Structure: ## Constraints (top of file) - Target: Nature, companion to HPRC v2, submission ~2-3 weeks - Nature format limits (cite current Nature Article guidelines) - BoG talk this week — figure-improvement tasks tagged 'BoG-priority' ## Lane A: BoG-this-week (highest priority, deliverable within days) Tasks for figure improvements that materially strengthen the talk. Should be a small set (5-10 tasks max). ## Lane B: Manuscript-for-Nature (priority, 2-3 week horizon) Tasks for: methods writing, results sections (one per major claim), intro with HPRC-v2 companion framing, discussion with concerted-evolution thesis, references, Nature-format assembly, render-with-figures-verified. ## Total task count Constraint: 25-60 tasks total across both lanes. Stop at 60. If your plan would exceed 60, prune. ## Task entry template (use for every task): ### TASK-NN: Lane: A or B Inputs: <specific files / data sources> Output: <specific filename + format> Acceptance: <one checkable condition; for any render task, MUST include 'figures visible inline at expected positions, verified by image-object count >= N or page screenshots'> Depends on: <prior TASK-NN ids, or 'none'> ## Specific tasks that MUST appear: - A task that establishes HPRC v2 companion-paper framing in the intro - A task (or tasks) that write the implicit-pangenome-graph methods section (~12% pairwise sampling explanation) - One task per abstract claim C4-C7 producing the corresponding results subsection - A figure-render task that includes inline-figure verification

Deliverable 3: Commit message 'docs: AUDIT_REPORT + REWRITE_PLAN anchored on canonical ABSTRACT.md'

ACCEPTANCE:

  • paper_prep/synthesis/AUDIT_REPORT.md exists with all 5 sections; all figures + all synthesis .md files are in the tables.
  • paper_prep/synthesis/REWRITE_PLAN.md exists with both lanes; total task count between 25 and 60; every task has all 5 fields filled in; the explicitly-required tasks listed above are present.
  • Single commit with the specified message.
  • wg artifact records both files.

DO NOT in this task:

  • Rewrite any section of the manuscript
  • Redo any figure
  • Run any data analysis
  • Move any file (already done in park-off-target)
  • Decompose figure rewrites into individual sub-figure tasks beyond what's needed for BoG lane — Lane B figure work can use one umbrella task per figure with the action verdict from the audit

Depends on

Required by

Log