improve-bog-slides

Improve BoG slides v2: add Hi-C/3D, pedigree, gene enrichment; synthesize better

Metadata

Statusabandoned ‖ paused
Agent identity3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3
Created2026-05-06T00:16:08.488356314+00:00
Started2026-05-06T00:16:51.301140516+00:00
Tagsbog-slides, bog-this-week, lane-a, eval-scheduled
Failure reasonwrong shape: monolithic; refanned out as one task per slide

Description

GOAL: Produce a concrete, ready-to-build improvement plan for Erik Garrison's Biology of Genomes (BoG) talk THIS WEEK. The existing draft slide deck (slides/20260204_Subtelomics_overview_EG.pdf, 10 slides covering IMPG + identity heatmaps + community/PCA clustering) is solid on methodology and the cladistic findings but is MISSING three categories of material the lead author wants integrated, and needs better narrative synthesis throughout.

LEAD-AUTHOR DIRECTIVES (verbatim):

'really i want to hit the slides harder. add new material we have (hi-c! 3d coordinates, gene enrichment analyses, pedigree plots) and synthesize things better'

CONTEXT:

  • Talk: Biology of Genomes (BoG), THIS WEEK. Likely a 15-min slot (per the prior TALK_OUTLINE_15MIN.md scope; confirm if you find a different number elsewhere). Speaker: Erik Garrison.
  • The existing slide deck is a v1 draft from 2026-02-04. Erik is treating it as the working base.
  • Companion paper to HPRC v2 (Nature, ~2-3 wk submission). The talk is the public preview, NOT the paper itself.
  • Pedigree work is canonical (NOT off-target): the abstract title 'concerted evolution and unorthodox recombination' is supported by pedigree analysis as direct empirical proof of ongoing exchange. End-to-end report chapter 14 (pedigree_recombination, 556 lines, second-longest chapter) is the substrate.
  • Gene enrichment is OK to mention in the talk as a fun aside / breadth slide, even though it is NOT in the canonical Nature manuscript. Different audience, different framing — slides ≠ paper.

INPUTS — read all of these: Anchor: - paper_prep/synthesis/ABSTRACT.md (canonical claims C1-C8) Existing deck (THE BASE WE ARE IMPROVING): - slides/20260204_Subtelomics_overview_EG.summary.md (8 KB; ChatGPT-extracted text from PDF, slide-by-slide) - slides/20260204_Subtelomics_overview_EG.pdf (3.3 MB; actual deck) Substrate for missing content: - end-to-end-report/report/05_hic_validation.md (688 lines — Hi-C 3D, this is the longest chapter and is your primary source for Hi-C/3D slides) - end-to-end-report/report/14_pedigree_recombination.md (556 lines — pedigree, the C8 evidence) - end-to-end-report/report/03_gene_enrichment.md (157 lines — gene enrichment material) - end-to-end-report/report/08_mouse.md (259 lines — mouse meiotic Hi-C, may serve a 3D-context slide) - end-to-end-report/report/07_integrated.md (110 lines — integrated synthesis, useful for the closer slide) - end-to-end-report/pedigree-plots/ (figure assets including ceph1463-hifiasm/.untangle.pdf) Existing figure assets to reuse (may map directly to new slides): - paper_prep/figures/fig3/figure_fig3.{pdf,png} (HG002 Pore-C contact matrix + convergent evidence forest plot) - paper_prep/figures/ed5/figure_ed5.{pdf,png} (Hi-C multi-resolution + exclusion controls — strong rigorous-defense visual) - paper_prep/figures/ed8/figure_ed8.{pdf,png} (causal feedback loop + D4Z4-CTCF-lamin tethering) - paper_prep/figures/fig1/figure_fig1.{pdf,png} (Erik already references the underlying analyses on slide 4) Gene enrichment material (off-spec for paper, OK for talk): - paper_prep/_brainstorming/gene_copy_summary.csv, all_gene_copies_by_arm.csv, genome_wide_gene_copies.csv - paper_prep/_brainstorming/Figure1_GSEA_BP_vertical.pdf, Figure_GSEA_MF_vertical.pdf - paper_prep/brainstorming/or4f.md, deep_research_dux4_frg2.md Crosswalk task currently running (read-end-to / agent-750) will produce paper_prep/synthesis/CROSSWALK.md before this task likely needs it. If CROSSWALK.md exists when you start, read it for additional methodology / corrections context. If it doesn't, proceed without it.

DELIVERABLE: slides/SLIDES_v2_PLAN.md with these sections:

1. Existing 10-slide audit

For each slide 1-10 in the existing deck, one row:
  Slide # | Title | What it shows | Keep / Revise / Cut for v2 | Specific revisions needed (if any)
Be concrete. Examples:
  - Slide 8/9 PCA: confirm whether this is true PCA or MDS/PCoA mislabeled; if MDS, recommend label fix to align with abstract claim C6 OR run actual PCA.
  - Slide 10 Communities: annotate the C1-C15 community labels with the abstract's named clades (C1=DUX4, C2=10p-18p, C6={22q,21q,19q,1q,13q,17q}, C7=acrocentrics, C14=Xq/Yq PAR2, C15=Xp/Yp PAR1). This is a quick high-value revision.

2. Proposed new slides (slide 11 onwards) — at least 4 slide concepts covering:

a) **Hi-C / 3D coordinates** — bring in the spatial nuclear-organisation evidence. Recommend specific panels from end-to-end-report ch.05 (Pore-C contact matrix HG002, multi-resolution Mantel ED5b, exclusion-control robustness, within-vs-between community contact). Decide: 1 slide or 2-3 slides? This is C7 evidence and is currently absent from the deck.
b) **Pedigree as proof of ongoing recombination** — bring in WashU 3-gen T2T pedigree (CEPH1463) untangle ribbons from end-to-end-report ch.14 + pedigree-plots/. This is the C8 'unorthodox recombination' evidence. Likely 1 strong slide, possibly with a build / animation suggestion.
c) **Gene enrichment / DUX4-OR4F biology aside** — a fun-breadth slide on the gene-family copy-number diversity story. Keep it brief (1 slide) and frame as 'and the biology is interesting too' rather than core argument.
d) **Synthesis closer** — a final slide tying together: methodology (IMPG/all-vs-all/12% connectivity) → empirical (extended subtelomere homology, named clades) → mechanism (Hi-C 3D / nuclear envelope) → proof (pedigree exchange) → biology (DUX4 copy diversity) → thesis ('concerted evolution and unorthodox recombination'). 1 slide.
For each new slide:
  ### Proposed slide N: <Title>
  Anchor claim: C1-C8 reference
  Bullets: 3-5 bullets max, conference-style
  Primary figure: which file path? Existing or to-be-generated? If to-be-generated, propose a script.
  Speaker notes: 100-200 words of what Erik says
  Time budget: in seconds (talk total ~15 min)

3. Narrative arc upgrade

A 1-page narrative document that lays out the story Erik tells from slide 1 to closer. Identify:
  - The hook (slide 1-3): why subtelomeres? why now? why HPRC v2?
  - The methods reveal (slide 3-4): IMPG + 12% connectivity argument, in language a mixed-genomics audience parses fast
  - The empirical core (slides 5-10): heatmaps → communities → cladistic structure
  - The mechanism (new slides 11-13): Hi-C 3D / nuclear envelope
  - The proof (new slide 14): pedigree as direct evidence of ongoing exchange
  - The aside (new slide 15): gene-family biology
  - The closer (new slide 16): synthesis + thesis
Include 1-2 transition sentences between sections.

4. Figure-asset checklist

A bullet list of every figure used in v2, with: source file path | exists on disk? (Y/N) | if N, the script/command needed to generate it. If anything is missing, FLAG it loudly so a follow-up task can be dispatched to produce it.

5. Time budget table

Slide # | Title | Target seconds | Cumulative time (target ≤ 15 min, leaving 1-2 min buffer for Q&A overflow)

6. Risks / open questions for Erik

Anything you're uncertain about that needs Erik to confirm before he builds the deck (e.g., 'are slides 8/9 actual PCA or MDS?'; 'should DUX4 aside be cut if time runs short?'; 'is there a Lamin B1 / LAD overlay we can show to make the C7 envelope hypothesis more direct?').

ACCEPTANCE:

  • slides/SLIDES_v2_PLAN.md exists with all 6 sections
  • At least 4 new-slide concepts proposed (Hi-C, pedigree, gene-enrichment, synthesis)
  • Every figure referenced has a path AND an exists-or-generation-script status
  • Time budget sums to ≤ 15 min (or whatever talk slot is confirmed)
  • Single commit: 'docs: SLIDES_v2 plan for BoG (Hi-C + pedigree + gene-enrichment + synthesis)'
  • wg artifact records SLIDES_v2_PLAN.md
  • If any new figure is to-be-generated, the generation script is also produced (in slides/scripts/ or referenced from existing scripts) so Erik can run it himself.

DO NOT in this task:

  • Build an actual new slide deck (no .pptx, .key, no typst/beamer .pdf — Erik builds the deck himself from the plan)
  • Modify the existing 20260204_Subtelomics_overview_EG.{md,pdf} files
  • Run any heavy data analysis (use existing analysis outputs; flag if something needs new compute)
  • Move any file outside slides/
  • Touch the running crosswalk task (read-end-to / agent-750) — it operates independently

Depends on

Required by

Log