chm13-phr-ucsc-browser-suite

Render CHM13 PHR UCSC browser slide suite

Metadata

Statusdone
Assignedagent-1053
Agent identity46f6237a65ec4f1002c4d3fb201dc8633638d0947c276be7008c227e1051ba5e
Created2026-05-07T14:17:09.315273421+00:00
Started2026-05-07T14:17:56.388889513+00:00
Completed2026-05-07T14:32:01.047972794+00:00
Tagsucsc, chm13, phr, slides, review-zoom, eval-scheduled
Eval score0.95
└ blocking impact0.95
└ completeness0.96
└ constraint fidelity0.70
└ coordination overhead0.93
└ correctness0.97
└ downstream usability0.95
└ efficiency0.90
└ intent fidelity0.90
└ style adherence0.94

Description

Build a standalone CHM13/hs1 UCSC Genome Browser slide suite for PHR review, separate from the existing BoG/review-zoom decks.

User intent:

  • Replace/supplement the current synthetic gene-structure views because the gene structures looked too short/wrong.
  • Use real UCSC Genome Browser screenshots/images on CHM13/hs1.
  • Show the PHR region as the existing BED track and include the chromosome end in every view, so it is visually clear that PHRs are subtelomeric regions but not the telomere itself.
  • Use a 1.5x zoom-out around each PHR, anchored at the terminal chromosome end.
  • Produce one large browser image per slide so the PDF can be used for manual copy/paste into the final slide deck.

Inputs and repo context:

  • PHR intervals: chm13.phrs.bed.
  • Expected/all-arm audit context: paper_prep/figures/fig1/architecture_per_arm.tsv.
  • UCSC session/example known to work from browser:
    • db=hub_3671779_hs1
    • hgsid=3966979908_lGks6rs34CqmdawD8iwY2YCYKVd3
    • user chr9q target URL/window: chr9:150279748-150617247
  • Note: chm13.phrs.bed currently appears to contain 37 CHM13 intervals, while the architecture table has 41 arms. Do not silently claim 41 CHM13 browser panels unless the input BED actually contains 41 intervals. Create an audit/manifest that explicitly reports any arms missing from the CHM13 BED.

Implementation guidance:

  • Use real UCSC browser image output, not a recreated local gene cartoon.
  • Fetch UCSC HTML with a browser-like user agent, e.g. curl -L -A 'Mozilla/5.0' ...; plain curl may hit a bot/Cloudflare-style page.
  • Use pix=1800 or similar high-resolution browser width.
  • Download/read hs1 chromosome sizes from UCSC, e.g. https://hgdownload.soe.ucsc.edu/goldenPath/hs1/bigZips/hs1.chrom.sizes or UCSC API.
  • For each row in chm13.phrs.bed, compute:
    • inclusive PHR length = end0 - start0 + 1
    • view length = ceil(1.5 * inclusive_PHR_length)
    • infer terminal side by smaller terminal gap: p if start0 is smaller, q if chrom_size - end0 is smaller
    • p-arm view: start1=1, end1=min(chrom_size, view_length)
    • q-arm view: start1=max(1, chrom_size - view_length + 1), end1=chrom_size
  • The chr9q row from chm13.phrs.bed should reproduce the user’s intended 1.5x terminal-anchored browser window: chr9:150279748-150617247.
  • Verify UCSC HTML shows the chm13.phrs.bed custom track for each fetched page.
  • UCSC browser track images are emitted as trash sprites. Extract the relevant ../trash/hgt/...png and ../trash/hgtSide/...png paths from the HTML, crop to the visible browser panel using the CSS offsets/dimensions, then append side labels plus data image into a single panel PNG.
  • ImageMagick is available when /home/erikg/micromamba/bin is on PATH.
  • Typst is available at /home/erikg/.local/bin/typst.

Suggested output layout:

  • slides/chm13-phr-ucsc-browser/CHM13_PHR_UCSC_browser_suite.pdf
  • slides/chm13-phr-ucsc-browser/chm13_phr_ucsc_browser_suite.typ
  • slides/chm13-phr-ucsc-browser/manifest.tsv
  • slides/chm13-phr-ucsc-browser/README.md or RENDER_NOTES.md
  • slides/chm13-phr-ucsc-browser/_scripts/ for render scripts
  • slides/chm13-phr-ucsc-browser/_assets/ucsc/ for browser panel PNGs and optional cached HTML/sprites

Keep the existing slides/v2-review-zoom/ deck untouched. This is a separate supporting presentation.

Related future note, do not implement here unless it is trivial to document only: the within-population variation plot may need a better metric, likely nearest same-superpopulation/same-population PHR distance rather than centroid distance.

Validation

  • slides/chm13-phr-ucsc-browser/CHM13_PHR_UCSC_browser_suite.pdf exists and renders successfully.
  • The deck has one main browser-image slide per interval in chm13.phrs.bed unless there is an explicitly labeled title/audit slide; report exact slide count.
  • Every browser panel image is nonblank and high enough resolution for slide copy/paste.
  • Every main browser slide includes the chromosome end in the browser window: p-arm windows start at coordinate 1, q-arm windows end at the chromosome size.
  • The chr9q browser window is chr9:150279748-150617247.
  • manifest.tsv records label, chrom, arm, PHR BED start/end, inclusive PHR bp, chromosome size, browser start/end, terminal gap, image path, and UCSC URL.
  • The manifest/audit explicitly explains the 37-vs-41 issue if chm13.phrs.bed still has 37 rows.
  • UCSC HTML confirms the chm13.phrs.bed track is visible for every fetched panel.
  • git diff --check passes.

Depends on

Required by

Log