fix-fig5-candidate-panel-coordinates

Fix Fig5 candidate panel genomic coordinates

Metadata

Statusdone
Assignedagent-2576
Agent identity289ccc9f03fc7c121a5ab8d685ffd018371bcdac67ceab1d50b03e7347d29155
Created2026-06-19T21:03:43.624977489+00:00
Started2026-06-19T21:05:52.822779055+00:00
Completed2026-06-19T21:13:29.403108128+00:00
Tagspedigree, figure, coordinates, sweepga, eval-scheduled
Eval score0.84
└ blocking impact0.95
└ completeness0.74
└ constraint fidelity0.55
└ coordination overhead0.92
└ correctness0.83
└ downstream usability0.84
└ efficiency0.90
└ intent fidelity0.62
└ style adherence0.90

Description

Correct the coordinate system in the Fig5 PAR1/PHR candidate panel asset pack produced by fig5-par1-phr-candidate-panels.

Problem: The current figure labels and summaries use local 0-500 kb extracted-window offsets as the main coordinate system. That is illegible and scientifically hazardous for a manuscript figure. Those offsets are only internal coordinates. The visual should show real genomic coordinates from the source sequence names, and should use a consistent physical scale or a clearly identical tick interval across panels.

Source task/output:

  • Predecessor task: fig5-par1-phr-candidate-panels
  • Predecessor commit/branch if needed: commit b6e039d, branch wg/agent-2572/fig5-par1-phr-candidate-panels
  • Existing output directory: paper_prep/_brainstorming/fig5_par1_phr_candidate_panels/

Required correction:

  1. Parse genomic intervals from query names such as:
    • PAN027#2#chrX.paternal:12265-512264_chrX_parm
    • PAN027#2#chr9.paternal:135704825-136204824_chr9_qarm
    • PAN028#1#chr3.haplotype1:199233840-199733839_chr3_qarm
  2. Convert local patch coordinates to source assembly genomic coordinates for display:
    • For local half-open interval [a,b) within query name chr:start-end, display genomic interval chr:(start+a)-(start+b) using a clearly documented 0-based half-open or 1-based closed convention. Use one convention consistently in axis labels, callouts, and panel_event_summary.tsv.
    • Example: local 446944-472441 in PAN027#2#chr9.paternal:135704825-136204824_chr9_qarm becomes approximately chr9:136,151,769-136,177,266 in native PAN027 paternal assembly coordinates if using 0-based half-open labels.
  3. Do not call the coordinates CHM13 unless the script actually uses a real CHM13 projection/liftover table. First audit whether existing inputs already represent CHM13-projected coordinates or native sample assembly coordinates. If no real CHM13 projection is available from existing files, label the figure as native assembly coordinates, not CHM13. Add a short README note explaining this. If a reliable existing CHM13 projection is available without heavy computation, use it and document the source file.
  4. Update the plotted x-axis labels and callouts so the viewer sees normal chromosome coordinates, not 0, 250 kb, 500 kb local offsets. Retain local offsets only in a secondary/audit column if useful.
  5. Use the same physical scale across event panels where possible. If panels must have different genomic spans for legibility, use the same tick spacing and state each displayed genomic window explicitly. Avoid misleading zooms where a 140 kb PAR1 block and a 40 kb autosomal block look equivalent without scale cues.
  6. If target/donor-side genomic coordinates can be recovered from the existing sweepGA/native PAF target fields, add donor genomic interval columns to panel_event_summary.tsv. If not, keep donor labels as donor arms/haplotypes and document that target coordinate recovery was not done in this lightweight presentation correction.
  7. Regenerate fig5_par1_phr_candidate_panels.pdf, .svg, panel_event_summary.tsv, and README.

Constraints:

  • Do not edit submission/paper.tex, the submitted Fig5 asset, or bibliography files.
  • Do not run heavy untangle/sweepGA jobs on the head node. Use existing PAF/TSV outputs only.
  • Keep the interpretation cautious: PAR1 is a positive control; autosomal events are candidate PHR exchange patches.
  • Commit changes with project convention: feat: fix-fig5-candidate-panel-coordinates (agent-NNN).

Validation:

  • Main figure x-axes use genomic coordinates parsed/projected from source names, not local 0-500 kb offsets.
  • README states whether coordinates are native assembly coordinates or CHM13-projected, with provenance.
  • panel_event_summary.tsv includes both local offsets and displayed genomic coordinates for each plotted interval.
  • Panel scaling is consistent or explicitly annotated so relative event lengths are not visually misleading.
  • PDF/SVG regenerate cleanly and remain legible.
  • Submitted manuscript and submitted Fig5 asset are untouched.

Depends on

Required by

Messages 2 messages (2 unread)

  1. #1user2026-06-19T21:04:58.076441899+00:00delivered
    Author clarification: the plotted intervals should come from the strict primary path, not permissive secondary/multihit patch calls. Use the conservative nb:i:1 plus sweepGA 1:1 no-scaffold output as the drawing source: either paper_prep/_brainstorming/fig5_sweepga_1to1_redraw/conservative_segments.tsv or the corresponding *.native_nb1.sweepga_1to1_noscaffold.paf. Treat /moosefs/.../patches.tsv as annotation/provenance only for community labels, score summaries, and interpretive labels. If the figure uses words like 'secondary fragment', make clear this means a smaller side fragment within the single selected 1:1 path, not a secondary alignment. Do not plot nth-best/multimap alternatives or non-primary PAF rows.
  2. #2fix-fig5-candidate-panel-coordinates2026-06-19T21:06:00.346727351+00:00read
    Acknowledged — I will use conservative nb:i:1 / sweepGA 1:1 no-scaffold segments as the drawing source, keep patches.tsv as annotation/provenance only, and avoid plotting secondary/multimap alternatives.

Log