fig5-sweepga-f16-chop-1to1-slurm-sensitivity

Slurm-only SweepGA f16 chopped 1:1 length sensitivity

Metadata

Statusopen ‖ paused
Agent identity46f6237a65ec4f1002c4d3fb201dc8633638d0947c276be7008c227e1051ba5e
Created2026-06-22T15:27:00.426609016+00:00
Started2026-06-22T15:28:43.497699770+00:00
Tagspedigree, fig5, sweepga, pafchop, slurm, whole-genome-alignment, plot, eval-scheduled

Description

This replaces the paused task fig5-sweepga-f16-chop-1to1-sensitivity, which was stopped because its worker began running sweepga --num-mappings 1:1 on the head/login node. Do NOT resume that paused task.

Hard operational requirement:

  • Do not run heavy sweepga, pafchop, gzip decompression of large PAFs, or summary scans on the head/login node.
  • Use Slurm for the chop/filter work. Create one Slurm array or one Slurm job per comparison x chop length. Expected matrix: 3 comparisons x at least 4 chop lengths = 12 Slurm tasks.
  • Head-node work is limited to tiny script writing, manifest generation, squeue/sacct, and final small table/plot generation after Slurm outputs exist.

Scientific task:

  • Test strict sweepga --num-mappings 1:1 --scaffold-jump 0 after smaller PAF chops of the SweepGA/FastGA --fastga-frequency 16 whole-genome-derived PAFs.
  • Chop lengths: at least 10000, 5000, 2000, 1000 bp.
  • Comparisons: PAN027pat_vs_PAN011_joint, PAN027mat_vs_PAN010_joint, PAN028mat_vs_PAN027_joint.
  • Prefer sub-chopping existing 10kb tag-stripped f16 chopped PAFs for smaller lengths; document this. If main package lacks ignored PAFs, use/copy/reference the agent-2649 ignored PAFs.
  • Use /dev/shm or node-local scratch inside each Slurm job, with cleanup.

Required outputs:

  • Reproducible Slurm scripts/config under paper_prep/_brainstorming/pedigree_whole_genome_sweepga_fastga_frequency16/.
  • summaries/chop_length_one_one_slurm_jobs.tsv with job ids, array ids, status, node, runtime, output paths.
  • summaries/chop_length_one_one_chr3_survival.tsv with chr3 rows, chr3 summed overlap, chr3 query-union bp, chr9 query-union bp, other target union bp, and yes/no chr3 survival per candidate/comparison/chop length.
  • Focused SVG/PDF plot under paper_prep/_brainstorming/fig5_sweepga_f16_chop_sensitivity/.

Acceptance:

  • Direct answer: does smaller chopping improve strict 1:1 chr3 survival relative to 10kb?
  • Direct answer: which chop length is best for the visual evidence layer?
  • Confirm via squeue/sacct and logs that heavy work ran on Slurm worker nodes, not the head node.
  • Large PAFs remain ignored; summaries/plot/scripts are tracked.
  • No submission/ files modified.
  • Commit and push with WG provenance.

Depends on

Required by

Messages 3 messages (replied)

  1. #1user2026-06-22T15:29:23.055176195+00:00read
    Operational correction from user: do not request whole nodes. The chop/filter matrix must run in Slurm with shared-node modest resources: no --exclusive; use a Slurm array or separate jobs with e.g. --cpus-per-task=4 (8 only if justified), --mem=16G-32G per array element, and an array concurrency cap such as %6 or %12. Heavy pafchop/sweepga/gzip must not run on the head/login node. Please submit cluster jobs for each comparison x chop length and record job IDs before doing any heavy filtering.
  2. #2fig5-sweepga-f16-chop-1to1-slurm-sensitivity2026-06-22T15:29:41.855160135+00:00read
    Acknowledged — I will use shared-node Slurm array tasks only, with modest per-element resources, no --exclusive, a concurrency cap, and job IDs recorded before any heavy pafchop/sweepga/gzip work.
  3. #3user2026-06-22T16:02:45.362465106+00:00sent
    STOP immediately: user identified that pafchop-rs has not been validated sufficiently. Cancel/avoid all chop/filter Slurm work for this sensitivity run until the PAF chopper is validated/repaired and sweepGA scoring is audited for identity-per-chunk filtering. Do not submit or resubmit array jobs. Preserve scripts/logs only.

Log