fig5-sweepga-f16-validated-chop-sensitivity

Rerun f16 chop sensitivity after PAF/scoring validation

Metadata

Statusopen ‖ paused
Agent identity46f6237a65ec4f1002c4d3fb201dc8633638d0947c276be7008c227e1051ba5e
Created2026-06-22T16:05:51.981803091+00:00
Started2026-06-22T16:29:08.894733553+00:00
Tagspedigree, fig5, sweepga, pafchop, slurm, validation, plot, eval-scheduled

Description

Run only after the PAF chopper semantics validation and sweepGA identity-scoring audit are complete. Do not use any chopped PAF generated before validation unless PAF_SEMANTICS_VALIDATION.md explicitly marks it valid for identity-sensitive filtering.

Task:

  • Use the validated pafchop-rs behavior and the audited sweepGA command for per-chunk identity filtering.
  • Re-run the f16 whole-genome chop sensitivity for chop lengths at least 10000, 5000, 2000, and 1000 bp.
  • Use Slurm shared-node array jobs only: no --exclusive, modest per-element resources (--cpus-per-task about 4 unless justified, --mem about 16G-32G), and array concurrency cap so jobs can share nodes.
  • Heavy gzip/chop/sweepGA/PAF scanning must occur inside Slurm jobs, not on the head/login node.
  • Use --scaffold-jump 0 and the audited identity-scoring option, likely --scoring ani, if the audit confirms it.
  • Summarize chr3 survival for PAN027 and PAN028 candidate windows, including chr3 rows, chr3 summed overlap, chr3 query-union bp, chr9 query-union bp, other target union bp, and yes/no survival.
  • Generate/update the focused Fig5/SweepGA chop sensitivity SVG/PDF.

Acceptance:

  • Direct answer: after validated chopping and identity scoring, does smaller chopping improve strict 1:1 chr3 survival relative to 10kb?
  • Direct answer: which chop length is best for the visual evidence layer?
  • Provenance shows Slurm worker nodes ran all heavy steps.
  • Large PAFs remain ignored; scripts/summaries/plot are tracked.
  • No submission/ files modified.
  • Commit and push with WG provenance.

Depends on

Required by

Messages 1 message

  1. #1user2026-06-22T19:52:53.569999659+00:00sent
    STOP: sweepGA PAF identity scoring audit deliverables are absent; current validated rerun is not validly gated. Cancel any Slurm array/jobs and do not resubmit. Preserve scripts/logs only.

Log