fig5-whole-genome-minimap2-asm5-allchains

Fig5 whole-genome minimap2 asm5 all-chain chr3 homology

Metadata

Statusdone
Assignedagent-2648
Agent identity46f6237a65ec4f1002c4d3fb201dc8633638d0947c276be7008c227e1051ba5e
Created2026-06-21T13:32:31.239006100+00:00
Started2026-06-21T16:04:45.373436981+00:00
Completed2026-06-21T19:00:07.810554987+00:00
Tagspedigree, fig5, minimap2, whole-genome-alignment, chr3-homology, local-homology, eval-scheduled
Eval score0.81
└ blocking impact0.74
└ completeness0.76
└ constraint fidelity0.85
└ coordination overhead0.93
└ correctness0.78
└ downstream usability0.92
└ efficiency0.66
└ intent fidelity0.89
└ style adherence0.94

Description

Motivation: Use minimap2 as an independent source-built local-homology aligner for the Fig5 pedigree chr9/chr3 question. Updated wfmash -p 95 recovers chr3 target homology consistent with the curated subtelomeric PGGB graph, while sweepGA/FastGA default raw PAF did not emit chr3 rows. Minimap2 should test whether a conventional local assembly-to-assembly mapper can recover the same chr3 support without FastGA's low k-mer occurrence cap.

Binary installed for this task:

  • Required binary: /home/erikg/bin/minimap2
  • Realpath at task creation: /export/local/home/erikg/bin/minimap2-v2.31-r1302
  • Version: 2.31-r1302
  • sha256: 5a0e9d6b351f1aa5d11a5067bd29a33bc50abe70c51fc9be9e1899ec1643c949
  • Source checkout: /home/erikg/minimap2, tag v2.31, commit 3c28777e7e2dcc90f825de1b9f17a89cca7d4452.

Task: Run full whole-genome minimap2 alignments for the same three joint-parent Fig5 comparisons used by the updated wfmash and sweepGA packages:

  • PAN027pat_vs_PAN011_joint
  • PAN027mat_vs_PAN010_joint
  • PAN028mat_vs_PAN027_joint

Use the same full whole-genome child-haplotype query FASTAs and joint-parent target FASTAs from paper_prep/_brainstorming/pedigree_whole_genome_sweepga_updated_bin/inputs/ or regenerate them identically from that package's manifest/scripts if needed. Do not use chromosome/window-only FASTAs.

Primary minimap2 command shape, preserving child as PAF query and joint parent as PAF target:

/home/erikg/bin/minimap2 -x asm5 -c --eqx -P --q-occ-frac=0 -t <threads> TARGET.fa QUERY.fa | pigz -p <threads> > OUT.paf.gz

Rationale:

  • -x asm5 is the high-identity assembly preset and uses an occurrence floor/ceiling much less stringent than FastGA's default low threshold.
  • -P retains all chains and avoids primary/secondary best-chain suppression, which is what we want for local homology evidence.
  • --q-occ-frac=0 disables query-side high-occurrence minimizer filtering for this sensitivity check.
  • -c --eqx preserves CIGAR evidence in PAF.

If asm5 + -P --q-occ-frac=0 gives no chr3 support or looks pathologically sparse, optionally run a small sensitivity row with -x asm20 -c --eqx -P --q-occ-frac=0 for the same inputs. Keep asm5 as the primary result.

Output package: Create paper_prep/_brainstorming/pedigree_whole_genome_minimap2_asm5_allchains/ with README, config, scripts, logs, summaries, and ignored raw PAF paths/checksums. Required summaries:

  • summaries/minimap2_binary.tsv
  • summaries/slurm_jobs.tsv
  • summaries/paf_file_summary.tsv
  • summaries/candidate_window_support.tsv
  • summaries/minimap2_chr3_support_summary.tsv

Acceptance:

  • All three full whole-genome minimap2 jobs complete through Slurm or failures are diagnosed with logs and exact next commands.
  • Exact command logs prove /home/erikg/bin/minimap2, version 2.31-r1302, -x asm5, -P, --q-occ-frac=0, and full whole-genome FASTA inputs were used.
  • The package gives a direct yes/no answer: does minimap2 emit chr3 target rows overlapping the PAN027 and PAN028 chr9 candidate windows?
  • Compare briefly to updated wfmash-positive and sweepGA/FastGA-default-negative evidence.
  • No submission/ files are modified and no Fig5 schematic is created.

Depends on

Required by

Log