Metadata
| Status | open |
|---|---|
| Created | 2026-06-27T15:46:22.357517260+00:00 |
| Tags | eval-scheduled, fig5 |
Description
Recover from the premature failure of finalize-fig5-raw. Do not rerun WFMASH, SweepGA/FastGA, minimap2, seqwish, odgi, or any alignment. Use the existing 2 kb IMPG Slurm arrays submitted by fig5-raw-manymany-impg-similarity-2kb-sharded: 1706840,1706841,1706842,1706843,1706844,1706845.
Important paths:
- Live shard outputs/logs are under /moosefs/erikg/phrs/.wg-worktrees/agent-2837/paper_prep/_brainstorming/fig5_raw_manymany_impg_similarity_2kb_sharded/
- Main target path is /moosefs/erikg/phrs/paper_prep/_brainstorming/fig5_raw_manymany_impg_similarity_2kb_sharded/
First check Slurm with squeue/sacct. If any of arrays 1706840-1706845 are still RUNNING/PENDING, do not run finalization against incomplete shards. Log exact state and create a new delayed follow-up task rather than marking this as a pipeline/data failure.
Once all six arrays are terminal and successful, normalize tmp shard filenames if needed, run scripts/finalize_2kb_sharded_impg.py against the live agent-2837 output tree or sync live outputs into the main target tree before finalization. Preserve all-hit assembled outputs for audit. For plotting summaries/tracks, keep/select the single best similarity/support hit per 2 kb query window with deterministic tie-breaking: highest similarity/ANI/support score first, then aligned/support length, then stable lexical target coordinates. Document the exact rule in REPORT.md.
Validation:
- All 906 shard rows in manifests/shard_completion_manifest.tsv are OK, or any failed Slurm shard is diagnosed concretely with log paths.
- Six assembled compressed outputs exist, one per method x comparison.
- Assembled all-hit outputs preserve complete IMPG similarity records for audit.
- Plotting tables reduce to one best hit per 2 kb window: per_window_target_similarity_support.tsv and full_genome_target_pattern_tracks.tsv.
- Summary tables include top_interchromosomal_targets.tsv, all_interchromosomal_targets.tsv, chr9q_chr3q_windows.tsv, par_controls.tsv, acrocentric_controls.tsv.
- REPORT.md records final Slurm state, live/source paths, output paths, row counts, and best-hit tie-breaking.
- Commit and push changes, then report whether this supersedes failed finalize-fig5-raw.
Depends on
Required by
- (none)
Log
- 2026-06-27T15:55:48.676502991+00:00 Slurm dependency finalizer submitted as job 1706861 with --dependency=afterany:1706840:1706841:1706842:1706843:1706844:1706845. This job owns tmp shard normalization, finalizer execution, and rsync from live agent-2837 tree into main scratch tree. When this WG task wakes, first inspect sacct/logs for 1706861 and harvest/validate outputs; do not duplicate final assembly if 1706861 completed successfully.