redo-sweepga-paf-filter-identity-scoring-audit — octopus01:/moosefs/erikg/phrs

Metadata

Status	done
Assigned	`agent-2671`
Agent identity	`3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3`
Created	2026-06-22T19:55:05.683616485+00:00
Started	2026-06-22T19:56:31.296357774+00:00
Completed	2026-06-22T20:06:05.941224315+00:00
Tags	`sweepga`, `paf`, `validation`, `scoring`, `fig5`, `eval-scheduled`
Eval score	0.94
└ blocking impact	0.96
└ completeness	0.95
└ constraint fidelity	0.85
└ coordination overhead	0.90
└ correctness	0.96
└ downstream usability	0.94
└ efficiency	0.90
└ intent fidelity	0.83
└ style adherence	0.94

Description

Replacement for the invalid task audit-sweepga-paf-filter-identity-scoring, which was marked done without producing the required audit deliverables.

Goal: prove exactly how the currently installed /home/erikg/.cargo/bin/sweepga scores and filters PAF records when applying --num-mappings 1:1 / many:many, especially whether --scoring ani ranks by per-chunk identity rather than raw length or length*identity.

Required work:

Inspect the actual sweepGA binary/help/version and source if available; record the executable path and version/hash where possible.
Build minimal synthetic PAF fixtures with equal-length and unequal-length competing chunks, with recomputed col10/col11 identity fields and cg:Z tags where sweepGA expects them.
Run sweepGA locally only on tiny fixtures, never on whole-genome data from the head node.
Explicitly test default scoring and --scoring ani behavior for --num-mappings 1:1 and many:many if supported.
Confirm whether filtering is per PAF row/chunk, whether any trivial merging/chaining occurs inside sweepGA before scoring, and which PAF columns/tags affect the score.
Write the required Markdown audit and TSV summary. The task is not complete without both deliverables.

Acceptance criteria:

SWEEPGA_PAF_FILTER_IDENTITY_AUDIT.md states the exact safe command line to use for the f16 chopped rerun, or states that no safe command exists.
sweepga_paf_filter_identity_audit.tsv has one row per synthetic fixture/test with command, expected winner, observed winner, pass/fail, and interpretation.
The audit explicitly says whether the downstream Fig5 f16 validated chop rerun may proceed.

Replacement for the invalid task audit-sweepga-paf-filter-identity-scoring, which was marked done without producing the required audit deliverables.

Required work:
- Inspect the actual sweepGA binary/help/version and source if available; record the executable path and version/hash where possible.
- Build minimal synthetic PAF fixtures with equal-length and unequal-length competing chunks, with recomputed col10/col11 identity fields and cg:Z tags where sweepGA expects them.
- Run sweepGA locally only on tiny fixtures, never on whole-genome data from the head node.
- Explicitly test default scoring and --scoring ani behavior for --num-mappings 1:1 and many:many if supported.
- Confirm whether filtering is per PAF row/chunk, whether any trivial merging/chaining occurs inside sweepGA before scoring, and which PAF columns/tags affect the score.
- Write the required Markdown audit and TSV summary. The task is not complete without both deliverables.

Acceptance criteria:
- SWEEPGA_PAF_FILTER_IDENTITY_AUDIT.md states the exact safe command line to use for the f16 chopped rerun, or states that no safe command exists.
- sweepga_paf_filter_identity_audit.tsv has one row per synthetic fixture/test with command, expected winner, observed winner, pass/fail, and interpretation.
- The audit explicitly says whether the downstream Fig5 f16 validated chop rerun may proceed.

Depends on

Required by

done .flip-redo-sweepga-paf-filter-identity-scoring-audit

Log

2026-06-22T19:55:42.174543138+00:00 Lightweight assignment: agent=Default Evaluator (31847164), exec_mode=full, context_scope=task, reason=The Default Evaluator is the best fit for a correctness-critical audit that needs both command-line validation and final Markdown/TSV reporting.
2026-06-22T19:56:31.296361541+00:00 Spawned by coordinator --executor codex --model gpt-5.5
2026-06-22T19:56:56.982326692+00:00 Starting audit: inspecting installed sweepGA, repository context, and prior pafchop validation artifacts
2026-06-22T19:58:00.486202307+00:00 Creating tiny synthetic PAF fixtures for scoring, num-mappings, and identity-field precedence tests
2026-06-22T20:03:46.537212954+00:00 Validated: deliverables present; synthetic audit TSV has all PASS rows; audit states safe rerun command and Fig5 proceed condition
2026-06-22T20:05:18.263633810+00:00 Committed: 444a470 — pushed to remote
2026-06-22T20:06:05.941234624+00:00 Task pending eval (agent reported done; awaiting `.evaluate-*` to score)
2026-06-22T20:15:32.373247387+00:00 PendingEval → Done (evaluator passed; downstream unblocks)