review-extract-and

Review: extract and document copy-aware enrichment findings

Metadata

Statusdone
Assignedagent-408
Agent identity3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3
Created2026-04-01T19:18:02.456614117+00:00
Started2026-04-01T19:19:27.614081062+00:00
Completed2026-04-01T19:22:47.968814550+00:00
Tagsreview,critical, eval-scheduled
Eval score0.49
└ blocking impact0.35
└ completeness0.60
└ coordination overhead0.50
└ correctness0.35
└ downstream usability0.30
└ efficiency0.70
└ intent fidelity0.64
└ style adherence0.80

Description

Goal

Read ALL the copy-number-aware enrichment output files and produce a clear, detailed summary of what the copy-aware analysis actually found. This is the foundation for updating all other documents.

Files to read

  • copy_number_vs_standard_ora_comparison.csv
  • copy_weighted_vs_deduplicated_comparison.csv
  • copy_weighted_functional_analysis.csv
  • copy_number_aware_enrichment_results.csv
  • phr_copy_weighted_enrichment.csv
  • copy_weighted_go_enrichment.R — the actual R script (to understand methodology)
  • copy_number_enrichment.py — the Python implementation
  • gene_copy_summary.csv — the copy counts
  • gene_copy_background_analysis.csv — genome-wide background
  • ora_comparison_results.csv
  • Any other CSV/results files in the working directory related to copy-weighted analysis

Questions to answer with SPECIFIC NUMBERS

  1. What was the genome-wide background? How many total gene copies? How many unique genes?
  2. For each GO term tested: what was the standard ORA p-value vs copy-weighted p-value?
  3. Which terms got STRONGER with copy awareness? By how much?
  4. Which terms got WEAKER or disappeared?
  5. Did any NEW terms appear that weren't in the standard analysis?
  6. What is the copy bias? Which gene families have the highest copy counts in PHRs vs genome-wide?
  7. Is the olfactory signal now dominant? How does it compare to splicing?
  8. What about GTP binding — 3 families with 18 copies each, p effectively 0?

Output

Log a COMPLETE structured summary with all numbers. This will be used by other tasks to update documents.

Save as copy_aware_findings_summary.md with:

  • Full comparison table (GO term | standard p | copy-weighted p | direction)
  • Gene family copy count table
  • Top 3 key findings with specific numbers
  • Methodology description (1 paragraph)

Validation

  • Every comparison has actual p-values from the CSVs
  • All copy counts are from the data files, not approximated
  • The summary is self-contained (someone reading it can understand the full picture)

Depends on

Required by

Log