clean-commit-stage

Clean commit: stage important outputs, commit, push

Metadata

Statusdone
Assignedagent-439
Agent identity3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3
Created2026-04-01T19:19:03.371163937+00:00
Started2026-04-02T01:32:36.664170081+00:00
Completed2026-04-02T01:34:48.310396464+00:00
Tagsgit,commit, eval-scheduled
Tokens0 in / 0 out
Eval score0.60
└ hallucination rate0.10
└ requirement coverage0.35
└ semantic match0.90
└ specificity match0.25

Description

Goal

Do a clean git commit of the important output files from this analysis. Exclude intermediate/temp files.

Files to INCLUDE (commit these):

Core data outputs

  • chm13.phrs.bed — extracted PHR intervals
  • chm13.phrs.no_acro.bed — non-acrocentric PHR intervals
  • phrs.no_acro.coding_gene_names.txt — protein-coding gene list
  • gene_copy_summary.csv — copy counts per gene family
  • all_gene_copies_by_arm.csv — every gene copy with location
  • enriched_genes_detailed_map.csv — gene-to-arm-community mapping

Enrichment results

  • phr_GO_BP_enrichment.csv, phr_GO_MF_enrichment.csv — full gene set
  • phr_no_acro_GO_BP_enrichment.csv, phr_no_acro_GO_MF_enrichment.csv — no acro
  • phr_coding_only_GO_BP_enrichment.csv, phr_coding_only_GO_MF_enrichment.csv — coding only
  • phr_copy_weighted_enrichment.csv — copy-aware results
  • copy_weighted_vs_deduplicated_comparison.csv — comparison
  • copy_weighted_functional_analysis.csv
  • copy_number_vs_standard_ora_comparison.csv
  • comparison_table.csv — PHR vs Angela comparison

Reports and documents

  • phr_gene_enrichment_report.md — main report
  • phr_gene_enrichment_synthesis.md — synthesis
  • copy_aware_findings_summary.md — copy-aware findings
  • enriched_genes_per_arm.md — per-arm summary
  • validation_report.md — validation results
  • workgraph_failure_modes_feedback.md — system feedback

Key R scripts

  • copy_weighted_go_enrichment.R
  • copy_weighted_enrichment.R

Files to EXCLUDE (do NOT commit):

  • .gitignore, CLAUDE.md, OCTOPUS_CLUSTER.md (config files, already tracked or not needed)
  • *.RData, *.rds files (R binary data)
  • debug_*.R, simple_test_*.R, test_*.txt (debug/test files)
  • verify_*.sh (verification scripts)
  • install_packages.R (setup)
  • gprofiler_request.json, gene_list_for_gprofiler.txt (intermediate)
  • Multiple redundant synthesis documents (keep only the final versions listed above)
  • *.py helper scripts (intermediate extraction scripts)

Commit message

'feat: PHR gene enrichment analysis with copy-number-aware methodology

Complete pipeline: PHR interval extraction → gene intersection → GO enrichment (standard + protein-coding-only + copy-number-aware). Key finding: copy-aware analysis dramatically strengthens olfactory receptor and immune gene signals, revealing that standard ORA underestimates enrichment in multi-copy subtelomeric regions.'

Validation

  • Only important files are staged (not debug/intermediate)
  • Commit succeeds
  • Push succeeds

Depends on

Required by

Log