research-snrnp-assembly

Research: snRNP assembly & spliceosome in subtelomeric PHRs

Metadata

Statusdone
Assignedagent-45
Agent identity3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3
Created2026-03-31T22:27:05.567731884+00:00
Started2026-03-31T22:28:16.603053782+00:00
Completed2026-03-31T22:31:24.695769715+00:00
Tagsresearch,fan-out, eval-scheduled
Eval score0.83
└ blocking impact0.90
└ completeness0.76
└ coordination overhead0.90
└ correctness0.85
└ downstream usability0.80
└ efficiency0.92
└ intent fidelity0.81
└ style adherence0.85

Description

Goal

Investigate the biological significance of snRNP/spliceosomal gene enrichment in subtelomeric PHRs.

Context

Our PHR-only GO enrichment found that 8 out of 22 query genes are involved in spliceosomal snRNP assembly. The top hits are:

  • Formation of quadruple SL/U4/U5/U6 snRNP (p=9.9e-4)
  • Spliceosomal tri-snRNP complex assembly (p=1.1e-3)
  • U4 snRNA binding (MF, p=9.1e-5) — strongest single hit
  • mRNA splicing via spliceosome
  • Protein-RNA complex assembly/organization

This signal persists (and improves) when excluding acrocentric PHRs — it's genome-wide.

Questions to answer

  1. Which 8 genes are these? Read from phrs.no_acro.gene_names.txt and cross-reference with the g:Profiler results or the GFF3 to identify the specific genes annotated to snRNP/spliceosome GO terms. Check the CSV files for gene lists within each term.
  2. What do these genes do? For each of the 8, briefly describe: gene name, full name, function, known disease associations, whether it's protein-coding or pseudogene.
  3. Why would spliceosomal genes cluster in subtelomeric PHRs? Are these known to be in multi-copy gene families? Are they pseudogenes of spliceosomal components? (snRNA genes like U4, U5, U6 are known to exist in tandem arrays — are the 8 genes related to snRNA pseudogenes?)
  4. Is there precedent in the literature for spliceosomal component genes being enriched in subtelomeric regions?
  5. Connection to T2T/CHM13: Were any of these genes newly resolved by the T2T assembly? Check Angela's 'Novel T2T' sheet concept from the xlsx if accessible.

Output

Log a structured summary with gene identities, functions, and a 2-paragraph interpretation of why this signal exists and what it means for the PHR paper.

Validation

  • All 8 genes are identified by name
  • Each gene has a brief functional annotation
  • The biological interpretation addresses whether these are functional genes vs pseudogenes
  • Literature context is provided

Depends on

Required by

Log