Metadata
| Status | abandoned ‖ paused |
|---|---|
| Created | 2026-05-01T22:08:23.157722705+00:00 |
| Tags | grant,urgent,review,v3 |
Description
Description
Erik wants an independent review of workgraph_google_application_FINAL_v3.md from a posture closer to an actual Google.org program officer reading a stack of 200 applications looking for reasons to cut. The auto-eval scored v3 at 0.78 LLM / 0.95 FLIP — passing, but the LLM dimension dropped from the ~0.94 average earlier in the day, suggesting weaknesses the grader sensed but didn't articulate.
Critical addition: web-link verification is part of the review. The application references poietic.life, GitHub repos, the WorkGraph docs site, and cited papers. If any of those are broken, mismatched, or weaker than what the application implies, a reviewer notices. This must be checked.
What to read
workgraph_google_application_FINAL_v3.md— committed at70c8e7fon worktree branchwg/agent-77/v3-assemble-stitch. Usegit show 70c8e7f:workgraph_google_application_FINAL_v3.mdto read it.workgraph_google_application_FINAL_v2.md(post-audit-fix on main) — for comparison~/poietic.life/notes/v3-spine-brief.md— the intended frame~/poietic.life/notes/v3-assembly-summary-20260501.md— what the assembler didCLAUDE.md— project context, key narrative decisions, attribution rules
What to do
Part 1: Hostile reviewer pass on v3
Adopt the posture of a Google.org program officer who must cut 95%+ of applications. Read v3 looking for reasons to discount it. Specifically:
- Overclaim hunt. Where does v3 promise more than it can deliver in 36 months? Flag exact section and quote.
- Vague-claim hunt. Where does v3 use language that sounds important but doesn't commit to anything verifiable? ("reliable", "careful", "auditable" are tells if not operationally defined.)
- Authority gaps. Where does v3 invoke founder track records that aren't actually relevant to the proposed work?
- Internal inconsistencies. Do §17 (approach) and §26 (track record) and §29 (theory of change) tell the same story? Where do they drift?
- The 'where's the science?' test. A reviewer used to seeing scientific deliverables may bounce off infrastructure-as-deliverable framing. Has §30's mitigation actually answered this objection or just acknowledged it?
- Comparison to Liverpool Hive Mind. Does §28 land the complementary positioning, or does it accidentally invite a 'didn't we already fund this' read?
Identify the 3 weakest sections and write specific surgical fix proposals for each ("in §X, replace 'Y' with 'Z' because..."). Identify the 3 strongest sections — these stay untouched.
Part 2: v2 vs v3 honest comparison
Re-do the v1-vs-v2 style comparison but for v2 vs v3. Six dimensions:
- Translational impact (Google.org cares about real-world benefit)
- Defensibility under expert review
- Authenticity to founders' track record
- Demonstration credibility
- Risk of falling apart under scrutiny
- Fit to Google's stated priorities (Functional Genomics framing)
State which is stronger on each dimension. Recommend v2 OR v3 OR a fold-in. Be willing to recommend v2 if v3 has regressions v2 didn't.
Part 3: Web-link verification
For EVERY URL in v3, verify it:
- Resolves (200 status, not 404 / dead / parked).
- Loads content that actually matches what v3 implies.
- Is not weaker than the application implies.
Specifically check:
poietic.life— does the landing page deliver on the v3 framing? Public benefit statement present? Founder bios consistent?github.com/orgs/poietic-pbc— repos visible? Look credible? Anything stale or embarrassing (the deep-research-competition KRAS scaffold)?github.com/graphwork/workgraph— actively developed? README delivers what v3 implies? Recent commits?graphwork.github.io— docs site loads? Substantive?- Any cited paper DOIs / arXiv links / PubMed links — resolve correctly? Cite the right thing?
For each URL, log: URL, status, brief assessment (matches application / weaker than application / mismatch / dead). Flag mismatches as MUST FIX.
Output
Write ~/poietic.life/notes/v3-standout-review-20260501.md with sections:
- Headline verdict (one paragraph): would you fund this if you were the program officer? Why or why not?
- Three weakest sections with surgical fixes
- Three strongest sections (don't break them)
- v2 vs v3 honest comparison (six dimensions + recommendation)
- Web-link verification table (URL | status | assessment | action)
- MUST FIX before submit (consolidated punch list, distinguishing content fixes from manual Erik-only steps)
- Optional improvements if time permits (deeper revisions, not blockers)
Cap: 1500 words total. Be terse and concrete.
wg log a one-paragraph summary on this task.
Constraints
- Adopt actual hostile-reviewer posture, not 'mostly positive with minor notes.' If v3 has real weaknesses, name them.
- For web-link verification, USE WebFetch on each URL. Do not assume.
- No em-dashes (CLAUDE.md style rule).
- Do not modify the v3 application file. Output is the review note only.
- If v2 is stronger overall, say so. The point is honest critique, not v3 advocacy.
Validation
- All seven listed inputs read
- Three weakest sections identified with surgical fixes
- Three strongest sections identified
- v2 vs v3 six-dimension comparison with verdict
- Every URL in v3 verified via WebFetch
- Web-link table includes status + assessment for each URL
- MUST FIX punch list distinguishes content fixes from Erik-only steps
-
Review note at
~/poietic.life/notes/v3-standout-review-YYYYMMDD.md - Under 1500 words
Depends on
- (none)
Required by
- (none)
Log
- 2026-05-01T22:08:23.155716617+00:00 Task paused
- 2026-05-01T22:08:48.369435959+00:00 Task abandoned