thin-wrapper-impl — Workgraph live mirror

Metadata

Status	done
Assigned	`agent-187`
Agent identity	`f51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e`
Created	2026-04-26T17:29:32.008025729+00:00
Started	2026-04-27T00:28:22.506159685+00:00
Completed	2026-04-27T00:38:05.925097160+00:00
Tags	`eval-scheduled`
Tokens	2165805 in / 12522 out
Eval score	0.88
└ blocking impact	0.90
└ completeness	1.00
└ constraint fidelity	0.55
└ coordination overhead	0.85
└ correctness	0.95
└ downstream usability	0.90
└ efficiency	0.90
└ intent fidelity	0.80
└ style adherence	0.90

Description

Harden the existing codex executor (src/commands/codex_handler.rs) so it works reliably against custom OAI-compatible endpoints — specifically the user's repro target https://lambda01.tail334fe6.ts.net:30000 running model qwen3-coder.

Per Phase 1 research (docs/research/thin-wrapper-executors-2026-04.md), the right architecture is already in place: per-turn codex exec spawn with replayed conversation history. The remaining gap is configuration plumbing — wg session config (model + endpoint + api key) must flow through into codex's model_providers config so codex talks to the user's endpoint instead of api.openai.com.

File scope (no overlap with siblings)

src/commands/codex_handler.rs (primary)
src/commands/init.rs / src/commands/setup.rs (only the codex-route writer pieces, if needed)
src/service/executor.rs (only if codex executor env var passing needs adjustment)
tests/codex_handler_oai_compat.rs (new file)

Implementation guidance

Read WG_MODEL and WG_ENDPOINT (and api key from existing session config) on handler startup.
Write a per-session ~/.codex/config.toml overlay (or use codex exec --config model_provider=...,base_url=...) so the spawned codex CLI talks to the configured endpoint.
Verify api_key is passed via env to the spawned codex process (do not log it).
Match claude_handler's error surface: any spawn/exit-code error becomes a clear handler.log line + outbox error reply.

Validation

Failing test written first: tests/codex_handler_oai_compat.rs::test_codex_handler_uses_custom_base_url — spawn codex_handler against a stub HTTP server (wiremock or hyper-based), verify request actually hits the stub URL not api.openai.com, replay 5 turns.
Implementation makes the test pass.
cargo build + cargo test pass with no regressions.
codex_handler successfully spawns codex CLI binary in a real run with WG_MODEL=qwen3-coder + custom WG_ENDPOINT (gated on codex CLI being installed; emit clear SKIP if not).

Implement directly — do not decompose further.

## Description

Harden the existing `codex` executor (src/commands/codex_handler.rs) so it works reliably against custom OAI-compatible endpoints — specifically the user's repro target `https://lambda01.tail334fe6.ts.net:30000` running model `qwen3-coder`.

Per Phase 1 research (docs/research/thin-wrapper-executors-2026-04.md), the right architecture is already in place: per-turn `codex exec` spawn with replayed conversation history. The remaining gap is configuration plumbing — wg session config (model + endpoint + api key) must flow through into codex's `model_providers` config so codex talks to the user's endpoint instead of api.openai.com.

## File scope (no overlap with siblings)
- src/commands/codex_handler.rs (primary)
- src/commands/init.rs / src/commands/setup.rs (only the codex-route writer pieces, if needed)
- src/service/executor.rs (only if codex executor env var passing needs adjustment)
- tests/codex_handler_oai_compat.rs (new file)

## Implementation guidance
1. Read WG_MODEL and WG_ENDPOINT (and api key from existing session config) on handler startup.
2. Write a per-session `~/.codex/config.toml` overlay (or use `codex exec --config model_provider=...,base_url=...`) so the spawned codex CLI talks to the configured endpoint.
3. Verify api_key is passed via env to the spawned codex process (do not log it).
4. Match claude_handler's error surface: any spawn/exit-code error becomes a clear handler.log line + outbox error reply.

## Validation
- [ ] Failing test written first: tests/codex_handler_oai_compat.rs::test_codex_handler_uses_custom_base_url — spawn codex_handler against a stub HTTP server (wiremock or hyper-based), verify request actually hits the stub URL not api.openai.com, replay 5 turns.
- [ ] Implementation makes the test pass.
- [ ] cargo build + cargo test pass with no regressions.
- [ ] codex_handler successfully spawns codex CLI binary in a real run with WG_MODEL=qwen3-coder + custom WG_ENDPOINT (gated on codex CLI being installed; emit clear SKIP if not).

Implement directly — do not decompose further.

Depends on

done .assign-thin-wrapper-impl

Required by

(none)

Log

2026-04-26T17:33:26.650019570+00:00 Lightweight assignment: agent=Careful Programmer (f5143935), exec_mode=full, context_scope=task, reason=Careful Programmer is the only role fit for implementation; Careful tradeoff matches hardening/correctness-critical work; 33 completed tasks show experience with similar executor/integration work.
2026-04-26T17:33:26.824313329+00:00 Spawned by coordinator --executor claude --model opus
2026-04-26T17:33:39.162011591+00:00 Starting work — reading codex_handler.rs and Phase 1 research doc
2026-04-26T18:03:29.957565426+00:00 Task pending LLM gate validation
2026-04-27T00:27:33.130330271+00:00 Task rejected (1/3): Cannot smoke-test against lambda01: agency-picks-claude bug (merged but ineffective) routes EVERY task to executor=claude regardless of -x codex config. Just smoke'd in /tmp scratch dir: wg init -x codex -m qwen3-coder -e https://lambda01... → task spawned with Executor: claude, Model: qwen3-coder → claude CLI 404 → failed. Until agency-picks-claude-2 lands and actually works, codex executor work cannot be evaluated on its own merit.
2026-04-27T00:28:22.506164364+00:00 Spawned by coordinator --executor claude --model opus
2026-04-27T00:28:31.361205774+00:00 Starting work on codex_handler hardening for custom OAI-compat endpoints
2026-04-27T00:37:48.410846145+00:00 Validation results: codex_oai_compat unit tests (4) pass; live integration test test_codex_handler_uses_custom_base_url passes (codex CLI installed, request reached stub URL with bearer token); 1982 lib tests pass; cargo build clean. Implementation already on main from prior commit 3923c0bf2: codex_handler.rs L342-354 wires codex_oai_compat::endpoint_url_from_env + config_overrides + api_key_from_env. Pre-existing test breakages in integration_resume/integration_dual_executor (from a22c7c90f wg-nex-resume-311 ResumeConfig field additions) and integration_chat (from --executor-required init change) are unrelated to thin-wrapper-impl scope.
2026-04-27T00:38:05.925110936+00:00 Task pending LLM gate validation
2026-04-27T01:51:24.458671852+00:00 Migrated PendingValidation → Done (deprecate-pending-validation): agency `.evaluate-*` is now the dependency-unblock gate. To force re-spawn instead, run `wg reject <task>`.