research-current-openai

Research: current OpenAI/codex model tiers and claude-equivalent mapping (2026-04-28)

Metadata

Statusdone
Assignedagent-993
Agent identityf51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e
Modelclaude:opus
Created2026-04-28T22:38:43.509006827+00:00
Started2026-04-28T22:45:28.078440627+00:00
Completed2026-04-28T22:48:32.656875426+00:00
Tagspriority-high,research,codex,config, eval-scheduled
Eval score0.94
└ blocking impact0.95
└ completeness0.95
└ constraint fidelity0.70
└ coordination overhead0.95
└ correctness0.95
└ downstream usability0.92
└ efficiency0.90
└ intent fidelity0.80
└ style adherence0.95

Description

Description

Workgraph's codex-cli route currently hardcodes codex:o1-pro for everything. There's no haiku/sonnet/opus-equivalent tier breakdown for codex agents, and the agency pipeline (.evaluate-*, .flip-*, .assign-*) silently falls back to claude:haiku because [models.*] per-role overrides aren't seeded.

Research the current state of OpenAI/codex models as of 2026-04-28 and produce the mapping needed to fix this.

What to find out

Use WebSearch / WebFetch (do NOT guess from training-cutoff knowledge — models have moved fast). Verify against:

  • OpenAI's official model list (platform.openai.com/docs/models or equivalent)
  • The codex CLI documentation for what models it supports
  • Recent (2026) pricing pages

Produce a mapping table in the task log with these columns:

Tier roleClaude modelCodex/OpenAI equivalent (2026-04-28)Rationale$/1M input$/1M output
cheap/fast (haiku)claude:haikucodex:???.........
balanced (sonnet)claude:sonnetcodex:???.........
heavy (opus)claude:opuscodex:???.........

Specifically answer:

  1. What are the current top-tier (reasoning-heavy) codex models in 2026-04?
  2. What is the cheapest codex model suitable for short, structured outputs (the role agency uses haiku for: scoring, FLIP, assignment verdicts)?
  3. What is the mid-tier balanced codex model for normal worker tasks?
  4. What model name strings does the codex CLI handler accept? (Check src/dispatch/handler_for_model.rs or equivalent if uncertain — but only the registry, not full source review.)
  5. Is codex:o1-pro (the current default) actually still a current model name in 2026-04, or has it been deprecated/superseded?
  6. The user mentioned "gpt-5.5" — does this exist? If not, what's closest?

Validation

  • Mapping table with all 3 tiers populated, posted to task log via wg log
  • Each codex model name is verified against an authoritative 2026 source (cite URL + date in the log)
  • Cost figures included (or marked "could not verify" — never invented)
  • Concrete recommendation: which model strings should the fix task write into [models.evaluator] / [models.assigner] / [models.flip] / [agent] / [dispatcher] in the codex-cli route
  • Note any gotchas (rate limits, regional availability, codex CLI compat) that affect the recommendation

Depends on

Required by

Log