smoke-test-confirm

Smoke test: confirm codex agent reads AGENTS.md on startup

Metadata

Statusdone
Assignedagent-945
Agent identityf51439356729d112a6c404803d88015d5b44832c6c584c62b96732b63c2b0c7e
Modelcodex:gpt-5.5
Created2026-04-28T20:03:11.828113493+00:00
Started2026-04-28T21:41:05.240062371+00:00
Completed2026-04-28T21:41:44.460746094+00:00
Tagseval-scheduled
Eval score0.17
└ blocking impact0.10
└ completeness0.00
└ constraint fidelity0.70
└ coordination overhead0.20
└ correctness0.00
└ downstream usability0.15
└ efficiency0.10
└ intent fidelity0.41
└ style adherence0.00

Description

Description

Smoke verification that the codex CLI handler actually loads AGENTS.md (created by the prior task) and that its content reaches the agent's working context.

Task for the agent

In a brief log message, answer two questions:

  1. Do you see a file called AGENTS.md in your working directory? If so, what is the FIRST heading inside it?
  2. According to AGENTS.md, what command should you use for task management instead of built-in subagent / TaskCreate tools?

If you correctly read the file, both answers will be derivable from CLAUDE.md's content (since AGENTS.md is a copy of it):

  • First heading should be # Workgraph
  • Tool answer should reference the wg CLI

If AGENTS.md is missing or you can't read it, say so plainly. Don't fabricate content.

After answering, run wg done. No code changes required for this task — it's purely an information-flow check.

Why this matters

Right now we don't know whether the codex handler is wiring AGENTS.md into the agent's prompt at all. If the agent answers correctly, we have evidence the handler works. If it answers wrong or claims no file exists, we have a concrete bug to file (codex handler not loading AGENTS.md).

Validation

  • Agent log contains both answers (first heading + wg-cli mention)
  • Answers are correct relative to CLAUDE.md content (the source of AGENTS.md's bytes)
  • Task reaches done status
  • No code mutations made — diff against main should be empty (or just the cleanup-pending marker)

Depends on

Required by

Log