verify-fix-nex

Verify: fix-nex-chat-mirror end-to-end — smoke test the A fix

Metadata

Statusdone
Assignedagent-2109
Agent identity3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3
Modelcodex:gpt-5.5
Created2026-05-03T21:30:47.630194808+00:00
Started2026-05-04T01:41:37.050985498+00:00
Completed2026-05-04T01:48:19.153139648+00:00
Tagspriority-critical,verify,smoke,nex,chat, eval-scheduled
Eval score0.95
└ blocking impact0.92
└ completeness1.00
└ constraint fidelity0.85
└ coordination overhead0.98
└ correctness0.95
└ downstream usability0.95
└ efficiency0.90
└ intent fidelity0.85
└ style adherence0.95

Description

Description

Validation gate before B can run. Verify the narrow fix from fix-nex-chat-mirror actually works end-to-end by running the simulated-human-in-TUI smoke against the live tailnet endpoint.

What to verify

The full canonical user flow on lambda01/qwen3-coder-30b:

  1. pkill -f 'wg tui' && cargo install --path . && wg tui — fresh process on fresh binary
  2. Open new-chat dialog
  3. Pick nex executor, model=qwen3-coder-30b, endpoint=https://lambda01.tail334fe6.ts.net:30000
  4. Click Launch
  5. Type 'hi' in the new chat tab
  6. ASSERT: response arrives within 30s
  7. Type a follow-up question
  8. ASSERT: response continues coherently
  9. Exit TUI (Ctrl+C or quit)
  10. Restart TUI
  11. ASSERT: nex chat tab is reattached, prior conversation visible
  12. Send another message
  13. ASSERT: response continues from prior context

If ALL of these pass: A is verified. B can proceed.

If ANY of these fail: A is not actually fixed; do NOT advance to B. Document the specific failure mode in the task log + decide whether to re-attempt A or escalate.

Test mechanism

The simulated-human smoke harness from smoke-tui-nex-end-to-end (filed earlier). If that harness exists and is functional, use it. If not, file a manual-verification result with screenshot/text-capture evidence.

Validation

  • All 13 steps above executed against the user's real endpoint
  • Each step's expected outcome confirmed (with capture: tmux pane text, daemon log excerpt, etc.)
  • Verdict: PASS (advances to B) OR FAIL (with specific failure mode documented)
  • No source / doc modifications — verification only
  • cargo install --path . was run before testing — verify the binary actually has fix-nex-chat-mirror's commit

Depends on

Required by

Log