design-secure-credential

Design: secure credential storage for API keys (replace env vars)

Metadata

Statusdone
Assignedagent-1067
Agent identity3184716484e6f0ea08bb13539daf07686ee79d440505f1fdf2de0357707034c3
Modelclaude:opus
Created2026-04-29T02:23:45.584213501+00:00
Started2026-04-29T02:30:03.590623373+00:00
Completed2026-04-29T02:32:16.286538535+00:00
Tagspriority-high,design,secrets,security, eval-scheduled
Tokens22637 in / 8142 out
Eval score0.40
└ blocking impact0.45
└ completeness0.35
└ constraint fidelity0.85
└ coordination overhead0.40
└ correctness0.30
└ downstream usability0.25
└ efficiency0.40
└ intent fidelity0.60
└ style adherence0.35

Description

Description

Profiles currently reference API keys via env var names (api_key_env = "OPENROUTER_API_KEY"). This is brittle and leaky:

  • Env vars surface in ps listings (security)
  • Lost on shell restart (UX)
  • Propagated to every subprocess (broader leak surface)
  • Land in shell history if set inline
  • Multi-machine sync is manual

User asked: 'we should have reasonable storage not in env vars for the keys! help figure that out.'

Goal

Pick the storage approach for API keys + the resolver semantics. Hand off to the implementation task.

Options to evaluate

A. OS keyring (default for desktops/macOS)

  • Crate: keyring — wraps macOS Keychain, Linux secret-service / kwallet, Windows Credential Manager
  • Pros: standard, encrypted at rest, OS handles unlock, no plaintext on disk, no env leak
  • Cons: flaky on headless Linux (no D-Bus); not git-syncable; per-machine

B. Plaintext file at ~/.wg/secrets.toml (mode 0600)

  • Pros: dead simple, works headless (CI, servers, containers), inspectable
  • Cons: anyone with read access to the user account can grab the file
  • Better than env var (file perms vs. process listing) but not encrypted at rest

C. age-encrypted file at ~/.wg/secrets.age

  • Pros: encrypted at rest, the encrypted form is git-syncable, agent-style unlock works
  • Cons: age key management itself becomes a secret-storage problem (turtles); needs unlock per-process
  • Crate: age or rage

D. Pass-through to external secret manager

  • 1Password CLI: op://vault/item/field URL syntax that wg resolves at runtime via op read
  • pass: pass:openrouter/api-key resolves via pass show
  • gnu-keyring, BitWarden, etc.
  • Pros: user's existing password manager; defers complexity to a tool that's already secure
  • Cons: assumes user has one configured + authed

E. Hybrid resolver (RECOMMENDED PATH)

Resolve in this order, first hit wins:

  1. Inline literal (only allowed for testing — warn loudly)
  2. Pass-through URL (op://..., pass:...) → call external tool
  3. OS keyring entry (default for new keys via wg secret set)
  4. Env var fallback (current behavior, deprecated with warning)
  5. Plaintext file fallback at ~/.wg/secrets.toml (only if explicitly enabled — secrets.allow_plaintext = true in config)

This means: existing env-var workflows keep working (back-compat); new users get keyring by default; power users can wire up their password manager; CI/headless can opt into the plaintext file.

Forks to resolve

Profile schema

Today: api_key_env = "OPENROUTER_API_KEY" Options:

  • F1) Replace with api_key_ref = "keyring:openrouter" (URI-style)
  • F2) Add api_key = "keyring:openrouter" and deprecate api_key_env
  • F3) Keep api_key_env as the primary field but make its value a URI (env: prefix opt-in)

Recommend F1 — clean break, more explicit, supports all backends via URI scheme.

Command surface

  • wg secret set <name> — prompts for value (echo off), stores in chosen backend
  • wg secret get <name> — redacted by default; --reveal to actually print (with warning)
  • wg secret list — names only, no values
  • wg secret rm <name>
  • wg secret backend show — show which backend(s) are active + reachable
  • wg secret backend set <keyring|plaintext|...> — change default for new wg secret set

Profile integration

  • wg profile create openrouter -m openrouter:... --secret openrouter → sets api_key_ref = "keyring:openrouter"
  • wg profile use <name> checks the referenced secret is reachable BEFORE switching the daemon. Print actionable hint if missing: "profile 'openrouter' references secret 'openrouter' but no entry found in keyring. Run: wg secret set openrouter"

Failure modes to spec

  • Headless Linux (no D-Bus, no keyring service): graceful fallback to plaintext-file backend with a clear setup hint
  • Locked keychain (macOS, GNOME): the unlock prompt happens once per worker spawn — is that acceptable, or do we keep an in-memory cache for the dispatcher's lifetime?
  • Profile switch races: keyring lookup is async; what if wg profile use returns success but the secret is unreachable when the daemon next spawns a worker? (Pre-flight check, then fail fast on use.)

Validation

  • Design doc posted via wg log
  • Storage approach chosen with rationale (probably E — hybrid resolver)
  • Schema decision (F1/F2/F3) made and justified
  • Full command surface specified
  • Migration path for users on existing api_key_env config (deprecation warning + automatic migration command)
  • Failure modes documented (headless, locked keychain, missing secret)
  • Smoke scenario list: set/get/list/rm + profile-with-secret happy path + missing-secret error path

Depends on

Required by

Log