hermes - ✅(Solved) Fix [Bug][auth] Refresh token rotation not persisted — RT reuse triggers session revocation across all profiles (v0.11.0) [1 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15099Fetched 2026-04-25 06:24:36
View on GitHub
Comments
2
Participants
2
Timeline
11
Reactions
0
Author
Timeline (top)
labeled ×4referenced ×3commented ×2closed ×1

Fix Action

Fix / Workaround

  1. Re-OAuthed all 4 profiles via hermes auth add nous --type oauth --no-browser
  2. Patched our custom self-heal hook to fall back to expires_at when obtained_at is missing
  3. Restarted all 4 gateways; verified /v1/chat/completions returns HTTP 200 and /api/oauth/token mints fresh tokens successfully

Happy to share full JWT payloads, OAuth response bodies, or patch a candidate fix against hermes_cli/auth.py if someone on the Nous team can point me at the right function. The fact that this hit 4 independent profiles within 10 days on the same account suggests a deterministic bug, not a flaky edge case.

PR fix notes

PR #15111: fix(nous-oauth): preserve obtained_at in pool + actionable message on RT reuse

Description (problem / solution / changelog)

Closes #15099.

What this PR does

Two narrow Nous OAuth fixes, both motivated by @camelludo's #15099 report. After tracing the refresh-token persistence path end-to-end I couldn't find the 'client silently drops rotated RT' bug the report hypothesised — the pool + singleton write-back machinery is actually correct. What IS broken is narrower, and the report gives us enough evidence to fix both issues completely.

Fix 1: _seed_from_singletons() preserves mint/refresh timestamps

When the pool is seeded from the providers.nous singleton, the upsert payload was missing obtained_at, agent_key_obtained_at, expires_in, agent_key_expires_in, agent_key_id, and agent_key_reused. Fresh credentials showed up with obtained_at: None, which breaks freshness-sensitive consumers (self-heal hooks, pool pruning by age) — they treat just-minted credentials as older than they are. @camelludo explicitly called this out as the 'secondary issue' in the report.

Fix 2: Actionable message on invalid_grant: refresh token reuse detected

The Nous Portal server implements OAuth 2.1 RT rotation and returns invalid_grant: Refresh token reuse detected when a rotated RT gets used a second time. In real-world reports this almost always means an external process (monitoring script, custom self-heal hook, another install sharing ~/.hermes/auth.json) called POST /api/oauth/token with Hermes's RT without persisting the rotated value back. The generic reuse message gave users no clue about the external-process cause, so they reported it as a Hermes persistence bug.

The PR detects that specific error signature and rewrites it to:

Nous Portal detected refresh-token reuse and revoked this session. This usually means an external process (monitoring script, custom self-heal hook, or another Hermes install sharing ~/.hermes/auth.json) called POST /api/oauth/token with Hermes's refresh token without persisting the rotated token back. Nous refresh tokens are single-use — only Hermes may call the refresh endpoint. For health checks, use hermes auth status instead. Re-authenticate with: hermes auth add nous

Generic invalid_grant errors (e.g. Refresh session has been revoked) keep their original server descriptions untouched.

Note on the reporter's primary diagnosis

@camelludo's report hypothesised that Hermes wasn't persisting the rotated RT. I traced this carefully and the persistence is working correctly:

  • refresh_nous_oauth_pure() persists the new RT into state (line 2118)
  • resolve_nous_runtime_credentials() holds _auth_store_lock() across the HTTP call and _persist_state()s immediately after rotation (lines 2359, 2426)
  • pool._refresh_entry() calls _sync_device_code_entry_to_auth_store() so the pool and singleton stay in sync

Their own comment 2 describes the actual trigger: a systemd-timer-triggered aliveness script POSTing to /api/oauth/token with each profile's RT, explicitly discarding the response. That's RT reuse — their script consumed RT1, got RT2, threw RT2 away, and the gateway's next refresh with RT1 triggered the OAuth server's reuse detection. Fix 2 makes that specific failure mode unmistakable in the error message so future reports can self-diagnose.

Validation

  • tests/agent/test_credential_pool.py — 45/45 pass (1 new)
  • tests/hermes_cli/test_auth_nous_provider.py — 12/12 pass (2 new)
  • E2E-verified: seeded singleton with obtained_at → pool entry keeps it; fake 400-with-reuse response → actionable rewritten message; fake 400-without-reuse → original description unchanged

Changed files

  • agent/credential_pool.py (modified, +12/-0)
  • hermes_cli/auth.py (modified, +22/-0)
  • tests/agent/test_credential_pool.py (modified, +69/-0)
  • tests/hermes_cli/test_auth_nous_provider.py (modified, +80/-0)
RAW_BUFFERClick to expand / collapse

Bug Description

On Hermes Agent v0.11.0 (SHA acdcb167), the Nous Portal refresh-token chain dies within 4–10 days on every profile, and the failure mode indicates the client is not persisting the rotated refresh_token. Nous's OAuth server treats reuse of a previously-rotated RT as a token-theft signal and revokes the entire session chain.

Today I diagnosed this incident on 4 profiles simultaneously (Pedro / Selim / Omar / Atlas) — 7 credentials total across their pools, all failing. Below is the reproduction evidence.

Environment

  • Hermes Agent v0.11.0 (2026.4.23), SHA acdcb167 (0 commits behind origin/main as of this writing)
  • Ubuntu 22.04, Python 3.11.15, running as 4-profile setup (root + 3 sub-profiles under /root/.hermes/profiles/*)
  • Account sub: cmnit8tqn000cl704ac8x2jn8 (Scale tier, $50/mo, subscription_tier: 3)
  • OAuth flow used: hermes auth add nous --type oauth --no-browser (device-code variant)
  • All 4 profiles re-OAuthed across 2026-04-13 → 2026-04-15; all 7 pool entries dead by 2026-04-24
  • Distinct from #14435 (that's Tool Gateway server-side provisioning; this is client-side RT rotation)

7-row failure matrix (from live POST /api/oauth/token with grant_type=refresh_token)

profilepool #cred age (h)server response
pedro0116.8invalid_grant: Refresh session has been revoked
selim099.0invalid_grant: Refresh token reuse detected; please re-authenticate
selim13.0invalid_grant: Refresh session has been revoked
omar0116.8invalid_grant: Refresh session has been revoked
omar199.0invalid_grant: Refresh session has been revoked
atlas0116.8invalid_grant: Refresh session has been revoked
atlas199.0invalid_grant: Refresh session has been revoked

The Selim #0 response — "Refresh token reuse detected" — is the smoking gun. Nous's server is telling us the client tried to use an RT that had already been rotated and retired. The other 6 rows are the consequence: after the server detected RT reuse, the entire session chain was revoked, cascading through every subsequent refresh attempt.

The OAuth 2.1 spec (section 6.1) says: "The authorization server MUST… issue a new refresh token, in which case it MUST revoke the previous refresh token after a new refresh token is issued." This is what Nous is implementing. For clients to coexist with this, they must persist the newly-issued RT from each refresh response and discard the old one.

Why the v0.10.0 "real fix" commits did not prevent this

The v0.9.0 → v0.10.0 release notes and my upgrade tracking pointed to three commits as the auth persistence fix:

  • 32cea0c0 — dashboard showing Nous as 'not connected' false-negative (#9261)
  • c7fece1f — normalize Nous device-code pool to avoid duplicates
  • c096a693 — mirror Nous OAuth credentials to providers.nous on CLI login

These address pool dedup and provider-pointer mirroring. None of them touch the code path that handles the refresh_response → write back the NEW refresh_token to auth.json. If that code path silently drops the new RT, every subsequent refresh uses the now-retired old RT, which Nous treats as reuse.

Request: can someone on the Hermes team confirm whether _refresh_nous_access_token (or equivalent) is persisting the refresh_token field from the response JSON back into credential_pool.nous[active].refresh_token? If it's only persisting access_token, that's the bug.

Secondary issue discovered in the same diagnosis

hermes auth add nous --type oauth --no-browser on v0.11.0 writes pool entries without obtained_at or agent_key_obtained_at populated (fields show None in the saved JSON). Other fields like expires_at, access_token, refresh_token, label, agent_key are all populated correctly. This is a regression compared to older credentials in the same pool — which have obtained_at set to a real ISO 8601 timestamp.

Downstream impact: any tool that sorts/prunes pool entries using obtained_at as the freshness signal will silently treat fresh credentials as "oldest" and may evict them. I hit this myself in a custom self-heal hook that uses obtained_at to pick the freshest pool entry — fresh credentials were being auto-pruned on every gateway restart.

Suggested fix: ensure the code path that constructs the pool entry in auth_add_command / _nous_device_code_login populates obtained_at and agent_key_obtained_at with the current UTC timestamp when the mint completes.

What we did to unblock ourselves

  1. Re-OAuthed all 4 profiles via hermes auth add nous --type oauth --no-browser
  2. Patched our custom self-heal hook to fall back to expires_at when obtained_at is missing
  3. Restarted all 4 gateways; verified /v1/chat/completions returns HTTP 200 and /api/oauth/token mints fresh tokens successfully

Working state confirmed as of 2026-04-24.

Reproducer steps

  1. Start with a clean Hermes v0.11.0 setup, any subscription tier
  2. Run hermes auth add nous --type oauth on any profile, approve via browser
  3. Leave the gateway running for 4–10 days with moderate traffic (any channel that triggers _refresh_nous_access_token a few times)
  4. Eventually POST /api/oauth/token grant_type=refresh_token will return invalid_grant: Refresh token reuse detected — typically around the mark when the 2nd or 3rd refresh happens
  5. All subsequent calls cascade to Refresh session has been revoked

Willing to help

Happy to share full JWT payloads, OAuth response bodies, or patch a candidate fix against hermes_cli/auth.py if someone on the Nous team can point me at the right function. The fact that this hit 4 independent profiles within 10 days on the same account suggests a deterministic bug, not a flaky edge case.

Tagging for visibility: @someone-on-auth (please re-tag the right maintainer).

extent analysis

TL;DR

The most likely fix is to update the _refresh_nous_access_token function to persist the new refresh_token from the refresh response to credential_pool.nous[active].refresh_token.

Guidance

  • Verify that the _refresh_nous_access_token function is correctly updating the refresh_token field in the credential_pool.nous[active] object.
  • Check the code path that handles the refresh response to ensure it is writing the new refresh_token back to the auth.json file.
  • Review the auth_add_command and _nous_device_code_login functions to ensure they are populating the obtained_at and agent_key_obtained_at fields with the current UTC timestamp.
  • Test the fix by running the reproducer steps and verifying that the refresh_token is correctly updated and persisted.

Example

No code example is provided as the issue does not include the relevant code snippets.

Notes

The issue appears to be related to the persistence of the refresh_token in the credential_pool.nous[active] object. The fact that the invalid_grant: Refresh token reuse detected error is returned suggests that the client is attempting to use a previously rotated refresh token.

Recommendation

Apply a workaround by patching the custom self-heal hook to fall back to expires_at when obtained_at is missing, as described in the issue. This will prevent the eviction of fresh credentials until the underlying issue is fixed.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [Bug][auth] Refresh token rotation not persisted — RT reuse triggers session revocation across all profiles (v0.11.0) [1 pull requests, 2 comments, 2 participants]