hermes - 💡(How to fix) Fix [Bug]: _is_entitlement_failure over-matches xAI 'bad-credentials' 403 — long-running TUI sessions can't auto-refresh stale OAuth tokens

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

_is_entitlement_failure in run_agent.py over-matches on xAI Grok 403 responses, causing legitimate "OAuth access token failed validation" errors to be misclassified as unsubscribed-account entitlement failures. The defensive guard against entitlement refresh loops (existing test references issue #26847) suppresses the refresh-on-401 path for both real cases, leaving long-running TUI sessions stuck on a stale token with no recovery.

Workaround: exit and reopen the TUI — the startup refresh path bypasses the broken classifier.

Error Message

"error": "The OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]" 5. Hermes logs Non-retryable client error and surfaces it to the user. No refresh attempt happens, even though the credential pool's _refresh_entry for this provider works fine (proven by opening a new TUI session — the startup-resolve path refreshes successfully). _is_entitlement_failure returns True because the response body matches its substring heuristic on "caller does not have permission". The recovery short-circuits, returns False, error surfaces as non-retryable. | Condition | code (same) | error field (the disambiguator) |

  1. Tightest — In _is_entitlement_failure, check the body's error field first: if it contains [WKE=unauthenticated: (or specifically [WKE=unauthenticated:bad-credentials]), return False immediately. Refresh path then handles it. "error": "The OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]",

but with status_code=403 and the bad-credentials error body. Should call try_refresh_current().

Root Cause

xAI's API returns the same code field text for two distinct conditions:

Conditioncode (same)error field (the disambiguator)
Entitlement (account isn't SuperGrok-subscribed)"The caller does not have permission to execute the specified operation""... active Grok subscription. Manage at https://grok.com" (or similar entitlement language)
Bad credentials (access token failed validation)"The caller does not have permission to execute the specified operation""The OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]"

The existing tests in tests/run_agent/test_codex_xai_oauth_recovery.py cover the entitlement case correctly (test_is_entitlement_failure_matches_real_xai_bodies), but there's no test case for the bad-credentials variant — so the classifier treats both identically.

The [WKE=unauthenticated:bad-credentials] suffix is xAI's authoritative disambiguator. Hermes currently ignores it.

Fix Action

Fix / Workaround

Workaround: exit and reopen the TUI — the startup refresh path bypasses the broken classifier.

Code Example

{
  "code": "The caller does not have permission to execute the specified operation",
  "error": "The OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]"
}

---

def test_is_entitlement_failure_false_for_bad_credentials_wke_suffix():
    """403 with WKE=unauthenticated:bad-credentials is auth failure, not entitlement."""
    from run_agent import AIAgent
    assert not AIAgent._is_entitlement_failure(
        {
            "code": "The caller does not have permission to execute the specified operation",
            "error": "The OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]",
        },
        403,
    )

def test_recover_with_credential_pool_refreshes_on_xai_bad_credentials_403():
    """A bad-credentials 403 from xai-oauth must trigger refresh."""
    # Same scaffolding as test_recover_with_credential_pool_still_refreshes_genuine_auth_failure,
    # but with status_code=403 and the bad-credentials error body. Should call try_refresh_current().
RAW_BUFFERClick to expand / collapse

Summary

_is_entitlement_failure in run_agent.py over-matches on xAI Grok 403 responses, causing legitimate "OAuth access token failed validation" errors to be misclassified as unsubscribed-account entitlement failures. The defensive guard against entitlement refresh loops (existing test references issue #26847) suppresses the refresh-on-401 path for both real cases, leaving long-running TUI sessions stuck on a stale token with no recovery.

Workaround: exit and reopen the TUI — the startup refresh path bypasses the broken classifier.

Repro

  1. Open a Hermes TUI session against provider/xai-oauth (SuperGrok).
  2. Let it sit idle long enough that the access token goes stale by xAI's server-side criteria (in my case, ~22 hours; can happen sooner if xAI rotates session-side).
  3. Send a request.
  4. xAI returns HTTP 403 with this body:
{
  "code": "The caller does not have permission to execute the specified operation",
  "error": "The OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]"
}
  1. Hermes logs Non-retryable client error and surfaces it to the user. No refresh attempt happens, even though the credential pool's _refresh_entry for this provider works fine (proven by opening a new TUI session — the startup-resolve path refreshes successfully).

Expected

The [WKE=unauthenticated:bad-credentials] suffix unambiguously indicates this is a credential-validation failure, not an entitlement failure. Hermes should:

  • Call _recover_with_credential_pooltry_refresh_current()_swap_credential
  • Retry the request with the refreshed token
  • Either succeed (the typical case after a stale token) or, if the refresh itself fails terminally, fall through to the existing terminal-quarantine path

Actual

_is_entitlement_failure returns True because the response body matches its substring heuristic on "caller does not have permission". The recovery short-circuits, returns False, error surfaces as non-retryable.

Root cause

xAI's API returns the same code field text for two distinct conditions:

Conditioncode (same)error field (the disambiguator)
Entitlement (account isn't SuperGrok-subscribed)"The caller does not have permission to execute the specified operation""... active Grok subscription. Manage at https://grok.com" (or similar entitlement language)
Bad credentials (access token failed validation)"The caller does not have permission to execute the specified operation""The OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]"

The existing tests in tests/run_agent/test_codex_xai_oauth_recovery.py cover the entitlement case correctly (test_is_entitlement_failure_matches_real_xai_bodies), but there's no test case for the bad-credentials variant — so the classifier treats both identically.

The [WKE=unauthenticated:bad-credentials] suffix is xAI's authoritative disambiguator. Hermes currently ignores it.

Proposed fixes (escalating, pick one)

  1. Tightest — In _is_entitlement_failure, check the body's error field first: if it contains [WKE=unauthenticated: (or specifically [WKE=unauthenticated:bad-credentials]), return False immediately. Refresh path then handles it.

  2. Pragmatic — Require BOTH the entitlement keyword AND the absence of "OAuth2 access token could not be validated" before classifying as entitlement.

  3. Safest — When the WKE suffix says unauthenticated, attempt refresh-once before classifying. The existing loop-protection still kicks in on the second 403 if refresh didn't actually help.

Fix #1 is mechanical and matches the explicit disambiguator xAI sends. Recommended.

Test additions

Suggested cases for tests/run_agent/test_codex_xai_oauth_recovery.py:

def test_is_entitlement_failure_false_for_bad_credentials_wke_suffix():
    """403 with WKE=unauthenticated:bad-credentials is auth failure, not entitlement."""
    from run_agent import AIAgent
    assert not AIAgent._is_entitlement_failure(
        {
            "code": "The caller does not have permission to execute the specified operation",
            "error": "The OAuth2 access token could not be validated. [WKE=unauthenticated:bad-credentials]",
        },
        403,
    )

def test_recover_with_credential_pool_refreshes_on_xai_bad_credentials_403():
    """A bad-credentials 403 from xai-oauth must trigger refresh."""
    # Same scaffolding as test_recover_with_credential_pool_still_refreshes_genuine_auth_failure,
    # but with status_code=403 and the bad-credentials error body. Should call try_refresh_current().

Impact

  • Any long-running TUI / chat session against provider/xai-oauth will eventually 403 once the token goes stale, and the user has to exit/reopen to recover.
  • Bridge adapters (Discord, Telegram, etc.) appear unaffected in practice because their process lifecycle / proactive refresh cadence keeps tokens fresh enough that the reactive-recovery path is rarely exercised. But they're vulnerable to the same bug under the right timing.
  • Reproduced on two independent installations of Hermes against two separate SuperGrok-active xAI OAuth accounts — same exact symptom, same exact 403 body.

Environment

  • Hermes — recent v0.14.x snapshot (cloned source, current main)
  • Python 3.11.15 on Linux
  • provider/xai-oauth source manual:xai_pkce (not loopback_pkce, but the bug is upstream of the loopback-vs-manual distinction)
  • xAI Grok backend, grok-4.3 model, https://api.x.ai/v1

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING