hermes - 💡(How to fix) Fix Claude Code OAuth (Max/Pro plan) still hits pay-per-token API endpoint — drains 'extra usage' credits instead of using subscription quota

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Error received:

Root Cause

Root cause (from code review)

Fix Action

Fix / Workaround

Workarounds (none clean)

RAW_BUFFERClick to expand / collapse

Problem

When using Claude Code OAuth credentials (sk-ant-oat01-...) with a Claude Max plan subscription, Hermes still routes API calls through https://api.anthropic.com/v1/messages — the pay-per-token endpoint. This burns through the account's "extra usage" add-on credits rather than the included subscription quota.

Error received: ``` HTTP 400: You're out of extra usage. Add more at claude.ai/settings/usage and keep going. ```

This happens even though:

  • hermes auth list correctly shows claude_code oauth as linked
  • hermes model confirms "Claude Code credentials: ✓ (auto-detected)"
  • The OAuth token was freshly generated via claude setup-token

Expected behaviour

A Claude Max plan subscriber using Claude Code OAuth should consume their included subscription quota, not pay-per-token API credits. The whole point of using OAuth (vs an ANTHROPIC_API_KEY) is to leverage the subscription.

Root cause (from code review)

agent/anthropic_adapter.py (lines 700–761) builds an Anthropic SDK client using the OAuth token but sends requests to the standard API endpoint — which treats all tokens as pay-per-token regardless of how they were obtained.

An unused helper already exists: run_oauth_setup_token() in anthropic_adapter.py:1146–1183 spawns the claude CLI subprocess to use Claude Code's credential store directly. This is the correct path for subscription-based auth but it isn't wired into the main request flow.

Proposed fix

In hermes_cli/runtime_provider.py (the credential resolution layer): when Claude Code credentials are detected, route inference through the claude CLI subprocess (same pattern as run_oauth_setup_token) instead of making direct SDK calls to api.anthropic.com. This way subscription quota is consumed, not API credits.

Environment

  • Hermes Agent v0.15.1 (2026.5.29)
  • macOS 15.x arm64
  • Claude Max plan (monthly subscription)
  • Model: claude-opus-4-8
  • Auth method: claude_code oauth (via claude setup-token)

Workarounds (none clean)

  • Switch to OpenRouter provider — bypasses the issue but loses native Anthropic routing
  • Add Anthropic API key separately — defeats the purpose of having Max plan OAuth
  • Wait for subscription quota reset — not viable for daily use

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING