hermes - 💡(How to fix) Fix Sonnet 4.6 / Opus 4.7 return generic 429 on Claude Max OAuth while Haiku 4.5 succeeds — appears to be new Anthropic-side enforcement beyond header inspection [2 comments, 2 participants]

hermes2026-04-28 22:53:23

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#17169•Fetched 2026-04-29 06:36:57

View on GitHub

Comments

Participants

Timeline

Reactions

Author

sean808080

Participants

alt-glitch

sean808080

Timeline (top)

labeled ×4commented ×2cross-referenced ×1subscribed ×1

The real claude CLI (v2.1.122) hits Sonnet successfully with the same OAuth token from the same machine in the same minute. So this is not a quota issue and not a token issue — it's something Anthropic is doing to distinguish real Claude Code from third-party clients on Sonnet/Opus.

Error Message

As of 2026-04-28, requests to Sonnet 4.5/4.6 and Opus 4.6/4.7 from Hermes via the Claude Max OAuth credential return HTTP 429 {"type":"error","error":{"type":"rate_limit_error","message":"Error"}} with no anthropic-ratelimit-* response headers. Haiku 4.5 succeeds normally on the same token, same code path, same session. Result: HTTP 429, body {"type":"error","error":{"type":"rate_limit_error","message":"Error"},"request_id":"req_011CaWyhtmFS2ob5td4b1mmj"}. Response has no anthropic-ratelimit-* headers — only generic Cloudflare headers and x-should-retry: true.

Root Cause

Fix Action

Workaround

Switch the primary model to claude-haiku-4-5-20251001 (Haiku is unaffected). Sonnet/Opus can stay in fallback chain but they'll always 429 until this is fixed.

Code Example

TOKEN=$(security find-generic-password -s "Claude Code-credentials" -w | jq -r .claudeAiOauth.accessToken)
curl -sD - -X POST 'https://api.anthropic.com/v1/messages?beta=true' \
  -H "Authorization: Bearer $TOKEN" \
  -H "Accept: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,advisor-tool-2026-03-01,advanced-tool-use-2025-11-20,context-1m-2025-08-07,effort-2025-11-24,cache-diagnosis-2026-04-07" \
  -H "anthropic-dangerous-direct-browser-access: true" \
  -H "user-agent: claude-cli/2.1.122 (external, sdk-cli)" \
  -H "x-app: cli" \
  -H "x-claude-code-session-id: $(uuidgen)" \
  -H "x-client-request-id: $(uuidgen)" \
  -H "x-stainless-arch: arm64" \
  -H "x-stainless-lang: js" \
  -H "x-stainless-os: MacOS" \
  -H "x-stainless-package-version: 0.81.0" \
  -H "x-stainless-runtime: node" \
  -H "x-stainless-runtime-version: v24.3.0" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":10,"system":"x","messages":[{"role":"user","content":"hi"}]}'

---

claude -p "hi" --model claude-sonnet-4-6
# → "<friendly assistant reply>"

RAW_BUFFERClick to expand / collapse

Summary

Environment

Hermes Agent: latest (post hermes update 2026-04-28)
macOS arm64
Anthropic provider, OAuth subscription token (sk-ant-oat01-…)
Account: Claude Max 20x, organizationRateLimitTier default_claude_max_20x, hasExtraUsageEnabled: true
Anthropic status page: All Systems Operational at time of testing

Reproduction

Same OAuth token, two requests run within seconds of each other.

Hermes-shaped request (curl) — 429:

TOKEN=$(security find-generic-password -s "Claude Code-credentials" -w | jq -r .claudeAiOauth.accessToken)
curl -sD - -X POST 'https://api.anthropic.com/v1/messages?beta=true' \
  -H "Authorization: Bearer $TOKEN" \
  -H "Accept: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,advisor-tool-2026-03-01,advanced-tool-use-2025-11-20,context-1m-2025-08-07,effort-2025-11-24,cache-diagnosis-2026-04-07" \
  -H "anthropic-dangerous-direct-browser-access: true" \
  -H "user-agent: claude-cli/2.1.122 (external, sdk-cli)" \
  -H "x-app: cli" \
  -H "x-claude-code-session-id: $(uuidgen)" \
  -H "x-client-request-id: $(uuidgen)" \
  -H "x-stainless-arch: arm64" \
  -H "x-stainless-lang: js" \
  -H "x-stainless-os: MacOS" \
  -H "x-stainless-package-version: 0.81.0" \
  -H "x-stainless-runtime: node" \
  -H "x-stainless-runtime-version: v24.3.0" \
  -d '{"model":"claude-sonnet-4-6","max_tokens":10,"system":"x","messages":[{"role":"user","content":"hi"}]}'

Result: HTTP 429, body {"type":"error","error":{"type":"rate_limit_error","message":"Error"},"request_id":"req_011CaWyhtmFS2ob5td4b1mmj"}. Response has no anthropic-ratelimit-* headers — only generic Cloudflare headers and x-should-retry: true.

Real claude CLI — 200:

claude -p "hi" --model claude-sonnet-4-6
# → "<friendly assistant reply>"

Same token. Same model. Same machine. Within seconds.

The successful response shows the account has plenty of quota:

anthropic-ratelimit-unified-5h-utilization: 0.09
anthropic-ratelimit-unified-7d-utilization: 0.16
anthropic-ratelimit-unified-7d_sonnet-utilization: 0.20
anthropic-ratelimit-unified-overage-disabled-reason: org_level_disabled_until
anthropic-ratelimit-unified-overage-status: rejected

So overage is disabled at org level (which is fine — base quota is 80% available), but the underlying gate is something else.

What's identical between real-claude and Hermes-spoofed

Same OAuth token
Same ?beta=true URL
All anthropic-beta values match
anthropic-dangerous-direct-browser-access: true
user-agent: claude-cli/2.1.122 (external, sdk-cli)
All x-stainless-* values match (arch, lang, os, package-version 0.81.0, runtime, runtime-version v24.3.0)
x-app: cli
Synthetic x-client-request-id and x-claude-code-session-id UUIDs

What's different

Things real claude sends that Hermes doesn't:

Body shape: real claude includes metadata, output_config, thinking, context_management, diagnostics top-level fields. Hermes sends only model, messages, system, tools, max_tokens.
TLS fingerprint: curl/openssl vs Bun/Node — different JA3/JA4 likely.
Streaming: real claude uses stream: true always.

Hypothesis

Anthropic added a new enforcement layer for Sonnet/Opus on subscription OAuth, separate from the existing prompt-text content filter. It probably keys on either TLS fingerprint or required body fields (most likely the structured metadata.user_id / output_config / context_management fields that real Claude Code adds).

Affected

Hermes Anthropic native provider (agent/anthropic_adapter.py's build_anthropic_client)
Any user on Claude Max / Claude Pro OAuth selecting Sonnet 4.5+ or Opus 4.6+ as primary or fallback
Telegram, Discord, webui, api_server — all platforms

Workaround

Switch the primary model to claude-haiku-4-5-20251001 (Haiku is unaffected). Sonnet/Opus can stay in fallback chain but they'll always 429 until this is fixed.

What might fix it

Have build_anthropic_client always emit the body fields that real Claude Code emits: metadata: {user_id: <hashed-account-uuid>}, output_config: {...}, thinking: {type: "adaptive"}, context_management: {...}, plus stream: true by default.
If the gate is TLS-level, the SDK already uses Node's https stack — should match. But if Anthropic is fingerprinting handshake details specific to Bun, that's harder.

Happy to provide gateway logs, full request dumps, or run additional diagnostics. Multiple request_ids above can be cross-referenced server-side.

extent analysis

TL;DR

Modify the Hermes request to include the missing body fields that the real Claude CLI sends, such as metadata, output_config, thinking, and context_management, to potentially bypass the rate limit error.

Guidance

Verify the hypothesis: Test the Hermes request with the additional body fields to see if it resolves the rate limit error.
Compare request differences: Review the request headers and bodies of both the Hermes and real Claude CLI requests to identify any other potential differences that could be contributing to the issue.
Check Anthropic documentation: Look for any updates or changes to the Anthropic API that may require additional fields or headers in requests.
Test with stream: true: Try setting stream: true in the Hermes request to see if it makes a difference, as the real Claude CLI always uses this setting.
Monitor response headers: Check the response headers for any clues about what might be causing the rate limit error, such as the anthropic-ratelimit-* headers.

Example

curl -sD - -X POST 'https://api.anthropic.com/v1/messages?beta=true' \
  -H "Authorization: Bearer $TOKEN" \
  -H "Accept: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05,advisor-tool-2026-03-01,advanced-tool-use-2025-11-20,context-1m-2025-08-07,effort-2025-11-24,cache-diagnosis-2026-04-07

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ISR setup #authentication setup #request error #file not found

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Sonnet 4.6 / Opus 4.7 return generic 429 on Claude Max OAuth while Haiku 4.5 succeeds — appears to be new Anthropic-side enforcement beyond header inspection [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

Code Example

Summary

Environment

Reproduction

What's identical between real-claude and Hermes-spoofed

What's different

Hypothesis

Affected

Workaround

What might fix it

extent analysis

TL;DR

Guidance

Example

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix Sonnet 4.6 / Opus 4.7 return generic 429 on Claude Max OAuth while Haiku 4.5 succeeds — appears to be new Anthropic-side enforcement beyond header inspection [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

Code Example

Summary

Environment

Reproduction

What's identical between real-claude and Hermes-spoofed

What's different

Hypothesis

Affected

Workaround

What might fix it

extent analysis

TL;DR

Guidance

Example

Still need to ship something?

RELATED_DISCOVERY

TRENDING