hermes - ✅(Solved) Fix OAuth Anthropic Max: <available_skills> system-prompt injection on every turn triggers "out of extra usage" 400 when skills/session_search toolsets are active [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#28902Fetched 2026-05-20 04:01:14
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
0
Author
Participants
Timeline (top)
labeled ×5commented ×1cross-referenced ×1

Root Cause

agent/prompt_builder.py:1175-1215 builds a ## Skills (mandatory) block plus a full <available_skills> catalog (every skill name + description) and injects it into the system prompt every turn, not just at session start. The _SKILLS_PROMPT_CACHE LRU on the same function caches the string, but the cached string is still attached to the system message on every request — the API payload size doesn't shrink.

With even a modest skill set (~50 skills) the block is ~2k tokens. With 80+ skills it crosses ~3k tokens of fixed per-turn overhead. Anthropic's OAuth Max pre-flight overage check appears to gate on aggregate request size; once you cross the threshold, every tool-bearing request is rejected before the model runs — same 400 as #28849, different trigger.

Fix Action

Fix / Workaround

Spun off from #28849 (which fixed the mcp_ tool-prefix path). With that patch applied, tool-bearing requests work — except when the skills or session_search toolsets are enabled. Then the 400 "out of extra usage" returns even with the Max quota nearly empty (anthropic-ratelimit-unified-5h-utilization ≈ 0.1, overage disabled at org level — default Max state).

  1. OAuth Max token, overage disabled (default).
  2. mcp_ prefix patch from #28849 applied (so baseline works).
  3. Enable skills toolset (or session_search, which pulls the same skills catalog into its description).
  4. Have ≥ ~50 skills installed under ~/.hermes/skills/ + bundled.
  5. Send any request with tools. → HTTP 400 "out of extra usage".
  • hermes-agent v0.14.0 (upstream 12c39830f)
  • anthropic SDK 0.87.0
  • Python 3.11
  • Token: Claude Code OAuth, Max 5x, overage disabled at org level
  • Affected models: all Claude models on OAuth Max path
  • #28849 patch applied locally

PR fix notes

PR #28929: fix(anthropic): emit skills catalog as names-only to avoid OAuth Max 400

Description (problem / solution / changelog)

Summary

The <available_skills> catalog injects every skill name + description into the system prompt on every turn. On Anthropic OAuth Max plans with overage disabled, the ~3k fixed per-turn overhead triggers 'out of extra usage' 400 whenever skills/session_search toolsets are active.

Fix: emit skill names only. Descriptions are one skill_view() call away.

Related Issue

Fixes #28902

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

  • agent/prompt_builder.py:1187-1188 — emit - {name} instead of - {name}: {desc} in the <available_skills> block. Descriptions dropped from per-turn system prompt injection.

How to Test

  1. Enable skills toolset with >=50 skills installed
  2. Start a conversation with Anthropic OAuth Max (overage disabled)
  3. Previously: HTTP 400 "out of extra usage" on every tool-bearing request
  4. Now: requests succeed. Run hermes tools --skills test to verify skills are still loadable.

Changed files

  • agent/prompt_builder.py (modified, +1/-4)
RAW_BUFFERClick to expand / collapse

Spun off from #28849 (which fixed the mcp_ tool-prefix path). With that patch applied, tool-bearing requests work — except when the skills or session_search toolsets are enabled. Then the 400 "out of extra usage" returns even with the Max quota nearly empty (anthropic-ratelimit-unified-5h-utilization ≈ 0.1, overage disabled at org level — default Max state).

Root cause

agent/prompt_builder.py:1175-1215 builds a ## Skills (mandatory) block plus a full <available_skills> catalog (every skill name + description) and injects it into the system prompt every turn, not just at session start. The _SKILLS_PROMPT_CACHE LRU on the same function caches the string, but the cached string is still attached to the system message on every request — the API payload size doesn't shrink.

With even a modest skill set (~50 skills) the block is ~2k tokens. With 80+ skills it crosses ~3k tokens of fixed per-turn overhead. Anthropic's OAuth Max pre-flight overage check appears to gate on aggregate request size; once you cross the threshold, every tool-bearing request is rejected before the model runs — same 400 as #28849, different trigger.

Repro

  1. OAuth Max token, overage disabled (default).
  2. mcp_ prefix patch from #28849 applied (so baseline works).
  3. Enable skills toolset (or session_search, which pulls the same skills catalog into its description).
  4. Have ≥ ~50 skills installed under ~/.hermes/skills/ + bundled.
  5. Send any request with tools. → HTTP 400 "out of extra usage".

Disabling skills/session_search (or pruning the catalog) restores 200 responses.

Suggested fixes (pick one or combine)

  1. Names-only mode: emit <available_skills> with skill names only, no descriptions. Descriptions move to skill_view(name) output (where they already exist).
  2. Config gate: display.skills_catalog_inline: false (default true) to opt out of inline injection; rely on skill_list() tool call when the model wants to enumerate.
  3. Lazy injection: inject the catalog only on the first turn of a session, then rely on the model's in-context memory + skill_list() for subsequent turns.
  4. Per-provider trim: when provider=anthropic and is_oauth=True, cap catalog at N skills or fall back to names-only automatically.

Option 1 is the cheapest and probably enough — descriptions are useful for the model deciding what to load, but a one-line name + category is often sufficient and skill_view is one tool call away.

Environment

  • hermes-agent v0.14.0 (upstream 12c39830f)
  • anthropic SDK 0.87.0
  • Python 3.11
  • Token: Claude Code OAuth, Max 5x, overage disabled at org level
  • Affected models: all Claude models on OAuth Max path
  • #28849 patch applied locally

Related: #28849.

Happy to PR option 1 if it's the preferred direction.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix OAuth Anthropic Max: <available_skills> system-prompt injection on every turn triggers "out of extra usage" 400 when skills/session_search toolsets are active [1 pull requests, 1 comments, 2 participants]