hermes - 💡(How to fix) Fix Custom provider max_output_tokens silently dropped by config.py normalizer — defaults to model minimum (2048)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

hermes_cli/config.py around line 2552 has a _KNOWN_KEYS set that gates what gets copied into the normalized provider dict:

_KNOWN_KEYS = {
    "name", "api", "url", "base_url", "api_key", "key_env", "api_key_env",
    "api_mode", "transport", "model", "default_model", "models",
    "context_length", "rate_limit_delay",
    "request_timeout_seconds", "stale_timeout_seconds",
}

max_output_tokens is not in this set. The validator logs a warning (unknown config keys ignored: max_output_tokens) but does not strip the key from the raw entry. However, the function then builds a new normalized dict and only copies keys it explicitly handles — so max_output_tokens is never copied into normalized, and _get_named_custom_provider returns without it.

The runtime_provider.py patch (PR #19782) that reads entry.get("max_output_tokens") never sees the value because by the time it runs, the entry has already been normalized without it.

Fix Action

Fix

Two changes to hermes_cli/config.py:

  1. Add max_output_tokens to _KNOWN_KEYS (suppresses the spurious warning)
  2. Copy it into normalized alongside context_length
# In _KNOWN_KEYS:
"context_length", "rate_limit_delay", "max_output_tokens",

# After context_length is copied into normalized:
max_output_tokens = entry.get("max_output_tokens")
if isinstance(max_output_tokens, int) and max_output_tokens > 0:
    normalized["max_output_tokens"] = max_output_tokens

Code Example

_KNOWN_KEYS = {
    "name", "api", "url", "base_url", "api_key", "key_env", "api_key_env",
    "api_mode", "transport", "model", "default_model", "models",
    "context_length", "rate_limit_delay",
    "request_timeout_seconds", "stale_timeout_seconds",
}

---

# In _KNOWN_KEYS:
"context_length", "rate_limit_delay", "max_output_tokens",

# After context_length is copied into normalized:
max_output_tokens = entry.get("max_output_tokens")
if isinstance(max_output_tokens, int) and max_output_tokens > 0:
    normalized["max_output_tokens"] = max_output_tokens

---

cd ~/.hermes/hermes-agent && venv/bin/python -c "
from hermes_cli.runtime_provider import _get_named_custom_provider
r = _get_named_custom_provider('your_provider_name')
print('max_output_tokens:', r.get('max_output_tokens'))
"
# Before fix: max_output_tokens: None
# After fix:  max_output_tokens: 64000
RAW_BUFFERClick to expand / collapse

Problem

When using a custom provider in config.yaml, the max_output_tokens key is silently dropped by the provider config normalizer in hermes_cli/config.py. Every API call then goes out with no max_tokens parameter, causing the downstream model to use its default minimum (2048 tokens for Claude on Bedrock Access Gateway).

This causes consistent truncation errors on any response over ~2048 tokens — tool calls writing code files, long completions, etc.

Root Cause

hermes_cli/config.py around line 2552 has a _KNOWN_KEYS set that gates what gets copied into the normalized provider dict:

_KNOWN_KEYS = {
    "name", "api", "url", "base_url", "api_key", "key_env", "api_key_env",
    "api_mode", "transport", "model", "default_model", "models",
    "context_length", "rate_limit_delay",
    "request_timeout_seconds", "stale_timeout_seconds",
}

max_output_tokens is not in this set. The validator logs a warning (unknown config keys ignored: max_output_tokens) but does not strip the key from the raw entry. However, the function then builds a new normalized dict and only copies keys it explicitly handles — so max_output_tokens is never copied into normalized, and _get_named_custom_provider returns without it.

The runtime_provider.py patch (PR #19782) that reads entry.get("max_output_tokens") never sees the value because by the time it runs, the entry has already been normalized without it.

Why This Only Affects Custom Providers

Built-in providers (anthropic, openai, bedrock) have their output limits baked into Hermes's internal model registry. Custom providers have no registry entry — the only way to tell Hermes the output limit is via max_output_tokens in config. So this bug is invisible for standard providers and only surfaces for custom provider users.

Fix

Two changes to hermes_cli/config.py:

  1. Add max_output_tokens to _KNOWN_KEYS (suppresses the spurious warning)
  2. Copy it into normalized alongside context_length
# In _KNOWN_KEYS:
"context_length", "rate_limit_delay", "max_output_tokens",

# After context_length is copied into normalized:
max_output_tokens = entry.get("max_output_tokens")
if isinstance(max_output_tokens, int) and max_output_tokens > 0:
    normalized["max_output_tokens"] = max_output_tokens

Verification

cd ~/.hermes/hermes-agent && venv/bin/python -c "
from hermes_cli.runtime_provider import _get_named_custom_provider
r = _get_named_custom_provider('your_provider_name')
print('max_output_tokens:', r.get('max_output_tokens'))
"
# Before fix: max_output_tokens: None
# After fix:  max_output_tokens: 64000

Related

Companion to #20975 (truncated tool call retry logic). That issue was a symptom; this is the root cause.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING