hermes - ✅(Solved) Fix feat: custom_providers should support per-provider max_tokens override [1 pull requests, 1 comments, 2 participants]

hermes2026-05-19 14:46:10

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#28782•Fetched 2026-05-20 04:02:02

View on GitHub

Comments

Participants

Timeline

Reactions

Author

pty819

Participants

alt-glitch

pty819

Timeline (top)

labeled ×5commented ×1cross-referenced ×1

Root Cause

This is problematic when:

A custom provider (e.g. Ark DeepSeek) needs an explicit max_tokens because auto-detection doesn't work
Fallback providers (e.g. MiniMax, NVIDIA) should NOT inherit that same max_tokens value

Fix Action

Fix / Workaround

Current workaround

PR fix notes

PR #28786: feat: per-provider max_tokens via custom_providers models.<model>.max_tokens

Repository: NousResearch/hermes-agent
Author: pty819
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/28786

Description (problem / solution / changelog)

Summary

Adds per-provider max_tokens support to custom_providers, mirroring the existing context_length pattern. This allows users to set a provider-scoped output-token cap without affecting fallback providers.

Problem

model.max_tokens in config.yaml is global — it applies to all providers including fallbacks. There is no way to scope max_tokens to a specific provider, unlike context_length which already supports per-provider overrides.

Changes

`hermes_cli/config.py`

Adds get_custom_provider_max_tokens() — mirrors get_custom_provider_context_length() exactly. Matches by base_url + model name, returns models.<model>.max_tokens if present and valid.
Adds "max_tokens" to _KNOWN_KEYS in _normalize_custom_provider_entry so the top-level key is not flagged as unknown (defensive, for users who had it at top level).

`agent/agent_init.py`

After the global model.max_tokens fallback (and before context_length resolution), adds a second fallback that checks custom_providers for a per-provider max_tokens when agent.max_tokens is still None.

Usage

custom_providers:
  - name: My Provider
    base_url: https://example.com/v1
    api_key: ...
    model: my-model
    api_mode: chat_completions
    models:
      my-model:
        context_length: 1000000
        max_tokens: 131072       # ← new per-provider field

Closes #28782

Changed files

agent/agent_init.py (modified, +17/-0)
hermes_cli/config.py (modified, +61/-1)

Code Example

custom_providers:
  - name: My Provider
    base_url: https://...
    api_key: ...
    model: my-model
    models:
      my-model:
        context_length: 1000000
        max_tokens: 131072       # new field, per-provider

RAW_BUFFERClick to expand / collapse

Problem

Currently, model.max_tokens in config.yaml is a global setting. When set, it applies to all providers including the fallback chain. There is no way to specify a per-provider max_tokens override, unlike context_length which already supports per-provider overrides via custom_providers[].models.<model>.context_length.

This is problematic when:

A custom provider (e.g. Ark DeepSeek) needs an explicit max_tokens because auto-detection doesn't work
Fallback providers (e.g. MiniMax, NVIDIA) should NOT inherit that same max_tokens value

Current workaround

Putting max_tokens in model: makes it global — every provider including fallbacks sends max_tokens=131072 in every API call. The only way to avoid this today is to leave max_tokens unset entirely and accept whatever default each provider chooses.

Proposed solution

Add max_tokens support to custom_providers[].models.<model>.max_tokens, following the exact same pattern as the existing context_length override:

custom_providers:
  - name: My Provider
    base_url: https://...
    api_key: ...
    model: my-model
    models:
      my-model:
        context_length: 1000000
        max_tokens: 131072       # new field, per-provider

Implementation scope

hermes_cli/config.py — Add get_custom_provider_max_tokens() function parallel to get_custom_provider_context_length().
agent/agent_init.py — After the existing model.max_tokens fallback (around line 1166), add a second fallback that checks custom_providers for a per-provider max_tokens when agent.max_tokens is still None.
hermes_cli/main.py — Optionally update _save_custom_provider to save max_tokens into models.<model>.max_tokens.

Priority

Medium. Not a bug (everything works without it), but a missing feature that causes real confusion — users who set model.max_tokens expecting it to only affect their primary provider may inadvertently pollute their fallback API calls.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #API middleware #SSR setup #ISR setup #authentication setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - ✅(Solved) Fix feat: custom_providers should support per-provider max_tokens override [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Current workaround

PR fix notes

PR #28786: feat: per-provider max_tokens via custom_providers models.<model>.max_tokens

Description (problem / solution / changelog)

Summary

Problem

Changes

`hermes_cli/config.py`

`agent/agent_init.py`

Usage

Related

Changed files

Code Example

Problem

Current workaround

Proposed solution

Implementation scope

Priority

Still need to ship something?

TRENDING

hermes - ✅(Solved) Fix feat: custom_providers should support per-provider max_tokens override [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Current workaround

PR fix notes

PR #28786: feat: per-provider max_tokens via custom_providers models.<model>.max_tokens

Description (problem / solution / changelog)

Summary

Problem

Changes

hermes_cli/config.py

agent/agent_init.py

Usage

Related

Changed files

Code Example

Problem

Current workaround

Proposed solution

Implementation scope

Priority

Still need to ship something?

RELATED_DISCOVERY

TRENDING

`hermes_cli/config.py`

`agent/agent_init.py`