hermes - 💡(How to fix) Fix [Bug]: custom_providers unstable with Baidu Coding Plan — multi-model picker broken + wrong context lengths causing truncation [1 comments, 2 participants]

hermes2026-05-10 16:58:01

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#23318•Fetched 2026-05-11 03:30:02

View on GitHub

Comments

Participants

Timeline

Reactions

Author

sunnysktsang

Participants

alex-pathcourse

sunnysktsang

Timeline (top)

labeled ×4commented ×1

Error Message

Not applicable — hermes debug share captures crashes, tracebacks, and environment details. This bug is a deterministic logic error in the model resolution code path (wrong values in DEFAULT_CONTEXT_LENGTHS fuzzy matching + picker flattening). No crash occurs; the symptoms are silent (wrong context_length= logged, missing picker entries).

The Root Cause Analysis section below provides the equivalent diagnostic: exact file paths, line numbers, and the resolution chain steps involved.

If a debug share is required to validate the fix, I can provide one after the native provider PR is submitted.

Root Cause

The Root Cause Analysis section below provides the equivalent diagnostic: exact file paths, line numbers, and the resolution chain steps involved.

Fix Action

Fix / Workaround

Multi-model picker broken — Defining multiple models: under a single custom provider entry breaks the /model picker; it only surfaces one model (#20582). The workaround (one list entry per model) means 7 duplicate base_url + api_key blocks for Baidu Coding alone.

Configure Baidu Coding as a single-model custom provider (workaround for Bug 1):

custom_providers:
  - display_name: Baidu Coding - deepseek-v4-flash
    base_url: https://qianfan.baidubce.com/v2/coding
    api_key: ${BAIDU_CODING_API_KEY}
    models:
      - deepseek-v4-flash

Run hermes chat and select deepseek-v4-flash
Send a long prompt (~130K tokens)
Observe: Hermes truncates context compression at 128K (the "deepseek" catch-all in DEFAULT_CONTEXT_LENGTHS) instead of using the real 1M window
Alternatively, select glm-5.1 and observe: Hermes over-sends to 202,752 tokens → Baidu silently truncates at 198K → incomplete output → retry loop

Code Example

custom_providers:
     - display_name: Baidu Coding
       base_url: https://qianfan.baidubce.com/v2/coding
       api_key: ${BAIDU_CODING_API_KEY}
       models:
         - glm-5.1
         - deepseek-v4-flash
         - kimi-k2.5
         - minimax-m2.5
         - ernie-4.5-turbo
         - qwen3-coder-plus
         - qwen3-235b-a22b

---

custom_providers:
     - display_name: Baidu Coding - deepseek-v4-flash
       base_url: https://qianfan.baidubce.com/v2/coding
       api_key: ${BAIDU_CODING_API_KEY}
       models:
         - deepseek-v4-flash

---

Not applicable — `hermes debug share` captures crashes, tracebacks, and environment details. This bug is a deterministic logic error in the model resolution code path (wrong values in `DEFAULT_CONTEXT_LENGTHS` fuzzy matching + picker flattening). No crash occurs; the symptoms are silent (wrong `context_length=` logged, missing picker entries).

The Root Cause Analysis section below provides the equivalent diagnostic: exact file paths, line numbers, and the resolution chain steps involved.

If a debug share is required to validate the fix, I can provide one after the native provider PR is submitted.

---

No traceback — this is a silent logic error, not a crash. The symptoms are:
- Missing models in the `/model` picker (no error logged)
- Wrong context lengths applied (logged as `context_length=128000` for `deepseek-v4-flash` instead of `1000000`)
- Truncation loops manifest as repeated "continuing generation" messages with identical or degraded output

---

glm-5.1 → 198,000     (not 202,752)
   deepseek-v4-flash → 1,000,000  (not 128,000)
   kimi-k2.5 → 256,000   (not 262,144)
   minimax-m2.5 → 192,000 (not 204,800)
   ernie-4.5-turbo → 128,000 (not 256K fallback)

RAW_BUFFERClick to expand / collapse

Bug Description

Baidu Qianfan Coding Plan provides an OpenAI-compatible endpoint with 7 curated models, launched February 2026 and explicitly designed for Claude Code, Cursor, and similar tools.

Hermes has no native provider for Baidu Coding Plan, forcing users into custom_providers — which breaks in two independent ways:

Multi-model picker broken — Defining multiple models: under a single custom provider entry breaks the /model picker; it only surfaces one model (#20582). The workaround (one list entry per model) means 7 duplicate base_url + api_key blocks for Baidu Coding alone.
Wrong context lengths → output truncation → token waste loop — As a custom provider, context length resolution hits DEFAULT_CONTEXT_LENGTHS fuzzy matching (#12977). For Baidu Coding models, the catch-alls are wrong:

Model	Actual (Baidu)	Catch-all	Delta
glm-5.1	198,000	202,752 (`"glm"`)	+4,752
deepseek-v4-flash	1,000,000	128,000 (`"deepseek"`)	-872K
kimi-k2.5	256,000	262,144 (`"kimi"`)	+6,144
minimax-m2.5	192,000	204,800 (`"minimax"`)	+12,800
ernie-4.5-turbo	128,000	no match → 256K fallback	+128K

Overstated values cause Hermes to send prompts exceeding the real context window. Baidu's API silently truncates mid-generation, producing incomplete outputs. The agent detects the truncation and restarts generation in the same session — burning tokens in a loop until the context limit is hit.

The understated deepseek-v4-flash value (128K vs real 1M) wastes 87% of the available window and triggers unnecessary trajectory compression.

Steps to Reproduce

Prerequisite: A Baidu Coding Plan API key (available at https://qianfan.baidubce.com).

Bug 1: Multi-model picker broken

Add the following to ~/.hermes/config.yaml:

custom_providers:
  - display_name: Baidu Coding
    base_url: https://qianfan.baidubce.com/v2/coding
    api_key: ${BAIDU_CODING_API_KEY}
    models:
      - glm-5.1
      - deepseek-v4-flash
      - kimi-k2.5
      - minimax-m2.5
      - ernie-4.5-turbo
      - qwen3-coder-plus
      - qwen3-235b-a22b

Run hermes chat
Type /model to open the model picker
Only one model from the list appears — the other 6 are invisible

Bug 2: Wrong context lengths

Configure Baidu Coding as a single-model custom provider (workaround for Bug 1):

custom_providers:
  - display_name: Baidu Coding - deepseek-v4-flash
    base_url: https://qianfan.baidubce.com/v2/coding
    api_key: ${BAIDU_CODING_API_KEY}
    models:
      - deepseek-v4-flash

Run hermes chat and select deepseek-v4-flash
Send a long prompt (~130K tokens)
Observe: Hermes truncates context compression at 128K (the "deepseek" catch-all in DEFAULT_CONTEXT_LENGTHS) instead of using the real 1M window
Alternatively, select glm-5.1 and observe: Hermes over-sends to 202,752 tokens → Baidu silently truncates at 198K → incomplete output → retry loop

Expected Behavior

The /model picker should display all 7 models defined under a single custom provider entry.
Context lengths should match the provider's actual limits — not generic DEFAULT_CONTEXT_LENGTHS fuzzy matches that were designed for other endpoints (e.g., "deepseek" → 128K was set for older DeepSeek V2/V3 via non-Baidu endpoints, not the Coding Plan's 1M-window V4 Flash).

Actual Behavior

Picker: Only one model from a multi-model custom_providers entry is shown. The rest are silently dropped. (Tracked in #20582.)
Context lengths: All 5 Baidu Coding models hit wrong fuzzy-match values in DEFAULT_CONTEXT_LENGTHS (step 8 of the resolution chain in agent/model_metadata.py). The most severe case is deepseek-v4-flash receiving 128K instead of 1M, wasting 87% of the context window. The second-most severe is ernie-4.5-turbo with no match at all, falling through to the 256K default — double the real 128K, causing truncation loops.

Affected Component

CLI (interactive chat), Configuration (config.yaml, .env, hermes setup), Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

N/A (CLI only)

Debug Report

Not applicable — `hermes debug share` captures crashes, tracebacks, and environment details. This bug is a deterministic logic error in the model resolution code path (wrong values in `DEFAULT_CONTEXT_LENGTHS` fuzzy matching + picker flattening). No crash occurs; the symptoms are silent (wrong `context_length=` logged, missing picker entries).

The Root Cause Analysis section below provides the equivalent diagnostic: exact file paths, line numbers, and the resolution chain steps involved.

If a debug share is required to validate the fix, I can provide one after the native provider PR is submitted.

Operating System

Ubuntu 24.04 (reproduced in development environment; bug is OS-independent — it's a logic error in model resolution and picker code)

Python Version

3.11

Hermes Version

Hermes Agent v0.13.0 (2026.5.7)

Additional Logs / Traceback (optional)

No traceback — this is a silent logic error, not a crash. The symptoms are:
- Missing models in the `/model` picker (no error logged)
- Wrong context lengths applied (logged as `context_length=128000` for `deepseek-v4-flash` instead of `1000000`)
- Truncation loops manifest as repeated "continuing generation" messages with identical or degraded output

Root Cause Analysis (optional)

Two independent root causes:

Bug 1: Multi-model picker

In hermes_cli/models.py, the /model picker iterates _PROVIDER_MODELS[provider] for native providers. Custom providers are handled separately — they flatten multi-model entries into a single display row. When a custom provider has multiple models:, only the first model is surfaced in the picker. The custom:name:model triple syntax (line ~1486–1494) supports named custom providers but the picker UI still shows one row per custom_providers entry rather than one row per model.

This is the same root cause as #20582.

Bug 2: Wrong context lengths

In agent/model_metadata.py, the resolve_context_length() function has a 10-step resolution chain (steps 0–10). Custom providers have no provider-specific step, so their models fall through to:

Step 8 — DEFAULT_CONTEXT_LENGTHS fuzzy matching (substring match, longest key first):
- deepseek-v4-flash matches "deepseek" → 128,000 (set as a legacy fallback for older DeepSeek models, not the 1M-window V4 Flash on Baidu)
- glm-5.1 matches "glm" → 202,752 (Z.AI's actual value; Baidu Qianfan caps it at 198,000)
- kimi-k2.5 matches "kimi" → 262,144 (Moonshot's value; Baidu's Coding Plan variant is 256,000)
- minimax-m2.5 matches "minimax" → 204,800 (MiniMax's own API value; Baidu's variant is 192,000)
- ernie-4.5-turbo has no match in DEFAULT_CONTEXT_LENGTHS
Step 10 — default fallback of 256K (for ernie-4.5-turbo with no fuzzy match)

The core issue: DEFAULT_CONTEXT_LENGTHS values are sourced from each model creator's own API (e.g., "deepseek" → 128K from DeepSeek V2/V3 docs), but the same model IDs served through Baidu Coding Plan have different context windows. The fuzzy-match table cannot distinguish between deepseek-v4-flash on api.deepseek.com (1M) vs. Baidu Coding Plan (also 1M, but caught by the legacy "deepseek" → 128K catch-all before the specific "deepseek-v4-flash" → 1M entry).

Native providers solve this via provider-specific steps in the resolution chain (e.g., step 1b for Bedrock in agent/bedrock_adapter.py). Custom providers have no such step.

Proposed Fix (optional)

Add Baidu Coding Plan as a native provider (baidu-coding) that bypasses both bugs:

Provider plugin at plugins/model-providers/baidu-coding/ — registers the provider, declares 7 curated Coding Plan models with correct context lengths

Provider-scoped context length table — step 1c in the resolution chain (following the Bedrock step 1b pattern in agent/bedrock_adapter.py), with a static table in agent/baidu_coding_context.py:

glm-5.1 → 198,000     (not 202,752)
deepseek-v4-flash → 1,000,000  (not 128,000)
kimi-k2.5 → 256,000   (not 262,144)
minimax-m2.5 → 192,000 (not 204,800)
ernie-4.5-turbo → 128,000 (not 256K fallback)

Env vars — BAIDU_CODING_API_KEY (primary) / BAIDU_API_KEY (fallback) + BAIDU_CODING_BASE_URL (overridable)
7 curated model entries in _PROVIDER_MODELS["baidu-coding"] — Coding Plan models only, not the full Qianfan catalog

This is the same approach used by all bundled providers: native registration gives correct picker display + correct context resolution + proper /doctor env checks.

I have a working prototype of the baidu-coding provider — plugin registration, context length table, 7 curated models, and tests. Currently validating against the repo's test conventions before submitting as a PR.

Are you willing to submit a PR for this?

I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #output truncation #runtime error #dependency conflict #environment setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Bug]: custom_providers unstable with Baidu Coding Plan — multi-model picker broken + wrong context lengths causing truncation [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug Description

Steps to Reproduce

Bug 1: Multi-model picker broken

Bug 2: Wrong context lengths

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Bug 1: Multi-model picker

Bug 2: Wrong context lengths

Proposed Fix (optional)

Are you willing to submit a PR for this?

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Bug]: custom_providers unstable with Baidu Coding Plan — multi-model picker broken + wrong context lengths causing truncation [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug Description

Steps to Reproduce

Bug 1: Multi-model picker broken

Bug 2: Wrong context lengths

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Bug 1: Multi-model picker

Bug 2: Wrong context lengths

Proposed Fix (optional)

Are you willing to submit a PR for this?

Still need to ship something?

RELATED_DISCOVERY

TRENDING