hermes - 💡(How to fix) Fix [Bug]: LM Studio custom_providers per-model context_length broken in 0.14.0 — regressed to 64K

StepCodex · 2026-05-22T02:24:42Z

[hermes] Bug Description Per-model context length under custom providers is completely ignored when using LM Studio as a provider. This affects ALL models. Aft… ### Bug Description Per-model context_length under custom_providers is completely ignored when using LM Studio as a provider. This affects ALL models. After updating to 0.14.0 it regressed — previously showing 256K, now every single model shows 64K regardless of what context_length is set in config.yaml. Config example: custom_providers: - name: lmstudio-qwen3.6-35b-a3b base_url: http://localhost:1234/v1 model: qwen3.6-35b-a3b models: qwen3.6-35b-a3b: context_length: 262144 - name: lmstudio-nemotron-3-nano-4b base_url: http://localhost:1234/v1 model: nemotron-3-nano-4b models: nemotron-3-nano-4b: context_length: 1048576 Both and all other configured models show 64K. Every single model shows 64K. (note: it even before the updates, it was wrong, but it was definetly better) (and in the logs "Could not detect context length for model 'nvidia/nemotron-3-nano-4b' at http://127.0.0.1:1234/v1 — defaulting to 256,000 tokens (probe-down). Set model.context_length in config.yaml to override." can be seen) ### Steps to Reproduce 1. Add custom_providers to ~/.hermes/config.yaml with per-model context_length for LM Studio models 2. Run hermes chat 3. Check context shown in the status bar at the bottom ### Expected Behavior Each model should use the context_length defined for it in custom_providers. Examples: - qwen3.6-35b-a3b → 262,144 tokens - glm-4.7-flash → 202,752 tokens - nemotron-3-nano-4b → 1,048,576 tokens - gemma-3-1b → 32,768 tokens ### Actual Behavior Every single model shows 64,000 tokens regardless of what's configured. The configured context_length values are completely ignored. ### Affected Component Configuration (config.yaml, .env, hermes setup) ### Messaging Platform (if gateway-related) N/A (CLI only) ### Debug Report ```shell https://paste.rs/4YutI https://paste.rs/q99Vj ``` ### Operating System windows 11 ### Python Version Python 3.14.0 ### Hermes Version 0.14.0 (2026.5.16) ### Additional Logs / Traceback (optional) ```shell Could not detect context length for model 'nvidia/nemotron-3-nano-4b' at http://127.0.0.1:1234/v1 — defaulting to 256,000 tokens (probe-down) ``` ### Root Cause Analysis (optional) _No response_ ### Proposed Fix (optional) _No response_ ### Are you willing to submit a PR for this? - [ ] I'd like to fix this myself and submit a PR

hermes2026-05-22 02:24:42

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Code Example

https://paste.rs/4YutI
https://paste.rs/q99Vj

---

Could not detect context length for model 'nvidia/nemotron-3-nano-4b' at http://127.0.0.1:1234/v1 — defaulting to 256,000 tokens (probe-down)

RAW_BUFFERClick to expand / collapse

Bug Description

Per-model context_length under custom_providers is completely ignored when using LM Studio as a provider. This affects ALL models. After updating to 0.14.0 it regressed — previously showing 256K, now every single model shows 64K regardless of what context_length is set in config.yaml.

Config example: custom_providers:

name: lmstudio-qwen3.6-35b-a3b base_url: http://localhost:1234/v1 model: qwen3.6-35b-a3b models: qwen3.6-35b-a3b: context_length: 262144
name: lmstudio-nemotron-3-nano-4b base_url: http://localhost:1234/v1 model: nemotron-3-nano-4b models: nemotron-3-nano-4b: context_length: 1048576

Both and all other configured models show 64K. Every single model shows 64K.

(note: it even before the updates, it was wrong, but it was definetly better)

(and in the logs "Could not detect context length for model 'nvidia/nemotron-3-nano-4b' at http://127.0.0.1:1234/v1 — defaulting to 256,000 tokens (probe-down). Set model.context_length in config.yaml to override." can be seen)

Steps to Reproduce

Add custom_providers to ~/.hermes/config.yaml with per-model context_length for LM Studio models
Run hermes chat
Check context shown in the status bar at the bottom

Expected Behavior

Each model should use the context_length defined for it in custom_providers. Examples:

qwen3.6-35b-a3b → 262,144 tokens
glm-4.7-flash → 202,752 tokens
nemotron-3-nano-4b → 1,048,576 tokens
gemma-3-1b → 32,768 tokens

Actual Behavior

Every single model shows 64,000 tokens regardless of what's configured. The configured context_length values are completely ignored.

Affected Component

Configuration (config.yaml, .env, hermes setup)

Messaging Platform (if gateway-related)

N/A (CLI only)

Debug Report

https://paste.rs/4YutI
https://paste.rs/q99Vj

Operating System

windows 11

Python Version

Python 3.14.0

Hermes Version

0.14.0 (2026.5.16)

Additional Logs / Traceback (optional)

Could not detect context length for model 'nvidia/nemotron-3-nano-4b' at http://127.0.0.1:1234/v1 — defaulting to 256,000 tokens (probe-down)

Root Cause Analysis (optional)

No response

Proposed Fix (optional)

No response

Are you willing to submit a PR for this?

I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Bug]: LM Studio custom_providers per-model context_length broken in 0.14.0 — regressed to 64K

Recommended Tools

GitHub issue graph ai analysis

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Proposed Fix (optional)

Are you willing to submit a PR for this?

Still need to ship something?

TRENDING