hermes - 💡(How to fix) Fix [Bug]: Fix: stale title generation request reloads unloaded Ollama model after model switch [1 participants]

hermes2026-05-02 23:57:34

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#19027•Fetched 2026-05-03 04:52:51

View on GitHub

Comments

Participants

Timeline

Reactions

Author

xy6859

Participants

xy6859

Timeline (top)

labeled ×4

Error Message

⚠ Auxiliary title generation failed: HTTP 400: Error code: 400 - {'error': 'Model unloaded.'}

Additional Logs / Traceback (optional)

Root Cause

Root cause

Fix Action

Fix / Workaround

hermes-title-generator-fix.patch Patch file available: hermes-title-generator-fix.patch

Patch file available: hermes-title-generator-fix.patch

Code Example

Fix: stale title generation request reloads unloaded Ollama model after model switch

---

RAW_BUFFERClick to expand / collapse

Bug Description

markdown Problem

When using Ollama with strict_single_load (only one model in memory at a time),
the background auto-title generation thread can cause a previously unloaded model
to be reloaded into memory.

Reproduction steps

1. Start a session with model A (e.g. Qwen3.6-27B-oQ8-fp16)
2. Hermes completes the first response and spawns a background thread for
   auto-title generation (captures main_runtime pointing to model A)
3. User immediately sends the next prompt, triggering a model switch to model B
4. Ollama's strict_single_load unloads model A
5. The background title thread still holds the old main_runtime snapshot and
   sends an HTTP request for model A → Ollama reloads model A into memory

Observed symptoms


[ollama-router] Switched general -> Qwen:Qwen3.6-27B-oQ8-fp16 (reason: default fast daily route; strategy: strict_single_load)
⚠ Auxiliary title generation failed: HTTP 400: Error code: 400 - {'error': 'Model unloaded.'}
[ollama-router] Switched general -> Qwen:Qwen3.6-27B-oQ8-fp16 (model reloaded again)


Both models end up in memory, defeating the purpose of strict_single_load and
wasting GPU VRAM.

Root cause

maybe_auto_title() is called with a snapshot of the current main_runtime
(model name, provider, base_url). The background thread holds this snapshot and
uses it unconditionally — it has no way to detect that the user has already
switched to a different model before its LLM call executes.

Fix

Add an optional runtime_validator callback to the title generation pipeline
(generate_title → auto_title_session → maybe_auto_title). All callers
(cli.py, gateway/run.py, tui_gateway/server.py, acp_adapter/server.py) now
capture the current model/provider at call time and pass a validator that checks
whether the live agent state still matches. If the model was switched, the
background thread skips the request silently — no HTTP call is made, so no
stale model reloads are triggered.

Changes

- agent/title_generator.py — new runtime_validator parameter on
  generate_title(), auto_title_session(), and maybe_auto_title();
  validator is checked before the LLM call
- cli.py — captures _title_model / _title_provider and passes validator
- gateway/run.py — same pattern for gateway callers
- tui_gateway/server.py — same pattern; also adds missing failure_callback
  and main_runtime for parity with CLI
- acp_adapter/server.py — same pattern; also adds missing parameters for parity
- tests/agent/test_title_generator.py — 3 new tests covering the validator,
  plus updated existing assertions for the new parameter

Backward compatibility

All new parameters have default values of None. Callers that don't pass
runtime_validator get the original behavior unchanged. No breaking changes.

Testing

bash
python -m pytest tests/agent/test_title_generator.py -v  # 22 passed
python -m pytest tests/test_tui_gateway_server.py -k "title" -v  # 13 passed

hermes-title-generator-fix.patch Patch file available: hermes-title-generator-fix.patch

Steps to Reproduce

Patch file available: hermes-title-generator-fix.patch

Expected Behavior

Fix: stale title generation request reloads unloaded Ollama model after model switch

Actual Behavior

Fix: stale title generation request reloads unloaded Ollama model after model switch

Affected Component

CLI (interactive chat)

Messaging Platform (if gateway-related)

No response

Debug Report

Fix: stale title generation request reloads unloaded Ollama model after model switch

Operating System

osx 15.7.5

Python Version

3.11.15

Hermes Version

No response

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

No response

Proposed Fix (optional)

No response

Are you willing to submit a PR for this?

I'd like to fix this myself and submit a PR

extent analysis

TL;DR

Apply the provided patch file hermes-title-generator-fix.patch to add a runtime validator callback to the title generation pipeline.

Guidance

Review the changes in agent/title_generator.py to understand how the runtime_validator parameter is used to prevent stale model reloads.
Verify that the runtime_validator callback is correctly implemented in all callers, including cli.py, gateway/run.py, tui_gateway/server.py, and acp_adapter/server.py.
Test the fix using the provided test commands: python -m pytest tests/agent/test_title_generator.py -v and python -m pytest tests/test_tui_gateway_server.py -k "title" -v.
Ensure that the patch does not introduce any breaking changes, as the new parameters have default values of None.

Example

No code snippet is provided, as the fix is described in detail in the issue body.

Notes

The provided patch file and test results suggest that the fix is well-tested and should resolve the issue. However, it is essential to review the changes and test the fix in your specific environment to ensure compatibility.

Recommendation

Apply the workaround by applying the provided patch file hermes-title-generator-fix.patch, as it adds a necessary validation mechanism to prevent stale model reloads.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #model download #tokenizer error #prompt formatting

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - 💡(How to fix) Fix [Bug]: Fix: stale title generation request reloads unloaded Ollama model after model switch [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Additional Logs / Traceback (optional)

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Proposed Fix (optional)

Are you willing to submit a PR for this?

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING