hermes - 💡(How to fix) Fix [Bug]: Content-filter triggered stream stall (output new_sensitive) does not trigger fallback_providers

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When the primary provider's output layer safety filter (e.g. MiniMax's output new_sensitive (1027)) terminates a streaming response mid-delivery, Hermes does not activate the configured fallback_providers chain. Instead it loops indefinitely retrying the same content against the same provider, which hits the same filter again.

This is a separate root cause from the stale-stream timeout case (#25689) — that bug requires the stale detector to fire; this bug's trigger fires while the stream is still appearing healthy.


Error Message

  • Error code: output new_sensitive (1027) — provider-side content safety filter In agent/error_classifier.py, treat new_sensitive as FailoverReason.content_filter (or similar). The stream-stall detection path already exists; it just needs a sentinel that fires when this specific error code is observed, not just on stale timeout. In agent/chat_completion_helpers.py, when new_sensitive is detected in the stream error, surface it as a classified error so the caller can route it to the fallback logic directly.

Root Cause

Two problems compound:

1. Partial stream stub is invisible to fallback logic

In agent/chat_completion_helpers.py, when a stream delivers some chunks then is killed by the provider, the response is returned as a partial stub with finish_reason="length". This is treated by conversation_loop.py as a truncation event requiring continuation, not as a failure requiring fallback.

2. Continuation prompt retries the same cause

The new continuation prompt (introduced in cea87d913) instructs the model to retry the same content with smaller tool calls. But when the filter is content-specific (always triggered by the same content), retrying locally against the same provider is futile — it hits the same new_sensitive filter again.

The stale-detector path (_maybe_activate_fallback_on_stale_stream from #25789) is never entered: new_sensitive fires on an active, chunk-delivering stream, so no stale timeout accumulates.


Fix Action

Fix / Workaround

  1. Configure minimax-portal/MiniMax-M2.7 as primary with a fallback_providers chain in config.yaml
  2. Send a tool-call that produces output content large enough to exceed the provider's output safety threshold (e.g. write_file with a ~17KB markdown file, patch with large context)
  3. Provider begins streaming; Hermes sees live chunks
  4. MiniMax output layer silently terminates the SSE stream
  5. Hermes receives a partial stream stub: finish_reason="length" + _dropped_tool_names
  6. Conversation loop sends a continuation prompt: "Do NOT retry the same large tool call. Break content into smaller tool calls."
  7. Model retries same content → hits same new_sensitive filter → loop
RAW_BUFFERClick to expand / collapse

Summary

When the primary provider's output layer safety filter (e.g. MiniMax's output new_sensitive (1027)) terminates a streaming response mid-delivery, Hermes does not activate the configured fallback_providers chain. Instead it loops indefinitely retrying the same content against the same provider, which hits the same filter again.

This is a separate root cause from the stale-stream timeout case (#25689) — that bug requires the stale detector to fire; this bug's trigger fires while the stream is still appearing healthy.


Environment

  • Hermes: latest main (cea87d913)
  • Provider: minimax-portal/MiniMax-M2.7 via OpenRouter
  • Configured fallback chain: yes (primary → fallback → fallback-2)
  • Error code: output new_sensitive (1027) — provider-side content safety filter

Reproduction

  1. Configure minimax-portal/MiniMax-M2.7 as primary with a fallback_providers chain in config.yaml
  2. Send a tool-call that produces output content large enough to exceed the provider's output safety threshold (e.g. write_file with a ~17KB markdown file, patch with large context)
  3. Provider begins streaming; Hermes sees live chunks
  4. MiniMax output layer silently terminates the SSE stream
  5. Hermes receives a partial stream stub: finish_reason="length" + _dropped_tool_names
  6. Conversation loop sends a continuation prompt: "Do NOT retry the same large tool call. Break content into smaller tool calls."
  7. Model retries same content → hits same new_sensitive filter → loop

Expected: After N retries, the fallback provider should be activated Actual: Infinite retry loop against the same primary; fallback chain never activates


Root cause

Two problems compound:

1. Partial stream stub is invisible to fallback logic

In agent/chat_completion_helpers.py, when a stream delivers some chunks then is killed by the provider, the response is returned as a partial stub with finish_reason="length". This is treated by conversation_loop.py as a truncation event requiring continuation, not as a failure requiring fallback.

2. Continuation prompt retries the same cause

The new continuation prompt (introduced in cea87d913) instructs the model to retry the same content with smaller tool calls. But when the filter is content-specific (always triggered by the same content), retrying locally against the same provider is futile — it hits the same new_sensitive filter again.

The stale-detector path (_maybe_activate_fallback_on_stale_stream from #25789) is never entered: new_sensitive fires on an active, chunk-delivering stream, so no stale timeout accumulates.


Suggested fix

Option A — Classify new_sensitive as a failover signal In agent/error_classifier.py, treat new_sensitive as FailoverReason.content_filter (or similar). The stream-stall detection path already exists; it just needs a sentinel that fires when this specific error code is observed, not just on stale timeout.

Option B — Treat partial stream stub + _dropped_tool_names as a failover trigger In conversation_loop.py, when a stub response has _dropped_tool_names and finish_reason="length", and the same tool was already retried N times, activate fallback instead of sending another continuation prompt.

Option C — Sentinel in chat completion client In agent/chat_completion_helpers.py, when new_sensitive is detected in the stream error, surface it as a classified error so the caller can route it to the fallback logic directly.


Related issues

  • #25689 — stale stream timeout fallback gap (different root cause — stale detector fires there; it does NOT fire here)
  • #22277 — fallback chain not activated on stream-stall timeouts (same symptom, different trigger)
  • #25789 — PR fix for #25689 (stale-detector path; does not cover content-filter path)

Impact

Any user whose primary provider has provider-side output content filtering (common for Chinese model providers) is affected. With new_sensitive triggering on large tool-call output, the fallback chain is effectively dead for those providers — users must manually /reset and switch providers.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING