hermes - 💡(How to fix) Fix [Bug]: Content-filter triggered stream stall (output new_sensitive) does not trigger fallback

StepCodex · 2026-05-26T05:35:51Z

[hermes] When the primary provider's output layer safety filter e.g. MiniMax's output new sensitive 1027 terminates a streaming response mid-delivery, Hermes d… When the primary provider's output layer safety filter (e.g. MiniMax's `output new_sensitive (1027)`) terminates a streaming response mid-delivery, Hermes does not activate the configured `fallback_providers` chain. Instead it loops indefinitely retrying the same content against the same provider, which hits the same filter again. This is a **separate root cause** from the stale-stream timeout case (#25689) — that bug requires the stale detector to fire; this bug's trigger fires while the stream is still appearing healthy. --- ## Fix / Workaround 1. Configure minimax-portal/MiniMax-M2.7 as primary with a fallback_providers chain in config.yaml 2. Send a tool-call that produces output content large enough to exceed the provider's output safety threshold (e.g. write_file with a ~17KB markdown file, patch with large context) 3. Provider begins streaming; Hermes sees live chunks 4. MiniMax output layer silently terminates the SSE stream 5. Hermes receives a partial stream stub: finish_reason="length" + _dropped_tool_names 6. Conversation loop sends a continuation prompt: "Do NOT retry the same large tool call. Break content into smaller tool calls." 7. Model retries same content → hits same new_sensitive filter → loop ## Summary When the primary provider's output layer safety filter (e.g. MiniMax's `output new_sensitive (1027)`) terminates a streaming response mid-delivery, Hermes does not activate the configured `fallback_providers` chain. Instead it loops indefinitely retrying the same content against the same provider, which hits the same filter again. This is a **separate root cause** from the stale-stream timeout case (#25689) — that bug requires the stale detector to fire; this bug's trigger fires while the stream is still appearing healthy. --- ## Environment - **Hermes:** latest main (cea87d913) - **Provider:** minimax-portal/MiniMax-M2.7 via OpenRouter - **Configured fallback chain:** yes (primary → fallback → fallback-2) - **Error code:** output new_sensitive (1027) — provider-side content safety filter --- ## Reproduction 1. Configure minimax-portal/MiniMax-M2.7 as primary with a fallback_providers chain in config.yaml 2. Send a tool-call that produces output content large enough to exceed the provider's output safety threshold (e.g. write_file with a ~17KB markdown file, patch with large context) 3. Provider begins streaming; Hermes sees live chunks 4. MiniMax output layer silently terminates the SSE stream 5. Hermes receives a partial stream stub: finish_reason="length" + _dropped_tool_names 6. Conversation loop sends a continuation prompt: "Do NOT retry the same large tool call. Break content into smaller tool calls." 7. Model retries same content → hits same new_sensitive filter → loop **Expected:** After N retries, the fallback provider should be activated **Actual:** Infinite retry loop against the same primary; fallback chain never activates --- ## Root cause Two problems compound: **1. Partial stream stub is invisible to fallback logic** In agent/chat_completion_helpers.py, when a stream delivers some chunks then is killed by the provider, the response is returned as a partial stub with finish_reason="length". This is treated by conversation_loop.py as a truncation event requiring continuation, not as a failure requiring fallback. **2. Continuation prompt retries the same cause** The new continuation prompt (introduced in cea87d913) instructs the model to retry the same content with smaller tool calls. But when the filter is content-specific (always triggered by the same content), retrying locally against the same provider is futile — it hits the same new_sensitive filter again. The stale-detector path (_maybe_activate_fallback_on_stale_stream from #25789) is **never entered**: new_sensitive fires on an active, chunk-delivering stream, so no stale timeout accumulates. --- ## Suggested fix **Option A — Classify new_sensitive as a failover signal** In agent/error_classifier.py, treat new_sensitive as FailoverReason.content_filter (or similar). The stream-stall detection path already exists; it just needs a sentinel that fires when this specific error code is observed, not just on stale timeout. **Option B — Treat partial stream stub + _dropped_tool_names as a failover trigger** In conversation_loop.py, when a stub response has _dropped_tool_names and finish_reason="length", and the same tool was already retried N times, activate fallback instead of sending another continuation prompt. **Option C — Sentinel in chat completion client** In agent/chat_completion_helpers.py, when new_sensitive is detected in the stream error, surface it as a classified error so the caller can route it to the fallback logic directly. --- ## Related issues - #25689 — stale stream timeout fallback gap (different root cause — stale detector fires there; it does

Error Message

Error code: output new_sensitive (1027) — provider-side content safety filter In agent/error_classifier.py, treat new_sensitive as FailoverReason.content_filter (or similar). The stream-stall detection path already exists; it just needs a sentinel that fires when this specific error code is observed, not just on stale timeout. In agent/chat_completion_helpers.py, when new_sensitive is detected in the stream error, surface it as a classified error so the caller can route it to the fallback logic directly.

Root Cause

Two problems compound:

1. Partial stream stub is invisible to fallback logic

In agent/chat_completion_helpers.py, when a stream delivers some chunks then is killed by the provider, the response is returned as a partial stub with finish_reason="length". This is treated by conversation_loop.py as a truncation event requiring continuation, not as a failure requiring fallback.

2. Continuation prompt retries the same cause

The new continuation prompt (introduced in cea87d913) instructs the model to retry the same content with smaller tool calls. But when the filter is content-specific (always triggered by the same content), retrying locally against the same provider is futile — it hits the same new_sensitive filter again.

The stale-detector path (_maybe_activate_fallback_on_stale_stream from #25789) is never entered: new_sensitive fires on an active, chunk-delivering stream, so no stale timeout accumulates.

Fix Action

Fix / Workaround

Configure minimax-portal/MiniMax-M2.7 as primary with a fallback_providers chain in config.yaml
Send a tool-call that produces output content large enough to exceed the provider's output safety threshold (e.g. write_file with a ~17KB markdown file, patch with large context)
Provider begins streaming; Hermes sees live chunks
MiniMax output layer silently terminates the SSE stream
Hermes receives a partial stream stub: finish_reason="length" + _dropped_tool_names
Conversation loop sends a continuation prompt: "Do NOT retry the same large tool call. Break content into smaller tool calls."
Model retries same content → hits same new_sensitive filter → loop

Summary

When the primary provider's output layer safety filter (e.g. MiniMax's output new_sensitive (1027)) terminates a streaming response mid-delivery, Hermes does not activate the configured fallback_providers chain. Instead it loops indefinitely retrying the same content against the same provider, which hits the same filter again.

This is a separate root cause from the stale-stream timeout case (#25689) — that bug requires the stale detector to fire; this bug's trigger fires while the stream is still appearing healthy.

Environment

Hermes: latest main (cea87d913)
Provider: minimax-portal/MiniMax-M2.7 via OpenRouter
Configured fallback chain: yes (primary → fallback → fallback-2)
Error code: output new_sensitive (1027) — provider-side content safety filter

Reproduction

Configure minimax-portal/MiniMax-M2.7 as primary with a fallback_providers chain in config.yaml
Send a tool-call that produces output content large enough to exceed the provider's output safety threshold (e.g. write_file with a ~17KB markdown file, patch with large context)
Provider begins streaming; Hermes sees live chunks
MiniMax output layer silently terminates the SSE stream
Hermes receives a partial stream stub: finish_reason="length" + _dropped_tool_names
Conversation loop sends a continuation prompt: "Do NOT retry the same large tool call. Break content into smaller tool calls."
Model retries same content → hits same new_sensitive filter → loop

Expected: After N retries, the fallback provider should be activated Actual: Infinite retry loop against the same primary; fallback chain never activates

Root cause

Two problems compound:

1. Partial stream stub is invisible to fallback logic

2. Continuation prompt retries the same cause

The stale-detector path (_maybe_activate_fallback_on_stale_stream from #25789) is never entered: new_sensitive fires on an active, chunk-delivering stream, so no stale timeout accumulates.

Suggested fix

Option A — Classify new_sensitive as a failover signal In agent/error_classifier.py, treat new_sensitive as FailoverReason.content_filter (or similar). The stream-stall detection path already exists; it just needs a sentinel that fires when this specific error code is observed, not just on stale timeout.

Option B — Treat partial stream stub + _dropped_tool_names as a failover trigger In conversation_loop.py, when a stub response has _dropped_tool_names and finish_reason="length", and the same tool was already retried N times, activate fallback instead of sending another continuation prompt.

Option C — Sentinel in chat completion client In agent/chat_completion_helpers.py, when new_sensitive is detected in the stream error, surface it as a classified error so the caller can route it to the fallback logic directly.

Related issues

#25689 — stale stream timeout fallback gap (different root cause — stale detector fires there; it does NOT fire here)
#22277 — fallback chain not activated on stream-stall timeouts (same symptom, different trigger)
#25789 — PR fix for #25689 (stale-detector path; does not cover content-filter path)

Impact

Any user whose primary provider has provider-side output content filtering (common for Chinese model providers) is affected. With new_sensitive triggering on large tool-call output, the fallback chain is effectively dead for those providers — users must manually /reset and switch providers.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Bug]: Content-filter triggered stream stall (output new_sensitive) does not trigger fallback_providers

Recommended Tools

GitHub issue graph ai analysis