hermes - 💡(How to fix) Fix Custom fallback providers fail silently when they don't support SSE streaming

hermes2026-05-07 21:29:21

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Root Cause

Root cause

Fix Action

Fix / Workaround

Workaround

Code Example

fallback_providers:
  - provider: custom
    model: my-model
    base_url: http://my-proxy:8080/v1

RAW_BUFFERClick to expand / collapse

Describe the bug

When a custom fallback provider returns a non-streaming JSON response to a stream=True request, the OpenAI SDK's streaming parser receives zero chunks. This causes:

content_parts stays empty → full_content = "".join([]) or None = None
Response is flagged as "empty" → retry loop → fallback cascade
The provider's valid response is silently discarded

This affects any custom provider that doesn't implement SSE streaming (e.g., lightweight proxies, self-hosted endpoints, Vertex AI REST API).

To reproduce

Configure a custom fallback provider that returns valid JSON but not SSE:

fallback_providers:
  - provider: custom
    model: my-model
    base_url: http://my-proxy:8080/v1

Primary provider hits rate limit → Hermes falls back to custom provider
Custom provider returns valid {"choices": [...]} JSON
Hermes logs: ⚠️ Empty response from model — retrying (1/3)
After 3 retries: cascades to next fallback or gives up

Root cause

run_agent.py line ~8089: _use_streaming = True is unconditional — there's no per-provider or per-fallback streaming toggle. The comment says "Always prefer the streaming path" for health-monitoring benefits, but this assumption breaks custom providers.

When client.chat.completions.create(stream=True) receives a JSON response instead of SSE, the SDK's Stream iterator yields zero chunks. The streaming response builder at line ~5040 produces full_content = None with no tool calls → flagged as invalid.

Expected behavior

Either:

(A) Add a per-provider config flag to disable streaming: fallback_providers: [{provider: custom, model: x, base_url: y, stream: false}]
(B) Detect non-SSE responses in the streaming path and fall back to non-streaming parsing
(C) Document that custom providers MUST support SSE streaming

Workaround

We built a lightweight proxy (~200 lines Python) that translates OpenAI streaming requests to Vertex AI's native streamGenerateContent?alt=sse endpoint and converts the chunks back to OpenAI chat.completion.chunk format. Happy to contribute this as a reference implementation or built-in adapter.

Environment

Hermes version: 0.8.0
Provider: custom (Vertex AI via proxy)
Platform: Docker (gateway mode)
OS: macOS (Apple Silicon)

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #orchestration issue #cache issue #memory leak #API versioning

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Custom fallback providers fail silently when they don't support SSE streaming

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix Custom fallback providers fail silently when they don't support SSE streaming

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Still need to ship something?

RELATED_DISCOVERY

TRENDING