openclaw - ✅(Solved) Fix [Bug]: Streaming 4xx responses lose their JSON error body, hiding provider error detail (e.g. Gemini 400) [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#78180Fetched 2026-05-06 06:16:16
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Timeline (top)
commented ×1cross-referenced ×1

Provider HTTP errors with Content-Type: text/event-stream (e.g. Google Gemini's streamGenerateContent) bubble up with their JSON error body stripped, so the thrown error degrades to just a status code (Google Generative AI API error (400)) with no [code=…] or message detail.

Error Message

Provider HTTP errors with Content-Type: text/event-stream (e.g. Google Gemini's streamGenerateContent) bubble up with their JSON error body stripped, so the thrown error degrades to just a status code (Google Generative AI API error (400)) with no [code=…] or message detail. 3. Observe [agent/embedded] embedded run agent end: ... isError=true ... error=Google Generative AI API error (400) in the gateway error log — body detail is missing. 4. Curl the same endpoint directly with the same payload and the JSON body { "error": { "code": 400, "message": "...", "status": "INVALID_ARGUMENT" } } is clearly returned. createProviderHttpError should surface the JSON body's error.message / error.status, e.g. Google Generative AI API error (400): GenerateContentRequest.contents: contents is not specified [code=INVALID_ARGUMENT]. This was the observed behavior on builds prior to ~2026-04-30 and is what extractProviderErrorDetail is designed to do. The thrown error is just Google Generative AI API error (400) with no detail, because the response body has been silently emptied before createProviderHttpError reads it. error=Google Generative AI API error (400): * GenerateContentRequest.contents: contents is not specified [code=INVALID_ARGUMENT] ← old, with detail error=Google Generative AI API error (400) ← new, detail gone "error": {

  • The Google transport (and any other transport using buildGuardedModelFetch) calls if (!response.ok) throw await createProviderHttpError(response, …) — error responses never go through the SDK's SSE parser.
  • Gemini's streamGenerateContent returns the JSON { "error": { … } } body for 4xx with Content-Type: text/event-stream retained.
  • The sanitizer sees no data: lines, drops the whole body, and extractProviderErrorDetail reads an empty string. The thrown error keeps only (${status}). Fix in progress as a PR — preserve the original response when !response.ok so the existing error-detail extractor sees the JSON body untouched. Adds a regression test in src/agents/provider-transport-fetch.test.ts.

Root Cause

sanitizeOpenAISdkSseResponse in src/agents/provider-transport-fetch.ts wraps every response whose Content-Type contains text/event-stream, then drops any chunk that does not have a data: prefix (its purpose is to strip event-only / blank-data keepalives so the OpenAI SDK's stream parser does not JSON.parse them).

For non-OK responses, that wrapper is wrong:

  • The Google transport (and any other transport using buildGuardedModelFetch) calls if (!response.ok) throw await createProviderHttpError(response, …) — error responses never go through the SDK's SSE parser.
  • Gemini's streamGenerateContent returns the JSON { "error": { … } } body for 4xx with Content-Type: text/event-stream retained.
  • The sanitizer sees no data: lines, drops the whole body, and extractProviderErrorDetail reads an empty string. The thrown error keeps only (${status}).

Fix Action

Fix / Workaround

Last known good: pre-sanitizeOpenAISdkSseResponse builds (logs from 2026-04-29 still carried full Gemini detail, builds from 2026-05-02 onward do not). First known bad: any build that ships the current sanitizeOpenAISdkSseResponse (present on main). Workaround: switch agent primary to a non-streaming-Content-Type provider (e.g. deepseek/*) until the fix lands.

PR fix notes

PR #78183: fix(transport): preserve JSON error body on non-OK SSE responses

Description (problem / solution / changelog)

Summary

  • Problem: Provider HTTP errors with Content-Type: text/event-stream (e.g. Google Gemini's streamGenerateContent) bubble up with their JSON error body silently stripped, so createProviderHttpError only surfaces the bare status code.
  • Why it matters: Operators see Google Generative AI API error (400) in logs and have no way to learn the actual upstream complaint (INVALID_ARGUMENT, schema field rejected, function-call ordering, etc.) without bypassing the runtime. Failover and triage fly blind.
  • What changed: sanitizeOpenAISdkSseResponse now returns the original response when !response.ok, so the existing error-detail extractor sees the JSON body intact. Plus a regression test in src/agents/provider-transport-fetch.test.ts.
  • What did NOT change (scope boundary): Successful (response.ok) SSE streams still pass through the sanitizer; the keepalive-frame stripping for the OpenAI SDK's parser is unchanged. No other transports, no plugin, no public surface.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #78180
  • Related #
  • This PR fixes a bug or regression

Real behavior proof (required for external PRs)

External contributors must show after-fix evidence from a real OpenClaw setup.

  • Behavior or issue addressed: Google Gemini 4xx responses returned via streamGenerateContent lose their JSON { "error": { ... } } body before createProviderHttpError reads it, leaving only Google Generative AI API error (400) with no [code=...] detail.
  • Real environment tested: macOS 26.4.1 (arm64), Node 25.9.0, OpenClaw npm global install at ~/.openclaw, Gemini agent on google/gemini-3.1-pro-preview over Feishu DM.
  • Exact steps or command run after this patch:
    pnpm install --frozen-lockfile
    node scripts/run-vitest.mjs run --config test/vitest/vitest.agents-core.config.ts src/agents/provider-transport-fetch.test.ts
  • Evidence after fix (terminal capture):
     RUN  v4.1.5 /private/tmp/oc-pr/openclaw
    
     Test Files  1 passed (1)
          Tests  23 passed (23)
       Start at  09:26:04
       Duration  463ms
    Including the new regression case preserves the JSON error body on non-OK SSE responses so providers can surface error detail.
  • Observed result after fix: The wrapped Response returned from buildGuardedModelFetch for a 400 with Content-Type: text/event-stream keeps its body, so await response.text() returns the original { "error": { "code": 400, "message": "...", "status": "INVALID_ARGUMENT" } } payload that extractProviderErrorDetail needs.
  • What was not tested: Live failover under sustained Gemini errors and non-Google streaming providers that emit JSON-only 4xx (e.g. some OpenAI-compatible deployments) — the fix is provider-agnostic but I have not exercised every transport.
  • Before evidence: With the same test, on main at 538605ff before this patch:
     FAIL  preserves the JSON error body on non-OK SSE responses ...
    AssertionError: expected '' to be '{"error":{"code":400,"message":"GenerateContentRequest.contents: contents is not specified","status":"INVALID_ARGUMENT"}}' // Object.is equality
    Real-environment log line that triggered this investigation, from ~/.openclaw/logs/gateway.err.log:
    2026-04-29T20:42:29.278+08:00 [agent/embedded] embedded run agent end:
      runId=2cfe0ade-... isError=true model=gemini-3.1-pro-preview provider=google
      error=Google Generative AI API error (400): * GenerateContentRequest.contents: contents is not specified [code=INVALID_ARGUMENT]   ← old, with detail
    
    2026-05-05T21:17:39.591+08:00 [agent/embedded] embedded run agent end:
      runId=d4c75e90-... isError=true model=gemini-3.1-pro-preview provider=google
      error=Google Generative AI API error (400)                                                                                       ← regressed, detail gone

Root Cause (if applicable)

  • Root cause: sanitizeOpenAISdkSseResponse (in src/agents/provider-transport-fetch.ts) wraps every response whose Content-Type contains text/event-stream and drops any chunk that lacks a data: prefix. Its job is to strip event-only / blank-data keepalives so the OpenAI SDK's stream parser does not JSON.parse them. Non-OK responses never go through that parser — transports throw via createProviderHttpError, which reads the raw body. Google's streamGenerateContent returns 4xx with the JSON { "error": { ... } } body but keeps Content-Type: text/event-stream, so the sanitizer drops it and the thrown error degrades to just the status code.
  • Missing detection / guardrail: No existing test covered the "non-OK + text/event-stream" combination; only the happy-path keepalive-stripping cases were exercised.
  • Contributing context (if known): The error-extractor extractProviderErrorDetail is shared by the Anthropic / Google / OpenAI-compat transports, so any provider that streams JSON 4xx with this content type was affected the same way. Logs from ~/.openclaw/logs/gateway.err.log show the detail string disappeared on live agents around 2026-04-30, matching the rollout window of the sanitizer.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/provider-transport-fetch.test.ts, new case preserves the JSON error body on non-OK SSE responses so providers can surface error detail.
  • Scenario the test should lock in: Given a fetchWithSsrFGuard mock that returns status: 400, content-type: text/event-stream, body: <JSON>, the response returned by buildGuardedModelFetch(...) must keep response.ok === false, response.status === 400, and await response.text() equal to the original JSON body.
  • Why this is the smallest reliable guardrail: It exercises the exact seam where the bug hides (sanitizeOpenAISdkSseResponse) without spinning up any transport, runtime, or provider, and matches the structure of the existing OK-path SSE tests in the same file.
  • Existing test that already covers this (if any): None — the existing SSE tests in this file all use 200 responses, so the response.ok branch was never tested.
  • If no new test is added, why not: N/A.

User-visible / Behavior Changes

None for the happy path. For failed requests on streaming endpoints, thrown errors regain their full provider detail (e.g. Google Generative AI API error (400): GenerateContentRequest.contents: contents is not specified [code=INVALID_ARGUMENT]).

Diagram (if applicable)

Before:
provider 4xx (content-type: text/event-stream, body: JSON error)
  -> sanitizeOpenAISdkSseResponse wraps body
     -> drops every chunk without `data:` prefix (JSON has none)
        -> response.body becomes empty
           -> createProviderHttpError reads "" -> throws "(400)" with no detail

After:
provider 4xx (content-type: text/event-stream, body: JSON error)
  -> sanitizeOpenAISdkSseResponse sees !response.ok -> returns original response
     -> createProviderHttpError reads JSON body
        -> throws "(400): <message> [code=<status>]"

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: N/A.

Repro + Verification

Environment

  • OS: macOS 26.4.1 (arm64)
  • Runtime/container: Node v25.9.0, pnpm 10.33.2
  • Model/provider: google/gemini-3.1-pro-preview via https://generativelanguage.googleapis.com/v1beta/.../streamGenerateContent
  • Integration/channel (if any): Feishu DM (incidental — happens on any channel)
  • Relevant config (redacted): ~/.openclaw/openclaw.json agent with model.primary = "google/gemini-3.1-pro-preview"; default Google route, no proxy.

Steps

  1. Start any agent whose primary is a Gemini model on the v1beta streaming endpoint.
  2. Trigger a 4xx (e.g. send a turn whose history contains a stale function_call block, or temporarily set tools to one with an unsupported field like additionalProperties).
  3. Observe gateway.err.log.

Expected

error=Google Generative AI API error (400): <message> [code=INVALID_ARGUMENT]

Actual (before this PR)

error=Google Generative AI API error (400) — no detail.

Actual (after this PR)

Full detail string is restored. Verified via the new unit test (terminal output above) and via a direct curl against the same Gemini endpoint, which shows that Gemini does return a JSON body with Content-Type: text/event-stream for 4xx:

$ curl -i -X POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-pro-preview:streamGenerateContent?alt=sse&key=…' \
    -H 'Content-Type: application/json' \
    -d '{"contents":[],"systemInstruction":{"parts":[{"text":"x"}]}}'

HTTP/2 400
content-type: text/event-stream

{
  "error": {
    "code": 400,
    "message": "* GenerateContentRequest.contents: contents is not specified\n",
    "status": "INVALID_ARGUMENT"
  }
}

Evidence

  • Failing test/log before + passing after (see "Real behavior proof" section)
  • Trace/log snippets (gateway.err.log lines from a real install)
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios:
    1. New unit test in provider-transport-fetch.test.ts fails on main (assertion: empty body) and passes after this patch (full JSON body preserved).
    2. Full agents-core Vitest scope (273 files / 3519 tests) passes locally with this patch applied.
    3. pnpm tsgo:core clean. oxlint and oxfmt --check clean for both touched files.
    4. Direct curl against Gemini v1beta streamGenerateContent confirms the upstream actually returns a JSON error body with Content-Type: text/event-stream (i.e. the regression precondition is real, not a synthetic test fixture).
  • Edge cases checked: Mixed body (e.g. real data: lines plus JSON error) cannot occur on the error path (Gemini either streams success frames or returns a JSON error body, never both); successful streaming responses still go through the sanitizer untouched, verified by the existing two text/event-stream happy-path tests still passing.
  • What you did not verify: Production failover behavior under sustained Gemini error rates; non-Google streaming providers that may emit JSON 4xx with text/event-stream (Anthropic/OpenAI typically emit application/json for 4xx, so they should not have been affected, but I have not exercised every transport).

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • If yes, exact upgrade steps: N/A.

Risks and Mitigations

  • Risk: A future provider could ship malformed SSE error frames (mixed JSON + data: lines) where consumers had relied on the sanitizer normalizing them.
    • Mitigation: Out of scope for this PR — the error path explicitly does not parse SSE; if such a provider emerges, its transport-specific error reader can normalize the body itself. The sanitizer's contract is now tightly scoped to OK responses, which matches the comment that already explained its purpose.

Built with Claude Code (AI-assisted). The fix and tests were authored in collaboration with the model; I reviewed and ran the verification listed above on my own machine.

Changed files

  • src/agents/provider-transport-fetch.test.ts (modified, +40/-0)
  • src/agents/provider-transport-fetch.ts (modified, +6/-1)

Code Example

2026-04-29T20:42:29.278+08:00 [agent/embedded] embedded run agent end: runId=2cfe0ade-854e-4dc8-a221-09676281681e
  isError=true model=gemini-3.1-pro-preview provider=google
  error=Google Generative AI API error (400): * GenerateContentRequest.contents: contents is not specified [code=INVALID_ARGUMENT]   ← old, with detail

2026-05-05T21:17:39.591+08:00 [agent/embedded] embedded run agent end: runId=d4c75e90-26e7-49f6-865b-3db3a28f2ec5
  isError=true model=gemini-3.1-pro-preview provider=google
  error=Google Generative AI API error (400)new, detail gone

---

$ curl -i -X POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-pro-preview:streamGenerateContent?alt=sse&key=…' \
    -H 'Content-Type: application/json' -d '{"contents":[],"systemInstruction":{"parts":[{"text":"x"}]}}'

HTTP/2 400
content-type: text/event-stream

{
  "error": {
    "code": 400,
    "message": "* GenerateContentRequest.contents: contents is not specified\n",
    "status": "INVALID_ARGUMENT"
  }
}
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

Provider HTTP errors with Content-Type: text/event-stream (e.g. Google Gemini's streamGenerateContent) bubble up with their JSON error body stripped, so the thrown error degrades to just a status code (Google Generative AI API error (400)) with no [code=…] or message detail.

Steps to reproduce

  1. Configure any agent with a google/gemini-* primary that hits https://generativelanguage.googleapis.com/v1beta/.../streamGenerateContent (e.g. gemini-3.1-pro-preview).
  2. Make any request that produces a 4xx from Gemini (e.g. an empty contents payload, a stale function_call in history, an unsupported tool schema field).
  3. Observe [agent/embedded] embedded run agent end: ... isError=true ... error=Google Generative AI API error (400) in the gateway error log — body detail is missing.
  4. Curl the same endpoint directly with the same payload and the JSON body { "error": { "code": 400, "message": "...", "status": "INVALID_ARGUMENT" } } is clearly returned.

Expected behavior

createProviderHttpError should surface the JSON body's error.message / error.status, e.g. Google Generative AI API error (400): GenerateContentRequest.contents: contents is not specified [code=INVALID_ARGUMENT]. This was the observed behavior on builds prior to ~2026-04-30 and is what extractProviderErrorDetail is designed to do.

Actual behavior

The thrown error is just Google Generative AI API error (400) with no detail, because the response body has been silently emptied before createProviderHttpError reads it.

OpenClaw version

2026.5.4 (also reproduces on main at e6f5f569 and 538605ff)

Operating system

macOS 26.4.1 (arm64) — root cause is platform-independent

Install method

npm global

Model

google/gemini-3.1-pro-preview (any Gemini model on the v1beta streamGenerateContent endpoint reproduces this; same shape can affect any provider that returns a JSON 4xx body alongside a streaming Content-Type)

Provider / routing chain

openclaw → google generativelanguage.googleapis.com/v1beta:streamGenerateContent

Additional provider/model setup details

Default Gemini route, no custom proxy. The same regression makes Anthropic/OpenAI streaming 4xx detail less reliable in any setup that reuses buildGuardedModelFetch and produces a JSON body with a streaming content-type.

Logs, screenshots, and evidence

2026-04-29T20:42:29.278+08:00 [agent/embedded] embedded run agent end: runId=2cfe0ade-854e-4dc8-a221-09676281681e
  isError=true model=gemini-3.1-pro-preview provider=google
  error=Google Generative AI API error (400): * GenerateContentRequest.contents: contents is not specified [code=INVALID_ARGUMENT]   ← old, with detail

2026-05-05T21:17:39.591+08:00 [agent/embedded] embedded run agent end: runId=d4c75e90-26e7-49f6-865b-3db3a28f2ec5
  isError=true model=gemini-3.1-pro-preview provider=google
  error=Google Generative AI API error (400)                                                                                       ← new, detail gone

Direct API call returning a JSON body for the streaming endpoint:

$ curl -i -X POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-3.1-pro-preview:streamGenerateContent?alt=sse&key=…' \
    -H 'Content-Type: application/json' -d '{"contents":[],"systemInstruction":{"parts":[{"text":"x"}]}}'

HTTP/2 400
content-type: text/event-stream

{
  "error": {
    "code": 400,
    "message": "* GenerateContentRequest.contents: contents is not specified\n",
    "status": "INVALID_ARGUMENT"
  }
}

Root cause

sanitizeOpenAISdkSseResponse in src/agents/provider-transport-fetch.ts wraps every response whose Content-Type contains text/event-stream, then drops any chunk that does not have a data: prefix (its purpose is to strip event-only / blank-data keepalives so the OpenAI SDK's stream parser does not JSON.parse them).

For non-OK responses, that wrapper is wrong:

  • The Google transport (and any other transport using buildGuardedModelFetch) calls if (!response.ok) throw await createProviderHttpError(response, …) — error responses never go through the SDK's SSE parser.
  • Gemini's streamGenerateContent returns the JSON { "error": { … } } body for 4xx with Content-Type: text/event-stream retained.
  • The sanitizer sees no data: lines, drops the whole body, and extractProviderErrorDetail reads an empty string. The thrown error keeps only (${status}).

Impact and severity

Affected: every agent on a Google Gemini primary (and any future provider that emits JSON 4xx with a streaming content-type). Severity: high for debuggability — the actual upstream complaint (e.g. "function call turn must come immediately after a user turn or after a function response turn", quota exhaustion, unsupported field in a tool schema) is invisible to operators and downstream alerting. Frequency: deterministic — every non-OK response on a streaming endpoint loses its body. Consequence: failovers and bug reports both fly blind; the only signal left is the bare HTTP status code.

Additional information

Last known good: pre-sanitizeOpenAISdkSseResponse builds (logs from 2026-04-29 still carried full Gemini detail, builds from 2026-05-02 onward do not). First known bad: any build that ships the current sanitizeOpenAISdkSseResponse (present on main). Workaround: switch agent primary to a non-streaming-Content-Type provider (e.g. deepseek/*) until the fix lands.

Fix in progress as a PR — preserve the original response when !response.ok so the existing error-detail extractor sees the JSON body untouched. Adds a regression test in src/agents/provider-transport-fetch.test.ts.

extent analysis

TL;DR

The most likely fix is to modify the sanitizeOpenAISdkSseResponse function to preserve the original response body when the response is not OK, allowing the error detail extractor to access the JSON body.

Guidance

  • Identify the sanitizeOpenAISdkSseResponse function in src/agents/provider-transport-fetch.ts and modify it to check if the response is not OK before sanitizing the response body.
  • Add a conditional statement to preserve the original response body when !response.ok.
  • Update the createProviderHttpError function to handle the preserved response body and extract the error details.
  • Verify the fix by testing the modified code with a non-OK response from the Google Gemini API and checking if the error details are correctly extracted.

Example

if (!response.ok) {
  // Preserve the original response body
  const responseBody = await response.text();
  // Extract error details from the response body
  const errorDetails = extractProviderErrorDetail(responseBody);
  // Create the provider HTTP error with the extracted error details
  throw createProviderHttpError(response, errorDetails);
} else {
  // Sanitize the response body for OK responses
  const sanitizedResponseBody = sanitizeOpenAISdkSseResponse(response);
  //...
}

Notes

The fix is specific to the sanitizeOpenAISdkSseResponse function and the createProviderHttpError function. The modification should preserve the original response body when the response is not OK, allowing the error detail extractor to access the JSON body.

Recommendation

Apply the workaround by modifying the sanitizeOpenAISdkSseResponse function to preserve the original response body when the response is not OK, until the fix lands. This will allow the error detail extractor to access the JSON body and provide more informative error messages.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

createProviderHttpError should surface the JSON body's error.message / error.status, e.g. Google Generative AI API error (400): GenerateContentRequest.contents: contents is not specified [code=INVALID_ARGUMENT]. This was the observed behavior on builds prior to ~2026-04-30 and is what extractProviderErrorDetail is designed to do.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING