One of the following should happen: 1. OpenClaw should convert image tool results into an xAI-compatible input shape before forwarding them, or 2. OpenClaw should down-convert incompatible image tool results to text placeholders/metadata for xAI models that cannot accept the internal image block format, or 3. OpenClaw should detect incompatibility earlier and avoid sending invalid model input to xAI.

openclaw - ✅(Solved) Fix [Bug]: xAI/openai-responses crashes with 422 when tool results include image blocks from read(image) [1 pull requests, 1 comments, 2 participants]

swilson2020 · 2026-03-30T23:10:44Z

[openclaw] In OpenClaw 2026.3.28, worker sessions using xAI via the openai-responses API path crash with: 422 Failed to deserialize the JSON body into the targ… In OpenClaw 2026.3.28, worker sessions using xAI via the `openai-responses` API path crash with: `422 Failed to deserialize the JSON body into the target type: input: data did not match any variant of untagged enum ModelInput` # PR #58017: fix(xai): normalize image tool results for responses - Repository: openclaw/openclaw - Author: neeravmakwana - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/58017 ## Description (problem / solution / changelog) ## Summary - Problem: xAI `openai-responses` requests could replay image-bearing tool results as array-valued `function_call_output.output`, which xAI rejects with a 422 deserialization error. - Why it matters: `read(image)` and similar tool flows could succeed on the tool call itself, then crash on the next model turn instead of continuing the session. - What changed: the xAI stream payload compatibility wrapper now rewrites array-valued `function_call_output` items into string outputs and, when the model explicitly supports image input, emits the image blocks as a following user message instead. - What did NOT change (scope boundary): core transcript structures, non-xAI providers, and upstream `pi-ai` response conversion logic. ## Change Type (select all) - [x] Bug fix - [ ] Feature - [ ] Refactor required for the fix - [ ] Docs - [ ] Security hardening - [ ] Chore/infra ## Scope (select all touched areas) - [ ] Gateway / orchestration - [ ] Skills / tool execution - [ ] Auth / tokens - [ ] Memory / storage - [x] Integrations - [ ] API / contracts - [ ] UI / DX - [ ] CI/CD / infra ## Linked Issue/PR - Closes #57981 - Related #57981 - [x] This PR fixes a bug or regression ## Root Cause / Regression History (if applicable) - Root cause: the xAI provider uses the OpenAI Responses transport, and the current payload builder can emit image-bearing tool results as `function_call_output.output = [input_text, input_image]`; xAI rejects that shape. - Missing detection / guardrail: the xAI compatibility wrapper stripped unsupported reasoning/schema fields but did not normalize image-bearing tool-result payloads. - Prior context (`git blame`, prior PR, issue, or refactor if known): issue #57981 captured the failing payload shape and 422 symptom for `read(image)` flows. - Why this regressed now: the provider path preserved structured image tool results into the Responses payload, but xAI's endpoint is stricter than direct OpenAI here. - If unknown, what was ruled out: verified current code still reproduces the array-valued `function_call_output.output` payload; this is not already normalized in the current xAI wrapper layer. ## Regression Test Plan (if applicable) - Coverage level that should have caught this: - [x] Unit test - [ ] Seam / integration test - [ ] End-to-end test - [ ] Existing coverage already sufficient - Target test or file: `extensions/xai/stream.test.ts` - Scenario the test should lock in: image-bearing tool results sent through the xAI `openai-responses` wrapper must become string `function_call_output.output` plus a following user image message. - Why this is the smallest reliable guardrail: the bug lives inside the xAI payload compatibility wrapper, so a focused wrapper test exercises the exact failing shape without depending on live provider calls. - Existing test that already covers this (if any): none for this payload shape. - If no new test is added, why not: N/A ## User-visible / Behavior Changes - xAI-backed sessions can continue after tool results that include image blocks instead of failing on the follow-up turn with a 422. ## Diagram (if applicable) ```text Before: [tool returns text + image blocks] -> [function_call_output.output contains input_image] -> [xAI 422] After: [tool returns text + image blocks] -> [function_call_output.output becomes text] -> [image blocks replayed as user message] -> [session continues] ``` ## Security Impact (required) - New permissions/capabilities? (`Yes/No`) No - Secrets/tokens handling changed? (`Yes/No`) No - New/changed network calls? (`Yes/No`) No - Command/tool execution surface changed? (`Yes/No`) No - Data access scope changed? (`Yes/No`) No - If any `Yes`, explain risk + mitigation: N/A ## Repro + Verification ### Environment - OS: macOS (local verification) and issue repro from Ubuntu in #57981 - Runtime/container: local repo checkout - Model/provider: xAI / `openai-responses` - Integration/channel (if any): agent session / tool replay - Relevant config (redacted): xAI provider using `https://api.x.ai/v1` ### Steps 1. Configure an xAI model on the `openai-responses` path. 2. Produce a tool result with text plus image blocks, such as `read` on a PNG. 3. Replay that result into the next model turn. ### Expected - The follow-up request stays xAI-compatible and the session continues. ### Actual - Before this change, t

openclaw2026-03-30 23:10:44

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#57981•Fetched 2026-04-08 01:55:18

View on GitHub

Comments

Participants

Timeline

Reactions

Author

swilson2020

Participants

obviyus

swilson2020

Timeline (top)

labeled ×2closed ×1commented ×1cross-referenced ×1

In OpenClaw 2026.3.28, worker sessions using xAI via the openai-responses API path crash with:

422 Failed to deserialize the JSON body into the target type: input: data did not match any variant of untagged enum ModelInput

Error Message

Next assistant turn fails with xAI 422 ModelInput deserialization error.
Session logs show the worker succeeds through read tool calls, then immediately dies with the 422 xAI error.

Root Cause

In OpenClaw 2026.3.28, worker sessions using xAI via the openai-responses API path crash with:

422 Failed to deserialize the JSON body into the target type: input: data did not match any variant of untagged enum ModelInput

Fix Action

Fixed

Fixed by PR: fix(xai): normalize image tool results for responses (https://github.com/openclaw/openclaw/pull/58017)

PR fix notes

PR #58017: fix(xai): normalize image tool results for responses

Repository: openclaw/openclaw
Author: neeravmakwana
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/58017

Description (problem / solution / changelog)

Summary

Problem: xAI openai-responses requests could replay image-bearing tool results as array-valued function_call_output.output, which xAI rejects with a 422 deserialization error.
Why it matters: read(image) and similar tool flows could succeed on the tool call itself, then crash on the next model turn instead of continuing the session.
What changed: the xAI stream payload compatibility wrapper now rewrites array-valued function_call_output items into string outputs and, when the model explicitly supports image input, emits the image blocks as a following user message instead.
What did NOT change (scope boundary): core transcript structures, non-xAI providers, and upstream pi-ai response conversion logic.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Closes #57981
Related #57981
This PR fixes a bug or regression

Root Cause / Regression History (if applicable)

Root cause: the xAI provider uses the OpenAI Responses transport, and the current payload builder can emit image-bearing tool results as function_call_output.output = [input_text, input_image]; xAI rejects that shape.
Missing detection / guardrail: the xAI compatibility wrapper stripped unsupported reasoning/schema fields but did not normalize image-bearing tool-result payloads.
Prior context (git blame, prior PR, issue, or refactor if known): issue #57981 captured the failing payload shape and 422 symptom for read(image) flows.
Why this regressed now: the provider path preserved structured image tool results into the Responses payload, but xAI's endpoint is stricter than direct OpenAI here.
If unknown, what was ruled out: verified current code still reproduces the array-valued function_call_output.output payload; this is not already normalized in the current xAI wrapper layer.

Regression Test Plan (if applicable)

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: extensions/xai/stream.test.ts
Scenario the test should lock in: image-bearing tool results sent through the xAI openai-responses wrapper must become string function_call_output.output plus a following user image message.
Why this is the smallest reliable guardrail: the bug lives inside the xAI payload compatibility wrapper, so a focused wrapper test exercises the exact failing shape without depending on live provider calls.
Existing test that already covers this (if any): none for this payload shape.
If no new test is added, why not: N/A

User-visible / Behavior Changes

xAI-backed sessions can continue after tool results that include image blocks instead of failing on the follow-up turn with a 422.

Diagram (if applicable)

Before:
[tool returns text + image blocks] -> [function_call_output.output contains input_image] -> [xAI 422]

After:
[tool returns text + image blocks] -> [function_call_output.output becomes text] -> [image blocks replayed as user message] -> [session continues]

Security Impact (required)

New permissions/capabilities? (Yes/No) No
Secrets/tokens handling changed? (Yes/No) No
New/changed network calls? (Yes/No) No
Command/tool execution surface changed? (Yes/No) No
Data access scope changed? (Yes/No) No
If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

OS: macOS (local verification) and issue repro from Ubuntu in #57981
Runtime/container: local repo checkout
Model/provider: xAI / openai-responses
Integration/channel (if any): agent session / tool replay
Relevant config (redacted): xAI provider using https://api.x.ai/v1

Steps

Configure an xAI model on the openai-responses path.
Produce a tool result with text plus image blocks, such as read on a PNG.
Replay that result into the next model turn.

Expected

The follow-up request stays xAI-compatible and the session continues.

Actual

Before this change, the request can include array-valued function_call_output.output with input_image, which causes xAI to fail with a 422.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

Verified scenarios: reproduced the current payload shape locally, added a regression test for the xAI wrapper, ran pnpm test -- extensions/xai/stream.test.ts, ran pnpm test:extension xai, and ran pnpm build.
Edge cases checked: tool results with text plus image blocks; models that do not explicitly advertise image input keep a string-only tool output.
What you did not verify: live xAI API execution in this environment.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? (Yes/No) Yes
Config/env changes? (Yes/No) No
Migration needed? (Yes/No) No
If yes, exact upgrade steps: N/A

Risks and Mitigations

Risk: xAI models without explicit image input support could still receive invalid image blocks.
- Mitigation: the wrapper only forwards image blocks when the resolved model explicitly declares image input support; otherwise it keeps a string-only tool output.

Notes

AI-assisted: yes.
pnpm check currently stops on unrelated existing tsgo errors in extensions/diffs/src/language-hints.test.ts on top of origin/main; the touched xAI lane and build both passed.

Made with Cursor

Changed files

extensions/xai/stream.test.ts (modified, +174/-0)
extensions/xai/stream.ts (modified, +81/-0)

Code Example

1. `openclaw status` shows OpenClaw 2026.3.28 and hook sessions using `grok-4-1-fast`.
2. Installed runtime code indicates xAI provider is built with `openai-responses`:
   - file: `dist/provider-catalog-*.js`
   - function: `buildXaiProvider(api = "openai-responses")`
3. Read-image tool results are normalized/sanitized but still preserved as structured image blocks with base64 payloads:
   - file: `dist/auth-profiles-*.js`
   - `createOpenClawReadTool()` returns `sanitizeToolResultImages(await normalizeReadImageResult(...))`
   - file: `dist/tool-images-*.js`
   - sanitized image blocks remain shaped like:
     - `type: "image"`
     - `data: <base64>`
     - `mimeType: "image/png"` or similar
4. Session logs show the worker succeeds through `read` tool calls, then immediately dies with the 422 xAI error.

RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

Summary

In OpenClaw 2026.3.28, worker sessions using xAI via the openai-responses API path crash with:

422 Failed to deserialize the JSON body into the target type: input: data did not match any variant of untagged enum ModelInput

Steps to reproduce

Configure OpenClaw 2026.3.28 with xAI/Grok model on the openai-responses path.
Start a session with a prompt like:
- "Read this PNG file with the read tool, then summarize it."
Call read on any local PNG path.
Observe that the tool result returns an image block.
Next assistant turn fails with xAI 422 ModelInput deserialization error.

Expected behavior

One of the following should happen:

OpenClaw should convert image tool results into an xAI-compatible input shape before forwarding them, or
OpenClaw should down-convert incompatible image tool results to text placeholders/metadata for xAI models that cannot accept the internal image block format, or
OpenClaw should detect incompatibility earlier and avoid sending invalid model input to xAI.

Actual behavior

The session crashes with xAI 422 after image tool results are reintroduced into model input.

OpenClaw version

2026.3.28

Operating system

Ubuntu

Install method

npm

Model

xai grok 4.1 fast

Provider / routing chain

openclaw -> xAI API

Additional provider/model setup details

No response

Logs, screenshots, and evidence

1. `openclaw status` shows OpenClaw 2026.3.28 and hook sessions using `grok-4-1-fast`.
2. Installed runtime code indicates xAI provider is built with `openai-responses`:
   - file: `dist/provider-catalog-*.js`
   - function: `buildXaiProvider(api = "openai-responses")`
3. Read-image tool results are normalized/sanitized but still preserved as structured image blocks with base64 payloads:
   - file: `dist/auth-profiles-*.js`
   - `createOpenClawReadTool()` returns `sanitizeToolResultImages(await normalizeReadImageResult(...))`
   - file: `dist/tool-images-*.js`
   - sanitized image blocks remain shaped like:
     - `type: "image"`
     - `data: <base64>`
     - `mimeType: "image/png"` or similar
4. Session logs show the worker succeeds through `read` tool calls, then immediately dies with the 422 xAI error.

Impact and severity

Breaks any xAI-backed workfloat that read local image files through tools and then continues agent execution.

In my case, it blocked several production workflows after upgrading to 2026.3.28.

Additional information

Last know good version 2026.3.24.

extent analysis

Fix Plan

To resolve the issue, we need to modify the OpenClaw code to handle image tool results before sending them to the xAI API. We can achieve this by converting image blocks to text placeholders or compatible input shapes.

Step-by-Step Solution

Identify the function responsible for sending data to the xAI API: Locate the buildXaiProvider function in dist/provider-catalog-*.js and find the code that sends the request to the openai-responses API.
Add a preprocessing step for image tool results: Before sending the data to the xAI API, add a check to see if the tool result is an image block. If it is, convert it to a text placeholder or a compatible input shape.
Implement the conversion logic: Create a new function, e.g., convertImageBlockToText, that takes the image block as input and returns a text placeholder or a compatible input shape.

Example code:

// dist/provider-catalog-*.js
function buildXaiProvider(api = "openai-responses") {
  // ...
  const sendDataToXai = async (data) => {
    // Check if the data is an image block
    if (data.type === "image") {
      // Convert the image block to a text placeholder or compatible input shape
      data = convertImageBlockToText(data);
    }
    // Send the preprocessed data to the xAI API
    const response = await fetch(api, {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify(data),
    });
    // ...
  };
  // ...
}

// dist/tool-images-*.js
function convertImageBlockToText(imageBlock) {
  // Convert the image block to a text placeholder, e.g., "[IMAGE: <image_name>]"
  return `[IMAGE: ${imageBlock.mimeType}]`;
}

Verification

To verify that the fix worked, follow these steps:

Restart the OpenClaw service: Restart the OpenClaw service to apply the changes.
Test the workflow: Test the workflow that was previously failing, and verify that it now completes successfully.
Check the logs: Check the logs to ensure that the xAI API is receiving the preprocessed data and responding correctly.

Extra Tips

Make sure to test the fix thoroughly to ensure that it works for all possible image tool results and xAI API requests.
Consider adding additional logging or monitoring to detect any future issues with image tool results or xAI API requests.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

One of the following should happen:

OpenClaw should convert image tool results into an xAI-compatible input shape before forwarding them, or
OpenClaw should down-convert incompatible image tool results to text placeholders/metadata for xAI models that cannot accept the internal image block format, or
OpenClaw should detect incompatibility earlier and avoid sending invalid model input to xAI.

#api #agent execution #runtime error #dependency conflict #serialization error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug]: xAI/openai-responses crashes with 422 when tool results include image blocks from read(image) [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #58017: fix(xai): normalize image tool results for responses

Description (problem / solution / changelog)

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Root Cause / Regression History (if applicable)

Regression Test Plan (if applicable)

User-visible / Behavior Changes

Diagram (if applicable)

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Risks and Mitigations

Notes

Changed files

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

extent analysis

Fix Plan

Step-by-Step Solution

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING