codex - 💡(How to fix) Fix code-mode rollout: exec_command stdout framing leaks into input_image base64, corrupting Responses API requests

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

codex doctor subcommand is not present in v0.125.0 (codex doctorError: stdin is not a terminal; --help shows no doctor verb).

  • The user-facing error ("Something went wrong... try again, or /new to start a fresh session") is generic and gives no actionable signal pointing at the rollout file.

Root Cause

Because ASCII whitespace is stripped but multi-byte ellipsis is preserved, a base64-canonicalization step is running after contamination but before rollout-write. (Standard base64 canonicalizers strip ASCII whitespace and leave non-ASCII characters in place to fail loudly downstream — exactly what we see.)

Fix Action

Fix / Workaround

The end-user sees a generic "Something went wrong while processing your request. Please try again, or use /new to start a fresh session." on every reply. They cannot recover without either /new (losing thread state) or hand-patching the rollout JSONL file.

  • Permanent thread breakage — every subsequent reply fails until the rollout file is patched by hand.
  • The user-facing error ("Something went wrong... try again, or /new to start a fresh session") is generic and gives no actionable signal pointing at the rollout file.
  • Affects any code-mode session that combines screenshots + exec_command (a very common combination — that's the whole code-mode use case).
  1. Strict type-tagged buffer in code-mode assembly so exec stdout cannot share serialization framing with input_image. Root-cause fix.
  2. Validate input_image.image_url at rollout-write time (cheap regex / base64 round-trip) and refuse to write a corrupted entry. Cheap defensive guard that would at least fail loudly.
  3. Run a sanitizer at the Responses API serialization layer before sending — strip any non-base64 prefix that has crept in. (This is the workaround OpenClaw has shipped downstream — see below.)

Code Example

Invalid 'input[N].output[K].image_url'. Expected a base64-encoded data URL with an image MIME type (e.g. 'data:image/png;base64,aW1nIGJ5dGVzIGhlcmU='), but got <malformed payload>.

---

"image_url":"data:image/png;base64,Totaloutputlines:1<base64 chars>…52309tokenstruncated…<rest of base64>"
RAW_BUFFERClick to expand / collapse

What version of Codex CLI is running?

codex-cli 0.125.0 (Mach-O binary at /opt/homebrew/bin/codex).

What subscription do you have?

ChatGPT-OAuth (backend: api.chatgpt.com/backend-api/codex/...).

Which model were you using?

gpt-5.2-codex (code-mode tool execution path).

What platform is your computer?

Darwin 25.4.0 arm64 arm (macOS 26 Tahoe, Apple Silicon).

What terminal emulator and version are you using (if applicable)?

Not applicable — codex is invoked headlessly as a subprocess by a parent harness (OpenClaw 2026.5.20 codex-app-server-real-home shim), not from an interactive terminal.

Codex doctor report

not available

codex doctor subcommand is not present in v0.125.0 (codex doctorError: stdin is not a terminal; --help shows no doctor verb).

What issue are you seeing?

Summary: In the code-mode tool execution path, exec_command stdout framing text leaks INTO an input_image.image_url base64 payload. The corrupted entry is then persisted to the rollout JSONL, and every subsequent API call on that thread fails with:

Invalid 'input[N].output[K].image_url'. Expected a base64-encoded data URL with an image MIME type (e.g. 'data:image/png;base64,aW1nIGJ5dGVzIGhlcmU='), but got <malformed payload>.

The end-user sees a generic "Something went wrong while processing your request. Please try again, or use /new to start a fresh session." on every reply. They cannot recover without either /new (losing thread state) or hand-patching the rollout JSONL file.

Concrete corruption observed in a real rollout JSONL (line 4774):

"image_url":"data:image/png;base64,Totaloutputlines:1<base64 chars>…52309tokenstruncated…<rest of base64>"

Two things to notice:

  1. Total output lines:1 — ASCII whitespace stripped, but the literal marker text appears at the start of the base64 payload, after the , of the data URL. This is Codex's own internal "Total output lines: N" framing header from the exec-output formatter.
  2. …52309tokenstruncated… — multi-byte ellipsis (U+2026 → ) preserved, ASCII whitespace stripped. This is Codex's own format_truncation_marker text from the truncation utility.

Because ASCII whitespace is stripped but multi-byte ellipsis is preserved, a base64-canonicalization step is running after contamination but before rollout-write. (Standard base64 canonicalizers strip ASCII whitespace and leave non-ASCII characters in place to fail loudly downstream — exactly what we see.)

This is a structural corruption (Codex's own framing text injected into a user-content base64 payload), not just a volume/poisoning issue like #18629 (which I checked — that issue is about valid-but-oversized inline base64 images; this issue is about malformed base64 produced by Codex itself).

What steps can reproduce the bug?

I do not have a deterministic local reproducer — this is observational evidence from a production session (code-mode session that combined exec_command calls with screenshots). The corrupted rollout entries were captured after the failure surfaced.

What I can describe is the conditions under which the corruption appeared:

  1. Run a code-mode session with both exec_command tool calls and screenshot-producing tool calls (browser/Computer Use) interleaved.
  2. Let the session run long enough that exec_command output triggers Codex's truncation path (format_truncation_marker fires → "…N tokens truncated…" header in the output stream).
  3. At some point, an input_image.image_url payload in the same buffer flush gets the framing text injected at its start, instead of the framing being prepended to its sibling function_call_output.
  4. The corrupt JSONL line is persisted.
  5. On the next thread continuation, the Responses API rejects the full input array with Invalid input[N].output[K].image_url, and the thread is permanently broken.

I'm flagging this as observational because (a) the corruption shape is unambiguous (Codex's own framing strings appear inside what should be opaque base64), and (b) the failure mode at the API layer is consistent with the corruption shape. If a code-reproducer is required to triage, please let me know and I can attempt to construct one — but the source-side fix should be obvious from the framing-string match.

Probable source

Three Codex emitters produce the marker strings observed in the corruption:

  • codex-rs/core/src/tools/code_mode/mod.rs::prepend_script_status — produces the "Script completed\nWall time X seconds\nOutput:\n" framing.
  • codex-rs/core/src/tools/mod.rs::format_exec_output_for_model / formatted_truncate_text — produces the "Total output lines: N" header.
  • codex-rs/utils/string/src/truncate.rs::format_truncation_marker — produces the "…N tokens truncated…" marker (multi-byte ellipsis).

The hypothesis is that the code-mode tool-execution buffer assembly is treating exec stdout and input_image payloads as part of the same byte stream when serializing for rollout-write, so the framing prepended for stdout ends up prefixing the next data URL in the buffer instead of being scoped to its own function_call_output item.

What is the expected behavior?

input_image.image_url payloads must be opaque base64 (after the data:image/png;base64, prefix) with no Codex-internal framing text injected into them. Specifically:

  • prepend_script_status, format_exec_output_for_model, and format_truncation_marker outputs must be strictly scoped to text/exec output items — they must never reach an input_image value.
  • Rollout-write should validate input_image.image_url (cheap regex: ^data:image/[a-z]+;base64,[A-Za-z0-9+/=]+$, or a base64 round-trip) and refuse to persist a malformed entry. This would at least fail the write loudly instead of corrupting the thread silently.

Additional information

Severity: HIGH.

  • Permanent thread breakage — every subsequent reply fails until the rollout file is patched by hand.
  • The user-facing error ("Something went wrong... try again, or /new to start a fresh session") is generic and gives no actionable signal pointing at the rollout file.
  • Affects any code-mode session that combines screenshots + exec_command (a very common combination — that's the whole code-mode use case).

Suggested fixes (in order of robustness):

  1. Strict type-tagged buffer in code-mode assembly so exec stdout cannot share serialization framing with input_image. Root-cause fix.
  2. Validate input_image.image_url at rollout-write time (cheap regex / base64 round-trip) and refuse to write a corrupted entry. Cheap defensive guard that would at least fail loudly.
  3. Run a sanitizer at the Responses API serialization layer before sending — strip any non-base64 prefix that has crept in. (This is the workaround OpenClaw has shipped downstream — see below.)

Downstream workaround already shipped:

A defensive scanner watchdog at the rollout-file layer scrubs corrupted base64 image_url payloads within ~5 minutes of corruption. Repo: https://github.com/EthanSK/dot-claude (script: ~/.claude/scripts/openclaw-rollout-corruption-scanner.sh). This works around the symptom but does not fix the root cause in Codex itself — once a rollout file is corrupted, the user has already had their thread break once.

Related issues (checked — none of these cover the framing-leak corruption specifically):

  • #18629 — desktop threads poisoned by inline base64 tool images leading to {"detail":"Bad Request"}. Different bug: that issue is about valid base64 images being too large / too many, causing replay failures. This issue is about malformed base64 (Codex framing text injected mid-payload).
  • #22603 — image-heavy threads slow to open. Performance issue, different bug.
  • #16605 — "got empty base64-encoded bytes". Different shape (empty, not framing-contaminated).
  • #4132 (closed) — VS Code extension injecting empty image_url blocks. Different surface (extension vs Rust CLI) and different shape (empty vs contaminated).

Downstream issue filed against the harness layer for tracking purposes: openclaw/openclaw#86878 — the OpenClaw team's view is that the bug is in Codex itself, not in their shim (they just exec codex as an opaque subprocess and persist whatever rollout JSONL it produces).

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

codex - 💡(How to fix) Fix code-mode rollout: exec_command stdout framing leaks into input_image base64, corrupting Responses API requests