openclaw - ✅(Solved) Fix Codex-native vision turns can stall when inbound images are present and the dynamic image tool remains exposed [1 pull requests, 1 comments, 1 participants]

openclaw2026-04-11 23:56:13

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#65050•Fetched 2026-04-12 13:25:49

View on GitHub

Comments

Participants

Timeline

Reactions

Author

zhulijin1991

Participants

zhulijin1991

Timeline (top)

cross-referenced ×2labeled ×2commented ×1

When a Codex-native run already contains inbound images and the selected model supports vision, the turn can still expose the dynamic image tool. In that situation the model may call image again instead of answering from the already-provided image input, and the run can stall with no reply delivered.

I also found a related image fallback mismatch: bare image-model overrides (for example gpt-5.4-mini) currently inherit DEFAULT_PROVIDER instead of the provider configured in agents.defaults.imageModel.primary. If the configured image model lives under openai-codex/..., a bare override can resolve back to plain openai/....

Root Cause

Fix Action

Fix / Workaround

Observed on 2026.4.10. I also checked the current source and the same behavior still exists in extensions/codex/src/app-server/run-attempt.ts and src/agents/model-fallback.ts. A local patch that filters image when vision input is already present, and that inherits the configured image-model provider for bare overrides, resolves the issue.

PR fix notes

PR #65061: fix(codex): avoid image tool loops on vision turns and respect image-model providers

Repository: openclaw/openclaw
Author: zhulijin1991
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/65061

Description (problem / solution / changelog)

Summary

hide the dynamic image tool on Codex-native turns that already include inbound images and use a vision-capable model
inherit the configured agents.defaults.imageModel.primary provider when resolving bare image-model override ids
add focused regression tests for both paths

Problem

Codex-native image turns can stall when inbound images are already present in the turn input but the dynamic image tool is still exposed. Separately, bare overrides such as gpt-5.4-mini can resolve under the wrong provider when the configured image model lives under a non-default provider prefix.

Closes #65050

Testing

pnpm exec node --no-maglev node_modules/vitest/vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/app-server/run-attempt.vision-tools.test.ts
pnpm exec node --no-maglev node_modules/vitest/vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/model-fallback.image-provider.test.ts
pnpm exec node --no-maglev node_modules/vitest/vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/app-server/run-attempt.test.ts
pnpm exec node --no-maglev node_modules/vitest/vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/model-fallback.test.ts

Changed files

extensions/codex/src/app-server/run-attempt.ts (modified, +20/-2)
extensions/codex/src/app-server/run-attempt.vision-tools.test.ts (added, +20/-0)
src/agents/model-fallback.image-provider.test.ts (added, +28/-0)
src/agents/model-fallback.ts (modified, +20/-1)

RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

Summary

Steps to reproduce

Configure OpenClaw to use a Codex-native harness with a vision-capable Codex model.
Send a message with an image attachment through any channel that forwards the image into the embedded run.
Observe that the turn input already contains the inbound image.
Observe that the dynamic tool catalog still exposes image.
When the model emits an image tool call, the run can stall instead of replying.

Expected behavior

A Codex-native vision turn that already has inbound image input should answer directly from that input and should not re-expose the dynamic image tool for the same turn. Bare image-model overrides should also inherit the configured image-model provider.

Actual behavior

The run can remain stuck after emitting a dynamic image tool call, so channels show typing and then silence. Separately, image-model overrides such as gpt-5.4-mini can resolve under the wrong provider when agents.defaults.imageModel.primary is configured under openai-codex/....

OpenClaw version

2026.4.10 (reproduced), current main still contains the same code paths

Operating system

macOS 15.x

Install method

npm global

Model

codex/gpt-5.4

Provider / routing chain

openclaw -> codex app-server

Additional provider/model setup details

No private config is required to reproduce this. The key condition is: a vision-capable Codex model, inbound images already present in the turn input, and the dynamic tool catalog still exposing image.

Logs, screenshots, and evidence

Impact and severity

Affected: Codex-native image turns across channels Severity: High (stalls image replies) Frequency: Reproduced consistently under the conditions above Consequence: the run can hang until the channel typing TTL expires

Additional information

extent analysis

TL;DR

Filtering the image tool when vision input is already present and inheriting the configured image-model provider for bare overrides may resolve the issue.

Guidance

Verify that the issue is caused by the dynamic image tool being exposed when inbound images are already present in the turn input.
Check the extensions/codex/src/app-server/run-attempt.ts and src/agents/model-fallback.ts files to ensure they are handling vision-capable Codex models correctly.
Consider applying a local patch to filter image when vision input is already present and to inherit the configured image-model provider for bare overrides.
Test the fix by sending a message with an image attachment and verifying that the run does not stall and the channel does not show typing and then silence.

Example

No code snippet is provided as the issue does not contain enough information to generate a specific code example.

Notes

The issue seems to be specific to Codex-native image turns with vision-capable models, and the fix may need to be applied to the extensions/codex/src/app-server/run-attempt.ts and src/agents/model-fallback.ts files.

Recommendation

Apply a workaround by filtering the image tool when vision input is already present and inheriting the configured image-model provider for bare overrides, as this has been shown to resolve the issue in local testing.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#optimization #mixed precision #training loop #device allocation #model download

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Codex-native vision turns can stall when inbound images are present and the dynamic image tool remains exposed [1 pull requests, 1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #65061: fix(codex): avoid image tool loops on vision turns and respect image-model providers

Description (problem / solution / changelog)

Summary

Problem

Testing

Changed files

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING