openclaw - ✅(Solved) Fix Codex-native vision turns can stall when inbound images are present and the dynamic image tool remains exposed [1 pull requests, 1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#65050Fetched 2026-04-12 13:25:49
View on GitHub
Comments
1
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
cross-referenced ×2labeled ×2commented ×1

When a Codex-native run already contains inbound images and the selected model supports vision, the turn can still expose the dynamic image tool. In that situation the model may call image again instead of answering from the already-provided image input, and the run can stall with no reply delivered.

I also found a related image fallback mismatch: bare image-model overrides (for example gpt-5.4-mini) currently inherit DEFAULT_PROVIDER instead of the provider configured in agents.defaults.imageModel.primary. If the configured image model lives under openai-codex/..., a bare override can resolve back to plain openai/....

Root Cause

When a Codex-native run already contains inbound images and the selected model supports vision, the turn can still expose the dynamic image tool. In that situation the model may call image again instead of answering from the already-provided image input, and the run can stall with no reply delivered.

I also found a related image fallback mismatch: bare image-model overrides (for example gpt-5.4-mini) currently inherit DEFAULT_PROVIDER instead of the provider configured in agents.defaults.imageModel.primary. If the configured image model lives under openai-codex/..., a bare override can resolve back to plain openai/....

Fix Action

Fix / Workaround

Observed on 2026.4.10. I also checked the current source and the same behavior still exists in extensions/codex/src/app-server/run-attempt.ts and src/agents/model-fallback.ts. A local patch that filters image when vision input is already present, and that inherits the configured image-model provider for bare overrides, resolves the issue.

PR fix notes

PR #65061: fix(codex): avoid image tool loops on vision turns and respect image-model providers

Description (problem / solution / changelog)

Summary

  • hide the dynamic image tool on Codex-native turns that already include inbound images and use a vision-capable model
  • inherit the configured agents.defaults.imageModel.primary provider when resolving bare image-model override ids
  • add focused regression tests for both paths

Problem

Codex-native image turns can stall when inbound images are already present in the turn input but the dynamic image tool is still exposed. Separately, bare overrides such as gpt-5.4-mini can resolve under the wrong provider when the configured image model lives under a non-default provider prefix.

Closes #65050

Testing

  • pnpm exec node --no-maglev node_modules/vitest/vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/app-server/run-attempt.vision-tools.test.ts
  • pnpm exec node --no-maglev node_modules/vitest/vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/model-fallback.image-provider.test.ts
  • pnpm exec node --no-maglev node_modules/vitest/vitest.mjs run --config test/vitest/vitest.extensions.config.ts extensions/codex/src/app-server/run-attempt.test.ts
  • pnpm exec node --no-maglev node_modules/vitest/vitest.mjs run --config test/vitest/vitest.agents.config.ts src/agents/model-fallback.test.ts

Changed files

  • extensions/codex/src/app-server/run-attempt.ts (modified, +20/-2)
  • extensions/codex/src/app-server/run-attempt.vision-tools.test.ts (added, +20/-0)
  • src/agents/model-fallback.image-provider.test.ts (added, +28/-0)
  • src/agents/model-fallback.ts (modified, +20/-1)
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

When a Codex-native run already contains inbound images and the selected model supports vision, the turn can still expose the dynamic image tool. In that situation the model may call image again instead of answering from the already-provided image input, and the run can stall with no reply delivered.

I also found a related image fallback mismatch: bare image-model overrides (for example gpt-5.4-mini) currently inherit DEFAULT_PROVIDER instead of the provider configured in agents.defaults.imageModel.primary. If the configured image model lives under openai-codex/..., a bare override can resolve back to plain openai/....

Steps to reproduce

  1. Configure OpenClaw to use a Codex-native harness with a vision-capable Codex model.
  2. Send a message with an image attachment through any channel that forwards the image into the embedded run.
  3. Observe that the turn input already contains the inbound image.
  4. Observe that the dynamic tool catalog still exposes image.
  5. When the model emits an image tool call, the run can stall instead of replying.

Expected behavior

A Codex-native vision turn that already has inbound image input should answer directly from that input and should not re-expose the dynamic image tool for the same turn. Bare image-model overrides should also inherit the configured image-model provider.

Actual behavior

The run can remain stuck after emitting a dynamic image tool call, so channels show typing and then silence. Separately, image-model overrides such as gpt-5.4-mini can resolve under the wrong provider when agents.defaults.imageModel.primary is configured under openai-codex/....

OpenClaw version

2026.4.10 (reproduced), current main still contains the same code paths

Operating system

macOS 15.x

Install method

npm global

Model

codex/gpt-5.4

Provider / routing chain

openclaw -> codex app-server

Additional provider/model setup details

No private config is required to reproduce this. The key condition is: a vision-capable Codex model, inbound images already present in the turn input, and the dynamic tool catalog still exposing image.

Logs, screenshots, and evidence

Impact and severity

Affected: Codex-native image turns across channels Severity: High (stalls image replies) Frequency: Reproduced consistently under the conditions above Consequence: the run can hang until the channel typing TTL expires

Additional information

Observed on 2026.4.10. I also checked the current source and the same behavior still exists in extensions/codex/src/app-server/run-attempt.ts and src/agents/model-fallback.ts. A local patch that filters image when vision input is already present, and that inherits the configured image-model provider for bare overrides, resolves the issue.

extent analysis

TL;DR

Filtering the image tool when vision input is already present and inheriting the configured image-model provider for bare overrides may resolve the issue.

Guidance

  • Verify that the issue is caused by the dynamic image tool being exposed when inbound images are already present in the turn input.
  • Check the extensions/codex/src/app-server/run-attempt.ts and src/agents/model-fallback.ts files to ensure they are handling vision-capable Codex models correctly.
  • Consider applying a local patch to filter image when vision input is already present and to inherit the configured image-model provider for bare overrides.
  • Test the fix by sending a message with an image attachment and verifying that the run does not stall and the channel does not show typing and then silence.

Example

No code snippet is provided as the issue does not contain enough information to generate a specific code example.

Notes

The issue seems to be specific to Codex-native image turns with vision-capable models, and the fix may need to be applied to the extensions/codex/src/app-server/run-attempt.ts and src/agents/model-fallback.ts files.

Recommendation

Apply a workaround by filtering the image tool when vision input is already present and inheriting the configured image-model provider for bare overrides, as this has been shown to resolve the issue in local testing.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

A Codex-native vision turn that already has inbound image input should answer directly from that input and should not re-expose the dynamic image tool for the same turn. Bare image-model overrides should also inherit the configured image-model provider.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING