openclaw - 💡(How to fix) Fix `image` media-understanding tool can bypass configured Codex image route via model overrides and direct OpenAI auto-selection

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On a Codex subscription/OAuth deployment, the agent repeatedly failed when calling the built-in image media-understanding tool. The main agent model runs successfully through the Codex harness/OAuth path, but image tool calls routed to direct OpenAI Responses and failed because no direct OPENAI_API_KEY was configured.

This appears to be two related issues:

  1. The image tool can honor model overrides such as gpt-5.5, gpt-4.1, or openai/gpt-4o and route them to direct OpenAI instead of the configured Codex image-understanding route.
  2. When no explicit image route is set, media-understanding auto-selection prefers direct openai/gpt-5.4-mini before Codex-capable routes, even on Codex-OAuth-only deployments.

A local prompt workaround reduced the failures, but this seems like behavior OpenClaw should guard against upstream.

Error Message

Error: Reconnecting... 2/5

Root Cause

Without the prompt workaround, the agent repeatedly called image with model overrides and generated 63 failed tool results. The failures were not normal text-agent turns and not image generation. They were media-understanding tool calls escaping to direct OpenAI Responses despite the deployment being authenticated through Codex OAuth.

Fix Action

Fix / Workaround

A local prompt workaround reduced the failures, but this seems like behavior OpenClaw should guard against upstream.

Local workaround

This is only a prompt workaround. The runtime should ideally prevent accidental direct OpenAI routing on Codex-OAuth-only installs.

Code Example

CLI: /Users/<user>/.npm-global/bin/openclaw
Package: /Users/<user>/.npm-global/lib/node_modules/openclaw
Config: ~/.openclaw/openclaw.json
Agent dir: ~/.openclaw/agents/main/agent

---

{
  "type": "tool.result",
  "provider": "openai",
  "modelId": "gpt-5.4-mini",
  "modelApi": "openai-responses",
  "data": {
    "name": "image",
    "success": false
  }
}

---

No API key found for provider "openai". You are authenticated with OpenAI Codex OAuth; OpenAI agent model runs use openai/gpt-* through the Codex runtime. Set OPENAI_API_KEY only for direct OpenAI API-key surfaces.

---

62 tool.result failures for image
 1 tool.result failure for pdf

---

provider: openai
modelApi: openai-responses

---

gpt-5.5               31
gpt-4.1               12
gpt-4o-mini            7
gpt-4o                 3
openai/gpt-4.1         2
openai/gpt-4o          2
gpt-4.1-mini           1
gpt-5.2                1
gpt-5.4                1
gpt-5.4-mini           1
openai/gpt-5.4-mini    1
openai/gpt-5.5         1

---

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "openai/gpt-5.5",
        "fallbacks": ["openai/gpt-5.4", "openai/gpt-5.4-mini"]
      },
      "imageModel": {
        "primary": "codex/gpt-5.5"
      },
      "imageGenerationModel": {
        "primary": "openai/gpt-image-2"
      }
    }
  }
}

---

openclaw config validate
Config valid: ~/.openclaw/openclaw.json

---

openclaw gateway restart
Restarted LaunchAgent: gui/501/ai.openclaw.gateway

---

openclaw gateway health
Gateway Health
OK (70ms)
Slack: configured

---

openclaw models status
Config        : ~/.openclaw/openclaw.json
Agent dir     : ~/.openclaw/agents/main/agent
Default       : openai/gpt-5.5
Fallbacks (2) : openai/gpt-5.4, openai/gpt-5.4-mini
Image model   : codex/gpt-5.5
Image fallbacks (0): -
Aliases (2)   : gpt -> openai/gpt-5.4, gpt-mini -> openai/gpt-5.4-mini
Configured models (4): openai/gpt-5.5, openai/gpt-5.4, openai/gpt-5.4-mini, openai/gpt-image-2

Auth overview
Auth store    : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env     : off
Providers w/ OAuth/tokens (1): openai-codex (1)
- codex effective=synthetic:codex-app-server | synthetic=plugin-owned | source=codex-app-server
- openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=1, token=0, api_key=0) | synthetic=plugin-owned | source=codex-app-server

Runtime auth
- openai via codex uses openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | status=usable

---

This agent runs on the Codex harness with Codex OAuth, not direct OpenAI API-key auth.

If an image is attached to the current message or otherwise already visible in the main Codex turn, analyze it directly without calling the `image` tool.

Use the `image` tool only for image files or URLs that are not already visible in context. When calling `image`, omit `model` so OpenClaw uses the configured Codex image-understanding route.

---

return {
  label: "Image",
  name: "image",
  description,
  parameters: Type.Object({
    prompt: Type.Optional(Type.String()),
    image: Type.Optional(Type.String({ description: "One image path/URL." })),
    images: Type.Optional(
      Type.Array(Type.String(), {
        description: "Image paths/URLs; maxImages default 20.",
      }),
    ),
    model: Type.Optional(Type.String()),
    maxBytesMb: Type.Optional(Type.Number()),
    maxImages: Type.Optional(Type.Number()),
  }),
  ...
}

---

Analyze images with vision model. Use image for one path/URL, images for max 20. Only use this tool when images were NOT already provided; prompt images already visible.

---

Analyze images with configured image model. Use image for one path/URL, images for max 20. Prompt says what to inspect.

---

- image: Analyze an image with the configured image model
- image_generate: Generate images with the configured image-generation model

---

"mediaUnderstandingProviderMetadata": {
  "openai": {
    "capabilities": ["image", "audio"],
    "defaultModels": {
      "image": "gpt-5.4-mini",
      "audio": "gpt-4o-transcribe"
    },
    "autoPriority": {
      "image": 10,
      "audio": 10
    }
  },
  "openai-codex": {
    "capabilities": ["image", "audio"],
    "defaultModels": {
      "image": "gpt-5.5",
      "audio": "gpt-4o-transcribe"
    },
    "autoPriority": {
      "image": 20,
      "audio": 20
    }
  }
}

---

export function buildCodexMediaUnderstandingProvider(...): MediaUnderstandingProvider {
  return {
    id: CODEX_PROVIDER_ID,
    capabilities: ["image"],
    defaultModels: { image: DEFAULT_CODEX_IMAGE_MODEL },
    describeImage: async (req) => describeCodexImages(...),
    describeImages: async (req) => describeCodexImages(req, options),
    extractStructured: async (req) => extractCodexStructured(req, options),
  };
}

---

agents.defaults.imageModel follows the same prefix split. Use openai/gpt-* for the normal OpenAI route and codex/gpt-* only when image understanding should run through a bounded Codex app-server turn. Do not use openai-codex/gpt-*; doctor rewrites that legacy prefix to openai/gpt-*.

---

openclaw infer image describe \
  --file /path/to/image.png \
  --model codex/gpt-5.5 \
  --json \
  --timeout-ms 60000

---

Error: Reconnecting... 2/5

---

2026-05-27T04:24:29.509Z info gateway {"subsystem":"gateway"} http server listening (8 plugins: brave, browser, codex, llm-task, memory-core, openai, slack, slack-subagent-card; 5.6s)
2026-05-27T04:24:29.751Z info gateway {"subsystem":"gateway"} gateway ready
2026-05-27T04:24:32.745Z info Gateway Health
2026-05-27T04:24:32.752Z info OK (70ms)
2026-05-27T04:24:34.051Z info gateway {"subsystem":"gateway"} provider auth state pre-warmed in 3277ms eventLoopMax=993.0ms
2026-05-27T04:24:42.859Z error Error: Reconnecting... 2/5
2026-05-27T04:24:51.364Z info Gateway Health
2026-05-27T04:24:51.378Z info OK (63ms)
2026-05-27T04:26:04.621Z error Error: Reconnecting... 2/5
RAW_BUFFERClick to expand / collapse

Summary

On a Codex subscription/OAuth deployment, the agent repeatedly failed when calling the built-in image media-understanding tool. The main agent model runs successfully through the Codex harness/OAuth path, but image tool calls routed to direct OpenAI Responses and failed because no direct OPENAI_API_KEY was configured.

This appears to be two related issues:

  1. The image tool can honor model overrides such as gpt-5.5, gpt-4.1, or openai/gpt-4o and route them to direct OpenAI instead of the configured Codex image-understanding route.
  2. When no explicit image route is set, media-understanding auto-selection prefers direct openai/gpt-5.4-mini before Codex-capable routes, even on Codex-OAuth-only deployments.

A local prompt workaround reduced the failures, but this seems like behavior OpenClaw should guard against upstream.

Environment

  • OpenClaw version: 2026.5.22 (a374c3a)
  • Platform: macOS
  • Auth mode: OpenAI Codex subscription/OAuth, not direct OpenAI API key
  • Main agent model: openai/gpt-5.5 through Codex OAuth/harness
  • No direct OPENAI_API_KEY configured

Relevant installed paths from the affected machine:

CLI: /Users/<user>/.npm-global/bin/openclaw
Package: /Users/<user>/.npm-global/lib/node_modules/openclaw
Config: ~/.openclaw/openclaw.json
Agent dir: ~/.openclaw/agents/main/agent

Observed failure

The failing tool was OpenClaw’s built-in media-understanding tool named exactly image, not image_generate.

Trajectory records looked like:

{
  "type": "tool.result",
  "provider": "openai",
  "modelId": "gpt-5.4-mini",
  "modelApi": "openai-responses",
  "data": {
    "name": "image",
    "success": false
  }
}

The underlying error text in those tool results was:

No API key found for provider "openai". You are authenticated with OpenAI Codex OAuth; OpenAI agent model runs use openai/gpt-* through the Codex runtime. Set OPENAI_API_KEY only for direct OpenAI API-key surfaces.

There were 63 failed media-understanding tool results with this API-key error:

62 tool.result failures for image
 1 tool.result failure for pdf

All used:

provider: openai
modelApi: openai-responses

For the 63 image failures, the agent supplied these model argument values:

gpt-5.5               31
gpt-4.1               12
gpt-4o-mini            7
gpt-4o                 3
openai/gpt-4.1         2
openai/gpt-4o          2
gpt-4.1-mini           1
gpt-5.2                1
gpt-5.4                1
gpt-5.4-mini           1
openai/gpt-5.4-mini    1
openai/gpt-5.5         1

This suggests the model repeatedly tried to “helpfully” pick an OpenAI vision model through the optional image.model argument, which bypassed the configured/authenticated Codex path.

Current relevant config

This deployment should use Codex OAuth for text and image understanding, while image generation can remain on OpenAI Images:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "openai/gpt-5.5",
        "fallbacks": ["openai/gpt-5.4", "openai/gpt-5.4-mini"]
      },
      "imageModel": {
        "primary": "codex/gpt-5.5"
      },
      "imageGenerationModel": {
        "primary": "openai/gpt-image-2"
      }
    }
  }
}

After setting agents.defaults.imageModel.primary = "codex/gpt-5.5":

openclaw config validate
Config valid: ~/.openclaw/openclaw.json
openclaw gateway restart
Restarted LaunchAgent: gui/501/ai.openclaw.gateway
openclaw gateway health
Gateway Health
OK (70ms)
Slack: configured
openclaw models status
Config        : ~/.openclaw/openclaw.json
Agent dir     : ~/.openclaw/agents/main/agent
Default       : openai/gpt-5.5
Fallbacks (2) : openai/gpt-5.4, openai/gpt-5.4-mini
Image model   : codex/gpt-5.5
Image fallbacks (0): -
Aliases (2)   : gpt -> openai/gpt-5.4, gpt-mini -> openai/gpt-5.4-mini
Configured models (4): openai/gpt-5.5, openai/gpt-5.4, openai/gpt-5.4-mini, openai/gpt-image-2

Auth overview
Auth store    : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env     : off
Providers w/ OAuth/tokens (1): openai-codex (1)
- codex effective=synthetic:codex-app-server | synthetic=plugin-owned | source=codex-app-server
- openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=1, token=0, api_key=0) | synthetic=plugin-owned | source=codex-app-server

Runtime auth
- openai via codex uses openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | status=usable

Local workaround

Adding guidance to the agent’s TOOLS.md reduced the repeated failures:

This agent runs on the Codex harness with Codex OAuth, not direct OpenAI API-key auth.

If an image is attached to the current message or otherwise already visible in the main Codex turn, analyze it directly without calling the `image` tool.

Use the `image` tool only for image files or URLs that are not already visible in context. When calling `image`, omit `model` so OpenClaw uses the configured Codex image-understanding route.

This is only a prompt workaround. The runtime should ideally prevent accidental direct OpenAI routing on Codex-OAuth-only installs.

Relevant source findings

image tool runtime description

The built-in tool is named image and exposes an optional model parameter.

Source behavior in src/agents/tools/image-tool.ts:

return {
  label: "Image",
  name: "image",
  description,
  parameters: Type.Object({
    prompt: Type.Optional(Type.String()),
    image: Type.Optional(Type.String({ description: "One image path/URL." })),
    images: Type.Optional(
      Type.Array(Type.String(), {
        description: "Image paths/URLs; maxImages default 20.",
      }),
    ),
    model: Type.Optional(Type.String()),
    maxBytesMb: Type.Optional(Type.Number()),
    maxImages: Type.Optional(Type.Number()),
  }),
  ...
}

The description variants include:

Analyze images with vision model. Use image for one path/URL, images for max 20. Only use this tool when images were NOT already provided; prompt images already visible.

or:

Analyze images with configured image model. Use image for one path/URL, images for max 20. Prompt says what to inspect.

The system prompt only lists the compact summary:

- image: Analyze an image with the configured image model
- image_generate: Generate images with the configured image-generation model

Direct OpenAI media-understanding is prioritized before openai-codex

The OpenAI media-understanding provider metadata has direct OpenAI ahead of openai-codex:

"mediaUnderstandingProviderMetadata": {
  "openai": {
    "capabilities": ["image", "audio"],
    "defaultModels": {
      "image": "gpt-5.4-mini",
      "audio": "gpt-4o-transcribe"
    },
    "autoPriority": {
      "image": 10,
      "audio": 10
    }
  },
  "openai-codex": {
    "capabilities": ["image", "audio"],
    "defaultModels": {
      "image": "gpt-5.5",
      "audio": "gpt-4o-transcribe"
    },
    "autoPriority": {
      "image": 20,
      "audio": 20
    }
  }
}

resolveAutoMediaKeyProviders() sorts ascending by autoPriority, so direct openai wins before openai-codex.

codex media-understanding provider is the bounded app-server path

The Codex media-understanding provider is separate and uses provider id codex:

export function buildCodexMediaUnderstandingProvider(...): MediaUnderstandingProvider {
  return {
    id: CODEX_PROVIDER_ID,
    capabilities: ["image"],
    defaultModels: { image: DEFAULT_CODEX_IMAGE_MODEL },
    describeImage: async (req) => describeCodexImages(...),
    describeImages: async (req) => describeCodexImages(req, options),
    extractStructured: async (req) => extractCodexStructured(req, options),
  };
}

It runs a bounded Codex app-server turn with text+image modalities.

The docs also say:

agents.defaults.imageModel follows the same prefix split. Use openai/gpt-* for the normal OpenAI route and codex/gpt-* only when image understanding should run through a bounded Codex app-server turn. Do not use openai-codex/gpt-*; doctor rewrites that legacy prefix to openai/gpt-*.

Additional observation: explicit Codex probe error

With imageModel set to codex/gpt-5.5, an explicit CLI probe still failed with a raw reconnecting message:

openclaw infer image describe \
  --file /path/to/image.png \
  --model codex/gpt-5.5 \
  --json \
  --timeout-ms 60000

Output:

Error: Reconnecting... 2/5

Recent logs around the probe:

2026-05-27T04:24:29.509Z info gateway {"subsystem":"gateway"} http server listening (8 plugins: brave, browser, codex, llm-task, memory-core, openai, slack, slack-subagent-card; 5.6s)
2026-05-27T04:24:29.751Z info gateway {"subsystem":"gateway"} gateway ready
2026-05-27T04:24:32.745Z info Gateway Health
2026-05-27T04:24:32.752Z info OK (70ms)
2026-05-27T04:24:34.051Z info gateway {"subsystem":"gateway"} provider auth state pre-warmed in 3277ms eventLoopMax=993.0ms
2026-05-27T04:24:42.859Z error Error: Reconnecting... 2/5
2026-05-27T04:24:51.364Z info Gateway Health
2026-05-27T04:24:51.378Z info OK (63ms)
2026-05-27T04:26:04.621Z error Error: Reconnecting... 2/5

This may be a separate issue, but the error should be more actionable if the bounded Codex image route cannot run.

Expected behavior

On a Codex-OAuth-only deployment:

  1. If agents.defaults.imageModel.primary = "codex/gpt-5.5" is configured, the image tool should use that route by default.
  2. The optional image.model argument should not accidentally bypass the configured Codex image route when the model supplies bare OpenAI-looking values like gpt-5.5 or gpt-4.1.
  3. If the model override is incompatible with available auth, OpenClaw should reject it with an actionable error instead of attempting direct OpenAI Responses and producing repeated API-key failures.
  4. Auto media-understanding selection should not prefer direct openai/gpt-5.4-mini when direct OpenAI API-key auth is absent and Codex OAuth/app-server auth is available.
  5. openclaw infer image describe --model codex/gpt-5.5 should either work or explain why the Codex app-server image route cannot run.

Possible upstream fixes

A robust fix could include:

  • Resolve bare image.model overrides relative to the configured image provider first. For example, when agents.defaults.imageModel.primary = "codex/gpt-5.5", model: "gpt-5.5" should prefer codex/gpt-5.5, not direct openai/gpt-5.5.
  • Make image.model overrides auth-aware. If direct openai lacks API-key auth but Codex image routing is configured, either route to codex/* where compatible or reject with a clear message.
  • Improve the image tool description to say that agents should usually omit model, and that Codex OAuth image understanding uses codex/gpt-*.
  • Update auto provider ordering/selection so direct OpenAI is not selected on Codex-OAuth-only installs without OPENAI_API_KEY.
  • Improve the Codex image probe error so Error: Reconnecting... 2/5 is wrapped with surface, provider, and next-step context.

Why this matters

Without the prompt workaround, the agent repeatedly called image with model overrides and generated 63 failed tool results. The failures were not normal text-agent turns and not image generation. They were media-understanding tool calls escaping to direct OpenAI Responses despite the deployment being authenticated through Codex OAuth.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

On a Codex-OAuth-only deployment:

  1. If agents.defaults.imageModel.primary = "codex/gpt-5.5" is configured, the image tool should use that route by default.
  2. The optional image.model argument should not accidentally bypass the configured Codex image route when the model supplies bare OpenAI-looking values like gpt-5.5 or gpt-4.1.
  3. If the model override is incompatible with available auth, OpenClaw should reject it with an actionable error instead of attempting direct OpenAI Responses and producing repeated API-key failures.
  4. Auto media-understanding selection should not prefer direct openai/gpt-5.4-mini when direct OpenAI API-key auth is absent and Codex OAuth/app-server auth is available.
  5. openclaw infer image describe --model codex/gpt-5.5 should either work or explain why the Codex app-server image route cannot run.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix `image` media-understanding tool can bypass configured Codex image route via model overrides and direct OpenAI auto-selection