openclaw - ✅(Solved) Fix [Bug]: models.mode=merge may incorrectly drop image attachments when one provider is missing apiKey [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70557Fetched 2026-04-24 05:56:31
View on GitHub
Comments
1
Participants
2
Timeline
8
Reactions
0
Timeline (top)
referenced ×3cross-referenced ×2labeled ×2commented ×1

When models.mode is set to "merge" and multiple providers are configured, image upload may fail unexpectedly if one of the providers is missing the apiKey field in both openclaw.json and agents/main/agent/models.json, even though another provider is fully configured and declares image support. This issue does not occur when both providers are configured correctly with apiKey.

Root Cause

When models.mode is set to "merge" and multiple providers are configured, image upload may fail unexpectedly if one of the providers is missing the apiKey field in both openclaw.json and agents/main/agent/models.json, even though another provider is fully configured and declares image support. This issue does not occur when both providers are configured correctly with apiKey.

Fix Action

Fixed

PR fix notes

PR #70816: fix(agents): preserve explicit input modalities through provider merge (#70557)

Description (problem / solution / changelog)

Summary

  • Problem: With models.mode=merge, the provider-model merge unconditionally overwrites a user's explicit input: ["text", "image"] with the implicit catalog's input, stripping vision capability from vision-capable models.
  • Why it matters: The generated agents/main/agent/models.json then advertises the model as text-only, so resolveGatewayModelSupportsImages returns false and parseMessageWithAttachments silently drops every image attachment with parseMessageWithAttachments: N attachment(s) dropped — model does not support images.
  • What changed: In mergeProviderModels, honor an explicit input override via the same "key" in obj pattern already used for reasoning. User-set input wins; if the user omits input, the implicit refresh still applies.
  • What did NOT change (scope boundary): No changes to resolveGatewayModelSupportsImages, the gateway catalog load path, plugin discovery, or any other merge field (contextWindow, maxTokens, reasoning, headers).

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #70557
  • Related #39785, #59282 (same one-line fix, both missing complete regression coverage)
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: src/agents/models-config.merge.ts line 104 (input: implicitModel.input) unconditionally overwrote the explicit model's input with the implicit catalog value. The adjacent reasoning field already uses "reasoning" in explicitModel ? explicitModel.reasoning : implicitModel.reasoning to honor explicit overrides; input was never given the same treatment. Behavior was introduced in commit 15e32c7341 ("refresh Moonshot Kimi vision capabilities") to keep stale user vision flags refreshed, but the asymmetry silently strips user-configured capabilities that are more accurate than the bundled catalog snapshot.
  • Missing detection / guardrail: The existing unit test at src/agents/models-config.merge.test.ts asserted the buggy behavior (input: ["text"] after merge), locking the regression in place.
  • Contributing context (if known): Reporter misattributed the symptom to the missing apiKey on one provider. apiKey is coincidental — the merge overwrite runs regardless of auth status.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/models-config.merge.test.ts
  • Scenario the test should lock in: Explicit input: ["text", "image"] + implicit input: ["text"] → merged keeps ["text", "image"]; plus refresh-when-omitted and explicit-empty-array cases.
  • Why this is the smallest reliable guardrail: mergeProviderModels is a pure helper — a direct unit test on it is sufficient and deterministic.
  • Existing test that already covers this (if any): The pre-existing "refreshes implicit model metadata while preserving explicit reasoning overrides" test asserted the buggy behavior; updated in this PR.
  • If no new test is added, why not: N/A — three new tests added.

User-visible / Behavior Changes

  • When a user configures input: ["text", "image"] on a model in models.providers.<id>.models[], that setting is now preserved through the merge in models.mode=merge (and the generated models.json), instead of being silently replaced by the built-in default. Image attachments are no longer dropped for vision-capable models whose bundled catalog entry omits image.
  • Users who rely on the refresh behavior (explicit omits input) are unaffected: the implicit value still wins when the explicit key is absent.

Diagram (if applicable)

Before (mode=merge):
[user openclaw.json: input: ["text","image"]]
  -> mergeProviderModels: input = implicitModel.input (e.g. ["text"])
  -> models.json advertises text-only
  -> resolveGatewayModelSupportsImages returns false
  -> parseMessageWithAttachments drops image(s)

After:
[user openclaw.json: input: ["text","image"]]
  -> mergeProviderModels: "input" in explicitModel -> keep ["text","image"]
  -> models.json advertises image capability
  -> resolveGatewayModelSupportsImages returns true
  -> image attachments flow through to the provider

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Windows 11 (per issue), reproduction is config-driven so OS-agnostic
  • Runtime/container: Node 22+
  • Model/provider: qwen/qwen3.6-plus with two configured providers
  • Integration/channel (if any): WebChat / Gateway
  • Relevant config (redacted):
    {
      "models": {
        "mode": "merge",
        "providers": {
          "providerA": { "models": [{ "id": "qwen3.6-plus", "input": ["text","image"] }] },
          "providerB": { "apiKey": "...", "models": [{ "id": "qwen3.6-plus", "input": ["text","image"] }] }
        }
      }
    }

Steps

  1. Configure models.mode=merge with two providers as above (one missing apiKey, one valid, both declaring input: ["text","image"]).
  2. Start the gateway and open WebChat.
  3. Send a text-only message — works.
  4. Send a message with an image attachment.

Expected

  • Image attachment is delivered to the vision-capable provider/model.

Actual (before fix)

  • Gateway logs parseMessageWithAttachments: 1 attachment(s) dropped — model does not support images; the image is silently dropped.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

The existing unit test at src/agents/models-config.merge.test.ts asserted input: ["text"] after merging explicit ["image"] over implicit ["text"] — that assertion (the bug, locked in) is updated to input: ["image"] in this PR. Three new tests cover the issue scenario, the refresh path, and the explicit-empty-array edge case.

  • pnpm test src/agents/models-config.merge.test.ts → 14/14 pass
  • pnpm test src/agents/models-config.preserves-explicit-reasoning-override.test.ts → 2/2 pass (sibling pattern unchanged)
  • pnpm tsgo → pass
  • pnpm format:check → pass
  • pnpm lint:core → pass

Human Verification (required)

  • Verified scenarios:
    • Explicit input: ["text","image"] + implicit input: ["text"] → merged keeps ["text","image"].
    • Explicit omits input + implicit input: ["text","image"] → merged gets ["text","image"] (refresh still works).
    • Explicit input: [] + implicit input: ["text","image"] → merged keeps [] (user intent honored).
  • Edge cases checked:
    • reasoning precedence unchanged (sibling test still green).
    • Provider with no implicit entry: unchanged (early-return at line 80).
    • Empty implicit models list: unchanged (early-return at line 51).
  • What you did not verify:
    • End-to-end WebChat image upload against a live Qwen provider — the bug is fully characterized by the unit-level merge path plus the downstream resolveGatewayModelSupportsImages contract, and no gateway code is changed.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: A user who relied on the merge stripping an incorrect explicit input back to the implicit default now keeps their (incorrect) config.
    • Mitigation: Explicit config always wins is consistent with every other user-override field (reasoning, contextWindow, maxTokens). Users can fix a wrong input by editing or removing the field; the refresh path still applies when input is absent.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/models-config.merge.test.ts (modified, +159/-2)
  • src/agents/models-config.merge.ts (modified, +51/-2)
  • src/agents/models-config.plan.ts (modified, +6/-1)

PR #70824: fix: merge mode no longer drops image attachments when one provider lacks apiKey

Description (problem / solution / changelog)

Summary

  • Problem: In models.mode=merge, image attachments are silently dropped when one of the configured providers is missing apiKey, even though another fully-configured provider declares input: ["text", "image"].
  • Why it matters: Users lose image capability with no warning. Gateway log says "attachment(s) dropped — model does not support images" despite a valid image-capable provider being present.
  • What changed: (1) resolveGatewayModelSupportsImages now falls back to checking all providers for image capability when the provider-specific lookup fails. (2) mergeProviderModels now respects explicit input overrides (consistent with existing reasoning behavior).
  • What did NOT change (scope boundary): No changes to model routing, provider auth, catalog loading, attachment parsing, or config schema. Only the image capability check and the input array merge expression.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #70557
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause (Bug 1 — cross-provider): resolveGatewayModelSupportsImages (src/gateway/session-utils.ts:1126) uses catalog.find() with exact provider + model match. In merge mode, when the resolved modelRef uses Provider A (missing apiKey, incomplete catalog entry with input: ["text"]), the function returns false without checking Provider B's entry (which declares input: ["text", "image"]). parseMessageWithAttachments then drops all image attachments.
  • Root cause (Bug 2 — merge input): mergeProviderModels (src/agents/models-config.merge.ts:104) unconditionally sets input: implicitModel.input, discarding the user's explicit input array. This is inconsistent with reasoning on the same line, which checks "reasoning" in explicitModel before falling back to implicit. If a user explicitly declares input: ["text", "image"] in their model config, the merge silently replaces it with the implicit catalog's value.
  • Missing detection / guardrail: No test asserted cross-provider image capability resolution. No test asserted that explicit input survives the merge (existing test expected implicit to always win).
  • Contributing context: The resolveGatewayModelSupportsImages function already has legacy safety shims for Microsoft Foundry and Claude CLI models — the cross-provider case was simply not covered. The merge function handles reasoning, contextWindow, contextTokens, and maxTokens with explicit-first logic, but input was the sole outlier using unconditional implicit override.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file:
    • src/gateway/session-utils.test.ts — 3 new tests for cross-provider image fallback
    • src/agents/models-config.merge.test.ts — 1 new test for implicit input fallback + 1 updated test for explicit input override
  • Scenario the test should lock in:
    • Cross-provider: Provider A has input: ["text"], Provider B has input: ["text", "image"] for same model → supportsImages returns true
    • Cross-provider (no entry): specified provider has no catalog entry, but another provider does with image support → returns true
    • Cross-provider (negative): no provider has image support → returns false
    • Merge: explicit input: ["text", "image"] survives merge (not overwritten by implicit input: ["text"])
    • Merge: when explicit model omits input, implicit input is used as fallback
  • Why this is the smallest reliable guardrail: Both bugs are pure data transformation — catalog lookup and config merge — with no side effects. Unit tests on the functions are sufficient.
  • Existing test that already covers this (if any): None for cross-provider. One existing test for merge input (updated — previously expected implicit to always win).

User-visible / Behavior Changes

  • Image attachments are no longer silently dropped in models.mode=merge when one provider is missing apiKey but another fully-configured provider declares image support.
  • User-configured input arrays in model config are now preserved through the merge (previously always overwritten by implicit catalog).

Diagram (if applicable)

flowchart TD
    A["User sends image\n(models.mode=merge)"] --> B["resolveGatewayModelSupportsImages()"]
    B --> C{"Provider A entry\nhas input: image?"}
    C -- "Yes" --> D["return true ✅"]
    C -- "No" --> E{"Legacy shim\n(Foundry/Claude)?"}
    E -- "Yes" --> D
    E -- "No" --> F{"🆕 Cross-provider fallback:\nANY catalog entry\nfor same model\nhas input: image?"}
    F -- "Yes" --> D
    F -- "No" --> G["return false ❌\n→ attachments dropped"]

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: Windows 11 (reporter) / Linux (dev)
  • Runtime/container: Node 22+, pnpm
  • Model/provider: qwen/qwen3.6-plus via manual multi-provider config
  • Relevant config: models.mode set to "merge", Provider A missing apiKey, Provider B fully configured with input: ["text", "image"]

Steps

  1. Configure models.mode=merge with two providers for the same model
  2. Provider A: no apiKey in openclaw.json or agents/main/agent/models.json
  3. Provider B: valid apiKey, model declares input: ["text", "image"]
  4. Start OpenClaw, send a message with an image attachment

Expected

  • Image attachment is processed normally (Provider B is valid and image-capable)

Actual

  • Image attachment dropped. Gateway logs: parseMessageWithAttachments: 1 attachment(s) dropped — model does not support images

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

4 new tests + 1 updated test covering both bugs. All 73 tests pass across both affected test files (61 session-utils + 12 merge). 0 lint errors. 0 type errors (pnpm tsgo green).

Human Verification (required)

  • Verified scenarios: Both fixes verified via fresh unit test runs. Cross-provider fallback tested with: hit (another provider has image), miss (no provider has image), no-entry (specified provider absent from catalog). Merge input tested with: explicit present (wins), explicit absent (implicit fallback). Full test suite for both files: 73/73 pass.
  • Edge cases checked: No provider specified (existing behavior preserved — provider-agnostic lookup). Model not in catalog at all (existing behavior preserved — returns false). Legacy shims (Foundry/Claude CLI) still take priority over cross-provider fallback.
  • What you did not verify: Full e2e wizard flow with actual multi-provider merge config on a live instance.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Risks and Mitigations

  • Risk: Cross-provider fallback might allow images for a model where the resolved provider does not actually support them (e.g., model supports images on Provider B but not Provider A, and request routes to Provider A).
    • Mitigation: This is a better failure mode than silently dropping images — Provider A would return an API error (visible to user) rather than silently discarding content. The routing layer should ideally select Provider B for image requests, and that is a separate concern from capability detection.
  • Risk: Respecting explicit input override in merge could let users misconfigure a model's capabilities (e.g., declaring image support for a text-only model).
    • Mitigation: This matches the existing reasoning behavior — explicit config is the user's deliberate override. The implicit catalog remains the fallback when input is not explicitly set.

AI-assisted PR (OpenCode + Claude). All changes tested via unit tests. I traced both bugs through the full code path (resolveGatewayModelSupportsImagesparseMessageWithAttachments → dropped) and understand the fix at each level.

Changed files

  • src/agents/models-config.merge.test.ts (modified, +38/-3)
  • src/agents/models-config.merge.ts (modified, +1/-1)
  • src/gateway/session-utils.subagent.test.ts (modified, +26/-1)
  • src/gateway/session-utils.test.ts (modified, +63/-0)
  • src/gateway/session-utils.ts (modified, +28/-0)
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

When models.mode is set to "merge" and multiple providers are configured, image upload may fail unexpectedly if one of the providers is missing the apiKey field in both openclaw.json and agents/main/agent/models.json, even though another provider is fully configured and declares image support. This issue does not occur when both providers are configured correctly with apiKey.

Steps to reproduce

Assume there are two providers configured under models.mode=merge:

  • Provider A:
    • missing apiKey in openclaw.json
    • also missing apiKey in agents/main/agent/models.json
  • Provider B:
    • fully configured
    • has valid apiKey
    • model config includes input: ["text", "image"]

Under this setup:

  1. Start OpenClaw normally.
  2. Send a text-only message, this works correctly.
  3. Send a message with an image attachment, the image is dropped.

Expected behavior

As long as there is at least one valid merged provider/model that:

  • is fully configured
  • is available for routing/selection
  • and explicitly supports input: ["text", "image"]

image attachments should not be dropped.

At minimum, the missing apiKey on another provider should not cause the merged image capability detection to degrade to “model does not support images” if a valid image-capable provider is present.

Actual behavior

Text chat works normally, but image attachments are dropped. Gateway logs show: parseMessageWithAttachments: 1 attachment(s) dropped — model does not support images

OpenClaw version

2026.4.20

Operating system

Windows 11

Install method

No response

Model

qwen/qwen3.6-plus

Provider / routing chain

Manually

Additional provider/model setup details

Model routing is configured manually via local config files, not via UI.

Relevant files:

  • openclaw.json
  • agents/main/agent/models.json

Setup details:

  • models.mode is set to "merge"
  • multiple providers are configured
  • one provider does not have apiKey in openclaw.json
  • the same provider also does not have apiKey in agents/main/agent/models.json
  • another provider is fully configured and its model declares input: ["text", "image"]

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

TL;DR

The issue can be resolved by ensuring all providers have the apiKey field configured in both openclaw.json and agents/main/agent/models.json, or by modifying the logic to ignore providers without apiKey when detecting image support.

Guidance

  • Verify that all providers have the apiKey field in both openclaw.json and agents/main/agent/models.json to ensure consistent configuration.
  • Check the model routing configuration to ensure that the fully configured provider with image support is correctly prioritized or selected when models.mode is set to "merge".
  • Consider modifying the parseMessageWithAttachments function to ignore providers without apiKey when detecting image support, allowing the merged model to use the capabilities of fully configured providers.
  • Review the logs to confirm that the issue is resolved after applying the above steps.

Example

No code snippet is provided as the issue does not specify the exact implementation details of the parseMessageWithAttachments function or the model routing logic.

Notes

The provided information suggests that the issue is related to the configuration and merging of providers, but the exact implementation details are not specified. The suggested steps are based on the assumption that the apiKey field is required for provider configuration and that the model routing logic can be modified to ignore providers without apiKey.

Recommendation

Apply workaround: Modify the logic to ignore providers without apiKey when detecting image support, allowing the merged model to use the capabilities of fully configured providers. This approach allows for a temporary fix without requiring a full reconfiguration of all providers.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

As long as there is at least one valid merged provider/model that:

  • is fully configured
  • is available for routing/selection
  • and explicitly supports input: ["text", "image"]

image attachments should not be dropped.

At minimum, the missing apiKey on another provider should not cause the merged image capability detection to degrade to “model does not support images” if a valid image-capable provider is present.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: models.mode=merge may incorrectly drop image attachments when one provider is missing apiKey [2 pull requests, 1 comments, 2 participants]