openclaw - ✅(Solved) Fix Bug: GLM provider does not register media-understanding capability despite image input config [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#51392Fetched 2026-04-08 01:11:49
View on GitHub
Comments
1
Participants
2
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
referenced ×3cross-referenced ×2closed ×1commented ×1

Error Message

  1. Error: "No media-understanding provider registered for glm"

Fix Action

Fixed

PR fix notes

PR #51418: fix(media-understanding): auto-register image capability

Description (problem / solution / changelog)

Summary

  • Problem: When a GLM model (e.g. glm-4.6v) is configured with input: ["text", "image"] in models.providers.glm, the media-understanding capability is not registered for the "glm" provider, causing the image tool to fail with "No media-understanding provider registered for glm".
  • Why it matters: Users configuring custom providers (e.g. GLM via OpenAI-compatible API) with image-capable models cannot use the image tool or agents.defaults.imageModel.
  • What changed: In buildMediaUnderstandingRegistry, auto-register a media-understanding provider (with describeImage/describeImages delegating to the generic pi-ai runtime) for any config provider that has at least one model with input including "image", when that provider is not already registered by a plugin.
  • What did NOT change: Plugin-owned providers, audio/video capabilities, or the generic describeImageWithModel implementation.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #51392
  • Related #

User-visible / Behavior Changes

When a config provider (e.g. models.providers.glm) defines models with input: ["text", "image"], the image tool and agents.defaults.imageModel now work for that provider without requiring a separate plugin to register media-understanding.

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)

Repro + Verification

Environment

  • OS: Any
  • Runtime/container: Node
  • Model/provider: glm with input: ["text","image"]
  • Relevant config: models.providers.glm.models with image input

Steps

  1. Configure glm provider with model having input: ["text","image"]
  2. Set agents.defaults.imageModel to glm/glm-4.6v
  3. Call the image tool

Expected

Image tool succeeds; no "No media-understanding provider registered for glm" error.

Actual

Before: error. After fix: works.

Evidence

  • Failing test/log before + passing after
  • Regression test in providers/index.test.ts for auto-registration

Human Verification (required)

  • Verified scenarios: Added unit test; logic follows existing patterns (zai provider, describeImageWithModel).
  • Edge cases checked: Providers with only text models are not registered; already-registered providers are not overwritten.
  • What you did not verify: End-to-end with real GLM API (no auth).

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)

Failure Recovery (if this breaks)

  • How to disable/revert: Revert commit.
  • Files/config to restore: src/media-understanding/providers/index.ts, index.test.ts

Risks and Mitigations

None. The change only adds providers to the registry when config explicitly declares image-capable models; existing auth and model resolution paths are unchanged.

Changed files

  • src/media-understanding/providers/index.test.ts (modified, +24/-0)
  • src/media-understanding/providers/index.ts (modified, +26/-0)

Code Example

{
  "models": {
    "providers": {
      "glm": {
        "models": [{
          "id": "glm-4.6v",
          "input": ["text", "image"]
        }]
      }
    }
  },
  "agents": {
    "defaults": {
      "imageModel": { "primary": "glm/glm-4.6v" }
    }
  }
}
RAW_BUFFERClick to expand / collapse

Bug Description

When a GLM model (e.g. glm-4.6v) is configured with input: ["text", "image"], the OpenClaw provider does not automatically register the media-understanding capability. This causes the image tool to fail with:

No media-understanding provider registered for glm

Reproduction

  1. Configure a GLM model with input: ["text", "image"] as agents.defaults.imageModel
  2. Call the image tool with any image
  3. Error: "No media-understanding provider registered for glm"

Expected Behavior

When a model's input array contains "image", the provider should auto-register the media-understanding capability, enabling the image tool.

Config

{
  "models": {
    "providers": {
      "glm": {
        "models": [{
          "id": "glm-4.6v",
          "input": ["text", "image"]
        }]
      }
    }
  },
  "agents": {
    "defaults": {
      "imageModel": { "primary": "glm/glm-4.6v" }
    }
  }
}

extent analysis

Fix Plan

To fix the issue, we need to auto-register the media-understanding capability when a GLM model is configured with input: ["text", "image"].

Code Changes

We can achieve this by modifying the OpenClaw provider to check the model's input configuration and register the capability if necessary. Here's an example code snippet:

# openclaw_provider.py

def register_capabilities(self, model):
    if "image" in model["input"]:
        self.register_capability("media-understanding")

# In the model configuration loop
for model in models:
    # ... existing code ...
    register_capabilities(model)

Alternatively, we can also modify the image tool to check if the media-understanding capability is registered before attempting to use it:

# image_tool.py

def __init__(self, model):
    if "image" in model["input"] and not self.is_capability_registered("media-understanding"):
        raise ValueError("Media-understanding capability not registered for model")

Configuration Changes

No configuration changes are required. The fix relies solely on code modifications.

Verification

To verify that the fix worked, follow these steps:

  • Configure a GLM model with input: ["text", "image"] as agents.defaults.imageModel
  • Call the image tool with any image
  • The image tool should no longer raise the "No media-understanding provider registered for glm" error

Extra Tips

  • Make sure to test the fix with different model configurations to ensure that the media-understanding capability is registered correctly in all cases.
  • Consider adding additional logging or debugging statements to help diagnose any future issues related to capability registration.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Bug: GLM provider does not register media-understanding capability despite image input config [1 pull requests, 1 comments, 2 participants]