Root Cause

Two related asks in one issue:

Add xiaomi/mimo-v2.5 to the built-in Xiaomi provider catalog — it was released on April 22 2026 and supersedes mimo-v2-omni across every multimodal benchmark.
Auto-route to a multimodal model when the primary model is text-only and the incoming message contains image / video / audio attachments — so users running DeepSeek V4-Pro as their default never have to manually switch models.

Fix Action

Fix / Workaround

OpenClaw should inspect the input capability array of the resolved primary model before dispatching. If the message contains attachment types the primary model does not declare support for, it should automatically re-route that single turn to a configured multimodal fallback, then resume the primary model on subsequent text-only turns.

Primary model supports	Message contains	Action
`["text"]`	text only	dispatch to primary ✅
`["text"]`	image / video / audio	re-route turn to `multimodalFallback`
`["text", "image"]`	image	dispatch to primary ✅
`["text", "image"]`	video or audio	re-route turn to `multimodalFallback`
`["text", "image", "video", "audio"]`	anything	dispatch to primary ✅

Expected behavior when a user sends a message with a video attached:

Turn is dispatched to xiaomi/mimo-v2.5
Response is returned normally in the same conversation thread
Next text-only message resumes on deepseek/deepseek-v4-pro

Code Example

{
 id: "mimo-v2.5",
 name: "Xiaomi MiMo V2.5",
 reasoning: true,
 input: ["text", "image", "video", "audio"],
 cost: { input: 0.40, output: 2.00, cacheRead: 0.04, cacheWrite: 0 },
 contextWindow: 1_048_576,
 maxTokens: 131_072,
},

---

// openclaw.config.json
{
 "agents": {
 "defaults": {
 "model": { "primary": "deepseek/deepseek-v4-pro" }
 }
 }
}

---

{
 "agents": {
 "defaults": {
 "model": {
 "primary": "deepseek/deepseek-v4-pro",
 "multimodalFallback": "xiaomi/mimo-v2.5"
 }
 }
 }
}

---

{
 "env": {
 "DEEPSEEK_API_KEY": "sk-...",
 "XIAOMI_API_KEY": "xm-..."
 },
 "agents": {
 "defaults": {
 "model": {
 "primary": "deepseek/deepseek-v4-pro",
 "multimodalFallback": "xiaomi/mimo-v2.5"
 }
 }
 },
 "models": {
 "mode": "merge",
 "providers": {
 "deepseek": {
 "baseUrl": "https://api.deepseek.com/v1",
 "api": "openai-completions",
 "apiKey": "***"
 },
 "xiaomi": {
 "baseUrl": "https://api.xiaomimimo.com/v1",
 "api": "openai-completions",
 "apiKey": "***",
 "models": [
 {
 "id": "mimo-v2.5",
 "name": "Xiaomi MiMo V2.5",
 "reasoning": true,
 "input": ["text", "image", "video", "audio"],
 "cost": { "input": 0.40, "output": 2.00, "cacheRead": 0.04, "cacheWrite": 0 },
 "contextWindow": 1048576,
 "maxTokens": 131072
 }
 ]
 }
 }
 }
}

[Feature] Add MiMo-V2.5 to Xiaomi catalog + automatic multimodal routing when DeepSeek V4-Pro is primary model

Summary

Two related asks in one issue:

Add xiaomi/mimo-v2.5 to the built-in Xiaomi provider catalog — it was released on April 22 2026 and supersedes mimo-v2-omni across every multimodal benchmark.
Auto-route to a multimodal model when the primary model is text-only and the incoming message contains image / video / audio attachments — so users running DeepSeek V4-Pro as their default never have to manually switch models.

Why MiMo-V2.5 instead of MiMo-V2-Omni

Property	`mimo-v2-omni`	`mimo-v2.5`
Parameters	—	310B total / 15B active (sparse MoE)
Modalities	text, image	text, image, video, audio
Context window	262,144	1,048,576
Max output	32,000	131,072
Reasoning	✅	✅
Input price	—	$0.40 / 1M tokens
Output price	—	$2.00 / 1M tokens
API model id	`mimo-v2-omni`	`mimo-v2.5`
Base URL	`https://api.xiaomimimo.com/v1`	same

MiMo-V2.5 trains image, video, and audio encoders jointly from the start — not adapter-bolted-on. It also beats MiMo-V2-Omni on Video-MME (87.7) and MMMU-Pro (77.9), and runs at roughly half the cost of MiMo-V2-Pro.

Requested catalog entry

Add to src/providers/xiaomi/catalog.ts (or equivalent):

{
 id: "mimo-v2.5",
 name: "Xiaomi MiMo V2.5",
 reasoning: true,
 input: ["text", "image", "video", "audio"],
 cost: { input: 0.40, output: 2.00, cacheRead: 0.04, cacheWrite: 0 },
 contextWindow: 1_048_576,
 maxTokens: 131_072,
},

The existing mimo-v2-omni entry should be kept for backward compat but marked as superseded in docs.

Requested behavior: automatic multimodal routing

Problem

When a user sets deepseek/deepseek-v4-pro (or any text-only model) as primary:

// openclaw.config.json
{
 "agents": {
 "defaults": {
 "model": { "primary": "deepseek/deepseek-v4-pro" }
 }
 }
}

…and then attaches an image, video, or audio file to a message, OpenClaw currently either errors out or strips the attachment silently.

Proposed behavior

Suggested config surface:

{
 "agents": {
 "defaults": {
 "model": {
 "primary": "deepseek/deepseek-v4-pro",
 "multimodalFallback": "xiaomi/mimo-v2.5"
 }
 }
 }
}

Routing rules

Primary model supports	Message contains	Action
`["text"]`	text only	dispatch to primary ✅
`["text"]`	image / video / audio	re-route turn to `multimodalFallback`
`["text", "image"]`	image	dispatch to primary ✅
`["text", "image"]`	video or audio	re-route turn to `multimodalFallback`
`["text", "image", "video", "audio"]`	anything	dispatch to primary ✅

The multimodal fallback response should be injected into the conversation history as an assistant message attributed to the primary model alias, so the context is seamless.

Why not just use `failover`?

Failover fires on error. This needs to fire before the request is sent, based on input type inspection — a different code path.

Minimal reproducible config (for testing)

{
 "env": {
 "DEEPSEEK_API_KEY": "sk-...",
 "XIAOMI_API_KEY": "xm-..."
 },
 "agents": {
 "defaults": {
 "model": {
 "primary": "deepseek/deepseek-v4-pro",
 "multimodalFallback": "xiaomi/mimo-v2.5"
 }
 }
 },
 "models": {
 "mode": "merge",
 "providers": {
 "deepseek": {
 "baseUrl": "https://api.deepseek.com/v1",
 "api": "openai-completions",
 "apiKey": "***"
 },
 "xiaomi": {
 "baseUrl": "https://api.xiaomimimo.com/v1",
 "api": "openai-completions",
 "apiKey": "***",
 "models": [
 {
 "id": "mimo-v2.5",
 "name": "Xiaomi MiMo V2.5",
 "reasoning": true,
 "input": ["text", "image", "video", "audio"],
 "cost": { "input": 0.40, "output": 2.00, "cacheRead": 0.04, "cacheWrite": 0 },
 "contextWindow": 1048576,
 "maxTokens": 131072
 }
 ]
 }
 }
 }
}

Expected behavior when a user sends a message with a video attached:

Turn is dispatched to xiaomi/mimo-v2.5
Response is returned normally in the same conversation thread
Next text-only message resumes on deepseek/deepseek-v4-pro

References

MiMo-V2.5 official release: https://mimo.xiaomi.com/mimo-v2-5/
MiMo-V2.5 on HuggingFace: https://huggingface.co/XiaomiMiMo/MiMo-V2.5
MiMo-V2.5 on OpenRouter: https://openrouter.ai/xiaomi/mimo-v2.5
DeepSeek V4 API docs: https://api-docs.deepseek.com/news/news260424
Existing Xiaomi provider docs: https://docs.openclaw.ai/providers/xiaomi
Related issue: #54367 (Native Xiaomi MiMo Ecosystem Integration)

Checklist

Add mimo-v2.5 to built-in Xiaomi catalog (input: ["text","image","video","audio"])
Add mimo-v2.5 to onboarding model selection UI
Implement multimodalFallback key in agent model config
Pre-dispatch input-type inspection (not error-triggered failover)
Inject fallback response into history under primary model alias
Update https://docs.openclaw.ai/providers/xiaomi with V2.5 catalog row
Add migration note: V2.5 supersedes V2-Omni for new installs

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Feature] Add MiMo-V2.5 to Xiaomi catalog + automatic multimodal routing when DeepSeek V4-Pro is primary model

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

[Feature] Add MiMo-V2.5 to Xiaomi catalog + automatic multimodal routing when DeepSeek V4-Pro is primary model

Summary

Why MiMo-V2.5 instead of MiMo-V2-Omni

Requested catalog entry

Requested behavior: automatic multimodal routing

Problem

Proposed behavior

Routing rules

Why not just use `failover`?

Minimal reproducible config (for testing)

References

Checklist

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Feature] Add MiMo-V2.5 to Xiaomi catalog + automatic multimodal routing when DeepSeek V4-Pro is primary model

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

[Feature] Add MiMo-V2.5 to Xiaomi catalog + automatic multimodal routing when DeepSeek V4-Pro is primary model

Summary

Why MiMo-V2.5 instead of MiMo-V2-Omni

Requested catalog entry

Requested behavior: automatic multimodal routing

Problem

Proposed behavior

Routing rules

Why not just use failover?

Minimal reproducible config (for testing)

References

Checklist

Still need to ship something?

TRENDING

Why not just use `failover`?