hermes - ✅(Solved) Fix [models.dev] mimo-v2.5-pro incorrectly marked as attachment:true — mimo-2.5-pro is text-only, mimo-2.5 is the omnimodal model [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#18884Fetched 2026-05-03 04:53:44
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×1

Fix Action

Workaround

Users can set agent.image_input_mode: text in config.yaml to force the text pipeline (vision_analyze pre-analysis) instead of native image attachment.

PR fix notes

PR #18889: fix(models-dev): let modalities.input take precedence over attachment flag

Description (problem / solution / changelog)

Problem

The models.dev data marks mimo-v2.5-pro on xiaomi-token-plan-cn with attachment: true, but the model's modalities.input only contains ["text"]. mimo-v2.5-pro is a text-only model — the actual omnimodal model is mimo-2.5.

The current code in get_model_capabilities() uses OR logic:

supports_vision = bool(entry.get("attachment", False)) or "image" in input_mods

This means attachment: true overrides the explicit modalities.input, causing image_routing.py to select native mode and send base64 images to a text-only model that silently drops them.

Fix

When modalities.input is explicitly provided and non-empty, it takes precedence over the attachment flag:

if input_mods:
    supports_vision = "image" in input_mods
else:
    supports_vision = bool(entry.get("attachment", False))

Applied to both get_model_capabilities() and ModelEntry.supports_vision().

Test

Added test_modalities_input_text_only_overrides_attachment covering the exact scenario from #18884.

Fixes

Fixes #18884

Changed files

  • agent/models_dev.py (modified, +14/-4)
  • tests/agent/test_models_dev.py (modified, +23/-0)

Code Example

"mimo-v2.5-pro": {
  "attachment": true,
  "modalities": {
    "input": ["text"]
  }
}

---

"mimo-v2.5-pro": {
  "attachment": true,
  "modalities": {
    "input": ["text", "image", "audio", "video", "pdf"]
  }
}
RAW_BUFFERClick to expand / collapse

Problem

The models.dev data marks mimo-v2.5-pro with attachment: true on the xiaomi-token-plan-cn provider, but mimo-2.5-pro is a text-only model. The actual omnimodal model is mimo-2.5 (supports image/audio/video/pdf).

This causes image_routing.py to choose "native" mode for mimo-v2.5-pro, sending base64 images directly to the model. The model silently drops the image data since it does not support multimodal input, and the user receives a response with no image understanding.

Evidence

Fresh data from https://models.dev/api.json:

xiaomi-token-plan-cn / mimo-v2.5-pro:

"mimo-v2.5-pro": {
  "attachment": true,
  "modalities": {
    "input": ["text"]
  }
}

xiaomi-token-plan-cn / mimo-v2.5 (if listed — this is the actual multimodal model): The attachment: true flag should be on mimo-2.5, not mimo-2.5-pro.

xiaomi (main API) / mimo-v2.5-pro:

"mimo-v2.5-pro": {
  "attachment": true,
  "modalities": {
    "input": ["text", "image", "audio", "video", "pdf"]
  }
}

The main xiaomi API entry lists full modalities for mimo-v2.5-pro, which may be correct for that endpoint. But the xiaomi-token-plan-cn entry only has ["text"] in modalities.input yet still has attachment: true.

Model clarification

ModelTypeVisionTool callingReasoning
mimo-2.5-proText-only
mimo-2.5Omnimodal✅ (image/audio/video/pdf)

The Token Plan API (token-plan-cn.xiaomimimo.com) provides access to both models, but they have different capabilities. The current data incorrectly treats mimo-2.5-pro as multimodal.

Impact

In agent/image_routing.py, decide_image_input_mode() checks supports_vision which resolves from attachment in models.dev data. With attachment: true on mimo-2.5-pro, the router selects "native" mode and sends images as base64 content parts. The model ignores the image data, and the user gets no image understanding — the image is silently lost.

Suggested fix

  1. Set attachment: false for mimo-2.5-pro on xiaomi-token-plan-cn (and other Token Plan providers)
  2. Ensure mimo-2.5 (if listed) has attachment: true with full modalities
  3. Alternatively, if the main xiaomi API entry also has mimo-2.5-pro incorrectly marked, fix that too — the modalities.input field already correctly shows ["text"] for Token Plan, so attachment should be false to match

Workaround

Users can set agent.image_input_mode: text in config.yaml to force the text pipeline (vision_analyze pre-analysis) instead of native image attachment.

extent analysis

TL;DR

Update the models.dev data to set attachment: false for mimo-v2.5-pro on xiaomi-token-plan-cn to correctly reflect its text-only capabilities.

Guidance

  • Verify the models.dev data for xiaomi-token-plan-cn and other Token Plan providers to ensure mimo-v2.5-pro has attachment: false.
  • Check if mimo-2.5 is listed and has attachment: true with full modalities to support multimodal input.
  • Consider updating the main xiaomi API entry for mimo-v2.5-pro to reflect its correct capabilities, if necessary.
  • As a temporary workaround, users can set agent.image_input_mode: text in config.yaml to force the text pipeline.

Example

No code snippet is necessary, as the fix involves updating the models.dev data.

Notes

The suggested fix assumes that the models.dev data is the source of the issue. If the problem persists after updating the data, further investigation may be necessary.

Recommendation

Apply the workaround by setting agent.image_input_mode: text in config.yaml until the models.dev data can be updated, as this will ensure that images are not silently dropped by the model.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [models.dev] mimo-v2.5-pro incorrectly marked as attachment:true — mimo-2.5-pro is text-only, mimo-2.5 is the omnimodal model [1 pull requests, 1 participants]