hermes - 💡(How to fix) Fix vision_analyze fails with kimi-coding provider: native vision fast path blocked [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

<pre>Error analyzing image: No LLM provider configured for task=vision provider=auto. Run: hermes setup</pre>
  1. Error: <code>No LLM provider configured for task=vision</code>

Root Cause

Two issues chain together:

Fix Action

Fixed

RAW_BUFFERClick to expand / collapse

Bug Description

When using <code>kimi-coding</code> provider with model <code>kimi-for-coding</code> (Kimi K2.6), the <code>vision_analyze</code> tool fails with:

<pre>Error analyzing image: No LLM provider configured for task=vision provider=auto. Run: hermes setup</pre>

This happens even though Kimi K2.6 supports multimodal/vision natively.

Root Cause Analysis

Two issues chain together:

1. Provider not in native vision allow-list

<code>_supports_media_in_tool_results()</code> in <code>tools/vision_tools.py</code> does not include <code>kimi-coding</code>, so the native fast path is skipped.

2. Config model name not resolvable in models.dev

Config uses friendly name <code>kimi-for-coding</code>, but models.dev cache stores the canonical ID <code>k2p6</code>. <code>_find_model_entry()</code> cannot bridge this gap, so <code>get_model_capabilities()</code> returns <code>None</code>.

This causes <code>decide_image_input_mode()</code> to fall back to <code>'text'</code> mode, which triggers the legacy auxiliary-LLM path. Since no auxiliary vision provider is configured, it fails.

Reproduction Steps

  1. Set config: <code>model.default: kimi-for-coding</code>, <code>model.provider: kimi-coding</code>
  2. Send an image to the agent
  3. Agent calls <code>vision_analyze</code>
  4. Error: <code>No LLM provider configured for task=vision</code>

Expected Behavior

<code>vision_analyze</code> should use the native fast path, returning a multimodal tool-result envelope with the base64-encoded image, allowing the main model (Kimi K2.6) to see the pixels directly.

Environment

  • Provider: kimi-coding
  • Model: kimi-for-coding (Kimi K2.6)
  • Hermes version: main @ e85592591

Proposed Fix

  1. Add <code>kimi-coding</code>, <code>kimi-coding-cn</code>, <code>kimi</code>, <code>moonshot</code> to <code>_supports_media_in_tool_results()</code>
  2. Add <code>_MODEL_ALIASES</code> mapping in <code>_find_model_entry()</code> to bridge friendly config names to canonical models.dev IDs

I have a branch ready with the fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING