hermes - ✅(Solved) Fix fix: remove kimi-coding from _PROVIDERS_WITHOUT_VISION — vision now supported upstream [1 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#18990Fetched 2026-05-03 04:53:04
View on GitHub
Comments
2
Participants
2
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
labeled ×5commented ×2cross-referenced ×1

PR #17451 (commit ff687c0, merged ~3 days ago) added kimi-coding to _PROVIDERS_WITHOUT_VISION in agent/auxiliary_client.py based on Kimi documentation stating that api.kimi.com/coding does not accept image input (issue #17076).

Direct API testing shows this is no longer true. Both Kimi Coding endpoints (/v1/messages and /v1/chat/completions) now successfully process image input. The hardcoded deny-list introduced by #17451 now causes a regression: users with kimi-coding as their main provider and no fallback vision backend get:

No LLM provider configured for task=vision provider=auto. Run: hermes setup

instead of functional native vision.

Root Cause

  • Zero regression risk for non-Kimi users — only removes entries from a static deny-list; no other code paths touched.
  • No risk for Kimi users — verified live that the API accepts multimodal input; the auxiliary client already sets the required User-Agent header.
  • Reverts an outdated guard from #17451 — restores the pre-#17451 behavior, which is now correct again because the upstream API has changed.

Fix Action

Fix / Workaround

Patch

  • Patch applied to agent/auxiliary_client.py
  • No new tests required (removal from static deny-list)
  • Existing vision auto-detection tests pass
  • Verified with live kimi-for-coding API key (both Messages and Chat Completions endpoints)

PR fix notes

PR #18999: fix(vision): remove kimi-coding from _PROVIDERS_WITHOUT_VISION deny-list

Description (problem / solution / changelog)

Summary

Removes kimi-coding and kimi-coding-cn from _PROVIDERS_WITHOUT_VISION since Kimi Coding now supports vision upstream.

Closes #18990

Problem

PR #17451 added kimi-coding and kimi-coding-cn to _PROVIDERS_WITHOUT_VISION because Kimi Coding's vision endpoint was unreliable at the time. Kimi has since added proper vision support — the Messages API (/v1/messages) with content blocks containing type: "image" and image_url (Anthropic wire format) works correctly.

This deny-list entry now blocks legitimate vision requests for Kimi Coding users: resolve_vision_provider_client() hits the guard, skips kimi-coding, falls through to unavailable aggregators, and returns None.

Changes

  1. agent/auxiliary_client.py: Emptied _PROVIDERS_WITHOUT_VISION frozenset (was {"kimi-coding", "kimi-coding-cn"}, now frozenset()). Removed the elif main_provider in _PROVIDERS_WITHOUT_VISION: branch (13 lines) from resolve_vision_provider_client() — dead code now that the set is empty. Updated comment block to reference #18990.

  2. tests/agent/test_auxiliary_client.py: Updated test class names, removed two tests that verified kimi-coding was skipped (no longer applicable), kept the explicit-override test, updated skip-set assertion to verify frozenset().

Verification

  • No other references to kimi-coding vision blocking exist in the codebase
  • The else branch in resolve_vision_provider_client() now handles kimi-coding normally — calling resolve_provider_client() which works correctly for vision-capable providers

Changed files

  • agent/auxiliary_client.py (modified, +6/-23)
  • tests/agent/test_auxiliary_client.py (modified, +10/-80)

Code Example

/image ~/some-photo.jpg

---

curl -H "Authorization: Bearer $KIMI_API_KEY" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "kimi-for-coding",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": "iVBORw0KGgo..."}},
        {"type": "text", "text": "What color is this?"}
      ]
    }],
    "max_tokens": 100
  }' \
  https://api.kimi.com/coding/v1/messages

---

curl -H "Authorization: Bearer $KIMI_API_KEY" \
  -H "Content-Type: application/json" \
  -H "User-Agent: claude-code/0.1.0" \
  -d '{
    "model": "kimi-for-coding",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}},
        {"type": "text", "text": "What color is this?"}
      ]
    }],
    "max_tokens": 100
  }' \
  https://api.kimi.com/coding/v1/chat/completions

---

--- a/agent/auxiliary_client.py
+++ b/agent/auxiliary_client.py
@@ -250,13 +250,12 @@ _PROVIDER_VISION_MODELS: Dict[str, str] = {
 # it must skip straight to the aggregator chain instead of returning a client
 # that will 404 on every vision request.
 #
-# kimi-coding / kimi-coding-cn: the Kimi Coding Plan routes through
-# api.kimi.com/coding (Anthropic Messages wire) which Kimi's own docs
-# describe as having no image_in capability. Vision lives on the separate
-# Kimi Platform (api.moonshot.ai, OpenAI-wire, pay-as-you-go).  See #17076.
+# NOTE: Previously included kimi-coding / kimi-coding-cn based on outdated
+# documentation. Direct API testing (2026-05-03) confirms both the Kimi Coding
+# Messages API (/v1/messages) and Chat Completions (/v1/chat/completions)
+# endpoints accept image input with proper User-Agent. The auxiliary client
+# already sends `User-Agent: claude-code/0.1.0` for api.kimi.com requests.
 _PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
-    "kimi-coding",
-    "kimi-coding-cn",
 })
RAW_BUFFERClick to expand / collapse

Draft: GitHub Issue / PR — Re-enable vision for Kimi Coding provider

Summary

PR #17451 (commit ff687c0, merged ~3 days ago) added kimi-coding to _PROVIDERS_WITHOUT_VISION in agent/auxiliary_client.py based on Kimi documentation stating that api.kimi.com/coding does not accept image input (issue #17076).

Direct API testing shows this is no longer true. Both Kimi Coding endpoints (/v1/messages and /v1/chat/completions) now successfully process image input. The hardcoded deny-list introduced by #17451 now causes a regression: users with kimi-coding as their main provider and no fallback vision backend get:

No LLM provider configured for task=vision provider=auto. Run: hermes setup

instead of functional native vision.

Background: what #17451 fixed and why it's now stale

TimelineEvent
~4 days agoIssue #17076 opened — user reports 404 on vision with kimi-coding
~3 days agoPR #17451 merged (ff687c0) — adds kimi-coding to _PROVIDERS_WITHOUT_VISION
2026-05-03Direct curl testing confirms both endpoints accept images (see below)

The #17451 fix was correct at the time based on Kimi's documented behavior. However, Kimi has since enabled multimodal input on the /coding endpoint. The deny-list now blocks a working provider instead of a broken one.

Affected users

Anyone using kimi-coding as their only configured provider with auxiliary.vision.provider: auto (default). Users who configured OpenRouter/Anthropic/Nous as fallback are unaffected.

Steps to reproduce

  1. Configure Hermes with only kimi-coding provider (no OpenRouter, Anthropic, Nous).
  2. Ensure auxiliary.vision.provider is auto (default).
  3. Enable vision toolset: hermes tools enable vision
  4. Start fresh session and attach an image:
    /image ~/some-photo.jpg
  5. Observe: No LLM provider configured for task=vision provider=auto

Expected behavior

auto resolution should route vision through kimi-coding when it is the main provider and the API accepts image input.

Actual behavior

resolve_vision_provider_client() hits the _PROVIDERS_WITHOUT_VISION guard introduced by #17451, skips kimi-coding, falls through to unavailable aggregators, and returns None.

Verification via direct API calls

Test 1 — Messages API (Anthropic wire, /v1/messages)

curl -H "Authorization: Bearer $KIMI_API_KEY" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "kimi-for-coding",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": "iVBORw0KGgo..."}},
        {"type": "text", "text": "What color is this?"}
      ]
    }],
    "max_tokens": 100
  }' \
  https://api.kimi.com/coding/v1/messages

Result: HTTP 200, correct image analysis returned.

Test 2 — Chat Completions (OpenAI wire, /v1/chat/completions)

curl -H "Authorization: Bearer $KIMI_API_KEY" \
  -H "Content-Type: application/json" \
  -H "User-Agent: claude-code/0.1.0" \
  -d '{
    "model": "kimi-for-coding",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "image_url", "image_url": {"url": "data:image/png;base64,..."}},
        {"type": "text", "text": "What color is this?"}
      ]
    }],
    "max_tokens": 100
  }' \
  https://api.kimi.com/coding/v1/chat/completions

Result: HTTP 200, image analysis + reasoning_content returned.

Note: Hermes auxiliary client already injects User-Agent: claude-code/0.1.0 for api.kimi.com endpoints (agent/auxiliary_client.py:1108–1109 and :1921–1922), so the Chat Completions path is already correctly configured.

Proposed fix

Remove kimi-coding and kimi-coding-cn from _PROVIDERS_WITHOUT_VISION and update the comment to reflect current API reality.

Patch

--- a/agent/auxiliary_client.py
+++ b/agent/auxiliary_client.py
@@ -250,13 +250,12 @@ _PROVIDER_VISION_MODELS: Dict[str, str] = {
 # it must skip straight to the aggregator chain instead of returning a client
 # that will 404 on every vision request.
 #
-# kimi-coding / kimi-coding-cn: the Kimi Coding Plan routes through
-# api.kimi.com/coding (Anthropic Messages wire) which Kimi's own docs
-# describe as having no image_in capability. Vision lives on the separate
-# Kimi Platform (api.moonshot.ai, OpenAI-wire, pay-as-you-go).  See #17076.
+# NOTE: Previously included kimi-coding / kimi-coding-cn based on outdated
+# documentation. Direct API testing (2026-05-03) confirms both the Kimi Coding
+# Messages API (/v1/messages) and Chat Completions (/v1/chat/completions)
+# endpoints accept image input with proper User-Agent. The auxiliary client
+# already sends `User-Agent: claude-code/0.1.0` for api.kimi.com requests.
 _PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
-    "kimi-coding",
-    "kimi-coding-cn",
 })

Risk assessment

  • Zero regression risk for non-Kimi users — only removes entries from a static deny-list; no other code paths touched.
  • No risk for Kimi users — verified live that the API accepts multimodal input; the auxiliary client already sets the required User-Agent header.
  • Reverts an outdated guard from #17451 — restores the pre-#17451 behavior, which is now correct again because the upstream API has changed.

Related

  • Issue #17076 — original bug report (404 on vision with kimi-coding)
  • PR #17451 (ff687c0) — added the deny-list (correct at the time, now stale)

Suggested issue title

fix: remove kimi-coding from _PROVIDERS_WITHOUT_VISION — vision now supported upstream

Suggested labels

type/bug, comp/cli, provider/kimi, regression

Checklist for PR

  • Patch applied to agent/auxiliary_client.py
  • No new tests required (removal from static deny-list)
  • Existing vision auto-detection tests pass
  • Verified with live kimi-for-coding API key (both Messages and Chat Completions endpoints)

extent analysis

TL;DR

Remove kimi-coding and kimi-coding-cn from _PROVIDERS_WITHOUT_VISION in agent/auxiliary_client.py to fix the regression caused by the outdated deny-list.

Guidance

  • Verify that the Kimi Coding API endpoints (/v1/messages and /v1/chat/completions) accept image input by running the provided curl tests.
  • Remove the kimi-coding and kimi-coding-cn entries from the _PROVIDERS_WITHOUT_VISION set in agent/auxiliary_client.py.
  • Update the comment above the _PROVIDERS_WITHOUT_VISION set to reflect the current API reality.
  • Test the fix by configuring Hermes with only kimi-coding as the provider and verifying that vision works as expected.

Example

The provided patch shows the exact changes needed to agent/auxiliary_client.py:

--- a/agent/auxiliary_client.py
+++ b/agent/auxiliary_client.py
@@ -250,13 +250,12 @@ _PROVIDER_VISION_MODELS: Dict[str, str] = {
 ...
 _PROVIDERS_WITHOUT_VISION: frozenset = frozenset({
-    "kimi-coding",
-    "kimi-coding-cn",
 })

Notes

This fix assumes that the Kimi Coding API will continue to accept image input. If the API changes again, the deny-list may need to be updated accordingly.

Recommendation

Apply the workaround by removing kimi-coding and kimi-coding-cn from _PROVIDERS_WITHOUT_VISION, as the Kimi Coding API now supports multimodal input and the auxiliary client already sets the required User-Agent header.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

auto resolution should route vision through kimi-coding when it is the main provider and the API accepts image input.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING