hermes - ✅(Solved) Fix Custom Kimi-compatible endpoint with api_mode=anthropic_messages fails after tool call when thinking is enabled [3 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#17057Fetched 2026-04-29 06:37:31
View on GitHub
Comments
1
Participants
2
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×3commented ×1

When Hermes is configured with:

model:
  default: kimi-2.6
  provider: custom
  base_url: http://<custom-endpoint>
  api_mode: anthropic_messages

agent:
  reasoning_effort: medium

the next request after a tool call can fail with:

HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

Error Message

Hermes sends Anthropic-style thinking, but prior assistant tool-call history does not include reasoning_content, so the upstream endpoint rejects the request.

Root Cause

When Hermes is configured with:

model:
  default: kimi-2.6
  provider: custom
  base_url: http://<custom-endpoint>
  api_mode: anthropic_messages

agent:
  reasoning_effort: medium

the next request after a tool call can fail with:

HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

Fix Action

Fixed

PR fix notes

PR #17065: fix(anthropic): suppress thinking on Kimi-compatible custom endpoints (#17057)

Description (problem / solution / changelog)

Summary

`build_anthropic_kwargs` skips Anthropic `thinking` for Kimi's official `/coding` endpoint via `_is_kimi_coding_endpoint(base_url)`, because Kimi validates the message history when thinking is enabled and requires every assistant tool-call to carry OpenAI-style `reasoning_content` — which the Anthropic path never populates. Without that suppression, the turn after any tool call fails with:

HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index N

That guard only matched the literal `api.kimi.com/coding` URL. Users running Hermes against a proxied or self-hosted Kimi deployment (same wire protocol, different `base_url`) hit the same 400 and are forced to disable `reasoning_effort` as a workaround. Reported in #17057.

Fix

Add `_is_kimi_anthropic_compat(base_url, model)` that returns True for either:

  • the official `api.kimi.com/coding` URL (existing behaviour, preserved), or
  • any non-empty `base_url` paired with a Kimi-family model name (`"kimi" in model.lower()`)

Native Anthropic (no `base_url`) is unaffected — both signals are required, so a fictional model name containing "kimi" on a direct Anthropic call still gets thinking. Non-Kimi models on custom endpoints (MiniMax, etc.) keep their thinking parameter.

Behaviour matrix

base_urlmodelBeforeAfter
`api.kimi.com/coding`kimi-*thinking omittedthinking omitted (unchanged)
`http://kimi-proxy.local\`kimi-2.6thinking sent → 400thinking omitted ✓
`http://anthropic-proxy.local\`MiniMax-M2.7thinking sentthinking sent (unchanged)
`None` (native Anthropic)claude-sonnet-4-*thinking sentthinking sent (unchanged)
`api.kimi.com/v1`kimi-k2.5thinking sent (paranoid path)thinking omitted (now consistent)

Test plan

  • `tests/agent/test_kimi_coding_anthropic_thinking.py` — 14 tests pass (10 existing + 5 new in `TestKimiCustomEndpointSkipsAnthropicThinking` + 1 retargeted)
  • `tests/agent/test_anthropic_adapter.py` (kimi/thinking subset) — 14 tests pass

The `test_kimi_root_endpoint_unaffected` paranoid test is retargeted to assert the new (consistent) behaviour: if a Kimi-family model reaches `build_anthropic_kwargs`, suppress thinking regardless of whether the URL is `/coding` or `/v1`. The wire failure mode follows the protocol, not the URL path.

Fixes #17057

Changed files

  • agent/anthropic_adapter.py (modified, +35/-1)
  • tests/agent/test_kimi_coding_anthropic_thinking.py (modified, +75/-7)

PR #1: fix: resolve 7 identified issues [automated]

Description (problem / solution / changelog)

Resumo

Este PR corrige 7 issues identificados no repositório NousResearch/hermes-agent.


Issues Corrigidos

1. #17086 - custom endpoint com api_mode=anthropic_messages falhava com 404

Arquivo: agent/auxiliary_client.py Problema: Quando provider=custom com api_mode=anthropic_messages e base_url terminando em /anthropic, a funcao _resolve_provider_client() convertia a URL para o formato /v1/messages, causando 404 em provedores Anthropic-compatíveis de terceiros. Correcao: Mantem o path /anthropic quando api_mode=anthropic_messages e o base_url ja termina em /anthropic.


2. #17076 - kimi-coding vision quebrado (404 em analise de imagem)

Arquivo: agent/auxiliary_client.py Problema: kimi-coding nao estava listado em _PROVIDER_VISION_MODELS, entao auxiliary.vision.provider: auto nao conseguia detectar o modelo de visao disponivel para Kimi. Correcao: Adicionado kimi-coding e kimi-coding-cn ao mapa de modelos de visao.


3. #17080 - hermes profile create --clone copiava credenciais exclusivas de plataforma

Arquivo: hermes_cli/profiles.py Problema: O clone de perfil copiava TELEGRAM_BOT_TOKEN, DISCORD_BOT_TOKEN, WEIXIN_TOKEN verbatim. Quando dois perfis iniciam simultaneamente com o mesmo token, o adaptador de plataforma falha durante a aquisicao de lock. Correcao: Define _EXCLUSIVE_PLATFORM_KEYS e _EXCLUSIVE_PLATFORM_CONFIG_PATHS. Credenciais exclusivas sao comentadas em .env e entradas de plataforma sao enabled: false em config.yaml apos clone.


4. #17054 - Slack manifest rejeitava nomes com underscore

Arquivo: hermes_cli/commands.py Problema: _sanitize_slack_name() convertia nomes de comandos como _reload_mcp para Slack mas nao removia o prefixo underscore, causando rejeicao do manifest. Correcao: Adicionada verificacao para pular nomes que começam com underscore antes de adicionar a lista de slash commands.


5. #17057 - custom Kimi-compatible endpoint falhava apos tool call com thinking habilitado

Arquivo: run_agent.py Problema: _needs_kimi_tool_reasoning() só verificava hostnames oficiais (api.kimi.com, moonshot.ai, moonshot.cn). Endpoints Kimi-compatíveis customizados nao eram detectados. Correcao: Ampliada a verificacao para detectar endpoints customizados pela familia do modelo (nome contem "kimi" ou "k2") alem do hostname.


6. #17049 - UnicodeDecodeError no scan de processos Windows (wmic)

Arquivo: hermes_cli/gateway.py Problema: wmic emitia saida em encoding local do Windows (cp1252/utf-16), causando UnicodeDecodeError e AttributeError durante parsing. Correcao: O parsing agora usa errors=ignore no decode, tratando bytes invalidos como despreziveis.


7. #17052 - stale reasoning reutilizado quando turn atual nao tem reasoning_content

Arquivo: run_agent.py Problema: Mensagens de assistant com tool_calls e reasoning_content eram reutilizadas indevidamente em turns que nao tinham reasoning_content, causando confabulacoes em provedores como Qwen3.6:27b via Ollama. Correcao: O loop de replay agora detecta quando a mensagem atual e um assistant com tool_calls mas sem reasoning_content, e limpa msg[reasoning_content] e msg[reasoning] para evitar propagacao de estado de reasoning de turns anteriores.


Arquivos Modificados

  • agent/auxiliary_client.py - #17086, #17076
  • hermes_cli/profiles.py - #17080
  • hermes_cli/commands.py - #17054
  • run_agent.py - #17057, #17052
  • hermes_cli/gateway.py - #17049

Notas

  • Este PR contem 8 commits (7 issues + 1 fix de seguranca do upstream relacionado a redaction de secrets)
  • Todos os commits foram feitos na branch fix-7-issues-clean
  • Nenhum push intermediario foi feito - push unico ao final
  • Commits do upstream incluidos para manter o historico completo: #16843, #17041, #17039

Changed files

  • .gitignore (modified, +1/-0)
  • Dockerfile (modified, +6/-2)
  • acp_adapter/entry.py (modified, +11/-0)
  • acp_adapter/server.py (modified, +28/-1)
  • agent/anthropic_adapter.py (modified, +134/-74)
  • agent/auxiliary_client.py (modified, +325/-53)
  • agent/bedrock_adapter.py (modified, +41/-3)
  • agent/context_compressor.py (modified, +113/-5)
  • agent/credential_pool.py (modified, +82/-4)
  • agent/credential_sources.py (modified, +0/-1)
  • agent/error_classifier.py (modified, +32/-0)
  • agent/gemini_cloudcode_adapter.py (modified, +0/-2)
  • agent/gemini_schema.py (modified, +1/-1)
  • agent/google_code_assist.py (modified, +0/-1)
  • agent/google_oauth.py (modified, +3/-3)
  • agent/image_routing.py (added, +236/-0)
  • agent/memory_manager.py (modified, +113/-5)
  • agent/model_metadata.py (modified, +56/-21)
  • agent/nous_rate_guard.py (modified, +144/-1)
  • agent/onboarding.py (added, +191/-0)
  • agent/prompt_builder.py (modified, +38/-0)
  • agent/redact.py (modified, +13/-6)
  • agent/shell_hooks.py (modified, +7/-2)
  • agent/skill_commands.py (modified, +2/-2)
  • agent/title_generator.py (modified, +39/-5)
  • agent/transports/anthropic.py (modified, +1/-7)
  • agent/transports/chat_completions.py (modified, +74/-0)
  • agent/transports/codex.py (modified, +1/-3)
  • cli-config.yaml.example (modified, +28/-8)
  • cli.py (modified, +522/-195)
  • cron/jobs.py (modified, +34/-5)
  • cron/scheduler.py (modified, +39/-5)
  • docker/entrypoint.sh (modified, +9/-7)
  • flake.nix (modified, +1/-0)
  • gateway/channel_directory.py (modified, +67/-14)
  • gateway/config.py (modified, +84/-3)
  • gateway/display_config.py (modified, +3/-1)
  • gateway/mirror.py (modified, +57/-11)
  • gateway/pairing.py (modified, +2/-1)
  • gateway/platforms/__init__.py (modified, +2/-0)
  • gateway/platforms/base.py (modified, +233/-16)
  • gateway/platforms/discord.py (modified, +18/-24)
  • gateway/platforms/email.py (modified, +3/-0)
  • gateway/platforms/feishu_comment.py (modified, +0/-1)
  • gateway/platforms/helpers.py (modified, +11/-2)
  • gateway/platforms/matrix.py (modified, +493/-47)
  • gateway/platforms/mattermost.py (modified, +0/-1)
  • gateway/platforms/qqbot/adapter.py (modified, +2/-7)
  • gateway/platforms/slack.py (modified, +753/-70)
  • gateway/platforms/telegram.py (modified, +138/-14)
  • gateway/platforms/weixin.py (modified, +26/-3)
  • gateway/platforms/yuanbao.py (added, +4754/-0)
  • gateway/platforms/yuanbao_media.py (added, +645/-0)
  • gateway/platforms/yuanbao_proto.py (added, +1209/-0)
  • gateway/platforms/yuanbao_sticker.py (added, +558/-0)
  • gateway/run.py (modified, +1143/-283)
  • gateway/runtime_footer.py (added, +150/-0)
  • gateway/session.py (modified, +16/-21)
  • gateway/stream_consumer.py (modified, +110/-0)
  • gateway/whatsapp_identity.py (modified, +21/-1)
  • hermes_cli/auth.py (modified, +40/-4)
  • hermes_cli/azure_detect.py (modified, +1/-1)
  • hermes_cli/backup.py (modified, +272/-1)
  • hermes_cli/banner.py (modified, +0/-1)
  • hermes_cli/claw.py (modified, +67/-6)
  • hermes_cli/commands.py (modified, +119/-5)
  • hermes_cli/config.py (modified, +322/-29)
  • hermes_cli/debug.py (modified, +13/-7)
  • hermes_cli/dingtalk_auth.py (modified, +0/-1)
  • hermes_cli/doctor.py (modified, +11/-1)
  • hermes_cli/env_loader.py (modified, +2/-1)
  • hermes_cli/fallback_cmd.py (added, +361/-0)
  • hermes_cli/gateway.py (modified, +47/-12)
  • hermes_cli/hooks.py (modified, +1/-2)
  • hermes_cli/main.py (modified, +691/-58)
  • hermes_cli/model_catalog.py (added, +329/-0)
  • hermes_cli/model_switch.py (modified, +55/-6)
  • hermes_cli/models.py (modified, +251/-43)
  • hermes_cli/nous_subscription.py (modified, +16/-8)
  • hermes_cli/oneshot.py (modified, +28/-11)
  • hermes_cli/platforms.py (modified, +1/-0)
  • hermes_cli/plugins.py (modified, +14/-0)
  • hermes_cli/plugins_cmd.py (modified, +0/-1)
  • hermes_cli/profiles.py (modified, +199/-4)
  • hermes_cli/providers.py (modified, +26/-0)
  • hermes_cli/runtime_provider.py (modified, +100/-14)
  • hermes_cli/setup.py (modified, +69/-16)
  • hermes_cli/skills_hub.py (modified, +230/-20)
  • hermes_cli/slack_cli.py (added, +152/-0)
  • hermes_cli/status.py (modified, +3/-2)
  • hermes_cli/timeouts.py (modified, +4/-4)
  • hermes_cli/tips.py (modified, +2/-3)
  • hermes_cli/tools_config.py (modified, +173/-4)
  • hermes_cli/web_server.py (modified, +11/-14)
  • hermes_cli/webhook.py (modified, +2/-2)
  • hermes_logging.py (modified, +3/-4)
  • hermes_state.py (modified, +578/-164)
  • model_tools.py (modified, +45/-10)
  • nix/checks.nix (modified, +30/-3)
  • nix/hermes-agent.nix (added, +186/-0)

PR #17090: fix: resolve 7 identified issues [automated]

Description (problem / solution / changelog)

Resumo

Este PR corrige 7 issues identificados no repositório NousResearch/hermes-agent.


Issues Corrigidos

1. #17086 - custom endpoint com api_mode=anthropic_messages falhava com 404

Arquivo: agent/auxiliary_client.py Problema: Quando provider=custom com api_mode=anthropic_messages e base_url terminando em /anthropic, a funcao _resolve_provider_client() convertia a URL para o formato /v1/messages, causando 404 em provedores Anthropic-compatíveis de terceiros. Correcao: Mantem o path /anthropic quando api_mode=anthropic_messages e o base_url ja termina em /anthropic.


2. #17076 - kimi-coding vision quebrado (404 em analise de imagem)

Arquivo: agent/auxiliary_client.py Problema: kimi-coding nao estava listado em _PROVIDER_VISION_MODELS, entao auxiliary.vision.provider: auto nao conseguia detectar o modelo de visao disponivel para Kimi. Correcao: Adicionado kimi-coding e kimi-coding-cn ao mapa de modelos de visao.


3. #17080 - hermes profile create --clone copiava credenciais exclusivas de plataforma

Arquivo: hermes_cli/profiles.py Problema: O clone de perfil copiava TELEGRAM_BOT_TOKEN, DISCORD_BOT_TOKEN, WEIXIN_TOKEN verbatim. Quando dois perfis iniciam simultaneamente com o mesmo token, o adaptador de plataforma falha durante a aquisicao de lock. Correcao: Define _EXCLUSIVE_PLATFORM_KEYS e _EXCLUSIVE_PLATFORM_CONFIG_PATHS. Credenciais exclusivas sao comentadas em .env e entradas de plataforma sao enabled: false em config.yaml apos clone.


4. #17054 - Slack manifest rejeitava nomes com underscore

Arquivo: hermes_cli/commands.py Problema: _sanitize_slack_name() convertia nomes de comandos como _reload_mcp para Slack mas nao removia o prefixo underscore, causando rejeicao do manifest. Correcao: Adicionada verificacao para pular nomes que começam com underscore antes de adicionar a lista de slash commands.


5. #17057 - custom Kimi-compatible endpoint falhava apos tool call com thinking habilitado

Arquivo: run_agent.py Problema: _needs_kimi_tool_reasoning() só verificava hostnames oficiais (api.kimi.com, moonshot.ai, moonshot.cn). Endpoints Kimi-compatíveis customizados nao eram detectados. Correcao: Ampliada a verificacao para detectar endpoints customizados pela familia do modelo (nome contem "kimi" ou "k2") alem do hostname.


6. #17049 - UnicodeDecodeError no scan de processos Windows (wmic)

Arquivo: hermes_cli/gateway.py Problema: wmic emitia saida em encoding local do Windows (cp1252/utf-16), causando UnicodeDecodeError e AttributeError durante parsing. Correcao: O parsing agora usa errors=ignore no decode, tratando bytes invalidos como despreziveis.


7. #17052 - stale reasoning reutilizado quando turn atual nao tem reasoning_content

Arquivo: run_agent.py Problema: Mensagens de assistant com tool_calls e reasoning_content eram reutilizadas indevidamente em turns que nao tinham reasoning_content, causando confabulacoes em provedores como Qwen3.6:27b via Ollama. Correcao: O loop de replay agora detecta quando a mensagem atual e um assistant com tool_calls mas sem reasoning_content, e limpa msg[reasoning_content] e msg[reasoning] para evitar propagacao de estado de reasoning de turns anteriores.


Arquivos Modificados

  • agent/auxiliary_client.py - #17086, #17076
  • hermes_cli/profiles.py - #17080
  • hermes_cli/commands.py - #17054
  • run_agent.py - #17057, #17052
  • hermes_cli/gateway.py - #17049

Notas

  • Este PR contem 8 commits (7 issues + 1 fix de seguranca do upstream relacionado a redaction de secrets)
  • Todos os commits foram feitos na branch fix-7-issues-clean
  • Nenhum push intermediario foi feito - push unico ao final
  • Commits do upstream incluidos para manter o historico completo: #16843, #17041, #17039

Changed files

  • agent/auxiliary_client.py (modified, +36/-0)
  • agent/redact.py (modified, +6/-3)
  • gateway/run.py (modified, +10/-4)
  • hermes_cli/commands.py (modified, +3/-0)
  • hermes_cli/gateway.py (modified, +22/-8)
  • hermes_cli/main.py (modified, +18/-7)
  • hermes_cli/profiles.py (modified, +141/-2)
  • run_agent.py (modified, +33/-9)
  • tools/terminal_tool.py (modified, +10/-2)

Code Example

model:
  default: kimi-2.6
  provider: custom
  base_url: http://<custom-endpoint>
  api_mode: anthropic_messages

agent:
  reasoning_effort: medium

---

HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

---

{
  "messages": [
    { "role": "user", "content": "..." },
    {
      "role": "assistant",
      "content": [
        { "type": "tool_use", "name": "skill_view", "id": "..." }
      ]
    },
    {
      "role": "user",
      "content": [
        { "type": "tool_result", "tool_use_id": "...", "content": "..." }
      ]
    }
  ],
  "thinking": { "type": "enabled", "budget_tokens": 8000 }
}
RAW_BUFFERClick to expand / collapse

Title

custom Kimi-compatible endpoint with api_mode=anthropic_messages fails after tool call when thinking is enabled

Summary

When Hermes is configured with:

model:
  default: kimi-2.6
  provider: custom
  base_url: http://<custom-endpoint>
  api_mode: anthropic_messages

agent:
  reasoning_effort: medium

the next request after a tool call can fail with:

HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

Reproduction

  1. Use a custom Kimi-compatible endpoint with api_mode: anthropic_messages.
  2. Enable reasoning, for example agent.reasoning_effort: medium.
  3. Trigger any tool call.
  4. Let Hermes continue to the next model request.

Observed behavior

Hermes sends Anthropic-style thinking, but prior assistant tool-call history does not include reasoning_content, so the upstream endpoint rejects the request.

Example request shape:

{
  "messages": [
    { "role": "user", "content": "..." },
    {
      "role": "assistant",
      "content": [
        { "type": "tool_use", "name": "skill_view", "id": "..." }
      ]
    },
    {
      "role": "user",
      "content": [
        { "type": "tool_result", "tool_use_id": "...", "content": "..." }
      ]
    }
  ],
  "thinking": { "type": "enabled", "budget_tokens": 8000 }
}

Relevant code

There is already a regression test for this failure mode on official Kimi endpoints:

  • tests/agent/test_kimi_coding_anthropic_thinking.py

And the current special-case appears to be URL-specific in:

  • agent/anthropic_adapter.py

Problem

The current guard seems to only match official Kimi URLs such as https://api.kimi.com/coding....

It does not cover custom or proxied Kimi-compatible endpoints, even when they have the same behavior and reject Anthropic thinking unless assistant tool-call history includes reasoning_content.

Expected behavior

Hermes should suppress Anthropic thinking for Kimi-compatible custom endpoints as well, not only for the official Kimi URL pattern.

Suggested fix

Broaden the current Kimi special-case beyond strict URL matching. For example:

  • detect Kimi-compatible custom endpoints by model family plus api_mode=anthropic_messages
  • or use capability-based detection instead of hostname-only detection

extent analysis

TL;DR

The issue can be fixed by broadening the Kimi special-case to detect Kimi-compatible custom endpoints beyond strict URL matching.

Guidance

  • Modify the agent/anthropic_adapter.py file to detect Kimi-compatible custom endpoints by checking the model family and api_mode instead of relying on a specific URL pattern.
  • Update the regression test in tests/agent/test_kimi_coding_anthropic_thinking.py to cover custom Kimi-compatible endpoints.
  • Consider using capability-based detection instead of hostname-only detection to make the special-case more robust.
  • Verify that the fix works by testing the custom endpoint with api_mode=anthropic_messages and checking that Anthropic thinking is suppressed correctly.

Example

No code snippet is provided as the issue does not contain enough information to generate a specific code example.

Notes

The suggested fix may require additional changes to the agent/anthropic_adapter.py file and the regression test to ensure that the special-case is correctly applied to custom Kimi-compatible endpoints.

Recommendation

Apply a workaround by modifying the agent/anthropic_adapter.py file to detect Kimi-compatible custom endpoints, as the issue is specific to custom endpoints and the suggested fix is a reasonable solution.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Hermes should suppress Anthropic thinking for Kimi-compatible custom endpoints as well, not only for the official Kimi URL pattern.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix Custom Kimi-compatible endpoint with api_mode=anthropic_messages fails after tool call when thinking is enabled [3 pull requests, 1 comments, 2 participants]