hermes - ✅(Solved) Fix Custom Kimi-compatible endpoint with api_mode=anthropic_messages fails after tool call when thinking is enabled [3 pull requests, 1 comments, 2 participants]

Q: Expected behavior

Hermes should suppress Anthropic `thinking` for Kimi-compatible custom endpoints as well, not only for the official Kimi URL pattern.

hermes2026-04-28 14:54:02

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#17057•Fetched 2026-04-29 06:37:31

View on GitHub

Comments

Participants

Timeline

Reactions

Author

imkenf

Participants

0xsir0000

imkenf

Timeline (top)

labeled ×4cross-referenced ×3commented ×1

When Hermes is configured with:

model:
  default: kimi-2.6
  provider: custom
  base_url: http://<custom-endpoint>
  api_mode: anthropic_messages

agent:
  reasoning_effort: medium

the next request after a tool call can fail with:

HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

Error Message

Hermes sends Anthropic-style thinking, but prior assistant tool-call history does not include reasoning_content, so the upstream endpoint rejects the request.

Root Cause

When Hermes is configured with:

model:
  default: kimi-2.6
  provider: custom
  base_url: http://<custom-endpoint>
  api_mode: anthropic_messages

agent:
  reasoning_effort: medium

the next request after a tool call can fail with:

HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

Fix Action

Fixed

Fixed by PR: fix(anthropic): suppress thinking on Kimi-compatible custom endpoints (#17057) (https://github.com/NousResearch/hermes-agent/pull/17065)
Fixed by PR: fix: resolve 7 identified issues [automated] (https://github.com/Sldark23/hermes-agent/pull/1)
Fixed by PR: fix: resolve 7 identified issues [automated] (https://github.com/NousResearch/hermes-agent/pull/17090)

PR fix notes

PR #17065: fix(anthropic): suppress thinking on Kimi-compatible custom endpoints (#17057)

Repository: NousResearch/hermes-agent
Author: 0xsir0000
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/17065

Description (problem / solution / changelog)

Summary

`build_anthropic_kwargs` skips Anthropic `thinking` for Kimi's official `/coding` endpoint via `_is_kimi_coding_endpoint(base_url)`, because Kimi validates the message history when thinking is enabled and requires every assistant tool-call to carry OpenAI-style `reasoning_content` — which the Anthropic path never populates. Without that suppression, the turn after any tool call fails with:

HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index N

That guard only matched the literal `api.kimi.com/coding` URL. Users running Hermes against a proxied or self-hosted Kimi deployment (same wire protocol, different `base_url`) hit the same 400 and are forced to disable `reasoning_effort` as a workaround. Reported in #17057.

Fix

Add `_is_kimi_anthropic_compat(base_url, model)` that returns True for either:

the official `api.kimi.com/coding` URL (existing behaviour, preserved), or
any non-empty `base_url` paired with a Kimi-family model name (`"kimi" in model.lower()`)

Native Anthropic (no `base_url`) is unaffected — both signals are required, so a fictional model name containing "kimi" on a direct Anthropic call still gets thinking. Non-Kimi models on custom endpoints (MiniMax, etc.) keep their thinking parameter.

Behaviour matrix

base_url	model	Before	After
`api.kimi.com/coding`	kimi-*	thinking omitted	thinking omitted (unchanged)
`http://kimi-proxy.local\`	kimi-2.6	thinking sent → 400	thinking omitted ✓
`http://anthropic-proxy.local\`	MiniMax-M2.7	thinking sent	thinking sent (unchanged)
`None` (native Anthropic)	claude-sonnet-4-*	thinking sent	thinking sent (unchanged)
`api.kimi.com/v1`	kimi-k2.5	thinking sent (paranoid path)	thinking omitted (now consistent)

Test plan

`tests/agent/test_kimi_coding_anthropic_thinking.py` — 14 tests pass (10 existing + 5 new in `TestKimiCustomEndpointSkipsAnthropicThinking` + 1 retargeted)
`tests/agent/test_anthropic_adapter.py` (kimi/thinking subset) — 14 tests pass

The `test_kimi_root_endpoint_unaffected` paranoid test is retargeted to assert the new (consistent) behaviour: if a Kimi-family model reaches `build_anthropic_kwargs`, suppress thinking regardless of whether the URL is `/coding` or `/v1`. The wire failure mode follows the protocol, not the URL path.

Fixes #17057

Changed files

agent/anthropic_adapter.py (modified, +35/-1)
tests/agent/test_kimi_coding_anthropic_thinking.py (modified, +75/-7)

PR #1: fix: resolve 7 identified issues [automated]

Repository: Sldark23/hermes-agent
Author: Sldark23
State: closed | merged: False
Link: https://github.com/Sldark23/hermes-agent/pull/1

Description (problem / solution / changelog)

Resumo

Este PR corrige 7 issues identificados no repositório NousResearch/hermes-agent.

Issues Corrigidos

1. #17086 - custom endpoint com api_mode=anthropic_messages falhava com 404

Arquivo: agent/auxiliary_client.py Problema: Quando provider=custom com api_mode=anthropic_messages e base_url terminando em /anthropic, a funcao _resolve_provider_client() convertia a URL para o formato /v1/messages, causando 404 em provedores Anthropic-compatíveis de terceiros. Correcao: Mantem o path /anthropic quando api_mode=anthropic_messages e o base_url ja termina em /anthropic.

2. #17076 - kimi-coding vision quebrado (404 em analise de imagem)

Arquivo: agent/auxiliary_client.py Problema: kimi-coding nao estava listado em _PROVIDER_VISION_MODELS, entao auxiliary.vision.provider: auto nao conseguia detectar o modelo de visao disponivel para Kimi. Correcao: Adicionado kimi-coding e kimi-coding-cn ao mapa de modelos de visao.

3. #17080 - hermes profile create --clone copiava credenciais exclusivas de plataforma

Arquivo: hermes_cli/profiles.py Problema: O clone de perfil copiava TELEGRAM_BOT_TOKEN, DISCORD_BOT_TOKEN, WEIXIN_TOKEN verbatim. Quando dois perfis iniciam simultaneamente com o mesmo token, o adaptador de plataforma falha durante a aquisicao de lock. Correcao: Define _EXCLUSIVE_PLATFORM_KEYS e _EXCLUSIVE_PLATFORM_CONFIG_PATHS. Credenciais exclusivas sao comentadas em .env e entradas de plataforma sao enabled: false em config.yaml apos clone.

4. #17054 - Slack manifest rejeitava nomes com underscore

Arquivo: hermes_cli/commands.py Problema: _sanitize_slack_name() convertia nomes de comandos como _reload_mcp para Slack mas nao removia o prefixo underscore, causando rejeicao do manifest. Correcao: Adicionada verificacao para pular nomes que começam com underscore antes de adicionar a lista de slash commands.

5. #17057 - custom Kimi-compatible endpoint falhava apos tool call com thinking habilitado

Arquivo: run_agent.py Problema: _needs_kimi_tool_reasoning() só verificava hostnames oficiais (api.kimi.com, moonshot.ai, moonshot.cn). Endpoints Kimi-compatíveis customizados nao eram detectados. Correcao: Ampliada a verificacao para detectar endpoints customizados pela familia do modelo (nome contem "kimi" ou "k2") alem do hostname.

6. #17049 - UnicodeDecodeError no scan de processos Windows (wmic)

Arquivo: hermes_cli/gateway.py Problema: wmic emitia saida em encoding local do Windows (cp1252/utf-16), causando UnicodeDecodeError e AttributeError durante parsing. Correcao: O parsing agora usa errors=ignore no decode, tratando bytes invalidos como despreziveis.

7. #17052 - stale reasoning reutilizado quando turn atual nao tem reasoning_content

Arquivo: run_agent.py Problema: Mensagens de assistant com tool_calls e reasoning_content eram reutilizadas indevidamente em turns que nao tinham reasoning_content, causando confabulacoes em provedores como Qwen3.6:27b via Ollama. Correcao: O loop de replay agora detecta quando a mensagem atual e um assistant com tool_calls mas sem reasoning_content, e limpa msg[reasoning_content] e msg[reasoning] para evitar propagacao de estado de reasoning de turns anteriores.

Arquivos Modificados

agent/auxiliary_client.py - #17086, #17076
hermes_cli/profiles.py - #17080
hermes_cli/commands.py - #17054
run_agent.py - #17057, #17052
hermes_cli/gateway.py - #17049

Notas

Este PR contem 8 commits (7 issues + 1 fix de seguranca do upstream relacionado a redaction de secrets)
Todos os commits foram feitos na branch fix-7-issues-clean
Nenhum push intermediario foi feito - push unico ao final
Commits do upstream incluidos para manter o historico completo: #16843, #17041, #17039

Changed files

.gitignore (modified, +1/-0)
Dockerfile (modified, +6/-2)
acp_adapter/entry.py (modified, +11/-0)
acp_adapter/server.py (modified, +28/-1)
agent/anthropic_adapter.py (modified, +134/-74)
agent/auxiliary_client.py (modified, +325/-53)
agent/bedrock_adapter.py (modified, +41/-3)
agent/context_compressor.py (modified, +113/-5)
agent/credential_pool.py (modified, +82/-4)
agent/credential_sources.py (modified, +0/-1)
agent/error_classifier.py (modified, +32/-0)
agent/gemini_cloudcode_adapter.py (modified, +0/-2)
agent/gemini_schema.py (modified, +1/-1)
agent/google_code_assist.py (modified, +0/-1)
agent/google_oauth.py (modified, +3/-3)
agent/image_routing.py (added, +236/-0)
agent/memory_manager.py (modified, +113/-5)
agent/model_metadata.py (modified, +56/-21)
agent/nous_rate_guard.py (modified, +144/-1)
agent/onboarding.py (added, +191/-0)
agent/prompt_builder.py (modified, +38/-0)
agent/redact.py (modified, +13/-6)
agent/shell_hooks.py (modified, +7/-2)
agent/skill_commands.py (modified, +2/-2)
agent/title_generator.py (modified, +39/-5)
agent/transports/anthropic.py (modified, +1/-7)
agent/transports/chat_completions.py (modified, +74/-0)
agent/transports/codex.py (modified, +1/-3)
cli-config.yaml.example (modified, +28/-8)
cli.py (modified, +522/-195)
cron/jobs.py (modified, +34/-5)
cron/scheduler.py (modified, +39/-5)
docker/entrypoint.sh (modified, +9/-7)
flake.nix (modified, +1/-0)
gateway/channel_directory.py (modified, +67/-14)
gateway/config.py (modified, +84/-3)
gateway/display_config.py (modified, +3/-1)
gateway/mirror.py (modified, +57/-11)
gateway/pairing.py (modified, +2/-1)
gateway/platforms/__init__.py (modified, +2/-0)
gateway/platforms/base.py (modified, +233/-16)
gateway/platforms/discord.py (modified, +18/-24)
gateway/platforms/email.py (modified, +3/-0)
gateway/platforms/feishu_comment.py (modified, +0/-1)
gateway/platforms/helpers.py (modified, +11/-2)
gateway/platforms/matrix.py (modified, +493/-47)
gateway/platforms/mattermost.py (modified, +0/-1)
gateway/platforms/qqbot/adapter.py (modified, +2/-7)
gateway/platforms/slack.py (modified, +753/-70)
gateway/platforms/telegram.py (modified, +138/-14)
gateway/platforms/weixin.py (modified, +26/-3)
gateway/platforms/yuanbao.py (added, +4754/-0)
gateway/platforms/yuanbao_media.py (added, +645/-0)
gateway/platforms/yuanbao_proto.py (added, +1209/-0)
gateway/platforms/yuanbao_sticker.py (added, +558/-0)
gateway/run.py (modified, +1143/-283)
gateway/runtime_footer.py (added, +150/-0)
gateway/session.py (modified, +16/-21)
gateway/stream_consumer.py (modified, +110/-0)
gateway/whatsapp_identity.py (modified, +21/-1)
hermes_cli/auth.py (modified, +40/-4)
hermes_cli/azure_detect.py (modified, +1/-1)
hermes_cli/backup.py (modified, +272/-1)
hermes_cli/banner.py (modified, +0/-1)
hermes_cli/claw.py (modified, +67/-6)
hermes_cli/commands.py (modified, +119/-5)
hermes_cli/config.py (modified, +322/-29)
hermes_cli/debug.py (modified, +13/-7)
hermes_cli/dingtalk_auth.py (modified, +0/-1)
hermes_cli/doctor.py (modified, +11/-1)
hermes_cli/env_loader.py (modified, +2/-1)
hermes_cli/fallback_cmd.py (added, +361/-0)
hermes_cli/gateway.py (modified, +47/-12)
hermes_cli/hooks.py (modified, +1/-2)
hermes_cli/main.py (modified, +691/-58)
hermes_cli/model_catalog.py (added, +329/-0)
hermes_cli/model_switch.py (modified, +55/-6)
hermes_cli/models.py (modified, +251/-43)
hermes_cli/nous_subscription.py (modified, +16/-8)
hermes_cli/oneshot.py (modified, +28/-11)
hermes_cli/platforms.py (modified, +1/-0)
hermes_cli/plugins.py (modified, +14/-0)
hermes_cli/plugins_cmd.py (modified, +0/-1)
hermes_cli/profiles.py (modified, +199/-4)
hermes_cli/providers.py (modified, +26/-0)
hermes_cli/runtime_provider.py (modified, +100/-14)
hermes_cli/setup.py (modified, +69/-16)
hermes_cli/skills_hub.py (modified, +230/-20)
hermes_cli/slack_cli.py (added, +152/-0)
hermes_cli/status.py (modified, +3/-2)
hermes_cli/timeouts.py (modified, +4/-4)
hermes_cli/tips.py (modified, +2/-3)
hermes_cli/tools_config.py (modified, +173/-4)
hermes_cli/web_server.py (modified, +11/-14)
hermes_cli/webhook.py (modified, +2/-2)
hermes_logging.py (modified, +3/-4)
hermes_state.py (modified, +578/-164)
model_tools.py (modified, +45/-10)
nix/checks.nix (modified, +30/-3)
nix/hermes-agent.nix (added, +186/-0)

PR #17090: fix: resolve 7 identified issues [automated]

Repository: NousResearch/hermes-agent
Author: Sldark23
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/17090

Description (problem / solution / changelog)

Resumo

Este PR corrige 7 issues identificados no repositório NousResearch/hermes-agent.

Issues Corrigidos

1. #17086 - custom endpoint com api_mode=anthropic_messages falhava com 404

2. #17076 - kimi-coding vision quebrado (404 em analise de imagem)

3. #17080 - hermes profile create --clone copiava credenciais exclusivas de plataforma

4. #17054 - Slack manifest rejeitava nomes com underscore

5. #17057 - custom Kimi-compatible endpoint falhava apos tool call com thinking habilitado

6. #17049 - UnicodeDecodeError no scan de processos Windows (wmic)

7. #17052 - stale reasoning reutilizado quando turn atual nao tem reasoning_content

Arquivos Modificados

agent/auxiliary_client.py - #17086, #17076
hermes_cli/profiles.py - #17080
hermes_cli/commands.py - #17054
run_agent.py - #17057, #17052
hermes_cli/gateway.py - #17049

Notas

Este PR contem 8 commits (7 issues + 1 fix de seguranca do upstream relacionado a redaction de secrets)
Todos os commits foram feitos na branch fix-7-issues-clean
Nenhum push intermediario foi feito - push unico ao final
Commits do upstream incluidos para manter o historico completo: #16843, #17041, #17039

Changed files

agent/auxiliary_client.py (modified, +36/-0)
agent/redact.py (modified, +6/-3)
gateway/run.py (modified, +10/-4)
hermes_cli/commands.py (modified, +3/-0)
hermes_cli/gateway.py (modified, +22/-8)
hermes_cli/main.py (modified, +18/-7)
hermes_cli/profiles.py (modified, +141/-2)
run_agent.py (modified, +33/-9)
tools/terminal_tool.py (modified, +10/-2)

Code Example

model:
  default: kimi-2.6
  provider: custom
  base_url: http://<custom-endpoint>
  api_mode: anthropic_messages

agent:
  reasoning_effort: medium

---

HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

---

{
  "messages": [
    { "role": "user", "content": "..." },
    {
      "role": "assistant",
      "content": [
        { "type": "tool_use", "name": "skill_view", "id": "..." }
      ]
    },
    {
      "role": "user",
      "content": [
        { "type": "tool_result", "tool_use_id": "...", "content": "..." }
      ]
    }
  ],
  "thinking": { "type": "enabled", "budget_tokens": 8000 }
}

RAW_BUFFERClick to expand / collapse

Title

custom Kimi-compatible endpoint with api_mode=anthropic_messages fails after tool call when thinking is enabled

Summary

When Hermes is configured with:

model:
  default: kimi-2.6
  provider: custom
  base_url: http://<custom-endpoint>
  api_mode: anthropic_messages

agent:
  reasoning_effort: medium

the next request after a tool call can fail with:

HTTP 400: thinking is enabled but reasoning_content is missing in assistant tool call message at index 2

Reproduction

Use a custom Kimi-compatible endpoint with api_mode: anthropic_messages.
Enable reasoning, for example agent.reasoning_effort: medium.
Trigger any tool call.
Let Hermes continue to the next model request.

Observed behavior

Hermes sends Anthropic-style thinking, but prior assistant tool-call history does not include reasoning_content, so the upstream endpoint rejects the request.

Example request shape:

{
  "messages": [
    { "role": "user", "content": "..." },
    {
      "role": "assistant",
      "content": [
        { "type": "tool_use", "name": "skill_view", "id": "..." }
      ]
    },
    {
      "role": "user",
      "content": [
        { "type": "tool_result", "tool_use_id": "...", "content": "..." }
      ]
    }
  ],
  "thinking": { "type": "enabled", "budget_tokens": 8000 }
}

Relevant code

There is already a regression test for this failure mode on official Kimi endpoints:

tests/agent/test_kimi_coding_anthropic_thinking.py

And the current special-case appears to be URL-specific in:

agent/anthropic_adapter.py

Problem

The current guard seems to only match official Kimi URLs such as https://api.kimi.com/coding....

It does not cover custom or proxied Kimi-compatible endpoints, even when they have the same behavior and reject Anthropic thinking unless assistant tool-call history includes reasoning_content.

Expected behavior

Hermes should suppress Anthropic thinking for Kimi-compatible custom endpoints as well, not only for the official Kimi URL pattern.

Suggested fix

Broaden the current Kimi special-case beyond strict URL matching. For example:

detect Kimi-compatible custom endpoints by model family plus api_mode=anthropic_messages
or use capability-based detection instead of hostname-only detection

extent analysis

TL;DR

The issue can be fixed by broadening the Kimi special-case to detect Kimi-compatible custom endpoints beyond strict URL matching.

Guidance

Modify the agent/anthropic_adapter.py file to detect Kimi-compatible custom endpoints by checking the model family and api_mode instead of relying on a specific URL pattern.
Update the regression test in tests/agent/test_kimi_coding_anthropic_thinking.py to cover custom Kimi-compatible endpoints.
Consider using capability-based detection instead of hostname-only detection to make the special-case more robust.
Verify that the fix works by testing the custom endpoint with api_mode=anthropic_messages and checking that Anthropic thinking is suppressed correctly.

Example

No code snippet is provided as the issue does not contain enough information to generate a specific code example.

Notes

The suggested fix may require additional changes to the agent/anthropic_adapter.py file and the regression test to ensure that the special-case is correctly applied to custom Kimi-compatible endpoints.

Recommendation

Apply a workaround by modifying the agent/anthropic_adapter.py file to detect Kimi-compatible custom endpoints, as the issue is specific to custom endpoints and the suggested fix is a reasonable solution.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Hermes should suppress Anthropic thinking for Kimi-compatible custom endpoints as well, not only for the official Kimi URL pattern.

#api #training loop #device allocation #model download #tokenizer error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix Custom Kimi-compatible endpoint with api_mode=anthropic_messages fails after tool call when thinking is enabled [3 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #17065: fix(anthropic): suppress thinking on Kimi-compatible custom endpoints (#17057)

Description (problem / solution / changelog)

Summary

Fix

Behaviour matrix

Test plan

Changed files

PR #1: fix: resolve 7 identified issues [automated]

Description (problem / solution / changelog)

Resumo

Issues Corrigidos

1. #17086 - custom endpoint com api_mode=anthropic_messages falhava com 404

2. #17076 - kimi-coding vision quebrado (404 em analise de imagem)

3. #17080 - hermes profile create --clone copiava credenciais exclusivas de plataforma

4. #17054 - Slack manifest rejeitava nomes com underscore

5. #17057 - custom Kimi-compatible endpoint falhava apos tool call com thinking habilitado

6. #17049 - UnicodeDecodeError no scan de processos Windows (wmic)

7. #17052 - stale reasoning reutilizado quando turn atual nao tem reasoning_content

Arquivos Modificados

Notas

Changed files

PR #17090: fix: resolve 7 identified issues [automated]

Description (problem / solution / changelog)

Resumo

Issues Corrigidos

1. #17086 - custom endpoint com api_mode=anthropic_messages falhava com 404

2. #17076 - kimi-coding vision quebrado (404 em analise de imagem)

3. #17080 - hermes profile create --clone copiava credenciais exclusivas de plataforma

4. #17054 - Slack manifest rejeitava nomes com underscore

5. #17057 - custom Kimi-compatible endpoint falhava apos tool call com thinking habilitado

6. #17049 - UnicodeDecodeError no scan de processos Windows (wmic)

7. #17052 - stale reasoning reutilizado quando turn atual nao tem reasoning_content

Arquivos Modificados

Notas

Changed files

Code Example

Title

Summary

Reproduction

Observed behavior

Relevant code

Problem

Expected behavior

Suggested fix

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING