hermes - ✅(Solved) Fix Kimi Coding: reasoning_config ignored due to _is_kimi_coding guard — thinking parameter never sent [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#18823Fetched 2026-05-03 04:54:03
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
labeled ×4cross-referenced ×1

When using kimi-for-coding with anthropic_messages transport and reasoning_effort: medium (or any reasoning config), the thinking parameter is never sent to the Kimi API. This happens because build_anthropic_kwargs() has an unconditional guard (not _is_kimi_coding) that skips the thinking block for all Kimi endpoints.

Error Message

The comment says this avoids "thinking is enabled but reasoning_content is missing in assistant tool call message" errors. However, this blanket ban prevents ALL reasoning, even for simple non-tool conversations where the error would not occur.

Root Cause

agent/anthropic_adapter.py:1869-1870:

_is_kimi_coding = _is_kimi_family_endpoint(base_url, model)
if reasoning_config and isinstance(reasoning_config, dict) and not _is_kimi_coding:
    # ^^^ this branch is skipped for Kimi

The comment says this avoids "thinking is enabled but reasoning_content is missing in assistant tool call message" errors. However, this blanket ban prevents ALL reasoning, even for simple non-tool conversations where the error would not occur.

Fix Action

Fixed

PR fix notes

PR #18898: Enable thinking/reasoning for kimi-for-coding via anthropic_messages

Description (problem / solution / changelog)

Summary

Enable reasoning/thinking for kimi-for-coding when using anthropic_messages transport. Closes #18823.

Problem

build_anthropic_kwargs() unconditionally skips the thinking parameter for all Kimi endpoints due to not _is_kimi_coding guard (line ~1869). This prevents reasoning from working even though Kimi's /coding endpoint fully supports Anthropic-style thinking with budget_tokens and returns signed thinking blocks.

Additionally, convert_messages_to_anthropic() strips all signed thinking blocks (including Kimi's own) because _preserve_unsigned_thinking assumes only unsigned blocks should be kept. Kimi generates and validates its own signatures; stripping them causes HTTP 400 on replay when tool-call messages lack reasoning_content.

Solution

Patch 1 — Narrow Kimi guard to non-/coding endpoints (anthropic_adapter.py)

Change the blanket not _is_kimi_coding guard (which matched ALL Kimi-family endpoints including Moonshot and custom proxies) to a precise not _is_kimi_non_coding check:

_is_kimi_non_coding = _is_kimi_family_endpoint(base_url, model) and not _is_kimi_coding_endpoint(base_url)
if reasoning_config and isinstance(reasoning_config, dict) and not _is_kimi_non_coding:

Rationale: Kimi /coding supports thinking parameter and validates its own signatures server-side. The original guard blocked reasoning for ALL Kimi-family endpoints, including /coding where it actually works. The new guard:

  • Allows thinking for api.kimi.com/coding (tested via direct API calls + pytest)
  • Preserves the old skip behavior for api.kimi.com/v1, Moonshot, custom proxies, and any other Kimi-family endpoint that does NOT match /coding

Patch 2 — Preserve Kimi-signed thinking blocks (anthropic_adapter.py)

In _preserve_unsigned_thinking branch, keep signed blocks when endpoint is Kimi /coding:

if b.get("signature") or b.get("data"):
    if _is_kimi_coding_endpoint(base_url):
        new_content.append(b)  # Kimi validates its own signatures
    continue  # DeepSeek/others: strip (can't validate Anthropic sigs)

Rationale: Kimi /coding generates ~12946-char signatures and validates them server-side. Using _is_kimi_coding_endpoint(base_url) (not the broader _is_kimi_family_endpoint) keeps the scope minimal and zero-risk for DeepSeek, Moonshot, or any other provider. Unsigned blocks are still preserved for all providers.

Note on the original comment

The codebase currently contains the following comment (lines ~1854-1865):

# We have to skip `thinking` for kimi because it doesn't work
# due to Kimi using anthropic as an intermediary and not supporting
# thinking signatures.  While DeepSeek will validate thinking signatures
# for DeepSeek reasoning models, Kimi does not support it, leading to
# HTTP 400 errors when thinking is enabled but reasoning_content is
# missing.  Kimi's reasoning is driven server-side on the /coding route,
# so skip Anthropic's thinking parameter entirely for that host.

This comment accurately described the original problem: unsigned Anthropic signatures caused HTTP 400 on tool-call replays because reasoning_content was missing. However, the comment became outdated after upstream commit 76edc40ab added _copy_reasoning_content_for_api() to inject reasoning_content: " " for Kimi tool-call messages. The comment now incorrectly justifies disabling reasoning entirely, even for simple conversations where the signature issue does not apply.

This PR addresses the root cause: by preserving Kimi's own signed blocks (Patch 2) and allowing the thinking parameter (Patch 1), the adapter no longer triggers the missing-reasoning_content error that the original comment warned about.

Recommended comment update (included in this PR):

# We used to skip `thinking` for all Kimi endpoints because unsigned
# Anthropic signatures triggered HTTP 400 on tool-call replays.
# This is now fixed upstream: run_agent.py commit `76edc40ab`
# injects `reasoning_content: " "` for Kimi tool-call messages,
# and `convert_messages_to_anthropic` preserves Kimi's own signed
# thinking blocks. Therefore we allow the `thinking` parameter
# for Kimi /coding when `reasoning_config` is provided.

What already works in upstream

run_agent.py commit 76edc40ab (2026-04-30) added _needs_kimi_tool_reasoning() and _copy_reasoning_content_for_api() which inject reasoning_content: " " for Kimi tool-call messages. This PR unblocks the adapter side so those fixes actually receive thinking responses to echo back.

Testing performed

Manual / CLI tests

20+ scenarios tested across two sessions (2026-05-02):

ScenarioStatus
Simple reasoning, no tools
Tool call + reasoning in live CLI
Chain of 4 tool calls
Long session (~15 messages)
compress + reasoning✅ (5 tests)
resume (hermes -c) + reasoning
reasoning_effort: low (budget_tokens: 4000)
show_reasoning: false
streaming: true
Cross-provider (RouterAI / qwen3.6-plus)✅ Partial — no regression observed

Unit tests (tests/agent/test_kimi_coding_anthropic_thinking.py)

pytest 19/19 passed (2026-05-02).

TestStatusNote
test_kimi_coding_endpoint_includes_thinking🆕 ModifiedRenamed from test_kimi_coding_endpoint_omits_thinking; assertion flipped from not in to in because /coding now receives thinking
test_kimi_coding_preserves_signed_thinking_blocks🆕 NewSigned thinking blocks with signature survive convert_messages_to_anthropic on /coding
test_kimi_coding_thinking_with_tool_call_replay🆕 Newreasoning_content + tool call round-trip works without losing the thinking block
test_kimi_coding_with_explicit_disabled_also_omits✅ Unchangedenabled: false still skips thinking
test_non_kimi_third_party_still_gets_thinking (MiniMax)✅ UnchangedNon-Kimi third-party still receives thinking
test_native_anthropic_still_gets_thinking✅ UnchangedNative Anthropic still receives thinking
test_kimi_root_endpoint_via_anthropic_transport_omits_thinking (api.kimi.com/v1)✅ UnchangedNon-/coding Kimi host still skips thinking
test_kimi_family_custom_endpoint_omits_thinking (7 param combos: proxy, Moonshot, etc.)✅ UnchangedCustom / proxied Kimi-family endpoints still skip thinking
test_custom_endpoint_non_kimi_model_keeps_thinking✅ UnchangedNon-Kimi model on custom proxy still gets thinking
test_kimi_family_replay_preserves_unsigned_thinking✅ UnchangedUnsigned reasoning_content blocks still survive for all Kimi-family endpoints

Known limitations

  • DeepSeek/Anthropic regression: Zero risk. Patch 2 uses _is_kimi_coding_endpoint(base_url) which matches only https://api.kimi.com/coding. DeepSeek (api.deepseek.com), Moonshot, Anthropic, and all other providers fall through to the original behavior — signed blocks are stripped as before.
  • high effort and enabled: false not explicitly tested (low effort thoroughly validated).

Checklist

  • Kimi /coding direct API accepts thinking parameter (curl verified)
  • Signed blocks survive round-trip (Kimi validates its own signatures)
  • Tool-call chains work without HTTP 400
  • compress and resume preserve reasoning state
  • Zero-risk for non-Kimi providers (scope limited to _is_kimi_coding_endpoint)
  • Unit tests updated for convert_messages_to_anthropic() — pytest 19/19 passed

Changed files

  • agent/anthropic_adapter.py (modified, +29/-20)
  • tests/agent/test_kimi_coding_anthropic_thinking.py (modified, +91/-22)

Code Example

[anthropic_adapter] build_anthropic_kwargs: model=kimi-for-coding, base_url=https://api.kimi.com/coding, _is_kimi_coding=True, reasoning_config={'enabled': True, 'effort': 'medium'}, kwargs_keys=['model', 'messages', 'max_tokens'], thinking=None, output_config=None

---

POST https://api.kimi.com/coding/v1/messages
{"max_tokens": 100, "messages": [{"role": "user", "content": "Say hello"}], "model": "kimi-for-coding"}

---

curl -H "Authorization: Bearer $KEY" -H "anthropic-version: 2023-06-01" \
  -d '{"model":"kimi-for-coding","messages":[{"role":"user","content":"Say hello"}],"max_tokens":100,"thinking":{"type":"enabled","budget_tokens":8000}}' \
  https://api.kimi.com/coding/v1/messages

---

_is_kimi_coding = _is_kimi_family_endpoint(base_url, model)
if reasoning_config and isinstance(reasoning_config, dict) and not _is_kimi_coding:
    # ^^^ this branch is skipped for Kimi
RAW_BUFFERClick to expand / collapse

Kimi Coding: reasoning_config ignored due to _is_kimi_coding guard — thinking parameter never sent

Summary

When using kimi-for-coding with anthropic_messages transport and reasoning_effort: medium (or any reasoning config), the thinking parameter is never sent to the Kimi API. This happens because build_anthropic_kwargs() has an unconditional guard (not _is_kimi_coding) that skips the thinking block for all Kimi endpoints.

Expected Behavior

With reasoning_effort: medium in config, API requests to Kimi Coding should include thinking: {type: "enabled", budget_tokens: 8000} (or adaptive thinking for supported models), and the response should contain reasoning blocks.

Actual Behavior

The thinking parameter is stripped. Only model, messages, max_tokens are sent. Response contains only text blocks, no reasoning.

Evidence

1. Debug log from build_anthropic_kwargs

[anthropic_adapter] build_anthropic_kwargs: model=kimi-for-coding, base_url=https://api.kimi.com/coding, _is_kimi_coding=True, reasoning_config={'enabled': True, 'effort': 'medium'}, kwargs_keys=['model', 'messages', 'max_tokens'], thinking=None, output_config=None

2. Actual HTTP request (from anthropic SDK debug)

POST https://api.kimi.com/coding/v1/messages
{"max_tokens": 100, "messages": [{"role": "user", "content": "Say hello"}], "model": "kimi-for-coding"}

No thinking, no output_config.

3. Kimi API DOES support thinking

Direct curl test confirms:

curl -H "Authorization: Bearer $KEY" -H "anthropic-version: 2023-06-01" \
  -d '{"model":"kimi-for-coding","messages":[{"role":"user","content":"Say hello"}],"max_tokens":100,"thinking":{"type":"enabled","budget_tokens":8000}}' \
  https://api.kimi.com/coding/v1/messages

Returns HTTP 200 with thinking block containing reasoning_content.

Root Cause

agent/anthropic_adapter.py:1869-1870:

_is_kimi_coding = _is_kimi_family_endpoint(base_url, model)
if reasoning_config and isinstance(reasoning_config, dict) and not _is_kimi_coding:
    # ^^^ this branch is skipped for Kimi

The comment says this avoids "thinking is enabled but reasoning_content is missing in assistant tool call message" errors. However, this blanket ban prevents ALL reasoning, even for simple non-tool conversations where the error would not occur.

Proposed Fix

Make the guard conditional — only skip thinking when there are prior assistant tool-call messages in the conversation history that lack reasoning_content. For fresh conversations or non-tool turns, allow thinking.

extent analysis

TL;DR

Modify the _is_kimi_coding guard in agent/anthropic_adapter.py to conditionally skip thinking based on conversation history.

Guidance

  • Review the conversation history to determine if there are prior assistant tool-call messages lacking reasoning_content before skipping the thinking block.
  • Update the build_anthropic_kwargs() function to include the thinking parameter when the conversation history allows it.
  • Verify the fix by checking the HTTP requests sent to the Kimi API for the presence of the thinking parameter.
  • Test the modified code with different conversation scenarios to ensure the thinking block is included correctly.

Example

if reasoning_config and isinstance(reasoning_config, dict):
    has_tool_call_without_reasoning = False
    for message in conversation_history:
        if message.get('role') == 'assistant' and message.get('content').get('tool'):
            if not message.get('reasoning_content'):
                has_tool_call_without_reasoning = True
                break
    if not has_tool_call_without_reasoning:
        # include thinking block
        kwargs['thinking'] = {'type': 'enabled', 'budget_tokens': 8000}

Notes

The proposed fix assumes that the conversation history is accessible and can be checked for prior assistant tool-call messages. The exact implementation may vary depending on the specific requirements and constraints of the project.

Recommendation

Apply the workaround by modifying the _is_kimi_coding guard to conditionally skip thinking based on conversation history, as this allows for more flexible and accurate control over when the thinking block is included.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix Kimi Coding: reasoning_config ignored due to _is_kimi_coding guard — thinking parameter never sent [1 pull requests, 1 participants]