hermes - ✅(Solved) Fix DeepSeek /anthropic (V4 thinking): stripped thinking blocks cause HTTP 400 on replay [1 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#16748Fetched 2026-04-28 06:51:00
View on GitHub
Comments
2
Participants
3
Timeline
7
Reactions
0
Author
Timeline (top)
labeled ×4commented ×2cross-referenced ×1

When using DeepSeek via the Anthropic Messages-compatible endpoint (https://api.deepseek.com/anthropic) with thinking-capable models (e.g. deepseek-v4-pro), Hermes can trigger HTTP 400 from DeepSeek with an error equivalent to:

The content[].thinking in the thinking mode must be passed back to the API.

Error Message

When using DeepSeek via the Anthropic Messages-compatible endpoint (https://api.deepseek.com/anthropic) with thinking-capable models (e.g. deepseek-v4-pro), Hermes can trigger HTTP 400 from DeepSeek with an error equivalent to:

Root Cause

Root cause (hypothesis)

Fix Action

Fix / Workaround

Workarounds (today)

Happy to help validate a patch or add a regression test if you want a contributor to pick this up.

PR fix notes

PR #16781: fix: preserve DeepSeek thinking blocks on Anthropic replay (#16748)

Description (problem / solution / changelog)

Summary

When using DeepSeek via their Anthropic Messages-compatible endpoint (api.deepseek.com/anthropic) with thinking-capable models, Hermes strips ALL thinking blocks from assistant history on replay, causing DeepSeek to return HTTP 400 because it expects thinking context from prior turns.

Root Cause

_is_third_party_anthropic_endpoint() returns True for api.deepseek.com (it's not anthropic.com), causing the thinking-block stripping logic to remove all thinking/redacted_thinking blocks unconditionally. DeepSeek's endpoint, unlike Azure/Bedrock/MiniMax, requires these blocks for reasoning continuity.

Fix

Added _is_deepseek_anthropic_endpoint() detector and a dedicated branch in the thinking-block processing logic that:

  • Strips signed Anthropic blocks (DeepSeek can't validate signatures)
  • Preserves unsigned thinking blocks synthesised from reasoning_content

This follows the exact same pattern as the Kimi exemption (added via issue #13848) and does not change behavior for any other third-party endpoint.

File changed: agent/anthropic_adapter.py (+29 lines)

Key Code Changes

  1. New detector function:
def _is_deepseek_anthropic_endpoint(base_url):
    return "api.deepseek.com" in normalized
  1. New branch in thinking-block logic (after Kimi, before generic third-party):
elif _is_deepseek:
    # Strip signed blocks, preserve unsigned ones for replay

Safety

  • Kimi and DeepSeek are the only third-party endpoints that get special treatment
  • Azure, Bedrock, MiniMax, and all other third-party endpoints are unaffected
  • Direct Anthropic API behavior is completely unchanged

Fixes #16748

Changed files

  • agent/anthropic_adapter.py (modified, +29/-0)
RAW_BUFFERClick to expand / collapse

Summary

When using DeepSeek via the Anthropic Messages-compatible endpoint (https://api.deepseek.com/anthropic) with thinking-capable models (e.g. deepseek-v4-pro), Hermes can trigger HTTP 400 from DeepSeek with an error equivalent to:

The content[].thinking in the thinking mode must be passed back to the API.

Root cause (hypothesis)

In agent/anthropic_adapter.py, convert_messages_to_anthropic() classifies api.deepseek.com as a generic third-party Anthropic endpoint (_is_third_party_anthropic_endpoint → true because the host is not anthropic.com).

For third-party hosts, the thinking-block pass strips all thinking / redacted_thinking blocks from assistant history (see the branch that builds stripped content for _is_third_party).

DeepSeek V4 thinking mode appears to require those thinking content blocks to round-trip on subsequent turns (similar in spirit to Kimi /coding, which already has a dedicated preserve path).

So: Hermes strips required blocks → next request is invalid → 400.

Expected behavior

For DeepSeek’s /anthropic route, assistant message history should preserve model-native thinking blocks across turns (at minimum the same class of handling as Kimi’s /coding endpoint: strip only Anthropic-signed blocks that third parties cannot validate, keep unsigned/native thinking).

Optionally, Hermes should also avoid stacking Anthropic-native thinking / output_config request knobs on DeepSeek’s proxy if DeepSeek drives thinking server-side (parallel to the existing Kimi /coding skip in build_anthropic_kwargs).

Actual behavior

Thinking blocks are removed from replayed assistant turns → DeepSeek rejects the follow-up request with HTTP 400.

Repro (high level)

  1. Configure primary model:
    • provider: deepseek
    • model: deepseek-v4-pro (or another thinking-enabled SKU)
    • base_url: https://api.deepseek.com/anthropic
  2. Enable Hermes reasoning / extended thinking path (e.g. agent.reasoning_effort: xhigh — may contribute to entering thinking mode).
  3. Run a multi-turn session with tool calls (or any path that replays assistant history).
  4. Observe HTTP 400 complaining that content[].thinking must be passed back.

Suggested fix direction

  1. Add an endpoint detector for DeepSeek Anthropic compat (host contains api.deepseek.com and path contains /anthropic).
  2. In convert_messages_to_anthropic, treat that endpoint like Kimi /coding for thinking-block preservation (do not use the "strip all thinking for all third-party messages" path).
  3. In build_anthropic_kwargs, consider skipping Hermes-injected Anthropic thinking / output_config for that host (same rationale as Kimi /coding).

Workarounds (today)

  • Use a non-thinking model variant if available, or
  • Set agent.reasoning_effort: none (may reduce pressure but may not fully address model-driven thinking), or
  • Avoid the /anthropic route if an OpenAI-compatible DeepSeek route avoids this contract.

References

  • Related class of issue: DeepSeek tool-call turns needing reasoning_content padding (run_agent.py mentions DeepSeek/Kimi tool reasoning requirements).
  • Kimi special-case in convert_messages_to_anthropic is a good template for DeepSeek /anthropic.

Happy to help validate a patch or add a regression test if you want a contributor to pick this up.

extent analysis

TL;DR

To fix the HTTP 400 error with DeepSeek's Anthropic endpoint, preserve model-native thinking blocks across turns by treating the DeepSeek Anthropic compat endpoint like Kimi's /coding endpoint.

Guidance

  • Identify the DeepSeek Anthropic compat endpoint by checking if the host contains api.deepseek.com and the path contains /anthropic.
  • Modify convert_messages_to_anthropic to preserve thinking blocks for this endpoint, similar to the Kimi /coding special case.
  • Consider skipping Hermes-injected Anthropic thinking / output_config for the DeepSeek Anthropic compat endpoint in build_anthropic_kwargs.
  • As a temporary workaround, use a non-thinking model variant, set agent.reasoning_effort: none, or avoid the /anthropic route if possible.

Example

No code snippet is provided as the issue does not contain sufficient information to generate a specific example.

Notes

The suggested fix direction is based on the hypothesis that the root cause of the issue is the stripping of thinking blocks from assistant history. The actual implementation may vary depending on the specific requirements of the DeepSeek Anthropic endpoint.

Recommendation

Apply the workaround by using a non-thinking model variant or setting agent.reasoning_effort: none until a patch can be implemented to preserve thinking blocks for the DeepSeek Anthropic compat endpoint. This is because the workaround can reduce the pressure on the system and may partially address the issue, but a proper fix is needed to fully resolve the problem.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

For DeepSeek’s /anthropic route, assistant message history should preserve model-native thinking blocks across turns (at minimum the same class of handling as Kimi’s /coding endpoint: strip only Anthropic-signed blocks that third parties cannot validate, keep unsigned/native thinking).

Optionally, Hermes should also avoid stacking Anthropic-native thinking / output_config request knobs on DeepSeek’s proxy if DeepSeek drives thinking server-side (parallel to the existing Kimi /coding skip in build_anthropic_kwargs).

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING