hermes - ✅(Solved) Fix DeepSeek /anthropic (V4 thinking): stripped thinking blocks cause HTTP 400 on replay [1 pull requests, 2 comments, 3 participants]

sadiuysal · 2026-04-28T00:50:45Z

[hermes] When using DeepSeek via the Anthropic Messages-compatible endpoint https://api.deepseek.com/anthropic with thinking-capable models e.g. deepseek-v4-pr… When using DeepSeek via the Anthropic Messages-compatible endpoint (`https://api.deepseek.com/anthropic`) with thinking-capable models (e.g. `deepseek-v4-pro`), Hermes can trigger **HTTP 400** from DeepSeek with an error equivalent to: > The `content[].thinking` in the thinking mode must be passed back to the API. # PR #16781: fix: preserve DeepSeek thinking blocks on Anthropic replay (#16748) - Repository: NousResearch/hermes-agent - Author: vominh1919 - State: open | merged: False - Link: https://github.com/NousResearch/hermes-agent/pull/16781 ## Description (problem / solution / changelog) ## Summary When using DeepSeek via their Anthropic Messages-compatible endpoint (`api.deepseek.com/anthropic`) with thinking-capable models, Hermes strips ALL thinking blocks from assistant history on replay, causing DeepSeek to return HTTP 400 because it expects thinking context from prior turns. ## Root Cause `_is_third_party_anthropic_endpoint()` returns `True` for `api.deepseek.com` (it's not `anthropic.com`), causing the thinking-block stripping logic to remove all thinking/redacted_thinking blocks unconditionally. DeepSeek's endpoint, unlike Azure/Bedrock/MiniMax, requires these blocks for reasoning continuity. ## Fix Added `_is_deepseek_anthropic_endpoint()` detector and a dedicated branch in the thinking-block processing logic that: - **Strips** signed Anthropic blocks (DeepSeek can't validate signatures) - **Preserves** unsigned thinking blocks synthesised from `reasoning_content` This follows the exact same pattern as the Kimi exemption (added via issue #13848) and does not change behavior for any other third-party endpoint. **File changed:** `agent/anthropic_adapter.py` (+29 lines) ## Key Code Changes 1. New detector function: ```python def _is_deepseek_anthropic_endpoint(base_url): return "api.deepseek.com" in normalized ``` 2. New branch in thinking-block logic (after Kimi, before generic third-party): ```python elif _is_deepseek: # Strip signed blocks, preserve unsigned ones for replay ``` ## Safety - Kimi and DeepSeek are the only third-party endpoints that get special treatment - Azure, Bedrock, MiniMax, and all other third-party endpoints are unaffected - Direct Anthropic API behavior is completely unchanged Fixes #16748 ## Changed files - `agent/anthropic_adapter.py` (modified, +29/-0) ## Fix / Workaround ## Workarounds (today) Happy to help validate a patch or add a regression test if you want a contributor to pick this up. ## Summary When using DeepSeek via the Anthropic Messages-compatible endpoint (`https://api.deepseek.com/anthropic`) with thinking-capable models (e.g. `deepseek-v4-pro`), Hermes can trigger **HTTP 400** from DeepSeek with an error equivalent to: > The `content[].thinking` in the thinking mode must be passed back to the API. ## Root cause (hypothesis) In `agent/anthropic_adapter.py`, `convert_messages_to_anthropic()` classifies `api.deepseek.com` as a **generic third-party Anthropic endpoint** (`_is_third_party_anthropic_endpoint` → true because the host is not `anthropic.com`). For third-party hosts, the thinking-block pass **strips all `thinking` / `redacted_thinking` blocks** from assistant history (see the branch that builds `stripped` content for `_is_third_party`). DeepSeek V4 **thinking mode** appears to require those **`thinking` content blocks to round-trip** on subsequent turns (similar in spirit to Kimi `/coding`, which already has a dedicated preserve path). So: **Hermes strips required blocks → next request is invalid → 400.** ## Expected behavior For DeepSeek’s `/anthropic` route, assistant message history should **preserve** model-native thinking blocks across turns (at minimum the same class of handling as Kimi’s `/coding` endpoint: strip only Anthropic-signed blocks that third parties cannot validate, keep unsigned/native thinking). Optionally, Hermes should also avoid stacking **Anthropic-native `thinking` / `output_config`** request knobs on DeepSeek’s proxy if DeepSeek drives thinking server-side (parallel to the existing Kimi `/coding` skip in `build_anthropic_kwargs`). ## Actual behavior Thinking blocks are removed from replayed assistant turns → DeepSeek rejects the follow-up request with HTTP 400. ## Repro (high level) 1. Configure primary model: - `provider: deepseek` - `model: deepseek-v4-pro` (or another thinking-enabled SKU) - `base_url: https://api.deepseek.com/anthropic` 2. Enable Hermes reasoning / extended thinking path (e.g. `agent.reasoning_effort: xhigh` — may contribute to entering thinking mode). 3. Run a multi-turn session with tool calls (or any path that replays assistant history). 4. Observe HTTP 400 complaining that `content[].thinking` must be passed back. ## Suggested fix direction 1. Add an endpoint detector for **DeepSeek Anthropic compat** (host contains `api.deepseek.com` and

hermes2026-04-28 00:50:45

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#16748•Fetched 2026-04-28 06:51:00

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

labeled ×4commented ×2cross-referenced ×1

When using DeepSeek via the Anthropic Messages-compatible endpoint (https://api.deepseek.com/anthropic) with thinking-capable models (e.g. deepseek-v4-pro), Hermes can trigger HTTP 400 from DeepSeek with an error equivalent to:

The content[].thinking in the thinking mode must be passed back to the API.

Error Message

Root Cause

Root cause (hypothesis)

Fix Action

Fix / Workaround

Workarounds (today)

Happy to help validate a patch or add a regression test if you want a contributor to pick this up.

PR fix notes

PR #16781: fix: preserve DeepSeek thinking blocks on Anthropic replay (#16748)

Repository: NousResearch/hermes-agent
Author: vominh1919
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/16781

Description (problem / solution / changelog)

Summary

When using DeepSeek via their Anthropic Messages-compatible endpoint (api.deepseek.com/anthropic) with thinking-capable models, Hermes strips ALL thinking blocks from assistant history on replay, causing DeepSeek to return HTTP 400 because it expects thinking context from prior turns.

Root Cause

_is_third_party_anthropic_endpoint() returns True for api.deepseek.com (it's not anthropic.com), causing the thinking-block stripping logic to remove all thinking/redacted_thinking blocks unconditionally. DeepSeek's endpoint, unlike Azure/Bedrock/MiniMax, requires these blocks for reasoning continuity.

Fix

Added _is_deepseek_anthropic_endpoint() detector and a dedicated branch in the thinking-block processing logic that:

Strips signed Anthropic blocks (DeepSeek can't validate signatures)
Preserves unsigned thinking blocks synthesised from reasoning_content

This follows the exact same pattern as the Kimi exemption (added via issue #13848) and does not change behavior for any other third-party endpoint.

File changed: agent/anthropic_adapter.py (+29 lines)

Key Code Changes

New detector function:

def _is_deepseek_anthropic_endpoint(base_url):
    return "api.deepseek.com" in normalized

New branch in thinking-block logic (after Kimi, before generic third-party):

elif _is_deepseek:
    # Strip signed blocks, preserve unsigned ones for replay

Safety

Kimi and DeepSeek are the only third-party endpoints that get special treatment
Azure, Bedrock, MiniMax, and all other third-party endpoints are unaffected
Direct Anthropic API behavior is completely unchanged

Fixes #16748

Changed files

agent/anthropic_adapter.py (modified, +29/-0)

RAW_BUFFERClick to expand / collapse

Summary

The content[].thinking in the thinking mode must be passed back to the API.

Root cause (hypothesis)

In agent/anthropic_adapter.py, convert_messages_to_anthropic() classifies api.deepseek.com as a generic third-party Anthropic endpoint (_is_third_party_anthropic_endpoint → true because the host is not anthropic.com).

For third-party hosts, the thinking-block pass strips all thinking / redacted_thinking blocks from assistant history (see the branch that builds stripped content for _is_third_party).

DeepSeek V4 thinking mode appears to require those thinking content blocks to round-trip on subsequent turns (similar in spirit to Kimi /coding, which already has a dedicated preserve path).

So: Hermes strips required blocks → next request is invalid → 400.

Expected behavior

For DeepSeek’s /anthropic route, assistant message history should preserve model-native thinking blocks across turns (at minimum the same class of handling as Kimi’s /coding endpoint: strip only Anthropic-signed blocks that third parties cannot validate, keep unsigned/native thinking).

Optionally, Hermes should also avoid stacking Anthropic-native thinking / output_config request knobs on DeepSeek’s proxy if DeepSeek drives thinking server-side (parallel to the existing Kimi /coding skip in build_anthropic_kwargs).

Actual behavior

Thinking blocks are removed from replayed assistant turns → DeepSeek rejects the follow-up request with HTTP 400.

Repro (high level)

Configure primary model:
- provider: deepseek
- model: deepseek-v4-pro (or another thinking-enabled SKU)
- base_url: https://api.deepseek.com/anthropic
Enable Hermes reasoning / extended thinking path (e.g. agent.reasoning_effort: xhigh — may contribute to entering thinking mode).
Run a multi-turn session with tool calls (or any path that replays assistant history).
Observe HTTP 400 complaining that content[].thinking must be passed back.

Suggested fix direction

Add an endpoint detector for DeepSeek Anthropic compat (host contains api.deepseek.com and path contains /anthropic).
In convert_messages_to_anthropic, treat that endpoint like Kimi /coding for thinking-block preservation (do not use the "strip all thinking for all third-party messages" path).
In build_anthropic_kwargs, consider skipping Hermes-injected Anthropic thinking / output_config for that host (same rationale as Kimi /coding).

Workarounds (today)

Use a non-thinking model variant if available, or
Set agent.reasoning_effort: none (may reduce pressure but may not fully address model-driven thinking), or
Avoid the /anthropic route if an OpenAI-compatible DeepSeek route avoids this contract.

References

Related class of issue: DeepSeek tool-call turns needing reasoning_content padding (run_agent.py mentions DeepSeek/Kimi tool reasoning requirements).
Kimi special-case in convert_messages_to_anthropic is a good template for DeepSeek /anthropic.

Happy to help validate a patch or add a regression test if you want a contributor to pick this up.

extent analysis

TL;DR

To fix the HTTP 400 error with DeepSeek's Anthropic endpoint, preserve model-native thinking blocks across turns by treating the DeepSeek Anthropic compat endpoint like Kimi's /coding endpoint.

Guidance

Identify the DeepSeek Anthropic compat endpoint by checking if the host contains api.deepseek.com and the path contains /anthropic.
Modify convert_messages_to_anthropic to preserve thinking blocks for this endpoint, similar to the Kimi /coding special case.
Consider skipping Hermes-injected Anthropic thinking / output_config for the DeepSeek Anthropic compat endpoint in build_anthropic_kwargs.
As a temporary workaround, use a non-thinking model variant, set agent.reasoning_effort: none, or avoid the /anthropic route if possible.

Example

No code snippet is provided as the issue does not contain sufficient information to generate a specific example.

Notes

The suggested fix direction is based on the hypothesis that the root cause of the issue is the stripping of thinking blocks from assistant history. The actual implementation may vary depending on the specific requirements of the DeepSeek Anthropic endpoint.

Recommendation

Apply the workaround by using a non-thinking model variant or setting agent.reasoning_effort: none until a patch can be implemented to preserve thinking blocks for the DeepSeek Anthropic compat endpoint. This is because the workaround can reduce the pressure on the system and may partially address the issue, but a proper fix is needed to fully resolve the problem.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #agent execution #callback error #memory management #API rate limit

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix DeepSeek /anthropic (V4 thinking): stripped thinking blocks cause HTTP 400 on replay [1 pull requests, 2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root cause (hypothesis)

Fix Action

Fix / Workaround

Workarounds (today)

PR fix notes

PR #16781: fix: preserve DeepSeek thinking blocks on Anthropic replay (#16748)

Description (problem / solution / changelog)

Summary

Root Cause

Fix

Key Code Changes

Safety

Changed files

Summary

Root cause (hypothesis)

Expected behavior

Actual behavior

Repro (high level)

Suggested fix direction

Workarounds (today)

References

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING