hermes - ✅(Solved) Fix [Bug] Multi-turn history loses thinking/redacted_thinking blocks -- raw Anthropic content array not preserved as source of truth [2 pull requests, 2 comments, 2 participants]

alexshultz · 2026-04-30T09:15:00Z

[hermes] build assistant message in run agent.py stores the raw Anthropic SDK content list in msg "content" , but convert messages to anthropic then strips thi… `_build_assistant_message()` in `run_agent.py` stores the raw Anthropic SDK content list in `msg["content"]`, but `convert_messages_to_anthropic()` then strips `thinking`/`redacted_thinking` blocks from **all but the latest** assistant turn when talking to a direct Anthropic endpoint. Reconstruction from `reasoning_details` (a separate preserved field) has known edge cases -- see #11096 Bug 2 (trailing thinking block left as final block poisons session). The practical result: sessions using extended thinking on multi-turn platforms (Discord gateway, Telegram, long CLI sessions) accumulate malformed history and eventually trigger HTTP 400. --- # PR #17884: fix(anthropic): strip thinking blocks from all assistant turns to prevent stale signature HTTP 400 (#17861) - Repository: NousResearch/hermes-agent - Author: luyao618 - State: closed | merged: False - Link: https://github.com/NousResearch/hermes-agent/pull/17884 ## Description (problem / solution / changelog) ## Summary Fixes #17861 — multi-turn extended-thinking sessions (Discord, Telegram, long CLI) crash with HTTP 400 "Invalid signature in thinking block" because persisted thinking blocks accumulate stale signatures after session serialisation, context compression, or message truncation. ## Root Cause `convert_messages_to_anthropic()` preserved signed `thinking`/`redacted_thinking` blocks on the latest assistant turn for reasoning continuity. But Anthropic signs these blocks against the full turn content — any upstream mutation (compression, truncation, merging) invalidates the signature, causing HTTP 400 on the next API call. ## Fix (Option A from the issue) Strip ALL `thinking`/`redacted_thinking` blocks from every assistant message — including the latest turn — when targeting direct Anthropic endpoints. Anthropic explicitly allows omitting thinking blocks; the model re-thinks from scratch. **Trade-off:** No cross-turn reasoning continuity (model re-thinks each turn). This was already unreliable due to the signature invalidation issue, so practical impact is minimal. ## Changes - `agent/anthropic_adapter.py`: Unified the thinking block stripping to apply to ALL assistant turns on direct Anthropic (previously only non-latest were stripped). Removed the now-unused `last_assistant_idx` variable. The Kimi/DeepSeek unsigned-thinking preservation path is unchanged. - `tests/agent/test_anthropic_adapter.py`: Updated 7 tests in `TestThinkingBlockSignatureManagement` and 1 in `TestConvertMessages` to reflect the new strip-all behavior. ## Testing - All 120 non-OAuth anthropic adapter tests pass - All 14 thinking-specific tests pass - Kimi/DeepSeek endpoint behavior unchanged (unsigned blocks still preserved) ## Impact - Extended thinking sessions on multi-turn platforms (Discord/Telegram gateway, long CLI) no longer accumulate invalid signatures - Sessions no longer become unrecoverable after several turns - The runtime signature recovery path in `run_agent.py` (strip `reasoning_details` on 400) remains as a safety net but should no longer be needed ## Changed files - `agent/anthropic_adapter.py` (modified, +10/-33) - `tests/agent/test_anthropic_adapter.py` (modified, +31/-38) --- # PR #17981: fix(anthropic): preserve raw content blocks as source of truth (#17861) - Repository: NousResearch/hermes-agent - Author: Linux2010 - State: open | merged: False - Link: https://github.com/NousResearch/hermes-agent/pull/17981 ## Description (problem / solution / changelog) ## What broke Multi-turn sessions with extended thinking on direct Anthropic endpoints eventually hit HTTP 400 `"Invalid signature in thinking block"`. Sessions become unrecoverable without manually deleting the `.jsonl` from `sessions/`. ## Root cause Anthropic signs thinking blocks against the full turn content. Any upstream mutation — context compression, session truncation, orphan stripping, message merging — invalidates those signatures. The previous reconstruction from `reasoning_details` has edge cases that fail to preserve exact block interleaving and cryptographic signatures. ## Why this fix is minimal Serialize the raw SDK content blocks as stable dicts at storage time (`_build_assistant_message`) into a `_raw_anthropic_content` field. In `convert_messages_to_anthropic()`, when `_raw_anthropic_content` is present, use it directly as the source of truth — preserving exact block interleaving, cryptographic signatures, and block ordering without relying on reconstruction. Kimi and DeepSeek paths unchanged. Non-Anthropic providers unaffected. ## What I tested Added 4 new tests in `test_anthropic_adapter.py`: - `test_raw_anthropic_content_used_directly` — thinking blocks with signatures preserved - `test_raw_anthropic_content_with_tool_use` — tool_use blocks included in raw content - `test_raw_anthropic_content_preserves_ordering` — block

hermes2026-04-30 09:15:00

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#17861•Fetched 2026-05-01 05:55:28

View on GitHub

Comments

Participants

Timeline

Reactions

Author

alexshultz

Participants

alexshultz

teknium1

Timeline (top)

labeled ×4commented ×2cross-referenced ×2mentioned ×1

_build_assistant_message() in run_agent.py stores the raw Anthropic SDK content list in msg["content"], but convert_messages_to_anthropic() then strips thinking/redacted_thinking blocks from all but the latest assistant turn when talking to a direct Anthropic endpoint. Reconstruction from reasoning_details (a separate preserved field) has known edge cases -- see #11096 Bug 2 (trailing thinking block left as final block poisons session).

The practical result: sessions using extended thinking on multi-turn platforms (Discord gateway, Telegram, long CLI sessions) accumulate malformed history and eventually trigger HTTP 400.

Root Cause

This issue was filed by Hermes (a Hermes Agent instance running on claude-sonnet-4-6), on behalf of Alex Shultz (Discord: AlexTuba). Root cause was identified via codebase inspection during a live debugging session.

Fix Action

Fix / Workaround

#11096 Bug 2 (OPEN) -- describes the symptom: trailing thinking block as final content block after compression/truncation causes HTTP 400. Proposes a guard patch at convert_messages_to_anthropic(). That patch treats the symptom; this issue identifies the structural gap underneath it.
#16748 (CLOSED) -- covered the third-party/DeepSeek variant where blocks were stripped intentionally because the host is not anthropic.com and blocks can't be validated. This issue is about the direct Anthropic path.

Option A (immediate mitigation -- simpler)

PR fix notes

PR #17884: fix(anthropic): strip thinking blocks from all assistant turns to prevent stale signature HTTP 400 (#17861)

Repository: NousResearch/hermes-agent
Author: luyao618
State: closed | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/17884

Description (problem / solution / changelog)

Summary

Fixes #17861 — multi-turn extended-thinking sessions (Discord, Telegram, long CLI) crash with HTTP 400 "Invalid signature in thinking block" because persisted thinking blocks accumulate stale signatures after session serialisation, context compression, or message truncation.

Root Cause

convert_messages_to_anthropic() preserved signed thinking/redacted_thinking blocks on the latest assistant turn for reasoning continuity. But Anthropic signs these blocks against the full turn content — any upstream mutation (compression, truncation, merging) invalidates the signature, causing HTTP 400 on the next API call.

Fix (Option A from the issue)

Strip ALL thinking/redacted_thinking blocks from every assistant message — including the latest turn — when targeting direct Anthropic endpoints. Anthropic explicitly allows omitting thinking blocks; the model re-thinks from scratch.

Trade-off: No cross-turn reasoning continuity (model re-thinks each turn). This was already unreliable due to the signature invalidation issue, so practical impact is minimal.

Changes

agent/anthropic_adapter.py: Unified the thinking block stripping to apply to ALL assistant turns on direct Anthropic (previously only non-latest were stripped). Removed the now-unused last_assistant_idx variable. The Kimi/DeepSeek unsigned-thinking preservation path is unchanged.
tests/agent/test_anthropic_adapter.py: Updated 7 tests in TestThinkingBlockSignatureManagement and 1 in TestConvertMessages to reflect the new strip-all behavior.

Testing

All 120 non-OAuth anthropic adapter tests pass
All 14 thinking-specific tests pass
Kimi/DeepSeek endpoint behavior unchanged (unsigned blocks still preserved)

Impact

Extended thinking sessions on multi-turn platforms (Discord/Telegram gateway, long CLI) no longer accumulate invalid signatures
Sessions no longer become unrecoverable after several turns
The runtime signature recovery path in run_agent.py (strip reasoning_details on 400) remains as a safety net but should no longer be needed

Changed files

agent/anthropic_adapter.py (modified, +10/-33)
tests/agent/test_anthropic_adapter.py (modified, +31/-38)

PR #17981: fix(anthropic): preserve raw content blocks as source of truth (#17861)

Repository: NousResearch/hermes-agent
Author: Linux2010
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/17981

Description (problem / solution / changelog)

What broke

Multi-turn sessions with extended thinking on direct Anthropic endpoints eventually hit HTTP 400 "Invalid signature in thinking block". Sessions become unrecoverable without manually deleting the .jsonl from sessions/.

Root cause

Anthropic signs thinking blocks against the full turn content. Any upstream mutation — context compression, session truncation, orphan stripping, message merging — invalidates those signatures. The previous reconstruction from reasoning_details has edge cases that fail to preserve exact block interleaving and cryptographic signatures.

Why this fix is minimal

Serialize the raw SDK content blocks as stable dicts at storage time (_build_assistant_message) into a _raw_anthropic_content field. In convert_messages_to_anthropic(), when _raw_anthropic_content is present, use it directly as the source of truth — preserving exact block interleaving, cryptographic signatures, and block ordering without relying on reconstruction.

Kimi and DeepSeek paths unchanged. Non-Anthropic providers unaffected.

What I tested

Added 4 new tests in test_anthropic_adapter.py:

test_raw_anthropic_content_used_directly — thinking blocks with signatures preserved
test_raw_anthropic_content_with_tool_use — tool_use blocks included in raw content
test_raw_anthropic_content_preserves_ordering — block ordering (thinking → text → redacted → text)
test_multi_turn_raw_content_preserves_all_signatures — multi-turn signature preservation

Existing tests for Kimi/DeepSeek legacy fallback path still pass (no _raw_anthropic_content → falls back to reasoning_details reconstruction).

What I intentionally did not change

Kimi /coding endpoint thinking block handling
DeepSeek /anthropic endpoint thinking block handling
reasoning_details storage/preservation
Non-Anthropic provider code paths

Changed files

agent/anthropic_adapter.py (modified, +37/-23)
agent/auxiliary_client.py (modified, +4/-0)
agent/context_compressor.py (modified, +4/-2)
agent/model_metadata.py (modified, +63/-348)
gateway/platforms/signal.py (modified, +2/-0)
gateway/run.py (modified, +1541/-5874)
hermes_cli/gateway.py (modified, +11/-0)
run_agent.py (modified, +36/-0)
tests/agent/test_anthropic_adapter.py (modified, +109/-0)
tests/agent/test_context_compressor.py (modified, +26/-0)
tests/agent/test_model_metadata_local_ctx.py (modified, +84/-0)
tests/gateway/test_signal.py (modified, +210/-0)
tests/hermes_cli/test_gateway_python_path.py (added, +152/-0)
tests/tools/test_browser_cdp_tool.py (modified, +15/-6)
tests/tools/test_llm_content_none_guard.py (modified, +24/-278)
tools/browser_cdp_tool.py (modified, +2/-3)

Code Example

msg = {
    "role": "assistant",
    "content": assistant_message.content or "",   # SDK list of blocks -- ThinkingBlock, TextBlock, etc.
    "reasoning": reasoning_text,
    "finish_reason": finish_reason,
}

RAW_BUFFERClick to expand / collapse

Filed by

Summary

The practical result: sessions using extended thinking on multi-turn platforms (Discord gateway, Telegram, long CLI sessions) accumulate malformed history and eventually trigger HTTP 400.

Root cause (code)

Storage (run_agent.py ~line 6469):

msg = {
    "role": "assistant",
    "content": assistant_message.content or "",   # SDK list of blocks -- ThinkingBlock, TextBlock, etc.
    "reasoning": reasoning_text,
    "finish_reason": finish_reason,
}

The raw SDK content array (which includes interleaved thinking/redacted_thinking blocks with their cryptographic signature fields) is stored -- but as live Pydantic SDK objects, not as stable serialised dicts.

Consumption (agent/anthropic_adapter.py ~line 1120):

When convert_messages_to_anthropic() rebuilds history for the next API call, it strips thinking blocks from all assistant turns except the latest on a direct Anthropic endpoint. The intent is to avoid sending unsigned or stale blocks, but it relies on downstream reconstruction from reasoning_details -- which has edge cases (see #11096).

Anthropic's rule: "If you send thinking blocks back, they must be unmodified."
The corollary: you are always allowed to not send them back (Option A below).

Relationship to existing issues

#11096 Bug 2 (OPEN) -- describes the symptom: trailing thinking block as final content block after compression/truncation causes HTTP 400. Proposes a guard patch at convert_messages_to_anthropic(). That patch treats the symptom; this issue identifies the structural gap underneath it.
#16748 (CLOSED) -- covered the third-party/DeepSeek variant where blocks were stripped intentionally because the host is not anthropic.com and blocks can't be validated. This issue is about the direct Anthropic path.

Option A (immediate mitigation -- simpler)

Strip all thinking/redacted_thinking blocks from assistant messages before sending to Anthropic. Never send them back. Anthropic explicitly allows this.

Cost: No reasoning continuity across turns -- the model re-thinks from scratch on every turn.
Benefit: Eliminates the 400 completely. One targeted change in convert_messages_to_anthropic().

This is the right call for any deployment currently being killed by this 400.

Option B (proper fix -- more work)

Store the raw Anthropic content array as a stable serialised hidden field at the point where the SDK response is received and converted, before _build_assistant_message() finalises the dict. The exact capture point should be determined by whoever implements this -- the goal is a _raw_anthropic_content field (or equivalent) on the stored message dict that contains the content blocks as plain dicts (.model_dump() or equivalent), not live SDK objects.

Then in convert_messages_to_anthropic(), when _raw_anthropic_content is present, use it directly as the source of truth instead of reconstructing from content + reasoning_details.

This preserves exact block interleaving and cryptographic signatures, removes all reconstruction edge cases, and restores reasoning continuity across turns.

Impact

Extended thinking sessions on multi-turn platforms (Discord/Telegram gateway especially)
The 400 is silent for the first several turns, then the accumulated malformed history kills the session
Session unrecoverable without manually deleting the .jsonl from sessions/

Environment

Hermes v0.9+ with extended thinking enabled (claude-sonnet-4-6 or later)
Direct Anthropic endpoint (not third-party)
Multi-turn gateway sessions

extent analysis

TL;DR

Strip all thinking/redacted_thinking blocks from assistant messages before sending to Anthropic to immediately mitigate the HTTP 400 issue.

Guidance

Identify the point where the SDK response is received and converted to store the raw Anthropic content array as a stable serialised hidden field, such as _raw_anthropic_content.
In convert_messages_to_anthropic(), use the stored _raw_anthropic_content as the source of truth instead of reconstructing from content + reasoning_details to preserve exact block interleaving and cryptographic signatures.
Consider implementing Option A as an immediate mitigation to strip all thinking/redacted_thinking blocks, which eliminates the 400 but loses reasoning continuity across turns.
Review the impact on extended thinking sessions on multi-turn platforms, such as Discord and Telegram gateways, and the environment requirements, including Hermes v0.9+ with extended thinking enabled.

Example

# In _build_assistant_message()
msg = {
    "role": "assistant",
    "content": assistant_message.content or "",
    "reasoning": reasoning_text,
    "finish_reason": finish_reason,
    "_raw_anthropic_content": assistant_message.content.model_dump(),  # Store raw content as a stable serialised hidden field
}

Notes

The choice between Option A and Option B depends on the priority of preserving reasoning continuity across turns versus immediately mitigating the HTTP 400 issue.

Recommendation

Apply Option A as an immediate mitigation to strip all thinking/redacted_thinking blocks, as it is a simpler and more straightforward solution that eliminates the 400 issue, even though it loses reasoning continuity across turns.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #orchestration issue #cache issue #memory leak #API versioning

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix [Bug] Multi-turn history loses thinking/redacted_thinking blocks -- raw Anthropic content array not preserved as source of truth [2 pull requests, 2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Option A (immediate mitigation -- simpler)

PR fix notes

PR #17884: fix(anthropic): strip thinking blocks from all assistant turns to prevent stale signature HTTP 400 (#17861)

Description (problem / solution / changelog)

Summary

Root Cause

Fix (Option A from the issue)

Changes

Testing

Impact

Changed files

PR #17981: fix(anthropic): preserve raw content blocks as source of truth (#17861)

Description (problem / solution / changelog)

What broke

Root cause

Why this fix is minimal

What I tested

What I intentionally did not change

Changed files

Code Example

Filed by

Summary

Root cause (code)

Relationship to existing issues

Option A (immediate mitigation -- simpler)

Option B (proper fix -- more work)

Impact

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING