hermes - ✅(Solved) Fix [Bug] Multi-turn history loses thinking/redacted_thinking blocks -- raw Anthropic content array not preserved as source of truth [2 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#17861Fetched 2026-05-01 05:55:28
View on GitHub
Comments
2
Participants
2
Timeline
11
Reactions
0
Participants
Timeline (top)
labeled ×4commented ×2cross-referenced ×2mentioned ×1

_build_assistant_message() in run_agent.py stores the raw Anthropic SDK content list in msg["content"], but convert_messages_to_anthropic() then strips thinking/redacted_thinking blocks from all but the latest assistant turn when talking to a direct Anthropic endpoint. Reconstruction from reasoning_details (a separate preserved field) has known edge cases -- see #11096 Bug 2 (trailing thinking block left as final block poisons session).

The practical result: sessions using extended thinking on multi-turn platforms (Discord gateway, Telegram, long CLI sessions) accumulate malformed history and eventually trigger HTTP 400.


Root Cause

This issue was filed by Hermes (a Hermes Agent instance running on claude-sonnet-4-6), on behalf of Alex Shultz (Discord: AlexTuba). Root cause was identified via codebase inspection during a live debugging session.

Fix Action

Fix / Workaround

  • #11096 Bug 2 (OPEN) -- describes the symptom: trailing thinking block as final content block after compression/truncation causes HTTP 400. Proposes a guard patch at convert_messages_to_anthropic(). That patch treats the symptom; this issue identifies the structural gap underneath it.
  • #16748 (CLOSED) -- covered the third-party/DeepSeek variant where blocks were stripped intentionally because the host is not anthropic.com and blocks can't be validated. This issue is about the direct Anthropic path.

Option A (immediate mitigation -- simpler)

PR fix notes

PR #17884: fix(anthropic): strip thinking blocks from all assistant turns to prevent stale signature HTTP 400 (#17861)

Description (problem / solution / changelog)

Summary

Fixes #17861 — multi-turn extended-thinking sessions (Discord, Telegram, long CLI) crash with HTTP 400 "Invalid signature in thinking block" because persisted thinking blocks accumulate stale signatures after session serialisation, context compression, or message truncation.

Root Cause

convert_messages_to_anthropic() preserved signed thinking/redacted_thinking blocks on the latest assistant turn for reasoning continuity. But Anthropic signs these blocks against the full turn content — any upstream mutation (compression, truncation, merging) invalidates the signature, causing HTTP 400 on the next API call.

Fix (Option A from the issue)

Strip ALL thinking/redacted_thinking blocks from every assistant message — including the latest turn — when targeting direct Anthropic endpoints. Anthropic explicitly allows omitting thinking blocks; the model re-thinks from scratch.

Trade-off: No cross-turn reasoning continuity (model re-thinks each turn). This was already unreliable due to the signature invalidation issue, so practical impact is minimal.

Changes

  • agent/anthropic_adapter.py: Unified the thinking block stripping to apply to ALL assistant turns on direct Anthropic (previously only non-latest were stripped). Removed the now-unused last_assistant_idx variable. The Kimi/DeepSeek unsigned-thinking preservation path is unchanged.
  • tests/agent/test_anthropic_adapter.py: Updated 7 tests in TestThinkingBlockSignatureManagement and 1 in TestConvertMessages to reflect the new strip-all behavior.

Testing

  • All 120 non-OAuth anthropic adapter tests pass
  • All 14 thinking-specific tests pass
  • Kimi/DeepSeek endpoint behavior unchanged (unsigned blocks still preserved)

Impact

  • Extended thinking sessions on multi-turn platforms (Discord/Telegram gateway, long CLI) no longer accumulate invalid signatures
  • Sessions no longer become unrecoverable after several turns
  • The runtime signature recovery path in run_agent.py (strip reasoning_details on 400) remains as a safety net but should no longer be needed

Changed files

  • agent/anthropic_adapter.py (modified, +10/-33)
  • tests/agent/test_anthropic_adapter.py (modified, +31/-38)

PR #17981: fix(anthropic): preserve raw content blocks as source of truth (#17861)

Description (problem / solution / changelog)

What broke

Multi-turn sessions with extended thinking on direct Anthropic endpoints eventually hit HTTP 400 "Invalid signature in thinking block". Sessions become unrecoverable without manually deleting the .jsonl from sessions/.

Root cause

Anthropic signs thinking blocks against the full turn content. Any upstream mutation — context compression, session truncation, orphan stripping, message merging — invalidates those signatures. The previous reconstruction from reasoning_details has edge cases that fail to preserve exact block interleaving and cryptographic signatures.

Why this fix is minimal

Serialize the raw SDK content blocks as stable dicts at storage time (_build_assistant_message) into a _raw_anthropic_content field. In convert_messages_to_anthropic(), when _raw_anthropic_content is present, use it directly as the source of truth — preserving exact block interleaving, cryptographic signatures, and block ordering without relying on reconstruction.

Kimi and DeepSeek paths unchanged. Non-Anthropic providers unaffected.

What I tested

Added 4 new tests in test_anthropic_adapter.py:

  • test_raw_anthropic_content_used_directly — thinking blocks with signatures preserved
  • test_raw_anthropic_content_with_tool_use — tool_use blocks included in raw content
  • test_raw_anthropic_content_preserves_ordering — block ordering (thinking → text → redacted → text)
  • test_multi_turn_raw_content_preserves_all_signatures — multi-turn signature preservation

Existing tests for Kimi/DeepSeek legacy fallback path still pass (no _raw_anthropic_content → falls back to reasoning_details reconstruction).

What I intentionally did not change

  • Kimi /coding endpoint thinking block handling
  • DeepSeek /anthropic endpoint thinking block handling
  • reasoning_details storage/preservation
  • Non-Anthropic provider code paths

Changed files

  • agent/anthropic_adapter.py (modified, +37/-23)
  • agent/auxiliary_client.py (modified, +4/-0)
  • agent/context_compressor.py (modified, +4/-2)
  • agent/model_metadata.py (modified, +63/-348)
  • gateway/platforms/signal.py (modified, +2/-0)
  • gateway/run.py (modified, +1541/-5874)
  • hermes_cli/gateway.py (modified, +11/-0)
  • run_agent.py (modified, +36/-0)
  • tests/agent/test_anthropic_adapter.py (modified, +109/-0)
  • tests/agent/test_context_compressor.py (modified, +26/-0)
  • tests/agent/test_model_metadata_local_ctx.py (modified, +84/-0)
  • tests/gateway/test_signal.py (modified, +210/-0)
  • tests/hermes_cli/test_gateway_python_path.py (added, +152/-0)
  • tests/tools/test_browser_cdp_tool.py (modified, +15/-6)
  • tests/tools/test_llm_content_none_guard.py (modified, +24/-278)
  • tools/browser_cdp_tool.py (modified, +2/-3)

Code Example

msg = {
    "role": "assistant",
    "content": assistant_message.content or "",   # SDK list of blocks -- ThinkingBlock, TextBlock, etc.
    "reasoning": reasoning_text,
    "finish_reason": finish_reason,
}
RAW_BUFFERClick to expand / collapse

Filed by

This issue was filed by Hermes (a Hermes Agent instance running on claude-sonnet-4-6), on behalf of Alex Shultz (Discord: AlexTuba). Root cause was identified via codebase inspection during a live debugging session.


Summary

_build_assistant_message() in run_agent.py stores the raw Anthropic SDK content list in msg["content"], but convert_messages_to_anthropic() then strips thinking/redacted_thinking blocks from all but the latest assistant turn when talking to a direct Anthropic endpoint. Reconstruction from reasoning_details (a separate preserved field) has known edge cases -- see #11096 Bug 2 (trailing thinking block left as final block poisons session).

The practical result: sessions using extended thinking on multi-turn platforms (Discord gateway, Telegram, long CLI sessions) accumulate malformed history and eventually trigger HTTP 400.


Root cause (code)

Storage (run_agent.py ~line 6469):

msg = {
    "role": "assistant",
    "content": assistant_message.content or "",   # SDK list of blocks -- ThinkingBlock, TextBlock, etc.
    "reasoning": reasoning_text,
    "finish_reason": finish_reason,
}

The raw SDK content array (which includes interleaved thinking/redacted_thinking blocks with their cryptographic signature fields) is stored -- but as live Pydantic SDK objects, not as stable serialised dicts.

Consumption (agent/anthropic_adapter.py ~line 1120):

When convert_messages_to_anthropic() rebuilds history for the next API call, it strips thinking blocks from all assistant turns except the latest on a direct Anthropic endpoint. The intent is to avoid sending unsigned or stale blocks, but it relies on downstream reconstruction from reasoning_details -- which has edge cases (see #11096).

Anthropic's rule: "If you send thinking blocks back, they must be unmodified."
The corollary: you are always allowed to not send them back (Option A below).


Relationship to existing issues

  • #11096 Bug 2 (OPEN) -- describes the symptom: trailing thinking block as final content block after compression/truncation causes HTTP 400. Proposes a guard patch at convert_messages_to_anthropic(). That patch treats the symptom; this issue identifies the structural gap underneath it.
  • #16748 (CLOSED) -- covered the third-party/DeepSeek variant where blocks were stripped intentionally because the host is not anthropic.com and blocks can't be validated. This issue is about the direct Anthropic path.

Option A (immediate mitigation -- simpler)

Strip all thinking/redacted_thinking blocks from assistant messages before sending to Anthropic. Never send them back. Anthropic explicitly allows this.

Cost: No reasoning continuity across turns -- the model re-thinks from scratch on every turn.
Benefit: Eliminates the 400 completely. One targeted change in convert_messages_to_anthropic().

This is the right call for any deployment currently being killed by this 400.


Option B (proper fix -- more work)

Store the raw Anthropic content array as a stable serialised hidden field at the point where the SDK response is received and converted, before _build_assistant_message() finalises the dict. The exact capture point should be determined by whoever implements this -- the goal is a _raw_anthropic_content field (or equivalent) on the stored message dict that contains the content blocks as plain dicts (.model_dump() or equivalent), not live SDK objects.

Then in convert_messages_to_anthropic(), when _raw_anthropic_content is present, use it directly as the source of truth instead of reconstructing from content + reasoning_details.

This preserves exact block interleaving and cryptographic signatures, removes all reconstruction edge cases, and restores reasoning continuity across turns.


Impact

  • Extended thinking sessions on multi-turn platforms (Discord/Telegram gateway especially)
  • The 400 is silent for the first several turns, then the accumulated malformed history kills the session
  • Session unrecoverable without manually deleting the .jsonl from sessions/

Environment

  • Hermes v0.9+ with extended thinking enabled (claude-sonnet-4-6 or later)
  • Direct Anthropic endpoint (not third-party)
  • Multi-turn gateway sessions

extent analysis

TL;DR

Strip all thinking/redacted_thinking blocks from assistant messages before sending to Anthropic to immediately mitigate the HTTP 400 issue.

Guidance

  • Identify the point where the SDK response is received and converted to store the raw Anthropic content array as a stable serialised hidden field, such as _raw_anthropic_content.
  • In convert_messages_to_anthropic(), use the stored _raw_anthropic_content as the source of truth instead of reconstructing from content + reasoning_details to preserve exact block interleaving and cryptographic signatures.
  • Consider implementing Option A as an immediate mitigation to strip all thinking/redacted_thinking blocks, which eliminates the 400 but loses reasoning continuity across turns.
  • Review the impact on extended thinking sessions on multi-turn platforms, such as Discord and Telegram gateways, and the environment requirements, including Hermes v0.9+ with extended thinking enabled.

Example

# In _build_assistant_message()
msg = {
    "role": "assistant",
    "content": assistant_message.content or "",
    "reasoning": reasoning_text,
    "finish_reason": finish_reason,
    "_raw_anthropic_content": assistant_message.content.model_dump(),  # Store raw content as a stable serialised hidden field
}

Notes

The choice between Option A and Option B depends on the priority of preserving reasoning continuity across turns versus immediately mitigating the HTTP 400 issue.

Recommendation

Apply Option A as an immediate mitigation to strip all thinking/redacted_thinking blocks, as it is a simpler and more straightforward solution that eliminates the 400 issue, even though it loses reasoning continuity across turns.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [Bug] Multi-turn history loses thinking/redacted_thinking blocks -- raw Anthropic content array not preserved as source of truth [2 pull requests, 2 comments, 2 participants]