openclaw - ✅(Solved) Fix Anthropic thinking block 'signature' field lost during session persistence — causes API rejection [2 pull requests, 1 participants]

Q: Expected behavior

The `signature` field (and any other fields) on `thinking` / `redacted_thinking` content blocks should be preserved exactly as received from the API when serializing to the session file.

openclaw2026-03-12 05:02:13

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#43691•Fetched 2026-04-08 00:17:07

View on GitHub

Comments

Participants

Timeline

Reactions

Author

PenguinMiaou

Participants

PenguinMiaou

Timeline (top)

cross-referenced ×3

Error Message

After ~15+ assistant turns with thinking, the session starts failing with the above error

Root Cause

Anthropic API returns thinking blocks with a signature field that must be passed back verbatim. During session persistence (writing to .jsonl), the signature field is stripped or not captured. Inspecting the session file confirms all thinking blocks have no signature:

type=thinking thinking_len=224 signature=(none)
type=thinking thinking_len=562 signature=(none)
...

This causes any subsequent API call using those messages to fail, since the thinking blocks are considered "modified".

Fix Action

Fix / Workaround

Once the session accumulates enough messages with thinking blocks, all further interactions in that session fail permanently
/new and /reset commands may not properly clear the corrupted session
The only workaround is manually deleting the session JSONL file

PR fix notes

PR #44009: fix(agents): exclude synthetic assistant transcript mirrors from Anthropic replay

Repository: openclaw/openclaw
Author: Haohao-end
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/44009

Description (problem / solution / changelog)

Summary

Fixes Anthropic extended-thinking replay failures by deduplicating synthetic delivery-mirror assistant transcript turns only when they are true duplicates of an adjacent real assistant reply.

Although the issue initially appeared to be a persistence/signature-loss problem, the session JSONL round-trip already preserved thinking signatures correctly. The actual bug was in replay assembly: a synthetic OpenClaw delivery-mirror assistant turn could be replayed alongside the original assistant reply, which polluted replay history and caused Anthropic reasoning replay to reject it as modified thinking content.

What changed

replace target-based delivery-mirror removal with transcript-based deduplication
keep the mirror marker narrow:
- role === "assistant"
- provider === "openclaw"
- model === "delivery-mirror"
only drop a mirror when it is a true adjacent duplicate:
- it follows a real assistant turn
- the previous assistant turn is not from openclaw
- the normalized visible assistant text matches
preserve delivery-mirror turns when they are the only persisted assistant history
preserve non-mirror OpenClaw assistant transcript entries such as gateway-injected
add regression coverage for:
- session persistence fidelity of thinking / redacted-thinking content
- final Anthropic replay payload assembly
- dropping duplicated mirror turns that shadow a real assistant reply
- preserving mirror-only assistant history
- preserving gateway-injected assistant entries
- intentional abort handling in the payload-capture test helper

Why this fix is safe

This change is intentionally narrow.

It no longer relies on replay target heuristics to remove mirrored assistant history. Instead, it removes only transcript entries that are structurally identifiable as duplicate delivery-mirror shadows of an adjacent real assistant reply. Mirror-only assistant history remains intact.

Validation

pnpm test src/agents/pi-embedded-runner.thinking-signature-persistence.test.ts src/agents/pi-embedded-runner.anthropic-thinking-replay.test.ts
pnpm build
pnpm check

Closes #43691

Changed files

src/agents/pi-embedded-runner.anthropic-thinking-replay.test.ts (added, +351/-0)
src/agents/pi-embedded-runner.thinking-signature-persistence.test.ts (added, +294/-0)
src/agents/pi-embedded-runner/google.ts (modified, +69/-1)

PR #55: feat: foundation layer — trogon-nats, trogon-mcp, trogon-agent-core, acp-telemetry

Repository: TrogonStack/trogonai
Author: jramirezhdez02
State: closed | merged: False
Link: https://github.com/TrogonStack/trogonai/pull/55

Description (problem / solution / changelog)

What

Adds the shared infrastructure that the ACP-over-NATS stack sits on. No product behavior yet — this establishes the primitives that the bridge and runner crates build on.

Crates

`trogon-nats`

Wraps async-nats with connection management, auth, messaging utilities, and a test mock.

auth.rs — NatsAuth enum with 5 variants: Credentials (file path), NKey, UserPassword, Token, None. NatsConfig reads from environment with explicit priority order: NATS_CREDS > NATS_NKEY > NATS_USER+ NATS_PASSWORD > NATS_TOKEN > no auth. Supports comma-separated NATS_URL for multi-server clusters.

connect.rs — connect(config, timeout) -> Result<Client, ConnectError>. Uses retry_on_initial_connect() so async_nats::connect() returns immediately and the handshake runs in the background. A oneshot channel catches the first meaningful event: Connected → Ok, authorization violation → Err (fail fast, no point retrying), unreachable server → return the client and let the reconnect loop continue. Exponential backoff capped at 30 s.

client.rs — four traits over async_nats::Client: SubscribeClient, RequestClient, PublishClient, FlushClient. All bridge and runner code is generic over these traits, which makes unit testing with mocks possible without spawning a real NATS server.

messaging.rs — request / request_with_timeout: serialize request, inject W3C trace context headers, send NATS request, deserialize response. publish: same but fire-and-forget with optional flush. RetryPolicy (no-retries or standard 3×, exponential 50 ms base) and FlushPolicy wrap both operations. Builder pattern for PublishOptions.

mocks.rs — AdvancedMockNatsClient: subscribe queues (each inject_messages() call pushes a new UnboundedReceiver that the next subscribe() call pops), request responses keyed by subject, fail_next_* controls for publish, flush, request. All tests in the bridge and runner use this instead of spawning NATS.

`trogon-mcp`

client.rs — McpClient: HTTP JSON-RPC 2.0 client for a single MCP server. Three methods: initialize() (handshake), list_tools() -> Vec<McpTool>, call_tool(name, args) -> String. Uses a global atomic counter for JSON-RPC ids. McpTool carries name, description, and inputSchema (raw serde_json::Value passed straight through to the Anthropic tool definition). On isError: true the call returns Err(text) so the agent loop can handle tool failures.

`trogon-agent-core`

tools/mod.rs — ToolDef (Anthropic tool definition: name, description, JSON Schema, optional cache_control for prompt caching on the last tool). ToolContext carries a reqwest::Client and proxy_url. dispatch_tool is a no-op stub — real dispatch happens in the runner via mcp_dispatch.

agent_loop.rs — AgentLoop: the Anthropic messages API call-loop. Fields: http_client, proxy_url, anthropic_token, anthropic_base_url (optional override), anthropic_extra_headers, model, max_iterations, thinking_budget, tool_context, MCP tool defs + dispatch list, permission_checker (optional gate called before each tool execution).

run(history, system_prompt, event_tx) streams events back through a channel: TextDelta, ToolUse, ToolResult, ModeChanged, Usage. Loop: POST to messages API → if end_turn emit final text and return → if tool_use check permission gate, dispatch (MCP first, then built-in), emit ToolResult, append to history, loop. Stops at max_iterations.

Wire types: Message (role + Vec<ContentBlock>), ContentBlock (Text, Image, Thinking, ToolUse, ToolResult), ImageSource (Base64 or Url), ToolResult. PermissionChecker trait for async approval callbacks.

`trogon-std`

Testable abstractions over OS APIs so the crates above can be tested without touching the real filesystem or environment.

fs/ — traits: ReadFile, WriteFile, CreateDirAll, OpenAppendFile, ExistsFile. Two impls: SystemFs (real OS calls) and MemFs (in-memory HashMap<PathBuf, String> used in tests).

env/ — ReadEnv trait with SystemEnv and InMemoryEnv impls. InMemoryEnv wraps a HashMap behind a Mutex — tests set variables without touching std::env.

time/ — GetNow and GetElapsed traits. SystemClock uses std::time::Instant. MockClock lets tests advance time manually.

json.rs — JsonSerialize trait + StdJsonSerialize impl. FailNextSerialize fails the Nth call (used to test error paths without mocking the entire serializer).

args.rs — args() helper that returns std::env::args() as a Vec<String> skipping the binary name.

`acp-telemetry`

OpenTelemetry setup for all ACP binaries.

lib.rs — init_logger(service_name, acp_prefix, env, fs): sets up a tracing subscriber with three layers: JSON to stderr, JSON to a log file (path: ACP_LOG_DIR env var or platform data dir), and OTel traces+metrics+logs if the OTel exporter initializes successfully. Falls back gracefully to stderr-only if OTel is unavailable. shutdown_otel() flushes and shuts down all three providers. meter(name) returns a named opentelemetry::metrics::Meter.

service_name.rs — ServiceName enum (AcpNatsWs, AcpNatsStdio, AcpRunner) used to name log files and OTel resources.

trace.rs / metric.rs / log.rs — each wraps an OnceLock for its SdkProvider, plus init_provider, force_flush, and shutdown. The signal.rs module provides a cross-platform wait_for_shutdown_signal that catches SIGTERM and SIGINT.

How to review

trogon-nats/src/connect.rs — the startup handshake and backoff logic
trogon-nats/src/messaging.rs — the retry/flush machinery used by the bridge
trogon-agent-core/src/agent_loop.rs — the Anthropic streaming loop

cargo test -p trogon-nats -p trogon-mcp -p trogon-agent-core -p trogon-std -p acp-telemetry

## Changed files

- `.github/workflows/ci-rust.yml` (modified, +2/-0)
- `rsworkspace/Cargo.lock` (modified, +1626/-102)
- `rsworkspace/crates/acp-telemetry/src/lib.rs` (modified, +38/-0)
- `rsworkspace/crates/trogon-agent-core/Cargo.toml` (added, +20/-0)
- `rsworkspace/crates/trogon-agent-core/build.rs` (added, +7/-0)
- `rsworkspace/crates/trogon-agent-core/src/agent_loop.rs` (added, +1178/-0)
- `rsworkspace/crates/trogon-agent-core/src/lib.rs` (added, +4/-0)
- `rsworkspace/crates/trogon-agent-core/src/tools/mod.rs` (added, +65/-0)
- `rsworkspace/crates/trogon-agent-core/tests/agent_loop_integration.rs` (added, +1059/-0)
- `rsworkspace/crates/trogon-mcp/Cargo.toml` (added, +17/-0)
- `rsworkspace/crates/trogon-mcp/src/client.rs` (added, +145/-0)
- `rsworkspace/crates/trogon-mcp/src/lib.rs` (added, +19/-0)
- `rsworkspace/crates/trogon-mcp/tests/mcp_client.rs` (added, +322/-0)
- `rsworkspace/crates/trogon-nats/Cargo.toml` (modified, +1/-0)
- `rsworkspace/crates/trogon-nats/src/auth.rs` (modified, +10/-0)
- `rsworkspace/crates/trogon-nats/src/connect.rs` (modified, +218/-28)
- `rsworkspace/crates/trogon-nats/tests/connect_integration.rs` (added, +194/-0)
- `rsworkspace/crates/trogon-nats/tests/messaging_integration.rs` (added, +152/-0)
- `rsworkspace/crates/trogon-std/src/fs/system.rs` (modified, +38/-0)

Code Example

LLM request rejected: messages.31.content.2: `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.

---

type=thinking thinking_len=224 signature=(none)
type=thinking thinking_len=562 signature=(none)
...

RAW_BUFFERClick to expand / collapse

Problem

When using Anthropic models with extended thinking enabled, the signature field on thinking content blocks is not preserved when messages are serialized to the session JSONL file. When the conversation history is sent back to the Anthropic API in subsequent turns, the API rejects the request:

LLM request rejected: messages.31.content.2: `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.

Root cause

type=thinking thinking_len=224 signature=(none)
type=thinking thinking_len=562 signature=(none)
...

This causes any subsequent API call using those messages to fail, since the thinking blocks are considered "modified".

Impact

Once the session accumulates enough messages with thinking blocks, all further interactions in that session fail permanently
/new and /reset commands may not properly clear the corrupted session
The only workaround is manually deleting the session JSONL file

Steps to reproduce

Configure an agent with anthropic/claude-sonnet-4-6 (or any Anthropic model with thinking)
Have a multi-turn conversation in a Feishu group (or any channel)
After ~15+ assistant turns with thinking, the session starts failing with the above error

Expected behavior

The signature field (and any other fields) on thinking / redacted_thinking content blocks should be preserved exactly as received from the API when serializing to the session file.

Environment

OpenClaw version: 2026.3.8
Model: anthropic/claude-sonnet-4-6
Provider: anthropic
Compaction mode: safeguard

extent analysis

Fix Plan

To preserve the signature field on thinking content blocks, we need to modify the serialization process. Here are the steps:

Modify the serialize_message function to include the signature field:

def serialize_message(message):
    # ...
    if 'thinking' in message:
        thinking_block = message['thinking']
        serialized_message['thinking'] = {
            'thinking_len': thinking_block['thinking_len'],
            'signature': thinking_block['signature']  # Add this line
        }
    # ...
    return serialized_message

Update the write_session_to_jsonl function to use the modified serialize_message function:

def write_session_to_jsonl(session):
    # ...
    for message in session['messages']:
        serialized_message = serialize_message(message)
        # ...
        jsonl_file.write(json.dumps(serialized_message) + '\n')
    # ...

Ensure that the signature field is properly deserialized when reading from the session JSONL file:

def deserialize_message(serialized_message):
    # ...
    if 'thinking' in serialized_message:
        thinking_block = serialized_message['thinking']
        message['thinking'] = {
            'thinking_len': thinking_block['thinking_len'],
            'signature': thinking_block['signature']  # Add this line
        }
    # ...
    return message

Verification

To verify that the fix worked, you can:

Inspect the session JSONL file to ensure that the signature field is present for thinking blocks
Test a multi-turn conversation with thinking blocks and verify that the session does not fail with the "LLM request rejected" error

Extra Tips

Make sure to update the serialize_message and deserialize_message functions to handle both thinking and redacted_thinking blocks
Consider adding additional logging or debugging statements to ensure that the signature field is being properly preserved and passed back to the Anthropic API.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

The signature field (and any other fields) on thinking / redacted_thinking content blocks should be preserved exactly as received from the API when serializing to the session file.

#api #ssr #installation #tensor shape #conversation history #file not found #serialization error #model compatibility

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Anthropic thinking block 'signature' field lost during session persistence — causes API rejection [2 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #44009: fix(agents): exclude synthetic assistant transcript mirrors from Anthropic replay

Description (problem / solution / changelog)

Summary

What changed

Why this fix is safe

Validation

Changed files

PR #55: feat: foundation layer — trogon-nats, trogon-mcp, trogon-agent-core, acp-telemetry

Description (problem / solution / changelog)

What

Crates

trogon-nats

trogon-mcp

trogon-agent-core

trogon-std

acp-telemetry

How to review

Code Example

Problem

Root cause

Impact

Steps to reproduce

Expected behavior

Environment

extent analysis

Fix Plan

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING

`trogon-nats`

`trogon-mcp`

`trogon-agent-core`

`trogon-std`

`acp-telemetry`