openclaw - ✅(Solved) Fix [Bug]: Reasoning model thinking blocks (<thinking> tags) in conversation history cause HTTP 400 on GitHub Copilot provider [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#81520Fetched 2026-05-14 03:31:16
View on GitHub
Comments
1
Participants
2
Timeline
9
Reactions
2
Author
Timeline (top)
cross-referenced ×2labeled ×2mentioned ×2subscribed ×2

When a reasoning model generates <thinking> blocks, they get stored in conversation history and sent back to GitHub Copilot on the next turn, which rejects them with HTTP 400 — breaking every multi-turn conversation.

Error Message

First message works fine every time. It breaks on the second, without exception.

Root Cause

When a reasoning model generates <thinking> blocks, they get stored in conversation history and sent back to GitHub Copilot on the next turn, which rejects them with HTTP 400 — breaking every multi-turn conversation.

Fix Action

Fixed

PR fix notes

PR #81534: fix(github-copilot): strip thinking blocks from latest assistant turn (#81520)

Description (problem / solution / changelog)

Summary

GitHub Copilot's Claude proxy returns HTTP 400 on every multi-turn conversation with a reasoning model (claude-sonnet-4.6, claude-opus-4, …) because the assistant message from the prior turn still contains { type: "thinking", … } content blocks when the next request goes out. Unlike the direct Anthropic Messages API, the Copilot transport exposes no signed-thinking replay protocol, so any persisted thinking / redacted_thinking block is rejected — regardless of whether it sits on the latest or a prior assistant turn.

The Copilot replay policy was reusing the shared dropThinkingBlocks helper, which intentionally preserves thinking blocks on the latest assistant turn so direct Anthropic / Bedrock can replay the signed payload. That preservation rule is wrong for Copilot.

Fixes #81520.

Approach

Per the codex review on the issue, the fix is provider-scoped instead of changing global dropThinkingBlocks semantics:

  • New ProviderReplayPolicy.dropAllThinkingBlocks flag.
  • New dropAllThinkingBlocks(messages) sanitizer that strips thinking / redacted_thinking from every assistant turn (latest included). Thinking-only assistant turns become a single neutral-text placeholder so provider adapters that filter blank content arrays still see a valid turn.
  • Threaded through TranscriptPolicy, the replay-history pipeline (sanitizeSessionHistory), the per-request stream wrapper in run/attempt.ts, and the context-pruning token-estimate input.
  • buildGithubCopilotReplayPolicy now sets dropAllThinkingBlocks: true for Claude models (instead of dropThinkingBlocks: true). Direct Anthropic / Bedrock / OpenAI paths are untouched and keep their signed-thinking replay semantics.

Test changes

  • New unit tests for dropAllThinkingBlocks in src/agents/pi-embedded-runner/thinking.test.ts: latest-turn stripping, placeholder-on-empty, redacted_thinking coverage, original-reference contract.
  • Flipped the existing Copilot Claude cases in src/agents/pi-embedded-runner.sanitize-session-history.test.ts so they now assert the bug-free behaviour described in the issue (thinking dropped from the latest assistant turn for github-copilot Claude, including thinking + tool_use + text combinations).
  • Updated every TranscriptPolicy fixture / mock to include the new dropAllThinkingBlocks field so types/tests stay consistent.

Acceptance (per codex review)

  • pnpm test src/agents/pi-embedded-runner.sanitize-session-history.test.ts src/agents/pi-embedded-runner/thinking.test.ts extensions/github-copilot/provider-runtime.contract.test.ts extensions/github-copilot/stream.test.ts
  • pnpm check:changed

(Local pnpm not available in this environment per workspace constraints; tests will run in CI.)

Risk / scope

  • Behavior change is gated on the new dropAllThinkingBlocks flag and is only set for github-copilot + Claude models.
  • No change to direct Anthropic / Bedrock / OpenAI replay paths, which still rely on dropThinkingBlocks (preserve latest) or signed-thinking replay.

Changed files

  • extensions/github-copilot/replay-policy.ts (modified, +7/-1)
  • src/agents/pi-embedded-runner.anthropic-tool-replay.live.test.ts (modified, +1/-0)
  • src/agents/pi-embedded-runner.sanitize-session-history.test.ts (modified, +23/-26)
  • src/agents/pi-embedded-runner/compact.hooks.harness.ts (modified, +1/-0)
  • src/agents/pi-embedded-runner/extensions.ts (modified, +1/-1)
  • src/agents/pi-embedded-runner/replay-history.ts (modified, +6/-3)
  • src/agents/pi-embedded-runner/run.overflow-compaction.test.ts (modified, +1/-0)
  • src/agents/pi-embedded-runner/run/attempt.test.ts (modified, +15/-0)
  • src/agents/pi-embedded-runner/run/attempt.tool-call-normalization.ts (modified, +2/-1)
  • src/agents/pi-embedded-runner/run/attempt.transcript-policy.test.ts (modified, +1/-0)
  • src/agents/pi-embedded-runner/run/attempt.ts (modified, +11/-5)
  • src/agents/pi-embedded-runner/thinking.test.ts (modified, +72/-0)
  • src/agents/pi-embedded-runner/thinking.ts (modified, +16/-0)
  • src/agents/runtime-plan/types.ts (modified, +1/-0)
  • src/agents/test-helpers/pi-embedded-runner-e2e-mocks.ts (modified, +1/-0)
  • src/agents/transcript-policy.test.ts (modified, +1/-0)
  • src/agents/transcript-policy.ts (modified, +14/-2)
  • src/plugins/types.ts (modified, +8/-0)

Code Example

Second message in any conversation returns a 400 from the Copilot API endpoint. Payload inspection shows the assistant message from turn 1 containing a content array with a {type: "thinking"} block. Single-turn conversations succeed without any issue.
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

When a reasoning model generates <thinking> blocks, they get stored in conversation history and sent back to GitHub Copilot on the next turn, which rejects them with HTTP 400 — breaking every multi-turn conversation.

Steps to reproduce

  1. Configure an agent with claude-sonnet-4.6 or claude-opus-4 via the GitHub Copilot provider, with thinking enabled.
  2. Send any message — the model generates a <thinking> block, which gets stored in history.
  3. Send a follow-up in the same session.
  4. OpenClaw includes the {type: "thinking"} block from step 2 in the history sent to the API.
  5. GitHub Copilot returns HTTP 400.

First message works fine every time. It breaks on the second, without exception.

Expected behavior

Thinking blocks are internal model state — they should never appear in the history sent to the API. The context sanitization pipeline should strip them before any request goes out.

Actual behavior

HTTP 400 from the GitHub Copilot API on every second turn of any conversation with a reasoning model. The request payload shows the assistant message from turn 1 with a content array that includes {type: "thinking", thinking: "..."} — Copilot rejects this. Single-turn conversations are completely unaffected.

OpenClaw version

OpenClaw version: v2026.5.7

Operating system

Kali Linux

Install method

npm global (npm install -g openclaw)

Model

claude-sonnet-4.6 (also confirmed with claude-opus-4)

Provider / routing chain

OpenClaw → GitHub Copilot provider → Claude Sonnet 4.6

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Second message in any conversation returns a 400 from the Copilot API endpoint. Payload inspection shows the assistant message from turn 1 containing a content array with a {type: "thinking"} block. Single-turn conversations succeed without any issue.

Impact and severity

  • Affected: Any deployment using reasoning models (Claude Sonnet/Opus) with the GitHub Copilot provider
  • Severity: High — multi-turn conversations broken entirely
  • Frequency: Every 2nd message, 100% reproducible
  • Consequence: The bot can't hold a conversation past the first exchange

Additional information

The fix is in dist/model-context-tokens*.js, in the sanitizedWithProvider function before the final return. Filter out any assistant message content blocks with type === "thinking" or type === "redacted_thinking":

content = content.filter(b => b.type !== 'thinking' && b.type !== 'redacted_thinking')

One line, applied in the right place — fully resolves the issue in production.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Thinking blocks are internal model state — they should never appear in the history sent to the API. The context sanitization pipeline should strip them before any request goes out.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: Reasoning model thinking blocks (<thinking> tags) in conversation history cause HTTP 400 on GitHub Copilot provider [1 pull requests, 1 comments, 2 participants]