openclaw - ✅(Solved) Fix [Bug]: Reasoning model thinking blocks (<thinking> tags) in conversation history cause HTTP 400 on GitHub Copilot provider [1 pull requests, 1 comments, 2 participants]

warcold · 2026-05-13T19:59:14Z

[openclaw] When a reasoning model generates blocks, they get stored in conversation history and sent back to GitHub Copilot on the next turn, which rejects them with HTTP 400 — breaking every multi-turn conversation. # PR #81534: fix(github-copilot): strip thinking blocks from latest assistant turn (#81520) - Repository: openclaw/openclaw - Author: SymbolStar - State: open | merged: False - Link: https://github.com/openclaw/openclaw/pull/81534 ## Description (problem / solution / changelog) ## Summary GitHub Copilot's Claude proxy returns HTTP 400 on every multi-turn conversation with a reasoning model (`claude-sonnet-4.6`, `claude-opus-4`, …) because the assistant message from the prior turn still contains `{ type: "thinking", … }` content blocks when the next request goes out. Unlike the direct Anthropic Messages API, the Copilot transport exposes no signed-thinking replay protocol, so any persisted `thinking` / `redacted_thinking` block is rejected — regardless of whether it sits on the latest or a prior assistant turn. The Copilot replay policy was reusing the shared `dropThinkingBlocks` helper, which intentionally **preserves** thinking blocks on the latest assistant turn so direct Anthropic / Bedrock can replay the signed payload. That preservation rule is wrong for Copilot. Fixes #81520. ## Approach Per the codex review on the issue, the fix is provider-scoped instead of changing global `dropThinkingBlocks` semantics: - New `ProviderReplayPolicy.dropAllThinkingBlocks` flag. - New `dropAllThinkingBlocks(messages)` sanitizer that strips `thinking` / `redacted_thinking` from **every** assistant turn (latest included). Thinking-only assistant turns become a single neutral-text placeholder so provider adapters that filter blank content arrays still see a valid turn. - Threaded through `TranscriptPolicy`, the replay-history pipeline (`sanitizeSessionHistory`), the per-request stream wrapper in `run/attempt.ts`, and the context-pruning token-estimate input. - `buildGithubCopilotReplayPolicy` now sets `dropAllThinkingBlocks: true` for Claude models (instead of `dropThinkingBlocks: true`). Direct Anthropic / Bedrock / OpenAI paths are untouched and keep their signed-thinking replay semantics. ## Test changes - **New** unit tests for `dropAllThinkingBlocks` in `src/agents/pi-embedded-runner/thinking.test.ts`: latest-turn stripping, placeholder-on-empty, `redacted_thinking` coverage, original-reference contract. - **Flipped** the existing Copilot Claude cases in `src/agents/pi-embedded-runner.sanitize-session-history.test.ts` so they now assert the bug-free behaviour described in the issue (thinking dropped from the latest assistant turn for `github-copilot` Claude, including thinking + tool_use + text combinations). - **Updated** every TranscriptPolicy fixture / mock to include the new `dropAllThinkingBlocks` field so types/tests stay consistent. ## Acceptance (per codex review) - `pnpm test src/agents/pi-embedded-runner.sanitize-session-history.test.ts src/agents/pi-embedded-runner/thinking.test.ts extensions/github-copilot/provider-runtime.contract.test.ts extensions/github-copilot/stream.test.ts` - `pnpm check:changed` (Local pnpm not available in this environment per workspace constraints; tests will run in CI.) ## Risk / scope - Behavior change is gated on the new `dropAllThinkingBlocks` flag and is only set for `github-copilot` + Claude models. - No change to direct Anthropic / Bedrock / OpenAI replay paths, which still rely on `dropThinkingBlocks` (preserve latest) or signed-thinking replay. ## Changed files - `extensions/github-copilot/replay-policy.ts` (modified, +7/-1) - `src/agents/pi-embedded-runner.anthropic-tool-replay.live.test.ts` (modified, +1/-0) - `src/agents/pi-embedded-runner.sanitize-session-history.test.ts` (modified, +23/-26) - `src/agents/pi-embedded-runner/compact.hooks.harness.ts` (modified, +1/-0) - `src/agents/pi-embedded-runner/extensions.ts` (modified, +1/-1) - `src/agents/pi-embedded-runner/replay-history.ts` (modified, +6/-3) - `src/agents/pi-embedded-runner/run.overflow-compaction.test.ts` (modified, +1/-0) - `src/agents/pi-embedded-runner/run/attempt.test.ts` (modified, +15/-0) - `src/agents/pi-embedded-runner/run/attempt.tool-call-normalization.ts` (modified, +2/-1) - `src/agents/pi-embedded-runner/run/attempt.transcript-policy.test.ts` (modified, +1/-0) - `src/agents/pi-embedded-runner/run/attempt.ts` (modified, +11/-5) - `src/agents/pi-embedded-runner/thinking.test.ts` (modified, +72/-0) - `src/agents/pi-embedded-runner/thinking.ts` (modified, +16/-0) - `src/agents/runtime-plan/types.ts` (modified, +1/-0) - `src/agents/test-helpers/pi-embedded-runner-e2e-mocks.ts` (modified, +1/-0) - `src/agents/transcript-policy.test.ts` (modified, +1/-0) - `src/agents/transcript-policy.ts` (modified, +14/-2) - `src/plugins/types.

openclaw2026-05-13 19:59:14

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#81520•Fetched 2026-05-14 03:31:16

View on GitHub

Comments

Participants

Timeline

Reactions

Author

warcold

Participants

clawsweeper[bot]

warcold

Timeline (top)

cross-referenced ×2labeled ×2mentioned ×2subscribed ×2

When a reasoning model generates <thinking> blocks, they get stored in conversation history and sent back to GitHub Copilot on the next turn, which rejects them with HTTP 400 — breaking every multi-turn conversation.

Error Message

First message works fine every time. It breaks on the second, without exception.

Root Cause

Fix Action

Fixed

Fixed by PR: fix(github-copilot): strip thinking blocks from latest assistant turn (#81520) (https://github.com/openclaw/openclaw/pull/81534)

PR fix notes

PR #81534: fix(github-copilot): strip thinking blocks from latest assistant turn (#81520)

Repository: openclaw/openclaw
Author: SymbolStar
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/81534

Description (problem / solution / changelog)

Summary

GitHub Copilot's Claude proxy returns HTTP 400 on every multi-turn conversation with a reasoning model (claude-sonnet-4.6, claude-opus-4, …) because the assistant message from the prior turn still contains { type: "thinking", … } content blocks when the next request goes out. Unlike the direct Anthropic Messages API, the Copilot transport exposes no signed-thinking replay protocol, so any persisted thinking / redacted_thinking block is rejected — regardless of whether it sits on the latest or a prior assistant turn.

The Copilot replay policy was reusing the shared dropThinkingBlocks helper, which intentionally preserves thinking blocks on the latest assistant turn so direct Anthropic / Bedrock can replay the signed payload. That preservation rule is wrong for Copilot.

Fixes #81520.

Approach

Per the codex review on the issue, the fix is provider-scoped instead of changing global dropThinkingBlocks semantics:

New ProviderReplayPolicy.dropAllThinkingBlocks flag.
New dropAllThinkingBlocks(messages) sanitizer that strips thinking / redacted_thinking from every assistant turn (latest included). Thinking-only assistant turns become a single neutral-text placeholder so provider adapters that filter blank content arrays still see a valid turn.
Threaded through TranscriptPolicy, the replay-history pipeline (sanitizeSessionHistory), the per-request stream wrapper in run/attempt.ts, and the context-pruning token-estimate input.
buildGithubCopilotReplayPolicy now sets dropAllThinkingBlocks: true for Claude models (instead of dropThinkingBlocks: true). Direct Anthropic / Bedrock / OpenAI paths are untouched and keep their signed-thinking replay semantics.

Test changes

New unit tests for dropAllThinkingBlocks in src/agents/pi-embedded-runner/thinking.test.ts: latest-turn stripping, placeholder-on-empty, redacted_thinking coverage, original-reference contract.
Flipped the existing Copilot Claude cases in src/agents/pi-embedded-runner.sanitize-session-history.test.ts so they now assert the bug-free behaviour described in the issue (thinking dropped from the latest assistant turn for github-copilot Claude, including thinking + tool_use + text combinations).
Updated every TranscriptPolicy fixture / mock to include the new dropAllThinkingBlocks field so types/tests stay consistent.

Acceptance (per codex review)

pnpm test src/agents/pi-embedded-runner.sanitize-session-history.test.ts src/agents/pi-embedded-runner/thinking.test.ts extensions/github-copilot/provider-runtime.contract.test.ts extensions/github-copilot/stream.test.ts
pnpm check:changed

(Local pnpm not available in this environment per workspace constraints; tests will run in CI.)

Risk / scope

Behavior change is gated on the new dropAllThinkingBlocks flag and is only set for github-copilot + Claude models.
No change to direct Anthropic / Bedrock / OpenAI replay paths, which still rely on dropThinkingBlocks (preserve latest) or signed-thinking replay.

Changed files

extensions/github-copilot/replay-policy.ts (modified, +7/-1)
src/agents/pi-embedded-runner.anthropic-tool-replay.live.test.ts (modified, +1/-0)
src/agents/pi-embedded-runner.sanitize-session-history.test.ts (modified, +23/-26)
src/agents/pi-embedded-runner/compact.hooks.harness.ts (modified, +1/-0)
src/agents/pi-embedded-runner/extensions.ts (modified, +1/-1)
src/agents/pi-embedded-runner/replay-history.ts (modified, +6/-3)
src/agents/pi-embedded-runner/run.overflow-compaction.test.ts (modified, +1/-0)
src/agents/pi-embedded-runner/run/attempt.test.ts (modified, +15/-0)
src/agents/pi-embedded-runner/run/attempt.tool-call-normalization.ts (modified, +2/-1)
src/agents/pi-embedded-runner/run/attempt.transcript-policy.test.ts (modified, +1/-0)
src/agents/pi-embedded-runner/run/attempt.ts (modified, +11/-5)
src/agents/pi-embedded-runner/thinking.test.ts (modified, +72/-0)
src/agents/pi-embedded-runner/thinking.ts (modified, +16/-0)
src/agents/runtime-plan/types.ts (modified, +1/-0)
src/agents/test-helpers/pi-embedded-runner-e2e-mocks.ts (modified, +1/-0)
src/agents/transcript-policy.test.ts (modified, +1/-0)
src/agents/transcript-policy.ts (modified, +14/-2)
src/plugins/types.ts (modified, +8/-0)

Code Example

Second message in any conversation returns a 400 from the Copilot API endpoint. Payload inspection shows the assistant message from turn 1 containing a content array with a {type: "thinking"} block. Single-turn conversations succeed without any issue.

RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

Summary

Steps to reproduce

Configure an agent with claude-sonnet-4.6 or claude-opus-4 via the GitHub Copilot provider, with thinking enabled.
Send any message — the model generates a <thinking> block, which gets stored in history.
Send a follow-up in the same session.
OpenClaw includes the {type: "thinking"} block from step 2 in the history sent to the API.
GitHub Copilot returns HTTP 400.

First message works fine every time. It breaks on the second, without exception.

Expected behavior

Thinking blocks are internal model state — they should never appear in the history sent to the API. The context sanitization pipeline should strip them before any request goes out.

Actual behavior

HTTP 400 from the GitHub Copilot API on every second turn of any conversation with a reasoning model. The request payload shows the assistant message from turn 1 with a content array that includes {type: "thinking", thinking: "..."} — Copilot rejects this. Single-turn conversations are completely unaffected.

OpenClaw version

OpenClaw version: v2026.5.7

Operating system

Kali Linux

Install method

npm global (npm install -g openclaw)

Model

claude-sonnet-4.6 (also confirmed with claude-opus-4)

Provider / routing chain

OpenClaw → GitHub Copilot provider → Claude Sonnet 4.6

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Second message in any conversation returns a 400 from the Copilot API endpoint. Payload inspection shows the assistant message from turn 1 containing a content array with a {type: "thinking"} block. Single-turn conversations succeed without any issue.

Impact and severity

Affected: Any deployment using reasoning models (Claude Sonnet/Opus) with the GitHub Copilot provider
Severity: High — multi-turn conversations broken entirely
Frequency: Every 2nd message, 100% reproducible
Consequence: The bot can't hold a conversation past the first exchange

Additional information

The fix is in dist/model-context-tokens*.js, in the sanitizedWithProvider function before the final return. Filter out any assistant message content blocks with type === "thinking" or type === "redacted_thinking":

content = content.filter(b => b.type !== 'thinking' && b.type !== 'redacted_thinking')

One line, applied in the right place — fully resolves the issue in production.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Thinking blocks are internal model state — they should never appear in the history sent to the API. The context sanitization pipeline should strip them before any request goes out.

#api #conversation history #GPU setup #container setup #orchestration issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug]: Reasoning model thinking blocks (<thinking> tags) in conversation history cause HTTP 400 on GitHub Copilot provider [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #81534: fix(github-copilot): strip thinking blocks from latest assistant turn (#81520)

Description (problem / solution / changelog)

Summary

Approach

Test changes

Acceptance (per codex review)

Risk / scope

Changed files

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING