openclaw - 💡(How to fix) Fix [Bug]: Bodyless Venice/Anthropic 400 misclassified as context-overflow → Pi triggers compaction recovery loop on sessions with nothing to compact [1 comments, 2 participants]

openclaw2026-04-29 09:02:42

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#74239•Fetched 2026-04-30 06:26:55

View on GitHub

Comments

Participants

Timeline

Reactions

Author

axonrelaybot

Participants

axonrelaybot

clawsweeper[bot]

Timeline (top)

cross-referenced ×3closed ×1commented ×1

This is distinct from #73963 (openai-codex metadata injection → 400) and #73879 (empty prompt → 400). The root error here is a Venice/provider-side rejection (possibly triggered by a poisoned transcript surviving auto-repair), but the compaction misclassification turns a single-turn failure into a session-destroying loop.

Error Message

When Venice returns a bodyless 400 status code (no body) error on a turn, Pi's embedded agent runtime misclassifies it as a context-overflow event and triggers compaction recovery. If the session has no real conversation messages to summarize (e.g. a near-empty or freshly-repaired session), the compaction safeguard writes an empty compaction boundary to "suppress re-trigger loop" — but the boundary itself causes the next turn to fail the same way, creating a self-reinforcing loop. This is distinct from #73963 (openai-codex metadata injection → 400) and #73879 (empty prompt → 400). The root error here is a Venice/provider-side rejection (possibly triggered by a poisoned transcript surviving auto-repair), but the compaction misclassification turns a single-turn failure into a session-destroying loop. 3. Venice still returns 400 status code (no body) on the next turn (transcript still contains something Venice rejects, or Venice has a transient error). 4. Pi cannot distinguish 400 = context overflow from 400 = malformed request / transient error — both surface as 400 status code (no body). 09:04:50 [agent/embedded] embedded run agent end: runId=cc246c2e isError=true model=claude-sonnet-4-6 provider=venice error=400 status code (no body) rawError=400 status code (no body) Pi's overflow detection in the embedded runtime treats any 400 API error as a potential context-overflow trigger (the overflow signatures list covers natural-language bodies like request_too_large, context length exceeded, etc., but a bare bodyless 400 falls through the string match and is still treated as "maybe overflow" rather than "hard failure").

A transient provider-side error Proposed fix: In Pi's overflow recovery path, require the error response to contain a recognized overflow signature string. A bare 400 status code (no body) with no body should be surfaced as a hard turn failure (propagate to user), not trigger compaction recovery. Optionally add a check: if compaction fires and finds no real conversation messages, abort rather than writing an empty boundary.
#73963 — different root cause (openai-codex metadata injection), same surface error string

Root Cause

Pi's overflow detection in the embedded runtime treats any 400 API error as a potential context-overflow trigger (the overflow signatures list covers natural-language bodies like request_too_large, context length exceeded, etc., but a bare bodyless 400 falls through the string match and is still treated as "maybe overflow" rather than "hard failure").

A bodyless 400 cannot be a context overflow — real Anthropic/Venice overflow errors always include a body with a machine-readable code. A bodyless 400 is either:

A malformed request (wrong headers, corrupt transcript segment, unsupported field)
A transient provider-side error

Proposed fix: In Pi's overflow recovery path, require the error response to contain a recognized overflow signature string. A bare 400 status code (no body) with no body should be surfaced as a hard turn failure (propagate to user), not trigger compaction recovery. Optionally add a check: if compaction fires and finds no real conversation messages, abort rather than writing an empty boundary.

Fix Action

Workaround

Manually rename the affected .jsonl to .jsonl.poisoned. OpenClaw will start a fresh session on the next message.

Code Example

09:04:49 [agent/embedded] session file repaired: rewrote 2 assistant message(s) (b87b3b97-b592-40fa-84ac-6ee8d258216a.jsonl)
09:04:50 [agent/embedded] embedded run agent end: runId=cc246c2e isError=true model=claude-sonnet-4-6 provider=venice error=400 status code (no body) rawError=400 status code (no body)
09:04:50 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.

---

09:04:50 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
09:28:21 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
09:30:24 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
09:44:29 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
09:47:02 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
09:48:39 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.

RAW_BUFFERClick to expand / collapse

Environment

OpenClaw: 2026.4.25 (aa36ee6)
Model: venice/claude-sonnet-4-6 (Anthropic Claude Sonnet 4.6 via Venice proxy)
Context window: 200,000 tokens (as reported by Venice discovery)
Platform: macOS Darwin 25.4.0 arm64, Node v22.22.2
Compaction config: mode: safeguard, reserveTokens: 8000, keepRecentTokens: 20000, model: openai/gpt-4o-mini

Summary

Steps to Reproduce

Session transcript accumulates empty-content assistant messages (e.g. from a prior failed compaction — see MEMORY.md note about poisoned sessions).
OpenClaw auto-repair fires on load (session file repaired: rewrote N assistant message(s)).
Venice still returns 400 status code (no body) on the next turn (transcript still contains something Venice rejects, or Venice has a transient error).
Pi cannot distinguish 400 = context overflow from 400 = malformed request / transient error — both surface as 400 status code (no body).
Pi triggers compaction recovery.
Compaction safeguard finds no real conversation messages to summarize, writes empty compaction boundary.
Next turn: empty boundary is included in context → Venice returns 400 again → repeat.

Log Evidence

From gateway.err.log (session b87b3b97):

09:04:49 [agent/embedded] session file repaired: rewrote 2 assistant message(s) (b87b3b97-b592-40fa-84ac-6ee8d258216a.jsonl)
09:04:50 [agent/embedded] embedded run agent end: runId=cc246c2e isError=true model=claude-sonnet-4-6 provider=venice error=400 status code (no body) rawError=400 status code (no body)
09:04:50 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.

From gateway.log (6 identical entries over 44 minutes, all new sessions):

09:04:50 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
09:28:21 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
09:30:24 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
09:44:29 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
09:47:02 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.
09:48:39 [compaction-safeguard] Compaction safeguard: no real conversation messages to summarize; writing compaction boundary to suppress re-trigger loop.

Root Cause

A bodyless 400 cannot be a context overflow — real Anthropic/Venice overflow errors always include a body with a machine-readable code. A bodyless 400 is either:

A malformed request (wrong headers, corrupt transcript segment, unsupported field)
A transient provider-side error

Impact

Sessions with auto-repaired transcripts enter a compaction loop that makes them unusable
Each loop iteration fires a gpt-4o-mini compaction attempt (token cost)
The loop persists across gateway restarts until the session file is manually quarantined
Affects any session where Venice returns a bodyless 400 — including transient network hiccups

Workaround

Manually rename the affected .jsonl to .jsonl.poisoned. OpenClaw will start a fresh session on the next message.

#73963 — different root cause (openai-codex metadata injection), same surface error string
#73829 — poisoned session files from failed compaction (the upstream cause that feeds this loop)

extent analysis

TL;DR

Modify Pi's overflow recovery path to require a recognized overflow signature string in the error response, treating bodyless 400 errors as hard turn failures.

Guidance

Update the overflow detection logic to distinguish between context overflow errors and other types of 400 errors.
Add a check to prevent compaction recovery from writing an empty boundary when no real conversation messages are found.
Consider adding logging to track instances of bodyless 400 errors to better understand their causes.
Review the list of recognized overflow signature strings to ensure it covers all possible context overflow error messages from Anthropic/Venice.

Example

No code snippet is provided as the issue does not include specific code references, but the proposed fix involves modifying the error handling logic in Pi's embedded runtime.

Notes

The solution assumes that the issue is solely due to the misclassification of bodyless 400 errors as context overflow events. If other factors contribute to the problem, additional debugging may be necessary.

Recommendation

Apply the proposed fix to modify Pi's overflow recovery path, as it directly addresses the root cause of the issue and prevents the compaction loop.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #autograd error #model save/load #optimization #mixed precision

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: Bodyless Venice/Anthropic 400 misclassified as context-overflow → Pi triggers compaction recovery loop on sessions with nothing to compact [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

Code Example

Environment

Summary

Steps to Reproduce

Log Evidence

Root Cause

Impact

Workaround

Related

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Bodyless Venice/Anthropic 400 misclassified as context-overflow → Pi triggers compaction recovery loop on sessions with nothing to compact [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

Code Example

Environment

Summary

Steps to Reproduce

Log Evidence

Root Cause

Impact

Workaround

Related

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING