openclaw - 💡(How to fix) Fix Feature: Model-aware contextTokens — auto-scale session context limit to active model's actual window

StepCodex · 2026-05-30T18:06:58Z

[openclaw] OpenClaw's built-in default for contextTokens is 1,048,576 2²⁰ = 1M . This is designed for large-context models Gemini 1M, future Claude versions ,… OpenClaw's built-in default for `contextTokens` is 1,048,576 (2²⁰ = 1M). This is designed for large-context models (Gemini 1M, future Claude versions), but it silently allows sessions to overflow the actual API context limit for smaller-context models. Compaction never fires, sessions grow past the model's hard limit, and failures occur without any warning. ## Summary OpenClaw's built-in default for `contextTokens` is 1,048,576 (2²⁰ = 1M). This is designed for large-context models (Gemini 1M, future Claude versions), but it silently allows sessions to overflow the actual API context limit for smaller-context models. Compaction never fires, sessions grow past the model's hard limit, and failures occur without any warning. ## Problem `contextTokens` is a static value. When an agent uses `claude-sonnet-4-6` (200k context), the default 1049k means compaction never triggers — the session just grows until the Anthropic API rejects the request with a context overflow or, worse, a subtle error like invalid thinking block signatures (see related issue #88403). Setting `contextTokens` to a low fixed value globally fixes one model but breaks others. An agent that switches between Claude (200k) and Gemini (1M) would either overflow Claude or artificially cap Gemini. ## Proposed solution Add an `"auto"` mode for `contextTokens` (or as its default behavior) that derives the effective limit from the active model's actual context window: ```json5 { agents: { defaults: { contextTokens: "auto", // new: derives from model catalog contextTokensSafetyFactor: 0.85 // optional: headroom margin (default 0.85) } } } ``` Behavior: - `"auto"` resolves to `model.contextTokens * safetyFactor` at session start - When the model changes mid-session, the effective limit recalculates - Falls back to current default (1049k) when model context window is unknown This means: - Claude Sonnet (200k) → effective limit ~170k → compaction fires safely before overflow - Gemini 1M → effective limit ~850k → full context window available - Future 10M-context models → automatically scales up without config changes ## Alternatives considered 1. **Lower the global default** — fixes current models but artificially caps future larger-context models and requires manual updates as models evolve. Fragile. 2. **Per-agent `contextTokens`** — requires manually setting per-agent values and updating them when switching models. High maintenance overhead. 3. **`maxActiveTranscriptBytes`** — byte-based proxy, not token-aware, and requires `truncateAfterCompaction`. Useful as secondary safeguard but not a replacement. ## Why this matters - Agents running tool-heavy sessions (research, data analysis, competitive intelligence) accumulate context fast - Silent overflow causes hard-to-diagnose failures — context limit errors, thinking signature errors, or truncated responses - Operators should not need to track per-model context windows and update config every time they switch models or a model is updated - `"auto"` is the least-surprise default: the platform should not allow sessions to grow beyond what the model can actually handle ## Suggested implementation The effective context limit is already computed per-request in the provider transport layer. Threading the model's `contextTokens` (from the catalog) through to the session compaction threshold calculation should be straightforward. A reasonable PR scope: 1. Add `"auto"` as a valid value for `contextTokens` at `agents.defaults` and `agents.list[]` 2. At session start (and on model change), resolve `contextTokens` = `catalog[model].contextTokens * safetyFactor` 3. Update `openclaw doctor` to warn when `contextTokens` exceeds the active model's limit 4. Consider making `"auto"` the new default in a future major version ## Environment - OpenClaw 2026.5.26 (10ad3aa) - Discovered via: long Teams session on `claude-sonnet-4-6` that grew to 207k tokens before failing

openclaw2026-05-30 18:06:58

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

OpenClaw's built-in default for contextTokens is 1,048,576 (2²⁰ = 1M). This is designed for large-context models (Gemini 1M, future Claude versions), but it silently allows sessions to overflow the actual API context limit for smaller-context models. Compaction never fires, sessions grow past the model's hard limit, and failures occur without any warning.

Error Message

contextTokens is a static value. When an agent uses claude-sonnet-4-6 (200k context), the default 1049k means compaction never triggers — the session just grows until the Anthropic API rejects the request with a context overflow or, worse, a subtle error like invalid thinking block signatures (see related issue #88403). 3. Update openclaw doctor to warn when contextTokens exceeds the active model's limit

Root Cause

Agents running tool-heavy sessions (research, data analysis, competitive intelligence) accumulate context fast
Silent overflow causes hard-to-diagnose failures — context limit errors, thinking signature errors, or truncated responses
Operators should not need to track per-model context windows and update config every time they switch models or a model is updated
"auto" is the least-surprise default: the platform should not allow sessions to grow beyond what the model can actually handle

Code Example

{
  agents: {
    defaults: {
      contextTokens: "auto",       // new: derives from model catalog
      contextTokensSafetyFactor: 0.85  // optional: headroom margin (default 0.85)
    }
  }
}

RAW_BUFFERClick to expand / collapse

Summary

Problem

Setting contextTokens to a low fixed value globally fixes one model but breaks others. An agent that switches between Claude (200k) and Gemini (1M) would either overflow Claude or artificially cap Gemini.

Proposed solution

Add an "auto" mode for contextTokens (or as its default behavior) that derives the effective limit from the active model's actual context window:

{
  agents: {
    defaults: {
      contextTokens: "auto",       // new: derives from model catalog
      contextTokensSafetyFactor: 0.85  // optional: headroom margin (default 0.85)
    }
  }
}

Behavior:

"auto" resolves to model.contextTokens * safetyFactor at session start
When the model changes mid-session, the effective limit recalculates
Falls back to current default (1049k) when model context window is unknown

This means:

Claude Sonnet (200k) → effective limit ~170k → compaction fires safely before overflow
Gemini 1M → effective limit ~850k → full context window available
Future 10M-context models → automatically scales up without config changes

Alternatives considered

Lower the global default — fixes current models but artificially caps future larger-context models and requires manual updates as models evolve. Fragile.
Per-agent contextTokens — requires manually setting per-agent values and updating them when switching models. High maintenance overhead.
maxActiveTranscriptBytes — byte-based proxy, not token-aware, and requires truncateAfterCompaction. Useful as secondary safeguard but not a replacement.

Why this matters

Agents running tool-heavy sessions (research, data analysis, competitive intelligence) accumulate context fast
Silent overflow causes hard-to-diagnose failures — context limit errors, thinking signature errors, or truncated responses
Operators should not need to track per-model context windows and update config every time they switch models or a model is updated
"auto" is the least-surprise default: the platform should not allow sessions to grow beyond what the model can actually handle

Suggested implementation

The effective context limit is already computed per-request in the provider transport layer. Threading the model's contextTokens (from the catalog) through to the session compaction threshold calculation should be straightforward.

A reasonable PR scope:

Add "auto" as a valid value for contextTokens at agents.defaults and agents.list[]
At session start (and on model change), resolve contextTokens = catalog[model].contextTokens * safetyFactor
Update openclaw doctor to warn when contextTokens exceeds the active model's limit
Consider making "auto" the new default in a future major version

Environment

OpenClaw 2026.5.26 (10ad3aa)
Discovered via: long Teams session on claude-sonnet-4-6 that grew to 207k tokens before failing

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering