openclaw - 💡(How to fix) Fix [Bug]: max_tokens not subtracting used input tokens, causing API "too large" errors [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When OpenClaw calls a model with a moderate context window (e.g. StepFun step-router-v1 with context=262144), it sets max_tokens to the full context window size (262144) without subtracting already-used input tokens. This causes the API to return max_tokens is too large errors, and the agent fails repeatedly.

Error Message

This error fires on every single LLM call once input tokens accumulate beyond a certain threshold. The agent is completely unable to produce any output.

Root Cause

When OpenClaw calls a model with a moderate context window (e.g. StepFun step-router-v1 with context=262144), it sets max_tokens to the full context window size (262144) without subtracting already-used input tokens. This causes the API to return max_tokens is too large errors, and the agent fails repeatedly.

Fix Action

Fixed

Code Example

{
            "id": "step-router-v1",
            "name": "step-router-v1",
            "reasoning": true,
            "input": [
              "text"
            ],
            "cost": {
              "input": 0,
              "output": 0,
              "cacheRead": 0,
              "cacheWrite": 0
            },
            "contextWindow": 262144,
            "maxTokens": 384000
          }
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

When OpenClaw calls a model with a moderate context window (e.g. StepFun step-router-v1 with context=262144), it sets max_tokens to the full context window size (262144) without subtracting already-used input tokens. This causes the API to return max_tokens is too large errors, and the agent fails repeatedly.

Steps to reproduce

Use a model with a 262144-token context window (e.g. stepfun-plan/step-router-v1) as the agent's primary model Accumulate a conversation long enough for input tokens to exceed ~40000 Observe the agent failing on every LLM call with max_tokens' or 'max_completion_tokens' is too large

Expected behavior

When OpenClaw makes an LLM API call, it should calculate max_tokens (or max_completion_tokens) as the remaining available output budget, i.e.: max_tokens = model.contextWindow - estimatedInputTokens

Actual behavior

When OpenClaw makes an LLM API call, it sets max_tokens to the model's full contextWindow value regardless of how many input tokens are already in use. For step-router-v1 (context=262144) with ~43K input tokens already consumed, it sends max_tokens=262144 in the API request, causing the API to reject the call with:

max_tokens' or 'max_completion_tokens' is too large: 262144. This model's maximum context length is 262144 tokens and your request has 43364 input tokens (262144 > 262144 - 43364)

This error fires on every single LLM call once input tokens accumulate beyond a certain threshold. The agent is completely unable to produce any output.

OpenClaw version

2026.5.12

Operating system

centos 8

Install method

No response

Model

StepFun/step-router-v1

Provider / routing chain

openclaw → StepFun step-router-v1

Additional provider/model setup details

No response

Logs, screenshots, and evidence

{
            "id": "step-router-v1",
            "name": "step-router-v1",
            "reasoning": true,
            "input": [
              "text"
            ],
            "cost": {
              "input": 0,
              "output": 0,
              "cacheRead": 0,
              "cacheWrite": 0
            },
            "contextWindow": 262144,
            "maxTokens": 384000
          }

Impact and severity

No response

Additional information

This bug was observed alongside a separate issue with agents.defaults.model.fallbacks config merging — the two are independent but both live in the model config handling logic.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When OpenClaw makes an LLM API call, it should calculate max_tokens (or max_completion_tokens) as the remaining available output budget, i.e.: max_tokens = model.contextWindow - estimatedInputTokens

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: max_tokens not subtracting used input tokens, causing API "too large" errors [1 pull requests]