openclaw - 💡(How to fix) Fix [Bug]: max_tokens not subtracting used input tokens, causing API "too large" errors [1 pull requests]

openclaw2026-05-17 11:44:01

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

When OpenClaw calls a model with a moderate context window (e.g. StepFun step-router-v1 with context=262144), it sets max_tokens to the full context window size (262144) without subtracting already-used input tokens. This causes the API to return max_tokens is too large errors, and the agent fails repeatedly.

Error Message

This error fires on every single LLM call once input tokens accumulate beyond a certain threshold. The agent is completely unable to produce any output.

Root Cause

Fix Action

Fixed

Fixed by PR: fix(agents): reserve 25% context window for input tokens in maxTokens calculation (https://github.com/openclaw/openclaw/pull/83376)

Code Example

{
            "id": "step-router-v1",
            "name": "step-router-v1",
            "reasoning": true,
            "input": [
              "text"
            ],
            "cost": {
              "input": 0,
              "output": 0,
              "cacheRead": 0,
              "cacheWrite": 0
            },
            "contextWindow": 262144,
            "maxTokens": 384000
          }

RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

Summary

Steps to reproduce

Use a model with a 262144-token context window (e.g. stepfun-plan/step-router-v1) as the agent's primary model Accumulate a conversation long enough for input tokens to exceed ~40000 Observe the agent failing on every LLM call with max_tokens' or 'max_completion_tokens' is too large

Expected behavior

When OpenClaw makes an LLM API call, it should calculate max_tokens (or max_completion_tokens) as the remaining available output budget, i.e.: max_tokens = model.contextWindow - estimatedInputTokens

Actual behavior

When OpenClaw makes an LLM API call, it sets max_tokens to the model's full contextWindow value regardless of how many input tokens are already in use. For step-router-v1 (context=262144) with ~43K input tokens already consumed, it sends max_tokens=262144 in the API request, causing the API to reject the call with:

max_tokens' or 'max_completion_tokens' is too large: 262144. This model's maximum context length is 262144 tokens and your request has 43364 input tokens (262144 > 262144 - 43364)

This error fires on every single LLM call once input tokens accumulate beyond a certain threshold. The agent is completely unable to produce any output.

OpenClaw version

2026.5.12

Operating system

centos 8

Install method

No response

Model

StepFun/step-router-v1

Provider / routing chain

openclaw → StepFun step-router-v1

Additional provider/model setup details

No response

Logs, screenshots, and evidence

{
            "id": "step-router-v1",
            "name": "step-router-v1",
            "reasoning": true,
            "input": [
              "text"
            ],
            "cost": {
              "input": 0,
              "output": 0,
              "cacheRead": 0,
              "cacheWrite": 0
            },
            "contextWindow": 262144,
            "maxTokens": 384000
          }

Impact and severity

No response

Additional information

This bug was observed alongside a separate issue with agents.defaults.model.fallbacks config merging — the two are independent but both live in the model config handling logic.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

When OpenClaw makes an LLM API call, it should calculate max_tokens (or max_completion_tokens) as the remaining available output budget, i.e.: max_tokens = model.contextWindow - estimatedInputTokens

#api #logging issue #authentication issue #prompt issue #agent setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: max_tokens not subtracting used input tokens, causing API "too large" errors [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: max_tokens not subtracting used input tokens, causing API "too large" errors [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING