openclaw - 💡(How to fix) Fix [Bug]: CLI inference wrapper sends legacy thinkingBudget instead of thinkingLevel for gemini-flash-latest

openclaw2026-05-20 20:04:04

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

When running openclaw infer model run with --thinking adaptive, OpenClaw incorrectly sends the legacy thinkingBudget: -1 parameter to the Google Gemini API for gemini-flash-latest instead of the required thinkingLevel parameter.

Root Cause

Fix Action

Fix / Workaround

Based on source code observation in dist/provider-stream-D4qSxrOO.js, the generic wrapper's supportsAdaptiveThinking(modelId) function only returns true for Claude 3.7/3.6 models, forcing a fallback to token budgets (-1). Furthermore, resolveGoogleGemini3ThinkingLevel in the Google provider adapter (dist/provider-stream-shared-CPv67a5n.js) fails to rewrite this legacy budget parameter to a thinkingLevel for Flash models (it only handles Pro models). Temporary workaround: explicitly defining the model in openclaw.json with a fixed thinkingLevel: "high".

Code Example

Output from zcat on the captured gemini-flash-latest payload:


{
  "contents": [{"parts":[{"text":"test"}],"role":"user"}],
  "generationConfig": {
    "maxOutputTokens": 65536,
    "thinkingConfig": {
      "includeThoughts": true,
      "thinkingBudget": -1
    }
  }
}


Output from zcat on the captured gemini-3.1-pro-preview payload (working correctly):

{
  "contents": [{"parts":[{"text":"test"}],"role":"user"}],
  "generationConfig": {
    "maxOutputTokens": 65536,
    "thinkingConfig": {
      "includeThoughts": true,
      "thinkingLevel": "HIGH"
    }
  }
}

RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Summary

Steps to reproduce

Run openclaw proxy run openclaw infer model run --prompt "test" --model google/gemini-flash-latest --thinking adaptive
View the captured proxy payload using zcat ~/.openclaw/debug-proxy/blobs/<blob-id>.bin.gz
Observe that generationConfig.thinkingConfig contains thinkingBudget: -1.
Run the same command using --model google/gemini-3.1-pro-preview and observe it correctly sends thinkingLevel.

Expected behavior

OpenClaw should recognize gemini-flash-latest (a Gemini 3.5 model) as supporting dynamic/adaptive thinking and natively omit thinkingBudget, or pass thinkingLevel as appropriate, matching the behavior of gemini-3.1-pro-preview and Gemini API documentation.

Actual behavior

The debug proxy payload sent to Google contains thinkingBudget: -1 instead of thinkingLevel.

OpenClaw version

2026.5.18 (50a2481)

Operating system

Linux 6.12.75+rpt-rpi-2712 (arm64) / Node.js v24.14.0

Install method

npm global

Model

google/gemini-flash-latest

Provider / routing chain

openclaw -> google

Additional provider/model setup details

Tested using the openclaw infer model run command, which routes through the generic inference (ZAI) wrapper rather than the primary agent gateway routing.

Logs, screenshots, and evidence

Output from zcat on the captured gemini-flash-latest payload:


{
  "contents": [{"parts":[{"text":"test"}],"role":"user"}],
  "generationConfig": {
    "maxOutputTokens": 65536,
    "thinkingConfig": {
      "includeThoughts": true,
      "thinkingBudget": -1
    }
  }
}


Output from zcat on the captured gemini-3.1-pro-preview payload (working correctly):

{
  "contents": [{"parts":[{"text":"test"}],"role":"user"}],
  "generationConfig": {
    "maxOutputTokens": 65536,
    "thinkingConfig": {
      "includeThoughts": true,
      "thinkingLevel": "HIGH"
    }
  }
}

Impact and severity

Affected: Users relying on the CLI inference wrappers or ZAI transports with gemini-flash-latest or gemini-3.5-flash while using adaptive thinking. Severity: Medium (Sends legacy/deprecated parameters to the Google API, which Google documentation warns may result in unexpected performance). Frequency: 100% reproducible for this specific model alias and thinking flag via the infer command. Consequence: The Google API evaluates the request using legacy token budget constraints rather than modern Gemini 3 thinkingLevel semantics.

Additional information

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: CLI inference wrapper sends legacy thinkingBudget instead of thinkingLevel for gemini-flash-latest

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

TRENDING