openclaw - 💡(How to fix) Fix [Bug]: CLI inference wrapper sends legacy thinkingBudget instead of thinkingLevel for gemini-flash-latest

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When running openclaw infer model run with --thinking adaptive, OpenClaw incorrectly sends the legacy thinkingBudget: -1 parameter to the Google Gemini API for gemini-flash-latest instead of the required thinkingLevel parameter.

Root Cause

When running openclaw infer model run with --thinking adaptive, OpenClaw incorrectly sends the legacy thinkingBudget: -1 parameter to the Google Gemini API for gemini-flash-latest instead of the required thinkingLevel parameter.

Fix Action

Fix / Workaround

Based on source code observation in dist/provider-stream-D4qSxrOO.js, the generic wrapper's supportsAdaptiveThinking(modelId) function only returns true for Claude 3.7/3.6 models, forcing a fallback to token budgets (-1). Furthermore, resolveGoogleGemini3ThinkingLevel in the Google provider adapter (dist/provider-stream-shared-CPv67a5n.js) fails to rewrite this legacy budget parameter to a thinkingLevel for Flash models (it only handles Pro models). Temporary workaround: explicitly defining the model in openclaw.json with a fixed thinkingLevel: "high".

Code Example

Output from zcat on the captured gemini-flash-latest payload:


{
  "contents": [{"parts":[{"text":"test"}],"role":"user"}],
  "generationConfig": {
    "maxOutputTokens": 65536,
    "thinkingConfig": {
      "includeThoughts": true,
      "thinkingBudget": -1
    }
  }
}


Output from zcat on the captured gemini-3.1-pro-preview payload (working correctly):

{
  "contents": [{"parts":[{"text":"test"}],"role":"user"}],
  "generationConfig": {
    "maxOutputTokens": 65536,
    "thinkingConfig": {
      "includeThoughts": true,
      "thinkingLevel": "HIGH"
    }
  }
}
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

When running openclaw infer model run with --thinking adaptive, OpenClaw incorrectly sends the legacy thinkingBudget: -1 parameter to the Google Gemini API for gemini-flash-latest instead of the required thinkingLevel parameter.

Steps to reproduce

  1. Run openclaw proxy run openclaw infer model run --prompt "test" --model google/gemini-flash-latest --thinking adaptive
  2. View the captured proxy payload using zcat ~/.openclaw/debug-proxy/blobs/<blob-id>.bin.gz
  3. Observe that generationConfig.thinkingConfig contains thinkingBudget: -1.
  4. Run the same command using --model google/gemini-3.1-pro-preview and observe it correctly sends thinkingLevel.

Expected behavior

OpenClaw should recognize gemini-flash-latest (a Gemini 3.5 model) as supporting dynamic/adaptive thinking and natively omit thinkingBudget, or pass thinkingLevel as appropriate, matching the behavior of gemini-3.1-pro-preview and Gemini API documentation.

Actual behavior

The debug proxy payload sent to Google contains thinkingBudget: -1 instead of thinkingLevel.

OpenClaw version

2026.5.18 (50a2481)

Operating system

Linux 6.12.75+rpt-rpi-2712 (arm64) / Node.js v24.14.0

Install method

npm global

Model

google/gemini-flash-latest

Provider / routing chain

openclaw -> google

Additional provider/model setup details

Tested using the openclaw infer model run command, which routes through the generic inference (ZAI) wrapper rather than the primary agent gateway routing.

Logs, screenshots, and evidence

Output from zcat on the captured gemini-flash-latest payload:


{
  "contents": [{"parts":[{"text":"test"}],"role":"user"}],
  "generationConfig": {
    "maxOutputTokens": 65536,
    "thinkingConfig": {
      "includeThoughts": true,
      "thinkingBudget": -1
    }
  }
}


Output from zcat on the captured gemini-3.1-pro-preview payload (working correctly):

{
  "contents": [{"parts":[{"text":"test"}],"role":"user"}],
  "generationConfig": {
    "maxOutputTokens": 65536,
    "thinkingConfig": {
      "includeThoughts": true,
      "thinkingLevel": "HIGH"
    }
  }
}

Impact and severity

Affected: Users relying on the CLI inference wrappers or ZAI transports with gemini-flash-latest or gemini-3.5-flash while using adaptive thinking. Severity: Medium (Sends legacy/deprecated parameters to the Google API, which Google documentation warns may result in unexpected performance). Frequency: 100% reproducible for this specific model alias and thinking flag via the infer command. Consequence: The Google API evaluates the request using legacy token budget constraints rather than modern Gemini 3 thinkingLevel semantics.

Additional information

Based on source code observation in dist/provider-stream-D4qSxrOO.js, the generic wrapper's supportsAdaptiveThinking(modelId) function only returns true for Claude 3.7/3.6 models, forcing a fallback to token budgets (-1). Furthermore, resolveGoogleGemini3ThinkingLevel in the Google provider adapter (dist/provider-stream-shared-CPv67a5n.js) fails to rewrite this legacy budget parameter to a thinkingLevel for Flash models (it only handles Pro models). Temporary workaround: explicitly defining the model in openclaw.json with a fixed thinkingLevel: "high".

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

OpenClaw should recognize gemini-flash-latest (a Gemini 3.5 model) as supporting dynamic/adaptive thinking and natively omit thinkingBudget, or pass thinkingLevel as appropriate, matching the behavior of gemini-3.1-pro-preview and Gemini API documentation.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: CLI inference wrapper sends legacy thinkingBudget instead of thinkingLevel for gemini-flash-latest