openclaw - ✅(Solved) Fix [Bug]: openai-completions provider never sends stream_options.include_usage: true, causing context token tracking to always show 0% [3 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#68707Fetched 2026-04-19 15:08:25
View on GitHub
Comments
0
Participants
1
Timeline
10
Reactions
0
Author
Participants
Timeline (top)
referenced ×5cross-referenced ×3closed ×1labeled ×1

Root cause: resolveIncludeUsageForStreaming() in openai-http-CBh7eNgq.js requires stream_options.include_usage === true in the inbound payload, but OpenClaw never injects this when proxying to openai-completions backends like llama-cpp or LM Studio. Provider: llama-cpp via openai-completions API type Version: v2026.4.15 Expected: /status shows real context usage Actual: Always 0/33k (0%)

Root Cause

Root cause: resolveIncludeUsageForStreaming() in openai-http-CBh7eNgq.js requires stream_options.include_usage === true in the inbound payload, but OpenClaw never injects this when proxying to openai-completions backends like llama-cpp or LM Studio. Provider: llama-cpp via openai-completions API type Version: v2026.4.15 Expected: /status shows real context usage Actual: Always 0/33k (0%)

Fix Action

Fixed

PR fix notes

PR #68742: fix(agents): restore streaming usage tracking for non-native openai-completions providers

Description (problem / solution / changelog)

Summary

Fixes #68707

When using custom openai-completions endpoints (llama-cpp, LM Studio, DashScope, etc.), stream_options.include_usage: true was never injected into the request payload, causing /status to always show 0% context usage.

Root Cause

In resolveOpenAICompletionsCompatDefaults(), supportsUsageInStreaming was set to false when usesConfiguredNonOpenAIEndpoint was true (i.e. the user pointed to a custom endpoint) and supportsNativeStreamingUsageCompat was not explicitly set. This caused the transport layer to skip injecting stream_options: { include_usage: true }.

Fix

  • openai-completions-compat.ts: Changed the default from opt-in to opt-out. supportsNativeStreamingUsageCompat now defaults to true via ?? true, since most OpenAI-compatible APIs handle stream_options.include_usage correctly (as established by #46142).

  • provider-model-compat.ts: Updated the fallback in the compat override layer from ?? false to ?? true to match the new default.

Impact

  • Restores context token tracking for llama-cpp, LM Studio, and other custom openai-completions providers
  • No behavioral change for native OpenAI endpoints (already had true)
  • No behavioral change for known non-standard providers (already forced false)
  • Aligns with prior fix (#46142) which established that most compat APIs support streaming usage

Testing

  • Updated unit tests to reflect the new default
  • Existing ollama tests remain passing (explicit isOllamaCompatProvider branch unchanged)

Changed files

  • src/agents/model-compat.test.ts (modified, +7/-3)
  • src/agents/openai-completions-compat.test.ts (modified, +2/-2)
  • src/agents/openai-completions-compat.ts (modified, +1/-1)
  • src/plugins/provider-model-compat.ts (modified, +2/-2)

PR #68746: fix: always send stream_options.include_usage when streaming (openai-completions)

Description (problem / solution / changelog)

Fixes #68707

Problem

buildOpenAICompletionsParams() only included stream_options: { include_usage: true } when compat.supportsUsageInStreaming was true. For non-standard/custom endpoints (llama-cpp, LM Studio, etc.), this flag resolved to false, so the gateway's resolveIncludeUsageForStreaming() never saw the field and context token tracking was always 0%.

Fix

Always include stream_options: { include_usage: true } in streaming request payloads, matching the OpenAI SDK's default behavior. Backends that don't support the field simply ignore it.

Changes

  • src/agents/openai-transport-stream.ts — unconditionally include stream_options instead of gating on compat.supportsUsageInStreaming
  • src/agents/openai-transport-stream.test.ts — added test for non-standard backends, updated existing assertion

Changed files

  • CHANGELOG.md (modified, +2/-0)
  • src/agents/openai-transport-stream.test.ts (modified, +28/-1)
  • src/agents/openai-transport-stream.ts (modified, +1/-3)

PR #68749: fix: enable stream_options.include_usage for local openai-completions endpoints

Description (problem / solution / changelog)

Problem

Closes #68707

When using local OpenAI-compatible backends (llama.cpp, LM Studio, Ollama), stream_options: { include_usage: true } is never sent in streaming requests. This causes context token usage to always show 0% in /status and the web UI.

Root Cause

In detectOpenAICompletionsCompat(), supportsUsageInStreaming evaluates to false for local endpoints because:

  • usesConfiguredNonOpenAIEndpoint = true (endpointClass "local""default")
  • supportsNativeStreamingUsageCompat = false (only moonshot/modelstudio qualify)
  • Result: true && (!true || false) = false

However, llama.cpp and LM Studio both implement the OpenAI-compatible streaming API and support stream_options.include_usage.

Fix

Add endpointClass === "local" to the supportsUsageInStreaming condition. This allows local OpenAI-compatible endpoints to receive usage data in streaming responses.

Note: LM Studio extension already works around this in extensions/lmstudio/src/stream.ts:259-264 by overriding supportsUsageInStreaming. This core fix covers llama.cpp, Ollama, and other local backends that do not have their own workaround.

Changed files

  • src/agents/openai-completions-compat.ts (modified, +1/-1)
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

Root cause: resolveIncludeUsageForStreaming() in openai-http-CBh7eNgq.js requires stream_options.include_usage === true in the inbound payload, but OpenClaw never injects this when proxying to openai-completions backends like llama-cpp or LM Studio. Provider: llama-cpp via openai-completions API type Version: v2026.4.15 Expected: /status shows real context usage Actual: Always 0/33k (0%)

Steps to reproduce

Root cause: resolveIncludeUsageForStreaming() in openai-http-CBh7eNgq.js requires stream_options.include_usage === true in the inbound payload, but OpenClaw never injects this when proxying to openai-completions backends like llama-cpp or LM Studio. Provider: llama-cpp via openai-completions API type Version: v2026.4.15 Expected: /status shows real context usage Actual: Always 0/33k (0%)

Expected behavior

Root cause: resolveIncludeUsageForStreaming() in openai-http-CBh7eNgq.js requires stream_options.include_usage === true in the inbound payload, but OpenClaw never injects this when proxying to openai-completions backends like llama-cpp or LM Studio. Provider: llama-cpp via openai-completions API type Version: v2026.4.15 Expected: /status shows real context usage Actual: Always 0/33k (0%)

Actual behavior

Root cause: resolveIncludeUsageForStreaming() in openai-http-CBh7eNgq.js requires stream_options.include_usage === true in the inbound payload, but OpenClaw never injects this when proxying to openai-completions backends like llama-cpp or LM Studio. Provider: llama-cpp via openai-completions API type Version: v2026.4.15 Expected: /status shows real context usage Actual: Always 0/33k (0%)

OpenClaw version

v2026.4.15

Operating system

Xubuntu 24.04

Install method

No response

Model

gemma-4-26B-A4B-it-UD-Q3_K_XL

Provider / routing chain

llama

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

TL;DR

The issue can likely be fixed by modifying OpenClaw to inject stream_options.include_usage === true into the inbound payload when proxying to openai-completions backends.

Guidance

  • Review the OpenClaw code to identify where the inbound payload is constructed and modify it to include stream_options.include_usage === true.
  • Verify that the resolveIncludeUsageForStreaming() function in openai-http-CBh7eNgq.js is correctly handling the updated payload.
  • Test the /status endpoint to ensure it displays the correct context usage.
  • Consider adding logging or debugging statements to verify that the stream_options.include_usage flag is being set correctly.

Example

No code example is provided as the issue does not contain sufficient information about the OpenClaw codebase.

Notes

The fix assumes that the issue is solely due to the missing stream_options.include_usage flag in the inbound payload. Additional debugging may be required to ensure that other parts of the system are functioning correctly.

Recommendation

Apply workaround: Modify OpenClaw to inject stream_options.include_usage === true into the inbound payload, as this is the most direct way to address the issue based on the provided information.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Root cause: resolveIncludeUsageForStreaming() in openai-http-CBh7eNgq.js requires stream_options.include_usage === true in the inbound payload, but OpenClaw never injects this when proxying to openai-completions backends like llama-cpp or LM Studio. Provider: llama-cpp via openai-completions API type Version: v2026.4.15 Expected: /status shows real context usage Actual: Always 0/33k (0%)

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING