openclaw - 💡(How to fix) Fix [Bug] xAI/grok-4.3 usage tokens lost on tool-use turns since 2026.5.3 (5× cost under-reporting) [3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#78089Fetched 2026-05-06 06:17:02
View on GitHub
Comments
3
Participants
2
Timeline
4
Reactions
2
Author
Timeline (top)
commented ×3closed ×1

Starting in 2026.5.3, xAI (xai/grok-4.3 and friends) tool-use turns no longer record token usage in session JSONL — usage.totalTokens is 0 and usage.cost.total is 0 for every assistant turn that emits a toolCall. Only single-shot turns (no tool calls) capture usage correctly. Confirmed against the xai/grok-4.3 model with provider: xai, api: openai-completions.

This breaks /status context tracking, the usage cost summary in the TUI, the costLog consumed by CodexBar/model-usage skills, and any downstream analytics. Concretely: the gateway under-reports xAI spend by roughly 5x in my data ($1.93 estimated from local logs vs. $11.43 actually billed by x.ai for May 4 2026).

Root Cause

  1. Every /status and /usage view is wrong for xAI sessions.
  2. Cost analytics under-reports by ~5×, making it hard to track spend.
  3. Context-window tracking is wrong — /status shows 0/256k instead of the real fraction.
  4. The compaction.maxHistoryShare safeguard can't trigger because it can't measure context usage.
  5. Subagents using Grok-4.3 (the OpenClaw default) are silently expensive.

Fix Action

Fix / Workaround

  • ✅ Working: through 2026-05-03 (verified on 02ef334b session, enso-briefing cron, 6/6 turns captured)
  • ❌ Broken: starting ~2026-05-04 11:00 UTC after upgrade to 2026.5.3 (b53fc4f2 onward, 0 capture across all subsequent xAI tool-use sessions)

Code Example

"xai": {
     "baseUrl": "https://api.x.ai/v1",
     "apiKey": "...",
     "api": "openai-completions",
     "models": [
       { "id": "grok-4.3", "api": "openai-completions", "reasoning": true,
         "cost": {"input": 1.25, "output": 2.5, "cacheRead": 0.2, "cacheWrite": 1.25} }
     ]
   }

---

"usage": {
     "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0, "totalTokens": 0,
     "cost": {"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0, "total": 0}
   }

---

curl -s -X POST "https://api.x.ai/v1/chat/completions" \
     -H "Authorization: Bearer $XAI_KEY" \
     -H "Content-Type: application/json" \
     -d '{
       "model": "grok-4.3",
       "stream": true,
       "stream_options": {"include_usage": true},
       "messages": [{"role":"user","content":"Say hi"}]
     }' | tail -3

---

{"choices":[],"usage":{
     "prompt_tokens":136,"completion_tokens":3,"total_tokens":736,
     "prompt_tokens_details":{"text_tokens":136,"cached_tokens":128},
     "completion_tokens_details":{"reasoning_tokens":597,...},
     "cost_in_usd_ticks":15356000
   }}
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

Starting in 2026.5.3, xAI (xai/grok-4.3 and friends) tool-use turns no longer record token usage in session JSONL — usage.totalTokens is 0 and usage.cost.total is 0 for every assistant turn that emits a toolCall. Only single-shot turns (no tool calls) capture usage correctly. Confirmed against the xai/grok-4.3 model with provider: xai, api: openai-completions.

This breaks /status context tracking, the usage cost summary in the TUI, the costLog consumed by CodexBar/model-usage skills, and any downstream analytics. Concretely: the gateway under-reports xAI spend by roughly 5x in my data ($1.93 estimated from local logs vs. $11.43 actually billed by x.ai for May 4 2026).

Steps to reproduce

  1. Configure xAI provider in openclaw.json:

    "xai": {
      "baseUrl": "https://api.x.ai/v1",
      "apiKey": "...",
      "api": "openai-completions",
      "models": [
        { "id": "grok-4.3", "api": "openai-completions", "reasoning": true,
          "cost": {"input": 1.25, "output": 2.5, "cacheRead": 0.2, "cacheWrite": 1.25} }
      ]
    }
  2. Run a Grok-4.3 turn that calls a local OpenClaw tool (e.g. read, exec, dir_list) and continues afterward — i.e., a typical multi-step agent flow.

  3. Inspect the assistant message in the session JSONL.

  4. Observe usage is recorded as zeros despite the turn obviously consuming substantial input + reasoning + output tokens:

    "usage": {
      "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0, "totalTokens": 0,
      "cost": {"input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0, "total": 0}
    }
  5. Compare with a single-shot turn (no tool call). For those, usage IS captured correctly.

  6. Verify the xAI endpoint actually returns full usage when streaming with stream_options.include_usage: true:

    curl -s -X POST "https://api.x.ai/v1/chat/completions" \
      -H "Authorization: Bearer $XAI_KEY" \
      -H "Content-Type: application/json" \
      -d '{
        "model": "grok-4.3",
        "stream": true,
        "stream_options": {"include_usage": true},
        "messages": [{"role":"user","content":"Say hi"}]
      }' | tail -3

    Returns a final SSE chunk with full usage:

    {"choices":[],"usage":{
      "prompt_tokens":136,"completion_tokens":3,"total_tokens":736,
      "prompt_tokens_details":{"text_tokens":136,"cached_tokens":128},
      "completion_tokens_details":{"reasoning_tokens":597,...},
      "cost_in_usd_ticks":15356000
    }}

    Note xAI returns: reasoning_tokens (currently dominates cost — ~200× completion_tokens), cached_tokens (cache hits), and cost_in_usd_ticks (xAI's own billed cost in micro-cents). OpenClaw is ignoring all three on tool-use turns.

Expected behavior

Every Grok-4.3 turn — including multi-step tool-use turns — should record real prompt_tokens, completion_tokens, cached_tokens, and reasoning_tokens from xAI's usage chunk, plus the calculated USD cost.

Actual behavior

Single-shot turns: usage captured ✅ Tool-use continuation turns: totalTokens: 0, cost.total: 0 ❌ — across 95% of my Grok-4.3 turns (177 zeros / 9 working out of 186 May-4 turns).

Likely root cause

The OpenAI-completions stream parser (dist/openai-transport-stream-Bvcf6n6H.js, parseTransportChunkUsage()) DOES read chunk.usage when present. And OpenClaw IS sending stream_options: { include_usage: true } (verified in the same file). So the request shape is correct.

The pattern of WHICH turns capture usage and which don't strongly suggests a tool-use continuation race: when a streaming turn ends with finish_reason: "tool_calls", OpenClaw's adapter may finalize the assistant message before the trailing usage chunk arrives. xAI emits the usage chunk in a final SSE frame with choices: [] — that's the same frame that contains the data: [DONE] lookalike. If the adapter's stream loop short-circuits on finish_reason without waiting for the trailing usage frame, usage is lost.

The 4.8% of working turns are exactly the simple turns where the model didn't request any tool — there's no early finish_reason cutoff and the parser sees the usage chunk before close.

Regression timeline

  • ✅ Working: through 2026-05-03 (verified on 02ef334b session, enso-briefing cron, 6/6 turns captured)
  • ❌ Broken: starting ~2026-05-04 11:00 UTC after upgrade to 2026.5.3 (b53fc4f2 onward, 0 capture across all subsequent xAI tool-use sessions)

The 2026.5.3 release shipped substantial Anthropic/OpenAI provider refactors. Likely an adjacent change to the openai-completions stream finalization changed the order of operations.

Why this matters

  1. Every /status and /usage view is wrong for xAI sessions.
  2. Cost analytics under-reports by ~5×, making it hard to track spend.
  3. Context-window tracking is wrong — /status shows 0/256k instead of the real fraction.
  4. The compaction.maxHistoryShare safeguard can't trigger because it can't measure context usage.
  5. Subagents using Grok-4.3 (the OpenClaw default) are silently expensive.

Suggested fix

Hold the assistant message finalize until the trailing usage chunk has been processed (or the SSE stream actually closes). xAI sends the usage chunk in a choices: [] frame AFTER all finish_reason frames; the loop should keep reading until [DONE] rather than breaking on the first finish_reason.

Also consider parsing xAI's cost_in_usd_ticks and completion_tokens_details.reasoning_tokens for accurate billing. The reasoning-tokens billing is currently the dominant cost component on Grok-4.3 (often 100-200× the visible completion tokens) and isn't surfaced anywhere in the gateway.

OpenClaw version

2026.5.3 (06d46f7)

Operating system

macOS 26.5 (arm64), Node 24.15.0

Provider details

  • Provider: xai
  • API: openai-completions
  • Models tested: grok-4.3, grok-4-1-fast-non-reasoning
  • Endpoint: https://api.x.ai/v1

extent analysis

TL;DR

Modify the OpenAI-completions stream parser to wait for the trailing usage chunk before finalizing the assistant message.

Guidance

  1. Verify the stream finalization logic: Check the dist/openai-transport-stream-Bvcf6n6H.js file, specifically the parseTransportChunkUsage() function, to ensure it correctly handles the trailing usage chunk.
  2. Update the stream loop: Modify the stream loop to continue reading until the [DONE] frame is received, rather than breaking on the first finish_reason frame.
  3. Parse additional usage fields: Consider parsing xAI's cost_in_usd_ticks and completion_tokens_details.reasoning_tokens for accurate billing.
  4. Test with tool-use turns: Verify that the updated parser correctly captures usage for tool-use turns, including multi-step agent flows.

Example

No code example is provided, as the necessary changes are specific to the OpenClaw codebase and require a thorough understanding of the existing implementation.

Notes

The suggested fix assumes that the issue is indeed caused by the stream finalization logic and that the trailing usage chunk is being sent by xAI after the finish_reason frame. Additional debugging and testing may be necessary to confirm this.

Recommendation

Apply the suggested fix to the OpenAI-completions stream parser to ensure accurate usage tracking and billing for xAI sessions. This change should be made to the 2026.5.3 version of OpenClaw, specifically to the dist/openai-transport-stream-Bvcf6n6H.js file.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Every Grok-4.3 turn — including multi-step tool-use turns — should record real prompt_tokens, completion_tokens, cached_tokens, and reasoning_tokens from xAI's usage chunk, plus the calculated USD cost.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug] xAI/grok-4.3 usage tokens lost on tool-use turns since 2026.5.3 (5× cost under-reporting) [3 comments, 2 participants]