openclaw - ✅(Solved) Fix [Bug]: toNormalizedUsage() discards accumulated input/cache tokens — cost tracking underreports ~80% of actual billed usage [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#53734Fetched 2026-04-08 01:24:10
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Participants
Timeline (top)
commented ×1cross-referenced ×1referenced ×1

toNormalizedUsage() in src/agents/pi-embedded-runner/run.ts (line ~192) intentionally returns only the last API call's input, cacheRead, and cacheWrite tokens instead of the accumulated values from the UsageAccumulator. While output tokens are correctly accumulated, input/cache tokens are taken from lastInput/lastCacheRead/lastCacheWrite. This causes session transcripts and cost tracking to miss ~80% of actual billed tokens in tool-heavy workflows.

This was introduced to fix the context-size display issue (#13698), but it inadvertently broke cost/accounting accuracy.

Root Cause

In run.ts:

const toNormalizedUsage = (usage: UsageAccumulator) => {
  const lastPromptTokens = usage.lastInput + usage.lastCacheRead + usage.lastCacheWrite;
  return {
    input:      usage.lastInput || undefined,       // ❌ last call only
    output:     usage.output || undefined,           // ✅ accumulated
    cacheRead:  usage.lastCacheRead || undefined,    // ❌ last call only
    cacheWrite: usage.lastCacheWrite || undefined,   // ❌ last call only
    total: lastPromptTokens + usage.output || undefined,
  };
};

The UsageAccumulator correctly accumulates all values via mergeUsageIntoAccumulator() (including input, cacheRead, cacheWrite), but toNormalizedUsage() discards the accumulated prompt-side fields. For a turn with N tool-call round-trips, this loses (N-1)/N of input and cache tokens.

Downstream consumers all read this lossy object:

  • Session store (session-store.ts:47) → inputTokens, cacheRead, cacheWrite, estimatedCostUsd
  • Transcript JSONL entries → usage block per assistant message
  • Gateway transcript fallback (session-utils.fs.ts)
  • Diagnostics emitter (agent-runner.ts:585-619)
  • Cost calculation (session-cost-usage.ts)

No alternative persistence path captures the full accumulated totals.

Fix Action

Fix / Workaround

  • #13698 — Original context-size display bug that introduced lastCacheRead (the fix that caused this regression)
  • #17016 — Same context-size issue, closed as resolved by #13698 fix
  • #40870 — Complementary undercounting issue (reset archive transcripts ignored)

PR fix notes

PR #53748: fix(usage): return accumulated input/cache tokens from toNormalizedUsage

Description (problem / solution / changelog)

Summary

Fixes #53734.

toNormalizedUsage() was returning only the last API call's input/cacheRead/cacheWrite instead of accumulated totals from the UsageAccumulator. This caused cost tracking to severely underreport actual billed tokens in multi-tool-call turns.

  • Extract UsageAccumulator logic into a dedicated, testable usage-accumulator.ts module
  • Fix toNormalizedUsage() to return accumulated values for cost/billing tracking
  • Context-size display is unaffected — it already uses the separate lastCallUsage field

Local reproduction

Simulated 3 tool-call round-trips with prompt caching against the original code on main:

Token fieldAccumulated (actual billed)Reported (buggy)Lost
input37015059.5%
cacheRead246,00084,00065.9%
cacheWrite5,000undefined100%

cacheWrite is 100% lost because the last call had 0 cache writes, so usage.lastCacheWrite is 0, which becomes undefined via || undefined.

After the fix, all fields report the correct accumulated values.

Why this is safe

The downstream persistence layer (session-usage.ts) already handles cost vs context-size as separate concerns:

  • Cost estimation (line 112–114): Uses params.usage → now gets correct accumulated totals
  • Session cacheRead/cacheWrite (line 131): Prefers params.lastCallUsage when available → unaffected
  • totalTokens / context display (line 104–110): Derives from params.lastCallUsageunaffected
  • usage.total override (run.ts): Already overwritten by lastTurnTotalunaffected

Test plan

  • New usage-accumulator.test.ts (5 tests) — covers accumulation, the #53734 regression, and edge cases
  • Existing usage-reporting.test.ts — 5/5 pass
  • Existing session.test.ts — 51/51 pass (downstream persistence)
  • Existing agent-runner.misc.runreplyagent.test.ts — 32/32 pass (integration)
  • pnpm tsgo — clean
  • pnpm check — clean

🤖 Generated with Claude Code

Changed files

  • src/agents/pi-embedded-runner/run.ts (modified, +14/-84)
  • src/agents/pi-embedded-runner/usage-accumulator.test.ts (added, +167/-0)
  • src/agents/pi-embedded-runner/usage-accumulator.ts (added, +102/-0)

Code Example

const toNormalizedUsage = (usage: UsageAccumulator) => {
  const lastPromptTokens = usage.lastInput + usage.lastCacheRead + usage.lastCacheWrite;
  return {
    input:      usage.lastInput || undefined,       // ❌ last call only
    output:     usage.output || undefined,           // ✅ accumulated
    cacheRead:  usage.lastCacheRead || undefined,    // ❌ last call only
    cacheWrite: usage.lastCacheWrite || undefined,   // ❌ last call only
    total: lastPromptTokens + usage.output || undefined,
  };
};

---

Run these 3 commands: echo "hello", echo "world", echo "done"
RAW_BUFFERClick to expand / collapse

Summary

toNormalizedUsage() in src/agents/pi-embedded-runner/run.ts (line ~192) intentionally returns only the last API call's input, cacheRead, and cacheWrite tokens instead of the accumulated values from the UsageAccumulator. While output tokens are correctly accumulated, input/cache tokens are taken from lastInput/lastCacheRead/lastCacheWrite. This causes session transcripts and cost tracking to miss ~80% of actual billed tokens in tool-heavy workflows.

This was introduced to fix the context-size display issue (#13698), but it inadvertently broke cost/accounting accuracy.

Impact

On a real deployment (572 sessions, ~7,900 assistant messages, ~7,300 tool calls over March 2026):

MetricAnthropic Console (actual)Transcript (logged)Captured
Input tokens3,106,387,899413,926,64913%
Output tokens12,523,3671,858,61315%
Cost$2,148.66$430.2520%

Verified independently by Claude Code and OpenAI Codex analyzing the source code — both confirmed no alternative persistence path exists for the full accumulated usage.

Root Cause

In run.ts:

const toNormalizedUsage = (usage: UsageAccumulator) => {
  const lastPromptTokens = usage.lastInput + usage.lastCacheRead + usage.lastCacheWrite;
  return {
    input:      usage.lastInput || undefined,       // ❌ last call only
    output:     usage.output || undefined,           // ✅ accumulated
    cacheRead:  usage.lastCacheRead || undefined,    // ❌ last call only
    cacheWrite: usage.lastCacheWrite || undefined,   // ❌ last call only
    total: lastPromptTokens + usage.output || undefined,
  };
};

The UsageAccumulator correctly accumulates all values via mergeUsageIntoAccumulator() (including input, cacheRead, cacheWrite), but toNormalizedUsage() discards the accumulated prompt-side fields. For a turn with N tool-call round-trips, this loses (N-1)/N of input and cache tokens.

Downstream consumers all read this lossy object:

  • Session store (session-store.ts:47) → inputTokens, cacheRead, cacheWrite, estimatedCostUsd
  • Transcript JSONL entries → usage block per assistant message
  • Gateway transcript fallback (session-utils.fs.ts)
  • Diagnostics emitter (agent-runner.ts:585-619)
  • Cost calculation (session-cost-usage.ts)

No alternative persistence path captures the full accumulated totals.

Steps to Reproduce

  1. Configure OpenClaw with Anthropic provider (prompt caching enabled by default)
  2. Start a session with tool-using agent (e.g., exec tool)
  3. Send a message that triggers multiple tool calls in one turn:
    Run these 3 commands: echo "hello", echo "world", echo "done"
  4. Check the transcript JSONL for the assistant message's usage.cacheRead
  5. Compare with Anthropic Console usage for the same time window

Expected: usage.cacheRead reflects sum of all API calls in the turn (≈ context_size × number_of_calls) Actual: usage.cacheRead reflects only the last API call

Suggested Fix

The types already have the separation needed (EmbeddedPiAgentMeta.lastCallUsage exists for context-size display). The fix is:

  1. Make toNormalizedUsage() return accumulated values for all fields (restore usage.input, usage.cacheRead, usage.cacheWrite from the accumulator)
  2. Continue using lastCallUsage / promptTokens for context-size display (already wired correctly via deriveSessionTotalTokens)
  3. Cost/accounting consumers already use agentMeta.usage — they will automatically get correct values

This separates the two conflated concepts:

  • Context-size display: needs last call's prompt tokens → lastCallUsage / promptTokens
  • Cost tracking: needs accumulated billed tokens → agentMeta.usage

Related Issues

  • #13698 — Original context-size display bug that introduced lastCacheRead (the fix that caused this regression)
  • #17016 — Same context-size issue, closed as resolved by #13698 fix
  • #40870 — Complementary undercounting issue (reset archive transcripts ignored)

Environment

  • OpenClaw 2026.3.23-2
  • Anthropic provider with prompt caching
  • Heavy tool-use workflows (avg ~1 tool call per assistant message)

extent analysis

Fix Plan

To fix the issue, we need to modify the toNormalizedUsage() function to return accumulated values for all fields. Here are the steps:

  • Modify the toNormalizedUsage() function to use the accumulated values from the UsageAccumulator:
const toNormalizedUsage = (usage: UsageAccumulator) => {
  return {
    input:      usage.input || undefined,
    output:     usage.output || undefined,
    cacheRead:  usage.cacheRead || undefined,
    cacheWrite: usage.cacheWrite || undefined,
    total: (usage.input + usage.cacheRead + usage.cacheWrite + usage.output) || undefined,
  };
};
  • Continue using lastCallUsage / promptTokens for context-size display, as it is already correctly wired via deriveSessionTotalTokens.
  • No changes are needed for cost/accounting consumers, as they already use agentMeta.usage and will automatically get the correct values.

Verification

To verify the fix, you can:

  • Run a session with a tool-using agent and trigger multiple tool calls in one turn.
  • Check the transcript JSONL for the assistant message's usage.cacheRead and compare it with the Anthropic Console usage for the same time window.
  • The usage.cacheRead should now reflect the sum of all API calls in the turn.

Extra Tips

  • Make sure to test the fix thoroughly to ensure that it does not introduce any new issues.
  • Consider adding additional logging or monitoring to track the usage and cost metrics, to quickly identify any potential issues in the future.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: toNormalizedUsage() discards accumulated input/cache tokens — cost tracking underreports ~80% of actual billed usage [1 pull requests, 1 comments, 2 participants]