openclaw - ✅(Solved) Fix [Bug]: Regression: custom OpenAI-compatible embedded sessions omit stream_options.include_usage [3 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#78661Fetched 2026-05-07 03:34:11
View on GitHub
Comments
1
Participants
2
Timeline
8
Reactions
2
Author
Timeline (top)
cross-referenced ×3labeled ×2commented ×1mentioned ×1

I am seeing a regression related to #75357. I thought the fix was working but it's possible I still had compat stream_options.include_usage set on the model. Note that not a duplicate of #73990: this is not asking for fallback estimation when telemetry is missing; proxy evidence shows OpenClaw omits stream_options.include_usage on a route where #75357 intended it to be sent

Custom OpenAI-compatible models can still stream without stream_options.include_usage, causing transcript usage to be recorded as zero and dashboard usage to report no tokens.

The previous fix appears to still exist in the OpenClaw OpenAI completions transport, but this path seems to be bypassed by embedded session stream resolution.

Environment

  • OpenClaw: 2026.5.6
  • Related fixed issue: #75357
  • Provider type: custom OpenAI-compatible provider
  • Backend: llama.cpp-compatible OpenAI /v1/chat/completions endpoint

Example Config

{
  "models": {
    "providers": {
      "llama": {
        "baseUrl": "http://192.168.x.x:8080/v1",
        "api": "openai-completions",
        "models": [
          {
            "id": "qwen36-35b-a3b",
            "api": "openai-completions"
          }
        ]
      }
    }
  }
}

Root Cause

Expected failure before the fix: streamFn is the same as nativeStreamFn, because PI’s native openai-completions stream is treated as a custom session stream instead of a default fallback.

Fix Action

Fixed

PR fix notes

PR #3: fix(agents): route PI openai-completions session streams through boundary transport

Description (problem / solution / changelog)

Summary

  • Problem: Embedded runs could keep PI’s native openai-completions stream on the session agent when it was not the global streamSimple reference, so OpenClaw skipped the boundary-aware completions transport and streaming bodies omitted stream_options.include_usage, producing zero usage in transcripts and dashboards for custom OpenAI-compatible backends (e.g. llama.cpp).
  • Why it matters: Accurate token/usage telemetry and consistent OpenClaw HTTP shaping for openai-completions are expected without per-model compat.supportsUsageInStreaming overrides (#75357 intent).
  • What changed: embeddedSessionUsesDefaultPiStreamFn treats the session stream as the default PI path when it matches getApiProvider(model.api)’s stream / streamSimple, so resolveEmbeddedAgentStreamFn and describeEmbeddedAgentStreamStrategy use createBoundaryAwareStreamFnForModel the same way as for undefined / global streamSimple.
  • What did NOT change (scope boundary): Explicit non-PI custom currentStreamFn values, provider-owned streams, WebSocket transport, Anthropic Vertex routing, and other APIs are unchanged except where they already shared this guard.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #78661
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: resolveEmbeddedAgentStreamFn only swapped in the boundary-aware transport when currentStreamFn was undefined or strictly equal to the global streamSimple. Sessions can carry PI’s provider-registered stream / streamSimple for openai-completions, which is a different function identity, so the swap was skipped and PI’s native HTTP path ran without OpenClaw’s stream_options shaping.
  • Missing detection / guardrail: No check that the session’s stream was still PI’s default implementation for the model’s api.
  • Contributing context (if known): Embedded session reuse / stream assignment from PI keeps the provider-native stream installed as agent.streamFn.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/pi-embedded-runner/stream-resolution.test.ts
  • Scenario the test should lock in: getApiProvider("openai-completions") native stream passed as currentStreamFn must resolve to a wrapped boundary-aware stream (spy) and forward resolvedApiKey; strategy string reports boundary-aware:openai-completions.
  • Why this is the smallest reliable guardrail: Asserts the exact regression (native PI stream identity) without a live llama.cpp proxy.
  • Existing test that already covers this (if any): prior tests only covered currentStreamFn: undefined for boundary-aware paths.
  • If no new test is added, why not: N/A — tests added.

User-visible / Behavior Changes

Custom openai-completions models without compat.supportsUsageInStreaming should again receive stream_options.include_usage on embedded streaming runs when the session used PI’s default completions stream, restoring usage blocks where the backend supports them.

Diagram (if applicable)

N/A

Security Impact (required)

  • New permissions/capabilities? (No)
  • Secrets/tokens handling changed? (No)
  • New/changed network calls? (No — same transport as the existing boundary-aware path; only routing when the session stream is PI-default)
  • Command/tool execution surface changed? (No)
  • Data access scope changed? (No)
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Linux (agent workspace; full pnpm test not executed — Node/pnpm unavailable in runner image)
  • Runtime/container: N/A
  • Model/provider: Custom OpenAI-compatible / openai-completions per #78661
  • Integration/channel (if any): N/A
  • Relevant config (redacted): Per issue example models.providers.*.api: openai-completions

Steps

  1. Configure a custom OpenAI-compatible provider with api: openai-completions without compat.supportsUsageInStreaming.
  2. Run an embedded session whose agent.streamFn is PI’s native openai-completions stream (not global streamSimple).
  3. Observe outbound streaming request body includes stream_options.include_usage: true and usage is recorded.

Expected

Streaming requests include stream_options.include_usage and usage telemetry is non-zero when the backend emits usage.

Actual (before fix)

Proxy showed stream_options.include_usage: false and transcript usage stayed zero (#78661).

Evidence

  • Failing test/log before + passing after — new unit tests encode the routing expectation; CI should run pnpm test src/agents/pi-embedded-runner/stream-resolution.test.ts in the changed lane.
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

  • Verified scenarios: Code path review against provider-transport-stream contract and issue root-cause narrative; lints clean on touched files.
  • Edge cases checked: Explicit vi.fn() custom streams for openai-responses remain session-custom / unchanged return (existing test).
  • What you did not verify: Live proxy against llama.cpp; full pnpm check:changed / pnpm test (toolchain absent in this environment).

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes)
  • Config/env changes? (No)
  • Migration needed? (No)
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: A hypothetical custom wrapper that exactly equals PI’s stream / streamSimple reference would be replaced — same as treating undefined; acceptable and aligned with “default PI transport” contract.
    • Mitigation: Identity checks only match PI-registered functions from getApiProvider(model.api).
<div><a href="https://cursor.com/agents/bc-81592ba3-fba7-44fd-b621-3f4b26ac42ce"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/open-in-web-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/open-in-web-light.png"><img alt="Open in Web" width="114" height="28" src="https://cursor.com/assets/images/open-in-web-dark.png"></picture></a>&nbsp;<a href="https://cursor.com/automations/aa23b0b8-7df6-4cc9-9c67-01a5647105b1"><picture><source media="(prefers-color-scheme: dark)" srcset="https://cursor.com/assets/images/view-automation-dark.png"><source media="(prefers-color-scheme: light)" srcset="https://cursor.com/assets/images/view-automation-light.png"><img alt="View Automation" width="141" height="28" src="https://cursor.com/assets/images/view-automation-dark.png"></picture></a>&nbsp;</div>

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/pi-embedded-runner/stream-resolution.test.ts (modified, +44/-1)
  • src/agents/pi-embedded-runner/stream-resolution.ts (modified, +29/-3)

PR #78679: fix: recognize PI native API provider streams in embedded session stream resolution

Description (problem / solution / changelog)

Summary

Fixes the regression where embedded sessions using PI-native openai-completions streams bypassed the boundary-aware transport, causing stream_options.include_usage to not be injected and token usage to be reported as zero.

Related Issue

Closes #78661

Root Cause

In resolveEmbeddedAgentStreamFn(), the check params.currentStreamFn === streamSimple only matched the module-level streamSimple export from @mariozechner/pi-ai. When a session was initialized with a PI-native API provider stream (obtained via getApiProvider("openai-completions")?.streamSimple), the wrapped function had a different reference identity, causing the check to fall through to return currentStreamFn — bypassing the boundary-aware transport entirely.

This meant stream_options.include_usage: true was never injected, and token usage was reported as zero.

Changes

  • Added isPiNativeDefaultStream() helper in stream-resolution.ts that checks if a stream function is:
    1. undefined (no session stream set)
    2. The module-level streamSimple export (existing behavior)
    3. A registered PI-native API provider's streamSimple for the given model API (new)
  • Updated both resolveEmbeddedAgentStreamFn() and describeEmbeddedAgentStreamStrategy() to use this helper instead of the direct reference comparison
  • Added getApiProvider and Api type imports from @mariozechner/pi-ai

Testing

  • Added regression test for resolveEmbeddedAgentStreamFn: verifies PI native openai-completions stream is routed through boundary-aware transport (not returned as-is)
  • Added regression test for describeEmbeddedAgentStreamStrategy: verifies PI native openai-completions stream is described as boundary-aware:openai-completions (not session-custom)
  • All existing tests pass (full agents-pi-embedded project: all green)

Real behavior proof

The fix is a pure reference-equality correction in stream resolution logic. The regression tests directly exercise the exact code path using real PI API provider registrations (not mocks):

✓ routes PI native openai-completions streams through boundary-aware shaping (#78661)
✓ routes PI native openai-completions streams through boundary-aware transport (#78661)

Both tests use getApiProvider("openai-completions")?.streamSimple — the actual registered PI provider stream function — to verify the fix handles the real wrapped reference correctly. Before this fix, these tests would fail because isPiNativeDefaultStream did not exist and the native provider stream would be classified as session-custom.

I do not have a local openai-completions endpoint to demonstrate end-to-end token usage recovery. A maintainer can apply proof: override if the test coverage is sufficient, or I can add additional evidence if a specific setup is suggested.

Changed files

  • src/agents/pi-embedded-runner/stream-resolution.test.ts (modified, +47/-1)
  • src/agents/pi-embedded-runner/stream-resolution.ts (modified, +18/-3)

PR #78705: fix(embedded): route auth-wrapped session streams through boundary-aware transport

Description (problem / solution / changelog)

Summary

  • Problem: PI installs an auth wrapper streamFn that is not identity-equal to streamSimple. Embedded sessions with custom openai-completions providers bypassed the boundary-aware transport and omitted stream_options.include_usage, causing zero token usage in transcripts.
  • What changed: resolveEmbeddedAgentStreamFn now routes supported transport APIs through boundary-aware transport even when currentStreamFn is any non-streamSimple function.
  • Why it matters: Custom OpenAI-compatible providers (llama.cpp, local endpoints) receive stream_options.include_usage: true without requiring per-model compat.supportsUsageInStreaming.

Change Type

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security
  • Chore

Scope

  • Gateway/orchestration
  • Agents/embedded runner
  • Auth/tokens
  • Memory/storage
  • Integrations
  • API/contracts
  • UI/DX
  • CI/CD

Linked Issue

  • Fixes #78661
  • Related to #75357 (prior fix for stream_options.include_usage)
  • Distinct from #73990 (fallback estimation when telemetry missing)

Root Cause

ConditionBeforeAfter
currentStreamFn === undefined✅ boundary-aware✅ boundary-aware
currentStreamFn === streamSimple✅ boundary-aware✅ boundary-aware
currentStreamFn === PI auth wrapperbypassboundary-aware

PI's createAgentSession installs an async wrapper that fetches auth then calls streamSimple. This wrapper is not identity-equal to the imported streamSimple, so the resolver treated it as a "custom" session stream.

Regression Test Plan

  • Unit tests added
  • New test: "routes PI auth-wrapped session streams through boundary-aware transport for openai-completions"
  • New test: "describes auth-wrapped session streams as boundary-aware for supported APIs"
  • New test: "keeps auth-wrapped session streams labeled as custom for unsupported APIs"

Security Impact

  • No new permissions/capabilities
  • No security implications - purely stream routing logic

User-visible Changes

  • Custom OpenAI-compatible providers now report token usage in transcripts
  • No config changes required

Verification

  • Code review of resolveEmbeddedAgentStreamFn routing logic
  • Tests verify PI auth wrapper scenario
  • Existing tests for streamSimple fallback still pass

Compatibility

  • Backward compatible
  • No migration needed

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/pi-embedded-runner/stream-resolution.test.ts (modified, +65/-0)
  • src/agents/pi-embedded-runner/stream-resolution.ts (modified, +28/-1)

Code Example

{
    "models": {
      "providers": {
        "llama": {
          "baseUrl": "http://192.168.x.x:8080/v1",
          "api": "openai-completions",
          "models": [
            {
              "id": "qwen36-35b-a3b",
              "api": "openai-completions"
            }
          ]
        }
      }
    }
  }

---



---

import type { StreamFn } from "@mariozechner/pi-agent-core";
import { getApiProvider } from "@mariozechner/pi-ai";
import { describe, expect, it, vi } from "vitest";
import * as providerTransportStream from "../provider-transport-stream.js";
import { resolveEmbeddedAgentStreamFn } from "./stream-resolution.js";

vi.mock("../provider-transport-stream.js", async (importOriginal) => {
  const actual = await importOriginal<typeof providerTransportStream>();
  return {
    ...actual,
    createBoundaryAwareStreamFnForModel: vi.fn(actual.createBoundaryAwareStreamFnForModel),
  };
});

describe("resolveEmbeddedAgentStreamFn OpenAI-compatible fallback", () => {
  it("routes PI native openai-completions streams through OpenClaw boundary-aware transport", async () => {
    const nativeStreamFn = getApiProvider("openai-completions")?.streamSimple;
    expect(nativeStreamFn).toBeDefined();

    const innerStreamFn = vi.fn(async (_model, _context, options) => options);
    vi.mocked(providerTransportStream.createBoundaryAwareStreamFnForModel).mockReturnValueOnce(
      innerStreamFn as never,
    );

    const streamFn = resolveEmbeddedAgentStreamFn({
      currentStreamFn: nativeStreamFn as StreamFn,
      shouldUseWebSocketTransport: false,
      sessionId: "session-1",
      model: {
        api: "openai-completions",
        provider: "llama",
        id: "qwen36-35b-a3b",
      } as never,
      resolvedApiKey: "local-token",
    });

    expect(streamFn).not.toBe(nativeStreamFn);

    await expect(
      streamFn({ provider: "llama", id: "qwen36-35b-a3b" } as never, {} as never, {}),
    ).resolves.toMatchObject({ apiKey: "local-token" });

    expect(innerStreamFn).toHaveBeenCalledTimes(1);
  });
});
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

I am seeing a regression related to #75357. I thought the fix was working but it's possible I still had compat stream_options.include_usage set on the model. Note that not a duplicate of #73990: this is not asking for fallback estimation when telemetry is missing; proxy evidence shows OpenClaw omits stream_options.include_usage on a route where #75357 intended it to be sent

Custom OpenAI-compatible models can still stream without stream_options.include_usage, causing transcript usage to be recorded as zero and dashboard usage to report no tokens.

The previous fix appears to still exist in the OpenClaw OpenAI completions transport, but this path seems to be bypassed by embedded session stream resolution.

Environment

  • OpenClaw: 2026.5.6
  • Related fixed issue: #75357
  • Provider type: custom OpenAI-compatible provider
  • Backend: llama.cpp-compatible OpenAI /v1/chat/completions endpoint

Example Config

{
  "models": {
    "providers": {
      "llama": {
        "baseUrl": "http://192.168.x.x:8080/v1",
        "api": "openai-completions",
        "models": [
          {
            "id": "qwen36-35b-a3b",
            "api": "openai-completions"
          }
        ]
      }
    }
  }
}

Steps to reproduce

A proxy probe in front of the llama.cpp endpoint showed:

  • request.stream_options_include_usage: false
  • usage.seen: false
  • timings.seen: true
  • timings.prompt_n / timings.predicted_n: nonzero
  • transcript usage: zero

This is reproducible by using a custom openai-completions provider without compat.supportsUsageInStreaming: true and inspecting the streamed request body.

Expected behavior

Custom api: "openai-completions" streaming sessions should get stream_options.include_usage: true by default, without requiring every local/custom model to set compat.supportsUsageInStreaming.

That was my understanding of the intended behavior from #75357.

Suspected Cause

The OpenClaw transport fix appears to still add stream_options: { include_usage: true }.

However, embedded stream resolution can keep PI's native openai-completions stream function when it is already installed as the current session stream. Since that function is not literally streamSimple, OpenClaw appears to treat it as a custom session stream and does not swap in the boundary-aware OpenClaw transport.

So the fixed transport exists, but this route bypasses it.

Actual behavior

With no explicit compat override, proxy logs show streaming requests do not include:

{ "stream_options": { "include_usage": true } }

As a result:

  • the proxy sees no OpenAI-style usage block
  • llama.cpp still emits nonzero timings.prompt_n / timings.predicted_n
  • OpenClaw transcript assistant messages store zero usage
  • session_cost reports no tokens

Adding this model compat override makes token usage come back:

{ "compat": { "supportsUsageInStreaming": true } }

OpenClaw version

v2026.5.6

Operating system

Ubuntu 24.04

Install method

npm global

Model

qwen36-35b-a3b

Provider / routing chain

openclaw -> llama.cpp

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

usage data does not report tokens

Additional information

Possible Regression Test

Best target file: src/agents/pi-embedded-runner/stream-resolution.test.ts.

import type { StreamFn } from "@mariozechner/pi-agent-core";
import { getApiProvider } from "@mariozechner/pi-ai";
import { describe, expect, it, vi } from "vitest";
import * as providerTransportStream from "../provider-transport-stream.js";
import { resolveEmbeddedAgentStreamFn } from "./stream-resolution.js";

vi.mock("../provider-transport-stream.js", async (importOriginal) => {
const actual = await importOriginal<typeof providerTransportStream>();
return {
  ...actual,
  createBoundaryAwareStreamFnForModel: vi.fn(actual.createBoundaryAwareStreamFnForModel),
};
});

describe("resolveEmbeddedAgentStreamFn OpenAI-compatible fallback", () => {
it("routes PI native openai-completions streams through OpenClaw boundary-aware transport", async () => {
  const nativeStreamFn = getApiProvider("openai-completions")?.streamSimple;
  expect(nativeStreamFn).toBeDefined();

  const innerStreamFn = vi.fn(async (_model, _context, options) => options);
  vi.mocked(providerTransportStream.createBoundaryAwareStreamFnForModel).mockReturnValueOnce(
    innerStreamFn as never,
  );

  const streamFn = resolveEmbeddedAgentStreamFn({
    currentStreamFn: nativeStreamFn as StreamFn,
    shouldUseWebSocketTransport: false,
    sessionId: "session-1",
    model: {
      api: "openai-completions",
      provider: "llama",
      id: "qwen36-35b-a3b",
    } as never,
    resolvedApiKey: "local-token",
  });

  expect(streamFn).not.toBe(nativeStreamFn);

  await expect(
    streamFn({ provider: "llama", id: "qwen36-35b-a3b" } as never, {} as never, {}),
  ).resolves.toMatchObject({ apiKey: "local-token" });

  expect(innerStreamFn).toHaveBeenCalledTimes(1);
});
});

Expected failure before the fix: streamFn is the same as nativeStreamFn, because PI’s native openai-completions stream is treated as a custom session stream instead of a default fallback.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Custom api: "openai-completions" streaming sessions should get stream_options.include_usage: true by default, without requiring every local/custom model to set compat.supportsUsageInStreaming.

That was my understanding of the intended behavior from #75357.

Suspected Cause

The OpenClaw transport fix appears to still add stream_options: { include_usage: true }.

However, embedded stream resolution can keep PI's native openai-completions stream function when it is already installed as the current session stream. Since that function is not literally streamSimple, OpenClaw appears to treat it as a custom session stream and does not swap in the boundary-aware OpenClaw transport.

So the fixed transport exists, but this route bypasses it.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: Regression: custom OpenAI-compatible embedded sessions omit stream_options.include_usage [3 pull requests, 1 comments, 2 participants]