openclaw - ✅(Solved) Fix [Bug]: Webchat UI shows nothing when Anthropic provider returns billing error — gateway decides surface_error but client never renders it [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70124Fetched 2026-04-23 07:29:00
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
labeled ×2commented ×1

When the Anthropic provider rejects a request with invalid_request_error ("Your credit balance is too low to access the Anthropic API"), the gateway correctly detects this and makes the decision surface_error with reason=billing. However, the webchat control-ui client shows no error, no message, no indication of failure — the UI just sits as if the request is still in flight.

This is the opposite failure mode of #13935 (where a billing error is shown inappropriately) and a UI-specific variant of #24622 (gateway hang on billing error).

Error Message

reason=billing. However, the webchat control-ui client shows no error, This is the opposite failure mode of #13935 (where a billing error is on billing error). 3. Observe: no response, no error, no indication of failure in the UI. The webchat UI should render the provider error (or at minimum a generic UI shows nothing. Gateway log contains the full error but it never reaches

  • No fallback models configured for the agent — the run hits exactly one provider and surfaces the error rather than failing over. 2026-04-22T09:41:29.171+00:00 [agent/embedded] embedded run agent end: runId=010f1588-6c76-4750-a215-bee3dbfb8ad9 isError=true model=claude-haiku-4-5-20251001 provider=anthropic error=LLM request rejected: Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits. rawError=400 {"type":"error","error":{"type":"invalid_request_error", "message":"Your credit balance is too low to access the Anthropic API. ..."}} cases the webchat UI showed nothing — no error banner, no message, message input accepts the prompt, the UI shows no spinner, no error, no

Root Cause

When the Anthropic provider rejects a request with invalid_request_error ("Your credit balance is too low to access the Anthropic API"), the gateway correctly detects this and makes the decision surface_error with reason=billing. However, the webchat control-ui client shows no error, no message, no indication of failure — the UI just sits as if the request is still in flight.

This is the opposite failure mode of #13935 (where a billing error is shown inappropriately) and a UI-specific variant of #24622 (gateway hang on billing error).

Fix Action

Fix / Workaround

  • Anthropic provider configured directly with API key (no OpenRouter, no LiteLLM, no other router in path).
  • Auth profile: sha256:154a23a3efe6 (from gateway log).
  • Reasoning: off (agents.defaults.thinkingDefault=off).
  • No fallback models configured for the agent — the run hits exactly one provider and surfaces the error rather than failing over.
  • Repro account: Anthropic API key with zero remaining credit balance, which causes the API to return HTTP 400 with type=invalid_request_error, message="Your credit balance is too low..."
  • Model switching was done via a helper script (oc-ai-switch.sh preset "haiku45") which edits openclaw.json and restarts the gateway container; the bug reproduces identically regardless of which model string is active, since the failure is at the provider-auth layer before model dispatch.

PR fix notes

PR #70848: fix(runner): throw FailoverError on assistant surface_error so webchat renders provider failures

Description (problem / solution / changelog)

Summary

  • Problem: handleAssistantFailover resolved surface_error, logged the decision, then fell through to continue_normal. buildEmbeddedRunPayloads saw the partial assistant output and dropped the provider error silently, so billing/auth/rate_limit failures never reached the webchat.
  • Why it matters: The reporter on #70124 hit this with Anthropic invalid_request_error (billing). Gateway logs show decision=surface_error reason=billing, but the UI renders nothing. The same path applies to every surface_error that isn't an external abort.
  • What changed: On non-externalAbort surface_error, throw a FailoverError with the resolved reason/provider/model/profileId/status, matching the existing fallback_model behavior. The catch in run.ts already wraps thrown FailoverError into the payload the webchat renders, so the one-site change reuses the whole client surface.
  • What did NOT change (scope boundary): external-abort short-circuit (user-pressed-stop still carries partial output via continue_normal), same-model-idle-timeout retry path, and the rotate_profile / fallback_model branches are untouched.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #70124
  • Related #
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: The surface_error branch in handleAssistantFailover called logAssistantFailoverDecision("surface_error") and then fell through to return { action: "continue_normal", ... }. Control returned to buildEmbeddedRunPayloads, which treated the partial assistant message as a completed turn and dropped the provider error.
  • Missing detection / guardrail: No unit test asserted handleAssistantFailover throws a FailoverError on surface_error. failover-policy.test.ts covered decision resolution but not post-decision outcome side effects.
  • Contributing context (if known): The fallback_model branch already throws FailoverError for the gateway catch to lift into the webchat payload. The two branches were asymmetric; surface_error was the only failure decision that reached continue_normal.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/pi-embedded-runner/run/assistant-failover.test.ts (new).
  • Scenario the test should lock in: each failover reason (billing/auth/rate_limit) throws a FailoverError with the right status; null decision reasons coerce onto the most specific observed failure (timedOut -> timeout/408); externalAbort still falls through to continue_normal; idle-timeout same-model retry preserved; fallback_model regression guard after the message-builder refactor.
  • Why this is the smallest reliable guardrail: handleAssistantFailover is pure given its params. Unit-level tests directly assert the outcome action and FailoverError fields; no need for a heavier runner harness.
  • Existing test that already covers this (if any): failover-policy.test.ts covers decision resolution but not outcome side effects.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Billing, auth, rate_limit, and timeout failures from the assistant turn now surface to the webchat as concrete provider error messages (matching what fallback_model already produced) instead of an empty completion. External aborts (user pressed stop) continue to fall through without synthesizing a provider error.

Diagram (if applicable)

Before:
[surface_error billing] -> [log decision] -> [continue_normal]
                        -> [buildEmbeddedRunPayloads sees partial msg] -> [empty UI]

After:
[surface_error billing] -> [log decision] -> [throw FailoverError(reason=billing, status=402)]
                        -> [run.ts catch lifts into payload] -> [webchat renders provider error]

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: Linux (container)
  • Runtime/container: Node 22 / pnpm 9
  • Model/provider: Anthropic claude-haiku-4-5-20251001 (billing error path)
  • Integration/channel (if any): pi-embedded-runner / webchat
  • Relevant config (redacted): failover path exercised via an assistant turn classified as billing with failoverFailure=true, fallbackConfigured=false.

Steps

  1. Start a session against a provider that returns a billing-style invalid_request_error on the assistant turn (or stub the assistant message to set billingFailure=true with failoverReason=billing).
  2. Watch the gateway log: it records decision=surface_error reason=billing.
  3. Observe the webchat.

Expected

  • Webchat renders the provider's billing error via the shared FailoverError payload path.

Actual

  • Before this PR: webchat renders nothing; gateway log shows the decision but the error never propagates.
  • After this PR: webchat renders the formatted billing error message; the catch in run.ts lifts the FailoverError into the payload the client already knows how to render.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

New assistant-failover.test.ts fails against the pre-patch file (7/7 tests expect action: "throw" with a concrete FailoverError; pre-patch returns action: "continue_normal"). Against the patched file, all 7 pass.

Human Verification (required)

  • Verified scenarios: focused pnpm test on the new assistant-failover.test.ts (7/7 pass); pnpm check:changed (239 tests green); pnpm lint:core (0 warnings / 0 errors across 7127 files); pnpm tsgo:core and pnpm tsgo:test:src (both clean); unit-fast run on the adjacent payloads.errors, payloads, failover-error, failover-policy tests (106 tests / 4 files, all passing).
  • Edge cases checked: null decision reason with timedOut=true coerces to timeout/408; null reason with no signal coerces to unknown; external abort still returns continue_normal; idleTimedOut + allowSameModelIdleTimeoutRetry returns retry with retryKind: "same_model_idle_timeout"; fallback_model branch still throws the same FailoverError with the shared message builder.
  • What you did not verify: live-provider integration probe against a real Anthropic billing error.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: A legitimate surface_error path (outside externalAbort) previously relied on the fall-through to continue_normal and would now throw instead.
    • Mitigation: The only documented caller of continue_normal after a surface_error decision was the external-abort case, which is still preserved explicitly. Every other surface_error decision represented a concrete provider failure that continue_normal was silently dropping.
  • Risk: Null decision reasons now coerce to a concrete FailoverReason, which changes the reported reason on the outgoing FailoverError.
    • Mitigation: The coercion order (timeout -> billing -> auth -> rate_limit -> unknown) matches the same precedence used elsewhere in the failover path; if nothing matches, "unknown" is emitted instead of crashing on null.

AI-assisted: yes

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/agents/pi-embedded-runner/run/assistant-failover.test.ts (added, +232/-0)
  • src/agents/pi-embedded-runner/run/assistant-failover.ts (modified, +100/-22)

PR #70900: fix(runner): gate surface_error throw on failoverFailure

Description (problem / solution / changelog)

Summary

  • Problem: The surface_error throw path added in #70848 guarded only on !externalAbort && !timedOut. classifyFailoverReason in run.ts runs as a pure string classifier without a stopReason === "error" gate, so a successful turn whose lastAssistant carries a stale classified errorMessage (e.g. from an earlier internal retry) can drive shouldRotateAssistant through its failoverReason !== null branch, exhaust profile rotation, and land in the throw path with no concrete provider failure on this attempt. The throw converts a successful turn into a hard error for the client.
  • Why it matters: Silent success-to-error conversion is the exact bug shape #70124 fixed, but in reverse. Flagged by the codex connector on #70848 as P1.
  • What changed: Add && params.failoverFailure to the throw guard. failoverFailure is gated on stopReason === "error" by the isXAssistantError helpers in errors.ts, so it's the signal that a genuine provider failure occurred on this attempt. Update the neighboring comment to name the third fall-through case.
  • What did NOT change (scope boundary): #70124 still lands; Anthropic billing errors set stopReason === "error", which makes both billingFailure and failoverFailure true. The external-abort, timeout, idle-retry, rotate_profile, and fallback_model branches are untouched.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #
  • Related #70848
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: shouldRotateAssistant in failover-policy.ts fires on (!aborted && (failoverFailure || failoverReason !== null)). The second half of that disjunction reads non-null output from classifyFailoverReason(assistantForFailover?.errorMessage ?? "", ...) at run.ts:1476-1481, which is a pure string classifier (errors.ts:1253) with no stopReason filter. So a successful reply whose errorMessage happens to match billing/auth/rate_limit/etc can drive rotation. If rotation fails and no fallback is configured, resolveRunFailoverDecision returns { action: "surface_error", reason: <classified> } and control lands in the throw path added in #70848.
  • Missing detection / guardrail: No test exercised the failoverFailure=false, failoverReason=<classified> shape against the post-#70848 throw path, so the gap landed without tripping the suite.
  • Contributing context (if known): The isBillingAssistantError, isAuthAssistantError, isRateLimitAssistantError, and isFailoverAssistantError helpers in errors.ts (lines 1105, 1129, 1222, 1269) all open with if (!msg || msg.stopReason !== "error") return false;. failoverFailure in run.ts is derived from these, so it's pre-filtered. failoverReason from classifyFailoverReason is not.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/agents/pi-embedded-runner/run/assistant-failover.test.ts.
  • Scenario the test should lock in: failoverFailure=false with a non-null failoverReason and matching initialDecision: { action: "surface_error", reason: "billing" } must return continue_normal, not throw.
  • Why this is the smallest reliable guardrail: The handler is pure given its params. The test sets the exact shape that would have fired the bug and asserts the outcome action.
  • Existing test that already covers this (if any): None; the post-#70848 suite covered failoverFailure=true and failoverFailure=false with reason=null, not failoverFailure=false with reason=<classified>.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Successful assistant turns whose lastAssistant still carries a stale classified errorMessage are no longer converted into hard errors for the client. The #70124 repro (Anthropic billing with stopReason="error") still surfaces the provider error to the webchat unchanged.

Diagram (if applicable)

Before:
[successful turn, stale errorMessage matches "billing"] -> [classifyFailoverReason="billing"]
  -> [shouldRotateAssistant=true via reason!=null] -> [rotate exhausted, no fallback]
  -> [surface_error throw] -> [webchat shows hard error on a successful turn]

After:
[same shape] -> [surface_error decision] -> [!failoverFailure guard] -> [continue_normal]
             -> [partial assistant output carries the turn]

Security Impact (required)

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

Repro + Verification

Environment

  • OS: Linux (container)
  • Runtime/container: Node 22 / pnpm
  • Model/provider: any provider with a retry-classified errorMessage on a successful turn
  • Integration/channel (if any): pi-embedded-runner / webchat
  • Relevant config (redacted): failover path with failoverFailure=false but classified failoverReason, rotation exhausted, fallback unconfigured

Steps

  1. Construct a turn where lastAssistant.errorMessage matches one of the classifier patterns (e.g. "rate limit") but stopReason is not "error" (run.incomplete-turn.test.ts:53-59 demonstrates this shape with stopReason: "toolUse").
  2. Ensure shouldRotateAssistant fires through the failoverReason !== null branch and advanceAuthProfile() returns false.
  3. Resolve decision to surface_error.

Expected

  • Outcome action is continue_normal; the successful assistant output carries the turn.

Actual

  • Before: throw FailoverError with reason derived from the stale errorMessage; client renders a hard error on a successful turn.
  • After: returns continue_normal; the stale classifier output is ignored because failoverFailure is false.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

New test leaves successful turns with a stale classified errorMessage on the continue_normal path fails against main (throws a billing FailoverError) and passes against this branch (returns continue_normal).

Human Verification (required)

  • Verified scenarios: pnpm vitest run src/agents/pi-embedded-runner/run/assistant-failover.test.ts (10/10 passing on this branch); regression check on run.timeout-triggered-compaction.test.ts + run.overflow-compaction.loop.test.ts (31/31 passing); oxlint clean on both changed files.
  • Edge cases checked: failoverFailure=true with billing still throws (the #70124 path); externalAbort=true still returns continue_normal; timedOut=true still returns continue_normal for the runner's timeout-payload synthesis; failoverFailure=false with reason=null still returns continue_normal; failoverFailure=false with reason="billing" (this PR's new case) now returns continue_normal instead of throwing.
  • What you did not verify: live-provider integration probe reproducing the stale-classification shape.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Risks and Mitigations

  • Risk: A surface_error decision that only sets failoverReason (not failoverFailure) and was previously surfacing to the client as an error message is now suppressed.
    • Mitigation: That shape by definition has stopReason !== "error" upstream (since failoverFailure derives from isXAssistantError helpers which all gate on that). The payloads.ts error-payload synthesizer is also gated on stopReason === "error", so any surface_error without failoverFailure already had no error payload synthesized downstream. Net effect: the runtime behavior matches the synthesizer's existing discrimination.

AI-assisted: yes

Changed files

  • src/agents/pi-embedded-runner/run/assistant-failover.test.ts (modified, +24/-0)
  • src/agents/pi-embedded-runner/run/assistant-failover.ts (modified, +26/-10)

Code Example

Here's the "Logs, screenshots, and evidence" block for the issue form:
markdown

## Gateway log — relevant excerpt

Host: `docker logs openclaw-openclaw-gateway-1`, filtered to the failing run.
API key on purpose has zero credit balance (account under test) to repro.

2026-04-22T09:18:20.212+00:00 [gateway] agent model: anthropic/claude-haiku-4-5-20251001 2026-04-22T09:18:20.213+00:00 [gateway] ready (5 plugins: acpx, browser, device-pair, phone-control, talk-voice; 4.5s)

2026-04-22T09:41:29.171+00:00 [agent/embedded] embedded run agent end: runId=010f1588-6c76-4750-a215-bee3dbfb8ad9 isError=true model=claude-haiku-4-5-20251001 provider=anthropic error=LLM request rejected: Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits. rawError=400 {"type":"error","error":{"type":"invalid_request_error", "message":"Your credit balance is too low to access the Anthropic API. ..."}}

2026-04-22T09:41:34.326+00:00 [agent/embedded] auth profile failure state updated: runId=010f1588-6c76-4750-a215-bee3dbfb8ad9 profile=sha256:154a23a3efe6 provider=anthropic reason=billing window=disabled reused=false

2026-04-22T09:41:34.334+00:00 [agent/embedded] embedded run failover decision: runId=010f1588-6c76-4750-a215-bee3dbfb8ad9 stage=assistant decision=surface_error reason=billing from=anthropic/claude-haiku-4-5-20251001 profile=sha256:154a23a3efe6

2026-04-22T09:41:34.885+00:00 [ws] webchat disconnected code=1001 reason=n/a conn=233db546-38c7-4ba4-9c74-c1c618571942

2026-04-22T09:41:35.266+00:00 [ws] webchat connected conn=c2430169-1c5a-4061-9f0e-9f5d9ef8a52a remote=172.18.0.1 client=openclaw-control-ui webchat v2026.4.21


Reproduced twice in the same session (runIds `010f1588-…` and `895220f9-…`)
with identical `decision=surface_error reason=billing` outcome. In both
cases the webchat UI showed nothing — no error banner, no message,
no indication that the request had failed.

## Image / build evidence

$ docker inspect ghcr.io/openclaw/openclaw:latest
--format '{{index .Config.Labels "org.opencontainers.image.version"}}' 2026.4.21

$ docker inspect ghcr.io/openclaw/openclaw:latest
--format '{{index .Config.Labels "org.opencontainers.image.revision"}}' f788c88b4c508c335336fb292afed8c900656d6d

$ docker inspect ghcr.io/openclaw/openclaw:latest --format '{{index .RepoDigests 0}}' ghcr.io/openclaw/openclaw@sha256:01325e7d5c0e273d9f3495fd39b8b811c28b120be4dbe5acee0e12a433e3d2a2


Image created: `2026-04-22T02:30:45Z`
Base: `docker.io/library/node:24-bookworm`
  (digest `sha256:3a09aa6354567619221ef6c45a5051b671f953f0a1924d1f819ffb236e520e6b`)

## Container health at time of repro

$ docker ps --filter name=openclaw-openclaw-gateway CONTAINER ID IMAGE STATUS PORTS f5886a21f89f ghcr.io/openclaw/openclaw:latest Up 29 minutes (healthy) 0.0.0.0:18789-18790->18789-18790/tcp


Container reports healthy throughout — this is not a crash or startup
problem, it is a client-surfacing gap.

## Config at time of repro

$ docker exec openclaw-openclaw-gateway-1
cat /home/node/.openclaw/openclaw.json | grep -E '"model"|thinkingDefault' "model": { "thinkingDefault": "off"


## Screenshots

No screenshot attached — there is nothing visible to capture. The webchat
message input accepts the prompt, the UI shows no spinner, no error, no
response. From the operator's perspective the UI is indistinguishable
from an idle state after sending a message. Happy to record a short screen
capture on request.
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

When the Anthropic provider rejects a request with invalid_request_error ("Your credit balance is too low to access the Anthropic API"), the gateway correctly detects this and makes the decision surface_error with reason=billing. However, the webchat control-ui client shows no error, no message, no indication of failure — the UI just sits as if the request is still in flight.

This is the opposite failure mode of #13935 (where a billing error is shown inappropriately) and a UI-specific variant of #24622 (gateway hang on billing error).

Environment

  • OpenClaw: ghcr.io/openclaw/openclaw:latest (Docker)
  • Deployment: docker compose, gateway container openclaw-openclaw-gateway-1
  • Webchat client: openclaw-control-ui webchat v2026.4.21
  • Host: Ubuntu 20.04
  • Provider: anthropic (API key, credit-depleted account for repro)
  • Model tested: anthropic/claude-haiku-4-5-20251001 (also reproducible with anthropic/claude-opus-4-7)

Steps to reproduce

  1. Configure an Anthropic provider with an API key whose credit balance is zero.
  2. Open the webchat UI, send any message.
  3. Observe: no response, no error, no indication of failure in the UI.

Expected behavior

The webchat UI should render the provider error (or at minimum a generic "provider rejected request — check gateway logs" message) so the operator knows to check billing. The gateway's own log line decision=surface_error reason=billing suggests the backend intended to surface this to the client.

Actual behavior

UI shows nothing. Gateway log contains the full error but it never reaches the webchat client.

OpenClaw version

2026.4.21

Operating system

Ubuntu 20.04 LTS (host); gateway runs inside Docker container

Install method

Docker Compose; image ghcr.io/openclaw/openclaw:latest pulled from GHCR. Config at /data/backup/docker-work/openclaw/ on host, mounted to /home/node/.openclaw/openclaw.json in container.

Model

claude47 / anthropic/claude-haiku-4-5-20251001 (also reproduced with anthropic/claude-opus-4-7)

Provider / routing chain

webchat (openclaw-control-ui webchat v2026.4.21) → gateway container (openclaw-openclaw-gateway-1) on host port 18789 → [agent/embedded] runtime → anthropic provider (direct, no intermediate proxy/router) → api.anthropic.com

Additional provider/model setup details

  • Anthropic provider configured directly with API key (no OpenRouter, no LiteLLM, no other router in path).
  • Auth profile: sha256:154a23a3efe6 (from gateway log).
  • Reasoning: off (agents.defaults.thinkingDefault=off).
  • No fallback models configured for the agent — the run hits exactly one provider and surfaces the error rather than failing over.
  • Repro account: Anthropic API key with zero remaining credit balance, which causes the API to return HTTP 400 with type=invalid_request_error, message="Your credit balance is too low..."
  • Model switching was done via a helper script (oc-ai-switch.sh preset "haiku45") which edits openclaw.json and restarts the gateway container; the bug reproduces identically regardless of which model string is active, since the failure is at the provider-auth layer before model dispatch.

Logs, screenshots, and evidence

Here's the "Logs, screenshots, and evidence" block for the issue form:
markdown

## Gateway log — relevant excerpt

Host: `docker logs openclaw-openclaw-gateway-1`, filtered to the failing run.
API key on purpose has zero credit balance (account under test) to repro.

2026-04-22T09:18:20.212+00:00 [gateway] agent model: anthropic/claude-haiku-4-5-20251001 2026-04-22T09:18:20.213+00:00 [gateway] ready (5 plugins: acpx, browser, device-pair, phone-control, talk-voice; 4.5s)

2026-04-22T09:41:29.171+00:00 [agent/embedded] embedded run agent end: runId=010f1588-6c76-4750-a215-bee3dbfb8ad9 isError=true model=claude-haiku-4-5-20251001 provider=anthropic error=LLM request rejected: Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits. rawError=400 {"type":"error","error":{"type":"invalid_request_error", "message":"Your credit balance is too low to access the Anthropic API. ..."}}

2026-04-22T09:41:34.326+00:00 [agent/embedded] auth profile failure state updated: runId=010f1588-6c76-4750-a215-bee3dbfb8ad9 profile=sha256:154a23a3efe6 provider=anthropic reason=billing window=disabled reused=false

2026-04-22T09:41:34.334+00:00 [agent/embedded] embedded run failover decision: runId=010f1588-6c76-4750-a215-bee3dbfb8ad9 stage=assistant decision=surface_error reason=billing from=anthropic/claude-haiku-4-5-20251001 profile=sha256:154a23a3efe6

2026-04-22T09:41:34.885+00:00 [ws] webchat disconnected code=1001 reason=n/a conn=233db546-38c7-4ba4-9c74-c1c618571942

2026-04-22T09:41:35.266+00:00 [ws] webchat connected conn=c2430169-1c5a-4061-9f0e-9f5d9ef8a52a remote=172.18.0.1 client=openclaw-control-ui webchat v2026.4.21


Reproduced twice in the same session (runIds `010f1588-…` and `895220f9-…`)
with identical `decision=surface_error reason=billing` outcome. In both
cases the webchat UI showed nothing — no error banner, no message,
no indication that the request had failed.

## Image / build evidence

$ docker inspect ghcr.io/openclaw/openclaw:latest
--format '{{index .Config.Labels "org.opencontainers.image.version"}}' 2026.4.21

$ docker inspect ghcr.io/openclaw/openclaw:latest
--format '{{index .Config.Labels "org.opencontainers.image.revision"}}' f788c88b4c508c335336fb292afed8c900656d6d

$ docker inspect ghcr.io/openclaw/openclaw:latest --format '{{index .RepoDigests 0}}' ghcr.io/openclaw/openclaw@sha256:01325e7d5c0e273d9f3495fd39b8b811c28b120be4dbe5acee0e12a433e3d2a2


Image created: `2026-04-22T02:30:45Z`
Base: `docker.io/library/node:24-bookworm`
  (digest `sha256:3a09aa6354567619221ef6c45a5051b671f953f0a1924d1f819ffb236e520e6b`)

## Container health at time of repro

$ docker ps --filter name=openclaw-openclaw-gateway CONTAINER ID IMAGE STATUS PORTS f5886a21f89f ghcr.io/openclaw/openclaw:latest Up 29 minutes (healthy) 0.0.0.0:18789-18790->18789-18790/tcp


Container reports healthy throughout — this is not a crash or startup
problem, it is a client-surfacing gap.

## Config at time of repro

$ docker exec openclaw-openclaw-gateway-1
cat /home/node/.openclaw/openclaw.json | grep -E '"model"|thinkingDefault' "model": { "thinkingDefault": "off"


## Screenshots

No screenshot attached — there is nothing visible to capture. The webchat
message input accepts the prompt, the UI shows no spinner, no error, no
response. From the operator's perspective the UI is indistinguishable
from an idle state after sending a message. Happy to record a short screen
capture on request.

Impact and severity

No response

Additional information

No response

extent analysis

TL;DR

The webchat UI fails to display an error message when the Anthropic provider rejects a request due to an invalid request error, such as a low credit balance, despite the gateway correctly detecting the error and making a decision to surface it.

Guidance

  • Verify that the gateway log contains the expected error message and decision to surface the error, as seen in the provided log excerpt.
  • Check the webchat UI code to ensure it is correctly handling the error response from the gateway and displaying the appropriate error message.
  • Investigate potential issues with the communication between the gateway and webchat UI, such as WebSocket connection or message handling problems.
  • Consider adding additional logging or debugging statements to the webchat UI code to help identify where the error message is being lost.

Example

No code example is provided as the issue is more related to the interaction between the gateway and webchat UI, and the exact code changes would depend on the specific implementation details.

Notes

The issue seems to be specific to the webchat UI and its interaction with the gateway, and may not be related to the Anthropic provider or the gateway itself. Further investigation is needed to determine the root cause of the issue.

Recommendation

Apply a workaround to the webchat UI to handle the error response from the gateway and display the appropriate error message, as the root cause of the issue is not immediately clear and may require further investigation.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

The webchat UI should render the provider error (or at minimum a generic "provider rejected request — check gateway logs" message) so the operator knows to check billing. The gateway's own log line decision=surface_error reason=billing suggests the backend intended to surface this to the client.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING