openclaw - 💡(How to fix) Fix [Bug]: slug-generator HTTP 400 misclassified as profile-wide billing failure (5h cooldown), kills all agents on profile [1 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#71709Fetched 2026-04-26 05:09:35
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Timeline (top)
cross-referenced ×1mentioned ×1subscribed ×1

A transient HTTP 400 returned to the internal generateSlugViaLLM helper is misclassified as a profile-wide billing failure: OpenClaw marks the entire anthropic:claude-cli auth profile as disabled:billing with a 5-hour cooldown. Every other agent that uses the same profile then silently fails with Provider claude-cli has billing issue (skipping all models) despite the underlying subscription being healthy.

Error Message

  • It should not flip the auth profile to disabled:billing for any non-billing-coded HTTP error.
  • Lane log: lane task error: lane=session:temp:slug-generator ... error="FailoverError: LLM request rejected: You're out of extra usage..."

Root Cause

A transient HTTP 400 returned to the internal generateSlugViaLLM helper is misclassified as a profile-wide billing failure: OpenClaw marks the entire anthropic:claude-cli auth profile as disabled:billing with a 5-hour cooldown. Every other agent that uses the same profile then silently fails with Provider claude-cli has billing issue (skipping all models) despite the underlying subscription being healthy.

Fix Action

Fix / Workaround

OpenClaw version

2026.4.23 (incident reproduced on the version current on 2026-04-14; behavior path still present in 2026.4.23 — the workaround flag is the only thing keeping it dormant)

Additional provider/model setup details

Hot-fix workaround applied locally and recommended for any user with this profile shape:

"hooks": {
  "internal": {
    "entries": {
      "session-memory": {
        "enabled": true,
        "llmSlug": false
      }
    }
  }
}

This disables the slug-generator entirely (sessions then named by UUID/timestamp), removing the offending lane. Hot-reload picks it up: [reload] config hot reload applied (hooks.internal.entries.session-memory.llmSlug). Functional, but leaves the default behaviour exposed — every new install with llmSlug: true (the default) is one transient internal 400 away from a 5-hour silent outage.

Code Example

"hooks": {
  "internal": {
    "entries": {
      "session-memory": {
        "enabled": true,
        "llmSlug": false
      }
    }
  }
}

---

python3 <<'PY'
import json
p = '/home/ubuntu/.openclaw/agents/main/agent/auth-state.json'
d = json.load(open(p))
cli = d['usageStats']['anthropic:claude-cli']
for k in ('disabledUntil','disabledReason','failureCounts','lastFailureAt'):
    cli.pop(k, None)
cli['errorCount'] = 0
json.dump(d, open(p,'w'), indent=2)
PY
systemctl --user restart openclaw-gateway.service
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

A transient HTTP 400 returned to the internal generateSlugViaLLM helper is misclassified as a profile-wide billing failure: OpenClaw marks the entire anthropic:claude-cli auth profile as disabled:billing with a 5-hour cooldown. Every other agent that uses the same profile then silently fails with Provider claude-cli has billing issue (skipping all models) despite the underlying subscription being healthy.

Steps to reproduce

  1. Configure anthropic:claude-cli as the primary auth profile (Max subscription, healthy).
  2. Allow OpenClaw default hooks.internal.entries.session-memory.llmSlug: true to remain on.
  3. Trigger any session that ends up in the slug-generator path; let the underlying claude invocation return any 400 (we observed it as LLM request rejected: You're out of extra usage).
  4. Observe ~/.openclaw/agents/main/agent/auth-state.jsonusageStats["anthropic:claude-cli"] gains disabledReason: "billing", disabledUntil set 5 hours forward.
  5. Send any prompt to a Cognitor agent on the same profile (Telegram, CLI, anything). It immediately fails with the cooldown message — no upstream Anthropic call is made.

Expected behavior

A failure inside the slug-generation lane (a low-priority, internal helper) should be lane-local:

  • It should not flip the auth profile to disabled:billing for any non-billing-coded HTTP error.
  • The classifier should require a real billing signal (HTTP 402, explicit billing/quota body) before marking the profile down — same direction as #56053 (the 402 fallback fix). HTTP 400 with arbitrary text is not a billing signal.
  • At minimum, an internal-helper failure should be scoped to the helper, not propagate to the profile-wide failover state used by user-facing agents.

Actual behavior

  • Real incident 2026-04-14 22:19–22:37 UTC.
  • Lane log: lane task error: lane=session:temp:slug-generator ... error="FailoverError: LLM request rejected: You're out of extra usage..."
  • Auth-state log: event=auth_profile_failure_state_updated provider=claude-cli reason=billing disabledReason=billing disabledUntil=2026-04-15T03:19:..Z
  • Subsequent traffic: reason=billing errorHash=sha256:52c819aaad70 "Provider claude-cli has billing issue (skipping all models)" — same errorHash on every request for the next 5 hours.
  • Direct claude -p --model claude-opus-4-6 "ping" returned a normal response throughout the cooldown — confirming Anthropic itself is fine.
  • User-visible symptom: every Cognitor agent (Telegram bot, CLI sessions) returned ⚠️ Something went wrong… with no diagnostic for the operator.

OpenClaw version

2026.4.23 (incident reproduced on the version current on 2026-04-14; behavior path still present in 2026.4.23 — the workaround flag is the only thing keeping it dormant)

Operating system

Ubuntu 24.04.4 LTS (kernel 6.8.0-107-generic)

Install method

npm global (/home/ubuntu/.npm-global/lib/node_modules/openclaw)

Model

claude-cli/claude-opus-4-7 (primary)

Provider / routing chain

openclaw -> auth-profile anthropic:claude-cli -> claude (CLI) -> anthropic.com

Additional provider/model setup details

Hot-fix workaround applied locally and recommended for any user with this profile shape:

"hooks": {
  "internal": {
    "entries": {
      "session-memory": {
        "enabled": true,
        "llmSlug": false
      }
    }
  }
}

This disables the slug-generator entirely (sessions then named by UUID/timestamp), removing the offending lane. Hot-reload picks it up: [reload] config hot reload applied (hooks.internal.entries.session-memory.llmSlug). Functional, but leaves the default behaviour exposed — every new install with llmSlug: true (the default) is one transient internal 400 away from a 5-hour silent outage.

Manual recovery once the cooldown is set, until OpenClaw classifies this correctly:

python3 <<'PY'
import json
p = '/home/ubuntu/.openclaw/agents/main/agent/auth-state.json'
d = json.load(open(p))
cli = d['usageStats']['anthropic:claude-cli']
for k in ('disabledUntil','disabledReason','failureCounts','lastFailureAt'):
    cli.pop(k, None)
cli['errorCount'] = 0
json.dump(d, open(p,'w'), indent=2)
PY
systemctl --user restart openclaw-gateway.service

Suggested direction

Mirror what #56053 / PR #56069 did for HTTP 402: tighten the failover classifier so that only billing-coded HTTP responses (402, or 4xx with an explicit billing/quota body marker) flip a profile to disabled:billing. For internal helper lanes (session:temp:slug-generator and similar), additionally scope the classifier so a helper-lane failure cannot escalate to a profile-wide cooldown — at most disable the helper.

Related (same family of misclassification): #56053, #14272, #25191, #33962.


Reported by @nikolaykazakovvs-ux via Cognitor (claude-opus-4-7 substrate).

extent analysis

TL;DR

The most likely fix is to update the failover classifier to only mark a profile as disabled:billing for billing-coded HTTP responses, such as 402 or 4xx with an explicit billing/quota body marker.

Guidance

  • Review the current failover classifier logic to identify why a 400 error is being misclassified as a billing failure.
  • Update the classifier to require a real billing signal, such as a 402 error or an explicit billing/quota body marker, before marking a profile as disabled:billing.
  • Consider scoping the classifier to prevent internal helper lane failures from escalating to a profile-wide cooldown.
  • Apply the hot-fix workaround by setting hooks.internal.entries.session-memory.llmSlug to false to disable the slug-generator and prevent the issue until a permanent fix is implemented.

Example

No code example is provided as the issue is related to the classifier logic, which is not explicitly shown in the issue body.

Notes

The issue is specific to the OpenClaw version 2026.4.23 and the anthropic:claude-cli auth profile. The hot-fix workaround is functional but leaves the default behavior exposed, and a permanent fix is needed to prevent similar issues in the future.

Recommendation

Apply the hot-fix workaround by setting hooks.internal.entries.session-memory.llmSlug to false to prevent the issue until a permanent fix is implemented. This will disable the slug-generator and prevent the misclassification of 400 errors as billing failures.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

A failure inside the slug-generation lane (a low-priority, internal helper) should be lane-local:

  • It should not flip the auth profile to disabled:billing for any non-billing-coded HTTP error.
  • The classifier should require a real billing signal (HTTP 402, explicit billing/quota body) before marking the profile down — same direction as #56053 (the 402 fallback fix). HTTP 400 with arbitrary text is not a billing signal.
  • At minimum, an internal-helper failure should be scoped to the helper, not propagate to the profile-wide failover state used by user-facing agents.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: slug-generator HTTP 400 misclassified as profile-wide billing failure (5h cooldown), kills all agents on profile [1 participants]