openclaw - ✅(Solved) Fix [Bug]: subagents.agentRuntime not inherited from agents.defaults — subagents bypass claude-cli runtime, hit Extra Usage bucket instead of Claude Max/Pro subscription, triggers 24h auth lockout [3 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#81395Fetched 2026-05-14 03:32:44
View on GitHub
Comments
2
Participants
3
Timeline
8
Reactions
1
Author
Timeline (top)
cross-referenced ×4commented ×2labeled ×1referenced ×1

When agentRuntime: claude-cli is set at agents.defaults, subagents silently ignore it, fall back to direct HTTP, and drain your Extra Usage quota instead of your Claude Max/Pro subscription — one failed call locks out the entire Anthropic provider for 24 hours.

Error Message

Subagents use the embedded HTTP runner regardless of agents.defaults.agentRuntime. The embedded runner calls the Anthropic API directly without CLI headers, routing to the Extra Usage bucket. One billing error sets disabledUntil in auth-state.json, which survives gateway restarts. All provider models are then skipped with "Provider anthropic has billing issue (skipping all models)" until the cooldown expires or the file is manually cleared.

  • Section 6 — first error at 00:07:04 UTC from subsystem: agent/embedded:

Root Cause

Affected usersAny Claude Max/Pro user running claude-cli OAuth with subagents — standard self-hosted setup
SeverityBlocks entire agent workflow — not just subagents, all interactive chat stops
FrequencyReproducible every time a subagent is spawned with Extra Usage disabled or exhausted; also triggered on gateway restart when a missed cron fires catch-up
ConsequenceUp to 24h outage requiring manual intervention; no warning before or during failure; subscription usage console shows nothing wrong because the subscription is never actually reached

Fix Action

Workaround

Manually clear usageStats fields in auth-state.json and restart:

python3 -c "
import json
path = '/home/<user>/.openclaw/agents/main/agent/auth-state.json'
with open(path) as f: d = json.load(f)
for p in d.get('usageStats', {}):
    for k in ['disabledUntil','failureCounts','errorCount','disabledReason','lastFailureAt']:
        d['usageStats'][p].pop(k, None)
with open(path, 'w') as f: json.dump(d, f, indent=2)
print('Cleared.')
"
openclaw gateway restart

Permanent fix: explicitly set agents.defaults.subagents.agentRuntime.id = "claude-cli" in openclaw.json.

PR fix notes

PR #81401: fix: inherit agents.defaults.agentRuntime in subagent runtime resolution [AI-assisted]

Description (problem / solution / changelog)

🤖 AI-assisted (built with Codex via Hermes orchestration). Test level: fully tested. Prompt summary available on request.

Summary

  • Problem: resolveModelRuntimePolicy never falls back to agents.defaults.agentRuntime, so subagents silently bypass the configured runtime (e.g. claude-cli) and use the embedded HTTP runner instead — routing API calls to the wrong billing bucket (Extra Usage instead of Max/Pro subscription). One billing failure sets a 24h lockout in auth-state.json that blocks all providers.
  • Why it matters: Any Claude Max/Pro user with subagents loses all agent functionality for up to 24 hours with no warning and no auto-recovery.
  • What changed: Added agents.defaults.agentRuntime as the final fallback in resolveModelRuntimePolicy(), after per-model and per-provider checks. Added unit tests covering the inheritance chain.
  • What did NOT change (scope boundary): No changes to the embedded HTTP runner, provider auth, billing lockout logic, or startup warnings. The fix is limited to the runtime resolution fallback chain.

Change Type (select all)

  • Bug fix

Scope (select all touched areas)

  • Agents

Linked Issue/PR

  • Closes #81395
  • This PR fixes a bug or regression

Root Cause

  • Root cause: resolveModelRuntimePolicy resolution chain checked agent model entries, provider config, and model config for agentRuntime, but never consulted agents.defaults.agentRuntime as a final fallback. The defaults-level runtime was only used for per-model entries under agents.defaults.models.*, not as a standalone cascade target.
  • Missing detection / guardrail: No unit test validated the full inheritance chain for agentRuntime resolution. No startup warning when agents.defaults.agentRuntime is set but subagents would not inherit it.
  • Contributing context (if known): The model-runtime-policy module was likely written before agents.defaults.agentRuntime became the primary configuration surface, and the fallback was never added.

Regression Test Plan

  • Coverage level that should have caught this:
    • Unit test
  • Target test or file: src/agents/model-runtime-policy.test.ts
  • Scenario the test should lock in: When agents.defaults.agentRuntime.id is set and no per-model or per-provider runtime is configured, resolveModelRuntimePolicy returns the defaults-level policy as fallback.
  • Why this is the smallest reliable guardrail: The resolution function is pure (config in, policy out) — a unit test directly validates the cascade without needing gateway or subprocess infrastructure.
  • Existing test that already covers this (if any): N/A — no prior test file existed for this module.
  • If no new test is added, why not: N/A — new test added.

User-visible / Behavior Changes

Subagents now correctly inherit agents.defaults.agentRuntime when no per-model or per-provider override is set. Users who set agents.defaults.agentRuntime.id = "claude-cli" no longer need to redundantly configure subagent runtime separately.

Security Impact (required)

  • New permissions/capabilities? No
  • This fix corrects runtime routing so that subagents use the intended authentication path (CLI OAuth) rather than falling back to direct API access. No new permissions or capabilities are introduced; the fix narrows behavior to match the user's configured intent.

Changed files

  • src/agents/model-runtime-policy.test.ts (added, +81/-0)
  • src/agents/model-runtime-policy.ts (modified, +4/-0)

PR #81406: fix(agents): inherit agentRuntime from agent entry and agents.defaults

Description (problem / solution / changelog)

Summary

Subagents were ignoring the agentRuntime configuration set at agents.defaults or per-agent level, falling back to direct HTTP instead of the configured runtime (e.g., claude-cli).

This caused subagents to bypass the CLI runtime, hit the Extra Usage quota instead of the Max/Pro subscription, and trigger 24h auth lockout on billing errors.

Changes

  • Added checks for agents.list[].agentRuntime and agents.defaults.agentRuntime in resolveModelRuntimePolicy() before the provider-level fallback
  • Updated type ModelRuntimePolicySource to include "agent" | "defaults" sources
  • Updated tests to verify the new behavior instead of verifying legacy ignore behavior

Resolution Chain

model-entry agentRuntime → agent-level agentRuntime → defaults agentRuntime → provider model config → provider config

Testing

All affected tests pass:

  • src/agents/harness/selection.test.ts — 38/38 ✅
  • src/agents/harness/ — 380/380 ✅
  • src/agents/subagent-spawn.test.ts — 44/44 ✅

Fix

Fixes #81395

Changed files

  • src/agents/harness/policy.ts (modified, +1/-1)
  • src/agents/harness/selection.test.ts (modified, +5/-4)
  • src/agents/model-runtime-policy.ts (modified, +19/-1)

PR #81421: fix(agents): inherit default agent runtime policy

Description (problem / solution / changelog)

Summary

  • Teach the agent runtime policy layer to inherit agents.defaults.agentRuntime after explicit provider/model settings.
  • Propagate the new defaults runtime source through harness policy/runtime metadata.
  • Add regression coverage for default runtime inheritance and update agent-list expectations so default runtime is honored while legacy per-agent runtime overrides stay ignored.

Context

agents.defaults.agentRuntime is part of the config surface, but the runtime policy path fell back to the implicit auto runtime when no explicit model/provider runtime was set. That meant subagent/harness execution could ignore a workspace-wide default runtime even though the config schema accepted it.

Test Plan

  • CI follow-up: adjusted doctor-session-state-providers route-state expectation so explicit agents.defaults.agentRuntime wins over legacy OPENCLAW_AGENT_RUNTIME env fallback; local focused suite: 4 files / 18 tests passed.
  • RED: kept the new regression test and temporarily removed the implementation changes; src/agents/harness/policy.test.ts failed with runtime: "auto" / runtimeSource: "implicit" instead of runtime: "claude-cli" / runtimeSource: "defaults".
  • GREEN: direct Vitest run for src/agents/harness/policy.test.ts and src/agents/tools/agents-list-tool.test.ts passed: 3 files, 11 tests.
  • Type check: node scripts/run-tsgo.mjs -p tsconfig.core.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core.tsbuildinfo
  • Type check: node scripts/run-tsgo.mjs -p test/tsconfig/tsconfig.core.test.json --incremental --tsBuildInfoFile .artifacts/tsgo-cache/core-test.tsbuildinfo
  • Change check: node scripts/check-changed.mjs --dry-run

Real behavior proof

  • Behavior or issue addressed: A subagent/harness runtime policy now inherits the workspace-level agents.defaults.agentRuntime when no provider/model runtime override is set, instead of falling back to implicit auto.

  • Real environment tested: Local OpenClaw checkout on macOS, branch fix/subagent-default-agent-runtime, commit b44208b61a2cc6106a6ccf8907cc24651b20f6bf, Node v23.11.0. The command invoked the real resolveAgentHarnessPolicy implementation from the patched source.

  • Exact steps or command run after this patch:

    PATH=/opt/homebrew/bin:/opt/homebrew/sbin:$PATH node_modules/.bin/tsx -e "import { resolveAgentHarnessPolicy } from './src/agents/harness/policy.ts'; const config = { agents: { defaults: { model: 'anthropic/claude-sonnet-4-6', agentRuntime: { id: 'claude-cli' }, subagents: { allowAgents: ['worker'] } }, list: [{ id: 'main', default: true }, { id: 'worker' }] } }; const policy = resolveAgentHarnessPolicy({ provider: 'anthropic', modelId: 'claude-sonnet-4-6', config, agentId: 'worker', sessionKey: 'agent:main:worker' }); console.log(JSON.stringify({ scenario: 'subagent-default-runtime', policy }, null, 2)); if (policy.runtime !== 'claude-cli' || policy.runtimeSource !== 'defaults') process.exit(1);"
  • Evidence after fix: Terminal output copied from the local OpenClaw checkout after the patch:

    {
      "scenario": "subagent-default-runtime",
      "policy": {
        "runtime": "claude-cli",
        "runtimeSource": "defaults"
      }
    }
  • Observed result after fix: The real policy resolver returned runtime: "claude-cli" and runtimeSource: "defaults", confirming that agents.defaults.agentRuntime is now consumed by the runtime policy path.

  • What was not tested: No full external agent runtime process was launched; this PR is limited to the config/policy selection path and its metadata/test coverage.

Fixes #81395

Changed files

  • src/agents/agent-runtime-metadata.ts (modified, +1/-1)
  • src/agents/harness/policy.test.ts (added, +76/-0)
  • src/agents/harness/policy.ts (modified, +1/-1)
  • src/agents/model-runtime-policy.ts (modified, +4/-1)
  • src/agents/tools/agents-list-tool.test.ts (modified, +2/-2)
  • src/commands/doctor-session-state-providers.test.ts (modified, +2/-2)

Code Example

and persists across restarts.

## Logs and Evidence

Diagnostic report (tokens redacted by diagnostic script). Key evidence:

- **Section 2**`disabledUntil` set, `disabledReason: billing`
- **Section 6** — first error at `00:07:04 UTC` from `subsystem: agent/embedded`:

---

python3 -c "
import json
path = '/home/<user>/.openclaw/agents/main/agent/auth-state.json'
with open(path) as f: d = json.load(f)
for p in d.get('usageStats', {}):
    for k in ['disabledUntil','failureCounts','errorCount','disabledReason','lastFailureAt']:
        d['usageStats'][p].pop(k, None)
with open(path, 'w') as f: json.dump(d, f, indent=2)
print('Cleared.')
"
openclaw gateway restart
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

When agentRuntime: claude-cli is set at agents.defaults, subagents silently ignore it, fall back to direct HTTP, and drain your Extra Usage quota instead of your Claude Max/Pro subscription — one failed call locks out the entire Anthropic provider for 24 hours.

Steps to reproduce

  1. Set agents.defaults.agentRuntime.id = "claude-cli" in openclaw.json
  2. Leave agents.defaults.subagents.agentRuntime unset
  3. Trigger any session that spawns a subagent (cron job with kind: agentTurn, or any task calling sessions_spawn)
  4. Observe the subagent fires via subsystem: agent/embedded instead of /usr/bin/claude
  5. If Extra Usage is disabled or exhausted, the embedded call returns 400 and OpenClaw writes disabledUntil ~24h forward to auth-state.json
  6. All subsequent requests — including normal interactive chat — return "Provider anthropic has billing issue (skipping all models)"

Expected behavior

Subagents inherit agentRuntime: claude-cli from agents.defaults and route through the /usr/bin/claude binary, which carries the correct Claude Code headers that bill against the Max/Pro subscription.

Actual behavior

Subagents use the embedded HTTP runner regardless of agents.defaults.agentRuntime. The embedded runner calls the Anthropic API directly without CLI headers, routing to the Extra Usage bucket. One billing error sets disabledUntil in auth-state.json, which survives gateway restarts. All provider models are then skipped with "Provider anthropic has billing issue (skipping all models)" until the cooldown expires or the file is manually cleared.

OpenClaw version

2026.2.21-2

Operating system

Ubuntu 24.04 LTS (Linux 6.8.0-100-generic x86_64)

Install method

npm install -g openclaw

Model

anthropic/claude-sonnet-4-6

Provider / routing chain

  • Intended: anthropic:claude-cli OAuth → /usr/bin/claude binary - Actual (subagents): embedded HTTP runner → api.anthropic.com directly (no CLI headers)

Additional provider/model setup details

Single auth profile anthropic:claude-cli (OAuth). No anthropic:manual profile.
agents.defaults.agentRuntime.id = "claude-cli" is set.
agents.defaults.subagents.agentRuntime is absent.

The diagnostic tool flags this post-hoc under automated diagnosis but emits no warning at gateway startup. The lockout state is stored under usageStats["anthropic:claude-cli"] in:

Logs, screenshots, and evidence

and persists across restarts.

## Logs and Evidence

Diagnostic report (tokens redacted by diagnostic script). Key evidence:

- **Section 2** — `disabledUntil` set, `disabledReason: billing`
- **Section 6** — first error at `00:07:04 UTC` from `subsystem: agent/embedded`:

Impact and severity

Impact and Severity

Affected usersAny Claude Max/Pro user running claude-cli OAuth with subagents — standard self-hosted setup
SeverityBlocks entire agent workflow — not just subagents, all interactive chat stops
FrequencyReproducible every time a subagent is spawned with Extra Usage disabled or exhausted; also triggered on gateway restart when a missed cron fires catch-up
ConsequenceUp to 24h outage requiring manual intervention; no warning before or during failure; subscription usage console shows nothing wrong because the subscription is never actually reached

Workaround

Manually clear usageStats fields in auth-state.json and restart:

python3 -c "
import json
path = '/home/<user>/.openclaw/agents/main/agent/auth-state.json'
with open(path) as f: d = json.load(f)
for p in d.get('usageStats', {}):
    for k in ['disabledUntil','failureCounts','errorCount','disabledReason','lastFailureAt']:
        d['usageStats'][p].pop(k, None)
with open(path, 'w') as f: json.dump(d, f, indent=2)
print('Cleared.')
"
openclaw gateway restart

Permanent fix: explicitly set agents.defaults.subagents.agentRuntime.id = "claude-cli" in openclaw.json.

Additional Information

The diagnostic tool detects the misconfiguration but only when run manually. A gateway startup warning — analogous to the existing allowInsecureAuth=true warning — would catch this before it causes an outage.

Additional information

Suggested Fix

Two changes, one behavioral and one UX:

1. Default inheritance (the real fix)

When agents.defaults.agentRuntime.id is set, subagents.agentRuntime should automatically inherit it unless explicitly overridden. Same logic as any sane config cascade — child inherits parent unless told otherwise.

2. Startup warning (the safety net)

When agentRuntime.id = "claude-cli" is detected at agents.defaults but subagents.agentRuntime is unset, emit a startup warning — same pattern already used for allowInsecureAuth=true. Catches misconfiguration before it causes a 24h outage.

Note: We are not asking to remove the embedded HTTP runner — it is a legitimate option for isolation use cases. The problem is the silent fallback, not the existence of the path.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Subagents inherit agentRuntime: claude-cli from agents.defaults and route through the /usr/bin/claude binary, which carries the correct Claude Code headers that bill against the Max/Pro subscription.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: subagents.agentRuntime not inherited from agents.defaults — subagents bypass claude-cli runtime, hit Extra Usage bucket instead of Claude Max/Pro subscription, triggers 24h auth lockout [3 pull requests, 2 comments, 3 participants]