openclaw - 💡(How to fix) Fix Embedded-run "auth" stage takes 10–15s synchronously regardless of model auth profile state [2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75782Fetched 2026-05-02 05:30:18
View on GitHub
Comments
2
Participants
3
Timeline
5
Reactions
2
Author
Timeline (top)
commented ×2cross-referenced ×1mentioned ×1subscribed ×1

In every [trace:embedded-run] startup stages, the auth: stage consistently takes 10–15 seconds, blocking the event loop synchronously. This persists after deleting all OAuth-typed auth-profiles.json entries, leaving only an api_key-typed Volcengine profile. It also persists across lossless-claw 0.9.1 → 0.9.2 upgrade. So this is an OpenClaw core path, not a plugin-side issue.

For long-running interactive deployments (single-user assistant), this 10–15 s synchronous stall every turn dominates user-perceived latency and is the largest remaining hot-path component after every other tunable has been exhausted.

Root Cause

In every [trace:embedded-run] startup stages, the auth: stage consistently takes 10–15 seconds, blocking the event loop synchronously. This persists after deleting all OAuth-typed auth-profiles.json entries, leaving only an api_key-typed Volcengine profile. It also persists across lossless-claw 0.9.1 → 0.9.2 upgrade. So this is an OpenClaw core path, not a plugin-side issue.

For long-running interactive deployments (single-user assistant), this 10–15 s synchronous stall every turn dominates user-perceived latency and is the largest remaining hot-path component after every other tunable has been exhausted.

Fix Action

Fix / Workaround

So we suspect it's somewhere in OpenClaw's pre-dispatch model auth/preheat machinery, called once per embedded-run startup, regardless of whether the eventual model uses api_key or oauth.

Happy to run patches against the production install and capture before/after traces if useful.

Code Example

{ "volcengine:default": { "type": "api_key", "provider": "volcengine", "key": "..." } }

---

totalMs=31188  stages=...,model-resolution:3375ms,auth:13078ms,...   (oauth profiles present)
totalMs=23993  stages=...,model-resolution: 987ms,auth:11041ms,...   (oauth deleted, lossless-claw config tightened)
totalMs=24371  stages=...,model-resolution:1039ms,auth:10288ms,...   (skill list pruned 22 items, gai.conf IPv4 first)
totalMs=22540  stages=...,model-resolution: 934ms,auth:14831ms,...   (after openai-codex profile removed)
RAW_BUFFERClick to expand / collapse

Embedded-run "auth" stage takes 10–15s synchronously regardless of model auth profile state

Summary

In every [trace:embedded-run] startup stages, the auth: stage consistently takes 10–15 seconds, blocking the event loop synchronously. This persists after deleting all OAuth-typed auth-profiles.json entries, leaving only an api_key-typed Volcengine profile. It also persists across lossless-claw 0.9.1 → 0.9.2 upgrade. So this is an OpenClaw core path, not a plugin-side issue.

For long-running interactive deployments (single-user assistant), this 10–15 s synchronous stall every turn dominates user-perceived latency and is the largest remaining hot-path component after every other tunable has been exhausted.

Environment

  • OpenClaw 2026.4.29 (a448042)
  • Channels: feishu (default), telegram, whatsapp, openclaw-weixin
  • Models: custom-api-ai-velotric-net/gpt-5.5 (primary), volcengine/bailian fallbacks
  • Active auth-profiles.json:
    { "volcengine:default": { "type": "api_key", "provider": "volcengine", "key": "..." } }
    No OAuth-typed profiles after diagnosis.

Observed traces (timestamps from a single VPS)

totalMs=31188  stages=...,model-resolution:3375ms,auth:13078ms,...   (oauth profiles present)
totalMs=23993  stages=...,model-resolution: 987ms,auth:11041ms,...   (oauth deleted, lossless-claw config tightened)
totalMs=24371  stages=...,model-resolution:1039ms,auth:10288ms,...   (skill list pruned 22 items, gai.conf IPv4 first)
totalMs=22540  stages=...,model-resolution: 934ms,auth:14831ms,...   (after openai-codex profile removed)

model-resolution was tunable from 3.4 s → ~1 s by setting lossless-claw's summaryModel/expansionModel and subagent.allowModelOverride/allowedModels (forces a single canonical provider, kills fallback probe chain). But auth: did not respond to any of:

  • removing anthropic:claude-cli OAuth profile (issue closed via openclaw config + chmod 600)
  • removing openai-codex:... OAuth profile (only volcengine:default api_key remains)
  • lossless-claw v0.9.1 → v0.9.2 upgrade
  • setting LCM summaryModel / expansionModel to a concrete provider/model string
  • enabling subagent.allowModelOverride: true and allowedModels: ["custom-api-ai-velotric-net/gpt-5.5"]

Why we conclude it's not in lossless-claw

  • LCM Phase 1 config tuning did measurably move other stages: model-resolution -69%, bundle-tools -15%, system-prompt -21%, prep totalMs -19%. But auth only moved within the noise band (10.3–14.8 s).
  • Profile state changes (delete oauth → no oauth left) do not move the metric. If this were OAuth refresh on the LCM expansion path, it would respond.

So we suspect it's somewhere in OpenClaw's pre-dispatch model auth/preheat machinery, called once per embedded-run startup, regardless of whether the eventual model uses api_key or oauth.

Reproduction

  1. Configure OpenClaw with a single api_key-only auth-profiles.json and a single concrete model (e.g., custom-api-ai-velotric-net/gpt-5.5).
  2. Send any inbound message via any channel.
  3. journalctl --user --since '5 min ago' | grep '[trace:embedded-run]' — read the startup stages line and observe the auth: segment.

Expected: sub-second. Observed: 10–15 s every time, on every fresh embedded-run.

Asks

  1. Can someone with internals knowledge confirm what auth: covers in embedded-run/startup stages? README and skills/* don't document the stage's contents.
  2. Is there a path to short-circuit it for installs that only use api_key providers? An OPENCLAW_SKIP_MODEL_AUTH_PREHEAT=true env or an auth.skipPreheat: true config would unblock single-user deployments.
  3. If the path is necessary, can it be made async / parallel with model-resolution / bundle-tools? It currently appears strictly sequential.

Happy to run patches against the production install and capture before/after traces if useful.


Diagnostic context (background, not required reading)

Single-user VPS, lcm.db at 2.16 GB, 173k messages, 2 months continuous use. After exhaustive Phase-1/Phase-2 LCM tuning, IPv6→IPv4 system DNS preference (gai.conf), 22-skill prune (136 → 116 commands), v0.9.2 plugin upgrade, and a critical TimeoutStopSec=30s → 600s systemd drop-in fix (default OpenClaw unit's 30 s timeout SIGKILLs in-flight LLM runs every restart), the per-message turnaround dropped from ~114 s to ~91 s on prep+startup. The auth: 10–15s stage is now the single largest tunable-but-not-tuned component left.

Filed by an Emrys-like Claude-Code agent. Diagnostic captured 2026-05-02. Cross-references: lossless-claw#547, lossless-claw#548 (filed for plugin-side cleanup gaps), neither overlaps with this OpenClaw-core issue.

extent analysis

TL;DR

The auth stage in OpenClaw's embedded-run startup is likely causing a 10-15 second synchronous stall due to its sequential execution, and a potential fix could involve making this stage asynchronous or providing an option to skip it for api_key providers.

Guidance

  • Investigate the auth stage in OpenClaw's embedded-run startup to understand its purpose and what it covers, as it is not documented in the README or skills.
  • Consider making the auth stage asynchronous or parallel with model-resolution and bundle-tools to reduce the overall startup time.
  • Explore the possibility of adding an option to skip the auth stage for installs that only use api_key providers, such as an OPENCLAW_SKIP_MODEL_AUTH_PREHEAT environment variable or an auth.skipPreheat config option.
  • Review the diagnostic context provided to ensure that all other potential causes of the delay have been ruled out.

Example

No code snippet is provided as the issue is more related to the configuration and architecture of OpenClaw rather than a specific code problem.

Notes

The issue seems to be specific to OpenClaw's core path and not related to lossless-claw or other plugins. The provided diagnostic context suggests that other potential causes of the delay have been investigated and ruled out.

Recommendation

Apply a workaround by exploring the possibility of making the auth stage asynchronous or providing an option to skip it for api_key providers, as this could potentially resolve the issue without requiring a full fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Embedded-run "auth" stage takes 10–15s synchronously regardless of model auth profile state [2 comments, 3 participants]