openclaw - 💡(How to fix) Fix Embedded-run "auth" stage takes 10–15s synchronously regardless of model auth profile state [2 comments, 3 participants]

openclaw2026-05-01 18:51:44

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#75782•Fetched 2026-05-02 05:30:18

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×2cross-referenced ×1mentioned ×1subscribed ×1

In every [trace:embedded-run] startup stages, the auth: stage consistently takes 10–15 seconds, blocking the event loop synchronously. This persists after deleting all OAuth-typed auth-profiles.json entries, leaving only an api_key-typed Volcengine profile. It also persists across lossless-claw 0.9.1 → 0.9.2 upgrade. So this is an OpenClaw core path, not a plugin-side issue.

For long-running interactive deployments (single-user assistant), this 10–15 s synchronous stall every turn dominates user-perceived latency and is the largest remaining hot-path component after every other tunable has been exhausted.

Root Cause

Fix Action

Fix / Workaround

So we suspect it's somewhere in OpenClaw's pre-dispatch model auth/preheat machinery, called once per embedded-run startup, regardless of whether the eventual model uses api_key or oauth.

Happy to run patches against the production install and capture before/after traces if useful.

Code Example

{ "volcengine:default": { "type": "api_key", "provider": "volcengine", "key": "..." } }

---

totalMs=31188  stages=...,model-resolution:3375ms,auth:13078ms,...   (oauth profiles present)
totalMs=23993  stages=...,model-resolution: 987ms,auth:11041ms,...   (oauth deleted, lossless-claw config tightened)
totalMs=24371  stages=...,model-resolution:1039ms,auth:10288ms,...   (skill list pruned 22 items, gai.conf IPv4 first)
totalMs=22540  stages=...,model-resolution: 934ms,auth:14831ms,...   (after openai-codex profile removed)

RAW_BUFFERClick to expand / collapse

Embedded-run "auth" stage takes 10–15s synchronously regardless of model auth profile state

Summary

Environment

OpenClaw 2026.4.29 (a448042)
Channels: feishu (default), telegram, whatsapp, openclaw-weixin
Models: custom-api-ai-velotric-net/gpt-5.5 (primary), volcengine/bailian fallbacks

Active auth-profiles.json:

{ "volcengine:default": { "type": "api_key", "provider": "volcengine", "key": "..." } }

No OAuth-typed profiles after diagnosis.

Observed traces (timestamps from a single VPS)

totalMs=31188  stages=...,model-resolution:3375ms,auth:13078ms,...   (oauth profiles present)
totalMs=23993  stages=...,model-resolution: 987ms,auth:11041ms,...   (oauth deleted, lossless-claw config tightened)
totalMs=24371  stages=...,model-resolution:1039ms,auth:10288ms,...   (skill list pruned 22 items, gai.conf IPv4 first)
totalMs=22540  stages=...,model-resolution: 934ms,auth:14831ms,...   (after openai-codex profile removed)

model-resolution was tunable from 3.4 s → ~1 s by setting lossless-claw's summaryModel/expansionModel and subagent.allowModelOverride/allowedModels (forces a single canonical provider, kills fallback probe chain). But auth: did not respond to any of:

removing anthropic:claude-cli OAuth profile (issue closed via openclaw config + chmod 600)
removing openai-codex:... OAuth profile (only volcengine:default api_key remains)
lossless-claw v0.9.1 → v0.9.2 upgrade
setting LCM summaryModel / expansionModel to a concrete provider/model string
enabling subagent.allowModelOverride: true and allowedModels: ["custom-api-ai-velotric-net/gpt-5.5"]

Why we conclude it's not in lossless-claw

LCM Phase 1 config tuning did measurably move other stages: model-resolution -69%, bundle-tools -15%, system-prompt -21%, prep totalMs -19%. But auth only moved within the noise band (10.3–14.8 s).
Profile state changes (delete oauth → no oauth left) do not move the metric. If this were OAuth refresh on the LCM expansion path, it would respond.

So we suspect it's somewhere in OpenClaw's pre-dispatch model auth/preheat machinery, called once per embedded-run startup, regardless of whether the eventual model uses api_key or oauth.

Reproduction

Configure OpenClaw with a single api_key-only auth-profiles.json and a single concrete model (e.g., custom-api-ai-velotric-net/gpt-5.5).
Send any inbound message via any channel.
journalctl --user --since '5 min ago' | grep '[trace:embedded-run]' — read the startup stages line and observe the auth: segment.

Expected: sub-second. Observed: 10–15 s every time, on every fresh embedded-run.

Asks

Can someone with internals knowledge confirm what auth: covers in embedded-run/startup stages? README and skills/* don't document the stage's contents.
Is there a path to short-circuit it for installs that only use api_key providers? An OPENCLAW_SKIP_MODEL_AUTH_PREHEAT=true env or an auth.skipPreheat: true config would unblock single-user deployments.
If the path is necessary, can it be made async / parallel with model-resolution / bundle-tools? It currently appears strictly sequential.

Happy to run patches against the production install and capture before/after traces if useful.

Diagnostic context (background, not required reading)

Single-user VPS, lcm.db at 2.16 GB, 173k messages, 2 months continuous use. After exhaustive Phase-1/Phase-2 LCM tuning, IPv6→IPv4 system DNS preference (gai.conf), 22-skill prune (136 → 116 commands), v0.9.2 plugin upgrade, and a critical TimeoutStopSec=30s → 600s systemd drop-in fix (default OpenClaw unit's 30 s timeout SIGKILLs in-flight LLM runs every restart), the per-message turnaround dropped from ~114 s to ~91 s on prep+startup. The auth: 10–15s stage is now the single largest tunable-but-not-tuned component left.

Filed by an Emrys-like Claude-Code agent. Diagnostic captured 2026-05-02. Cross-references: lossless-claw#547, lossless-claw#548 (filed for plugin-side cleanup gaps), neither overlaps with this OpenClaw-core issue.

extent analysis

TL;DR

The auth stage in OpenClaw's embedded-run startup is likely causing a 10-15 second synchronous stall due to its sequential execution, and a potential fix could involve making this stage asynchronous or providing an option to skip it for api_key providers.

Guidance

Investigate the auth stage in OpenClaw's embedded-run startup to understand its purpose and what it covers, as it is not documented in the README or skills.
Consider making the auth stage asynchronous or parallel with model-resolution and bundle-tools to reduce the overall startup time.
Explore the possibility of adding an option to skip the auth stage for installs that only use api_key providers, such as an OPENCLAW_SKIP_MODEL_AUTH_PREHEAT environment variable or an auth.skipPreheat config option.
Review the diagnostic context provided to ensure that all other potential causes of the delay have been ruled out.

Example

No code snippet is provided as the issue is more related to the configuration and architecture of OpenClaw rather than a specific code problem.

Notes

The issue seems to be specific to OpenClaw's core path and not related to lossless-claw or other plugins. The provided diagnostic context suggests that other potential causes of the delay have been investigated and ruled out.

Recommendation

Apply a workaround by exploring the possibility of making the auth stage asynchronous or providing an option to skip it for api_key providers, as this could potentially resolve the issue without requiring a full fix.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #prompt template #agent execution #callback error #memory management

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Embedded-run "auth" stage takes 10–15s synchronously regardless of model auth profile state [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Embedded-run "auth" stage takes 10–15s synchronously regardless of model auth profile state

Summary

Environment

Observed traces (timestamps from a single VPS)

Why we conclude it's not in lossless-claw

Reproduction

Asks

Diagnostic context (background, not required reading)

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Embedded-run "auth" stage takes 10–15s synchronously regardless of model auth profile state [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Embedded-run "auth" stage takes 10–15s synchronously regardless of model auth profile state

Summary

Environment

Observed traces (timestamps from a single VPS)

Why we conclude it's not in lossless-claw

Reproduction

Asks

Diagnostic context (background, not required reading)

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING