openclaw - 💡(How to fix) Fix Feature request: proactive OAuth refresh canary / pre-expiry keepalive

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

OpenClaw OAuth refresh is documented as lazy/on-demand: an expired token is refreshed only when the next request hits it. Per docs/concepts/oauth.md:

  • if expires is in the future → use the stored access token
  • if expired → refresh (under a file lock) and overwrite the stored credentials
  • The refresh flow is automatic; you generally don't need to manage tokens manually.

This behavior is correct as designed. However, it leaves a silent expiry window when:

  • A provider's token expires
  • No request hits that provider for hours / days
  • An operator looking at the auth-profiles.json sees expires in the past and assumes the refresh path is broken
  • When traffic eventually arrives, the refresh fires — but the operator-side concern has already triggered investigation

Proposing a low-cost proactive refresh canary that periodically pre-warms OAuth tokens so the lazy refresh path is exercised before silent expiry windows accumulate.

Root Cause

OpenClaw OAuth refresh is documented as lazy/on-demand: an expired token is refreshed only when the next request hits it. Per docs/concepts/oauth.md:

  • if expires is in the future → use the stored access token
  • if expired → refresh (under a file lock) and overwrite the stored credentials
  • The refresh flow is automatic; you generally don't need to manage tokens manually.

This behavior is correct as designed. However, it leaves a silent expiry window when:

  • A provider's token expires
  • No request hits that provider for hours / days
  • An operator looking at the auth-profiles.json sees expires in the past and assumes the refresh path is broken
  • When traffic eventually arrives, the refresh fires — but the operator-side concern has already triggered investigation

Proposing a low-cost proactive refresh canary that periodically pre-warms OAuth tokens so the lazy refresh path is exercised before silent expiry windows accumulate.

Fix Action

Fix / Workaround

  • #76247 (dispatch_acks landing telemetry) — different surface but same family of "exercise the path proactively before silent failure"
  • Filing this as a separate feature request since the dispatcher work scope is distinct from auth refresh.

Code Example

{
  "plugins": {
    "entries": {
      "<provider>": {
        "config": {
          "oauth": {
            "canary": {
              "enabled": true,
              "cadence": "*/30 * * * *",
              "lookahead_seconds": 600,
              "alert_after_consecutive_failures": 3
            }
          }
        }
      }
    }
  }
}
RAW_BUFFERClick to expand / collapse

Feature request: proactive OAuth refresh canary / pre-expiry keepalive

Target repo: openclaw/openclaw Filer: islandpreneur007 Draft status: ready for review + file (feature request, not bug) Date drafted: 2026-05-21 Source evidence: xAI OAuth investigation 2026-05-21 with substrate code-path documentation in docs/concepts/oauth.md

Summary

OpenClaw OAuth refresh is documented as lazy/on-demand: an expired token is refreshed only when the next request hits it. Per docs/concepts/oauth.md:

  • if expires is in the future → use the stored access token
  • if expired → refresh (under a file lock) and overwrite the stored credentials
  • The refresh flow is automatic; you generally don't need to manage tokens manually.

This behavior is correct as designed. However, it leaves a silent expiry window when:

  • A provider's token expires
  • No request hits that provider for hours / days
  • An operator looking at the auth-profiles.json sees expires in the past and assumes the refresh path is broken
  • When traffic eventually arrives, the refresh fires — but the operator-side concern has already triggered investigation

Proposing a low-cost proactive refresh canary that periodically pre-warms OAuth tokens so the lazy refresh path is exercised before silent expiry windows accumulate.

Concrete repro from 2026-05-21

  • xAI OAuth profile in agents/main/agent/auth-profiles.json had expires ~14h in the past at 14:32Z investigation start.
  • Refresh code path was correct and worked when we triggered it via a non-mutating probe at 14:44Z (refreshed to expires=2026-05-21T20:39:03Z).
  • But the 14h silent window caused an operator-side OAuth investigation that consumed ~3h, the bulk of which would have been avoided with a periodic keepalive.

Proposed feature shape

A configurable per-provider canary that:

  • Runs on a cron schedule (e.g. */30 * * * * every 30 min)
  • For each enabled OAuth provider, checks if any stored profile has expires within a configurable lookahead window (e.g. lookahead_seconds: 600)
  • If yes, runs a no-op refresh-and-discard round-trip against the OAuth endpoint
  • Logs success/failure to the gateway journal with a structured event class (e.g. oauth_canary_refresh_ok / oauth_canary_refresh_failed)
  • Optionally raises a structured alert if refresh fails persistently (e.g. via existing inspector/alert mechanism)

Configuration sketch

{
  "plugins": {
    "entries": {
      "<provider>": {
        "config": {
          "oauth": {
            "canary": {
              "enabled": true,
              "cadence": "*/30 * * * *",
              "lookahead_seconds": 600,
              "alert_after_consecutive_failures": 3
            }
          }
        }
      }
    }
  }
}

Or a top-level gateway.oauth.canary config that applies to all OAuth providers uniformly.

Why this isn't already covered by existing mechanisms

  • openclaw doctor is one-shot, not periodic. It would require operator-side cron to run periodically.
  • L2-WATCHDOG / gateway health probes don't currently exercise OAuth refresh paths.
  • The lazy refresh code path itself is correct — this isn't about fixing the refresh, it's about exercising it before silent expiry windows surface.

Operational benefit

Surfaces OAuth refresh path failures (revoked tokens, changed provider OAuth endpoints, network issues) before they affect downstream agent traffic. Catches token revocation / consent revocation early. Reduces false-positive operator-side investigations triggered by stale expires fields.

Alternatives considered

  1. Operator-side cron running openclaw infer model auth status periodically — workable but adds maintenance burden + doesn't actually refresh, just reports. Doesn't exercise the refresh round-trip.
  2. Pull-only on-demand-only model (current default) — works but silent window persists.
  3. Aggressive auto-refresh every request — wasteful network, not needed.
  4. Per-provider periodic canary (this proposal) — minimal cost, exercises refresh path, surfaces failures early.

Environment context

  • OpenClaw 2026.5.19 (a185ca2)
  • Providers we run with OAuth: xai, openai-codex
  • Cascading agents using read-through OAuth inheritance from main: 25+ in our fleet
  • Per docs/concepts/oauth.md "Storage" section, secondary agents inherit main's OAuth via read-through. A main-store-level canary would exercise the refresh path for the whole cascade.

Suggested fit

This may map to a small bundled extension under plugins/oauth-canary or as a gateway-level cron addition. Implementation surface is small (~100 LoC + config) and reuses the existing refresh code path verbatim.

Related issues

  • #76247 (dispatch_acks landing telemetry) — different surface but same family of "exercise the path proactively before silent failure"
  • Filing this as a separate feature request since the dispatcher work scope is distinct from auth refresh.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING