openclaw - 💡(How to fix) Fix Discord can go silent after 2026.5.5 upgrade/doctor model-runtime migration despite connected bot and working send path [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#78609Fetched 2026-05-07 03:34:41
View on GitHub
Comments
1
Participants
2
Timeline
1
Reactions
2
Timeline (top)
commented ×1

After upgrading a live macOS OpenClaw install to 2026.5.5 (Discord plugin 2026.5.6), Discord appeared healthy and inbound events had previously been observed, but Jarvis/main stopped producing normal Discord replies. Manual Discord send through the gateway still worked, so the Discord token/send permissions were not the root cause.

The failure appears to be a model/runtime migration + stale session/task recovery issue:

  • upgrade/doctor rewrote working openai-codex/* model refs toward openai/* and stamped some agents/sessions with Codex runtime fields
  • this install did not have a usable openai provider route for that rewritten config
  • gateway logs showed model-provider resolution errors and Codex app-server fallback errors
  • existing Discord channel sessions could remain stalled/queued, and old background tasks remained stale_running
  • externally, the user-visible symptom was simply: Discord bot is connected, but no agent reply appears

Error Message

CodexAppServerRpcError: failed to load configuration: Model provider anthropic not found CodexAppServerRpcError: failed to load configuration: Model provider crofai not found CodexAppServerRpcError: failed to load configuration: Model provider minimax-direct not found Error: Codex app-server auth profile "zai:default" must belong to provider "openai-codex" or a supported alias. FailoverError: LLM request timed out. stalled session: sessionKey=agent:main:discord:channel:<channel-id> state=processing queueDepth=1

Root Cause

After upgrading a live macOS OpenClaw install to 2026.5.5 (Discord plugin 2026.5.6), Discord appeared healthy and inbound events had previously been observed, but Jarvis/main stopped producing normal Discord replies. Manual Discord send through the gateway still worked, so the Discord token/send permissions were not the root cause.

Code Example

openclaw tasks list --status running --json
count: 3

---

stale_running: 3
errors: 3

---

CodexAppServerRpcError: failed to load configuration: Model provider `anthropic` not found
CodexAppServerRpcError: failed to load configuration: Model provider `crofai` not found
CodexAppServerRpcError: failed to load configuration: Model provider `minimax-direct` not found
Error: Codex app-server auth profile "zai:default" must belong to provider "openai-codex" or a supported alias.
FailoverError: LLM request timed out.
stalled session: sessionKey=agent:main:discord:channel:<channel-id> state=processing queueDepth=1

---

openclaw message send --channel discord --account main --target channel:<channel-id> --message "OpenClaw upgrade diagnostic: Jarvis outbound Discord send path is live. Testing only." --json

---

{
  "ok": true,
  "result": {
    "messageId": "<discord-message-id>",
    "channelId": "<channel-id>"
  }
}

---

openclaw tasks list --status running --json
count: 0

---

ok: true
Discord main connected: true
Discord specialists connected: true
lastError: null
restartPending: false
RAW_BUFFERClick to expand / collapse

Summary

After upgrading a live macOS OpenClaw install to 2026.5.5 (Discord plugin 2026.5.6), Discord appeared healthy and inbound events had previously been observed, but Jarvis/main stopped producing normal Discord replies. Manual Discord send through the gateway still worked, so the Discord token/send permissions were not the root cause.

The failure appears to be a model/runtime migration + stale session/task recovery issue:

  • upgrade/doctor rewrote working openai-codex/* model refs toward openai/* and stamped some agents/sessions with Codex runtime fields
  • this install did not have a usable openai provider route for that rewritten config
  • gateway logs showed model-provider resolution errors and Codex app-server fallback errors
  • existing Discord channel sessions could remain stalled/queued, and old background tasks remained stale_running
  • externally, the user-visible symptom was simply: Discord bot is connected, but no agent reply appears

Environment

  • OpenClaw: 2026.5.5
  • Discord plugin: 2026.5.6
  • OS: macOS Darwin arm64
  • Gateway: LaunchAgent, local loopback 127.0.0.1:18789
  • Channel: Discord guild channels
  • Main bot: Jarvis/main
  • Install/config paths sanitized, but this was an npm/Homebrew global install with OpenClaw config under the user OpenClaw home

Observed symptoms

Before cleanup:

  • openclaw gateway status reported gateway running and connectivity probe OK
  • Discord accounts reported running/connected
  • lastInboundAt had advanced during the failure window
  • lastOutboundAt stayed null
  • user saw no replies in Discord
  • direct Discord app/channel send had not yet been isolated

Task audit showed three stale running tasks from earlier repair/subagent/heartbeat work:

openclaw tasks list --status running --json
count: 3

openclaw tasks audit --json classified them as:

stale_running: 3
errors: 3

Gateway logs around the failure window included errors of this shape:

CodexAppServerRpcError: failed to load configuration: Model provider `anthropic` not found
CodexAppServerRpcError: failed to load configuration: Model provider `crofai` not found
CodexAppServerRpcError: failed to load configuration: Model provider `minimax-direct` not found
Error: Codex app-server auth profile "zai:default" must belong to provider "openai-codex" or a supported alias.
FailoverError: LLM request timed out.
stalled session: sessionKey=agent:main:discord:channel:<channel-id> state=processing queueDepth=1

The important user-visible gap: none of these were surfaced as a clear Discord error/reply. The bot simply appeared silent.

What fixed this install locally

  1. Restored the config to a working openai-codex/gpt-* model route for the affected agents instead of the upgrade/doctor-rewritten openai/gpt-* + Codex runtime path.
  2. Removed stale per-session Codex runtime/harness overrides from session stores.
  3. Cancelled the three stale running tasks with openclaw tasks cancel <taskId>.
  4. Restarted the gateway.
  5. Verified gateway and Discord accounts came back healthy.
  6. Verified direct Discord outbound send path:
openclaw message send --channel discord --account main --target channel:<channel-id> --message "OpenClaw upgrade diagnostic: Jarvis outbound Discord send path is live. Testing only." --json

Result:

{
  "ok": true,
  "result": {
    "messageId": "<discord-message-id>",
    "channelId": "<channel-id>"
  }
}

After cleanup:

openclaw tasks list --status running --json
count: 0

openclaw health --json showed:

ok: true
Discord main connected: true
Discord specialists connected: true
lastError: null
restartPending: false

Expected behavior

Upgrade/doctor migration should not leave a Discord install in a state where:

  • Discord is connected
  • inbound messages are accepted or were recently accepted
  • manual Discord send path works
  • but normal agent replies silently fail or stall due to model/runtime/session state

If model/runtime migration cannot be made safely, OpenClaw should either:

  • preserve the known-good model/runtime route
  • emit a hard validation warning before restart
  • surface a clear channel-visible or status-visible error
  • provide a targeted repair that does not also rewrite working model refs

Actual behavior

The install became Discord-silent. Gateway/channel status looked broadly healthy, but the agent reply path failed/stalled. The actionable errors were only discoverable by combining gateway logs, model status, task audit, and session-state inspection.

Why this is hard to debug

Several signals point in different directions:

  • openclaw gateway status: OK
  • Discord accounts: connected
  • Discord token/send path: OK after manual send test
  • channel security: no warnings
  • doctor --fix: recommended, but in this install it was exactly the risky path because it would rewrite openai-codex/* back toward the route that broke the install
  • stale task/session state: only obvious through tasks audit and session inspection

Suggested fixes

  • Make doctor/upgrade model-runtime migration transactional and validate the post-migration provider route before writing config/session runtime overrides.
  • If openai/* provider is not usable, do not rewrite a working openai-codex/* route into it.
  • Add a targeted command to clear stale session runtime overrides without running all of doctor --fix.
  • Include tasks audit stale-running findings in doctor/status when channel replies are stalled.
  • When Discord inbound-to-agent fails before producing a reply, send a visible fallback error or at least update channel health with a clear lastReplyError.
  • Consider a diagnostic that differentiates:
    • Discord gateway connected
    • Discord outbound API send works
    • inbound message routed to agent session
    • model turn completed
    • Discord reply emitted

Impact

High for live Discord installs: the bot appears present and healthy, but users get no replies. The repair requires knowing to inspect model-runtime migration state, stale tasks, session runtime overrides, and Discord send path separately.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Upgrade/doctor migration should not leave a Discord install in a state where:

  • Discord is connected
  • inbound messages are accepted or were recently accepted
  • manual Discord send path works
  • but normal agent replies silently fail or stall due to model/runtime/session state

If model/runtime migration cannot be made safely, OpenClaw should either:

  • preserve the known-good model/runtime route
  • emit a hard validation warning before restart
  • surface a clear channel-visible or status-visible error
  • provide a targeted repair that does not also rewrite working model refs

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING