openclaw - 💡(How to fix) Fix Discord inbound worker repeatedly times out after 1800s while gateway is still running [1 comments, 2 participants]

2026-04-26T10:04:19+08:00 [discord] inbound worker timed out after 1800 seconds (channelId=1497687704866000906, messageId=1497772780744347771) 2026-04-26T12:06:09+08:00 [discord] inbound worker timed out after 1800 seconds (channelId=1489624037758861415, messageId=1497803443287625800)

Error Message

[ws] handshake timeout ... peer=127.0.0.1:...->127.0.0.1:18789 Subagent announce failed: Error: gateway timeout after 10000ms [session-write-lock] releasing lock held for 66843ms / 71211ms / 97029ms [memory] qmd embed failed ... timed out after 600000ms [agent/embedded] [context-overflow-diag] ... Context overflow: estimated context size exceeds safe threshold during tool loop.

Root Cause

I'm seeing Discord replies fail with:

Discord inbound worker timed out.

In the service logs this appears as:

2026-04-26T10:04:19+08:00 [discord] inbound worker timed out after 1800 seconds (channelId=1497687704866000906, messageId=1497772780744347771)
2026-04-26T12:06:09+08:00 [discord] inbound worker timed out after 1800 seconds (channelId=1489624037758861415, messageId=1497803443287625800)

The gateway process itself remains active, but Discord inbound work appears to get stuck long enough to hit the 30 minute default timeout. Around the same periods I also see gateway/local worker health symptoms like websocket handshake timeouts, subagent announce timeouts, session locks, qmd timeouts, and context overflow recovery.

Code Example

2026-04-26T10:04:19+08:00 [discord] inbound worker timed out after 1800 seconds (channelId=1497687704866000906, messageId=1497772780744347771)
2026-04-26T12:06:09+08:00 [discord] inbound worker timed out after 1800 seconds (channelId=1489624037758861415, messageId=1497803443287625800)

---

[ws] handshake timeout ... peer=127.0.0.1:...->127.0.0.1:18789
Subagent announce failed: Error: gateway timeout after 10000ms
[session-write-lock] releasing lock held for 66843ms / 71211ms / 97029ms
[memory] qmd embed failed ... timed out after 600000ms
[agent/embedded] [context-overflow-diag] ... Context overflow: estimated context size exceeds safe threshold during tool loop.

---

[discord] gateway error: Error: socket hang up
[discord] gateway: Gateway websocket closed: 1006
[discord] gateway: Gateway reconnect scheduled ... (invalid-session, resume=false)

---

Tasks: 80
Memory: 6.5G
CPU: 3d+ accumulated

Summary

I'm seeing Discord replies fail with:

Discord inbound worker timed out.

In the service logs this appears as:

2026-04-26T10:04:19+08:00 [discord] inbound worker timed out after 1800 seconds (channelId=1497687704866000906, messageId=1497772780744347771)
2026-04-26T12:06:09+08:00 [discord] inbound worker timed out after 1800 seconds (channelId=1489624037758861415, messageId=1497803443287625800)

What I expected

A Discord inbound message should either complete, fail with the underlying agent/model error, or surface enough diagnostic context to tell which internal worker/session is stuck.

If this timeout is expected behavior, it would help if the user-facing Discord reply included the agent/session/run id or a clearer reason than just Discord inbound worker timed out.

What actually happens

The Discord channel gets a generic timeout reply after 1800 seconds.

Nearby logs show related pressure/errors:

[ws] handshake timeout ... peer=127.0.0.1:...->127.0.0.1:18789
Subagent announce failed: Error: gateway timeout after 10000ms
[session-write-lock] releasing lock held for 66843ms / 71211ms / 97029ms
[memory] qmd embed failed ... timed out after 600000ms
[agent/embedded] [context-overflow-diag] ... Context overflow: estimated context size exceeds safe threshold during tool loop.

There were also Discord gateway reconnect/session churn events in the same general window:

[discord] gateway error: Error: socket hang up
[discord] gateway: Gateway websocket closed: 1006
[discord] gateway: Gateway reconnect scheduled ... (invalid-session, resume=false)

System/config context

OpenClaw: 2026.4.24 (46d2415)
OS: LMDE 6 / Debian kernel 6.1.0-44-amd64
Node: v24.15.0
npm: 11.12.1
Codex CLI: 0.120.0
Running as user systemd service: openclaw-gateway.service
Gateway mode: local loopback, port 18789
Discord enabled with multiple accounts/bots
Discord healthMonitor.enabled=false
Discord threadBindings.enabled=true
Discord threadBindings.spawnSubagentSessions=true
Discord threadBindings.spawnAcpSessions=true
Agent defaults:
- contextTokens=120000
- primary model openai-codex/gpt-5.5
- timeoutSeconds=3600
- subagents.maxConcurrent=5
- subagents.maxChildrenPerAgent=5
- subagents.announceTimeoutMs=300000
- compaction reserve floor 24000
Discord inbound worker timeout appears to be using the default 1800000ms / 1800s; I did not find an explicit per-account channels.discord.accounts.<id>.inboundWorker.runTimeoutMs override in my config.

At the time of inspection the service was still running but using substantial resources:

Tasks: 80
Memory: 6.5G
CPU: 3d+ accumulated

Why I think this might be an OpenClaw issue

The timeout itself is documented/configured, but in practice it seems to be acting as the only visible failure mode for several possible internal stalls:

queued Discord inbound run stuck behind session locks
qmd embed/search/update timeouts
subagent announce timeouts
local gateway websocket handshake timeouts
context overflow recovery taking a long time or looping

It would be useful if the inbound worker timeout carried the underlying run/session state into the Discord error reply and logs, or if the queue could cancel/unblock the stuck worker more cleanly before the full 1800s elapses.

Possible improvement

When the inbound worker times out, include something like:

account id / agent id
session key
run id
queue depth
whether the agent was waiting on model, tool, qmd, session lock, or gateway connect
whether the timeout came from default inboundWorker.runTimeoutMs or an explicit account override

That would make this much easier to diagnose from Discord without digging through journal logs.

extent analysis

TL;DR

The Discord inbound worker timeout may be caused by internal stalls, and including more diagnostic context in the error reply and logs could help identify the root cause.

Guidance

Review the system logs for patterns or correlations between the Discord inbound worker timeouts and other errors, such as session locks, qmd timeouts, and gateway reconnects.
Consider increasing the inboundWorker.runTimeoutMs value or implementing a more robust queue management system to handle stuck workers.
Investigate the resource usage of the OpenClaw service, as high CPU and memory usage may be contributing to the timeouts.
Examine the Discord account configuration, particularly the threadBindings settings, to ensure they are optimized for the workload.

Example

No specific code example is provided, as the issue appears to be related to configuration and system resource management rather than a specific code snippet.

Notes

The issue may be related to the complex interactions between the OpenClaw service, Discord gateway, and various timeouts. Further investigation is needed to determine the root cause.

Recommendation

Apply a workaround by increasing the inboundWorker.runTimeoutMs value or implementing a more robust queue management system to handle stuck workers, as the current timeout may be too aggressive and masking underlying issues.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Discord inbound worker repeatedly times out after 1800s while gateway is still running [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

What I expected

What actually happens

System/config context

Why I think this might be an OpenClaw issue

Possible improvement

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Discord inbound worker repeatedly times out after 1800s while gateway is still running [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

What I expected

What actually happens

System/config context

Why I think this might be an OpenClaw issue

Possible improvement

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING