openclaw - 💡(How to fix) Fix Gateway silently drops agent spawns when log file hits maxFileBytes cap [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#61440Fetched 2026-04-08 02:58:34
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

When the gateway log file reaches the maxFileBytes cap (default 512MB at /tmp/openclaw/<date>.log), the gateway emits [openclaw] log file size cap reached; suppressing writes and continues accepting WebSocket RPC calls (chat.send). However, agent sessions dispatched via chat.send during this state fail to actually spawn — the gateway returns a successful response (no error) but the agent process never starts. The caller has no way to detect this failure.

Error Message

When the gateway log file reaches the maxFileBytes cap (default 512MB at /tmp/openclaw/<date>.log), the gateway emits [openclaw] log file size cap reached; suppressing writes and continues accepting WebSocket RPC calls (chat.send). However, agent sessions dispatched via chat.send during this state fail to actually spawn — the gateway returns a successful response (no error) but the agent process never starts. The caller has no way to detect this failure.

  • Option A: Gateway should return an error from chat.send when it cannot guarantee agent spawn due to log suppression, so callers can retry or fail gracefully

Root Cause

This caused multiple QA agent dispatches to silently fail over several hours in our Atlas HQ deployment. The watchdog eventually marked them as stalled after 10-minute timeouts, but the root cause was not surfaced to operators.

Fix Action

Workaround

Truncate the log file and restart the gateway:

> /tmp/openclaw/openclaw-<date>.log
kill -TERM <gateway-pid>
openclaw gateway start

Code Example

> /tmp/openclaw/openclaw-<date>.log
kill -TERM <gateway-pid>
openclaw gateway start
RAW_BUFFERClick to expand / collapse

Summary

When the gateway log file reaches the maxFileBytes cap (default 512MB at /tmp/openclaw/<date>.log), the gateway emits [openclaw] log file size cap reached; suppressing writes and continues accepting WebSocket RPC calls (chat.send). However, agent sessions dispatched via chat.send during this state fail to actually spawn — the gateway returns a successful response (no error) but the agent process never starts. The caller has no way to detect this failure.

Reproduction

  1. Run a gateway with the default log configuration long enough for the log to reach 512MB (heavy agent traffic, ~6-12 hours)
  2. Observe [openclaw] log file size cap reached; suppressing writes in stderr
  3. Send a chat.send RPC via WebSocket to dispatch an agent turn
  4. Gateway responds with {ok: true} and an empty runId
  5. Agent session is never started — no process spawned, no output produced
  6. The dispatching system (Atlas HQ in our case) marks the job as "running" but the agent never checks in

Expected behavior

Either:

  • Option A: Gateway should return an error from chat.send when it cannot guarantee agent spawn due to log suppression, so callers can retry or fail gracefully
  • Option B: Log suppression should not affect agent spawning — only log writes should be suppressed, not operational behavior
  • Option C: Gateway should auto-rotate logs (e.g. rename the capped file to .1 and start a fresh one) instead of entering a degraded state

Workaround

Truncate the log file and restart the gateway:

> /tmp/openclaw/openclaw-<date>.log
kill -TERM <gateway-pid>
openclaw gateway start

Environment

  • OpenClaw gateway (latest, as of 2026-04-05)
  • macOS Darwin 24.6.0
  • Log path: /tmp/openclaw/openclaw-2026-04-05.log
  • maxFileBytes: 524288000 (512MB)

Impact

This caused multiple QA agent dispatches to silently fail over several hours in our Atlas HQ deployment. The watchdog eventually marked them as stalled after 10-minute timeouts, but the root cause was not surfaced to operators.

extent analysis

TL;DR

Truncating the log file and restarting the gateway is a viable workaround to resolve the issue of agent sessions failing to spawn when the log file size cap is reached.

Guidance

  • To verify the issue, check the gateway logs for the message [openclaw] log file size cap reached; suppressing writes and confirm that agent sessions are not being spawned despite successful chat.send responses.
  • Consider implementing log rotation to prevent the log file from reaching the size cap, such as renaming the capped file to .1 and starting a fresh one.
  • Review the gateway configuration to determine if the maxFileBytes setting can be adjusted to a larger value to reduce the frequency of log file size caps.
  • To mitigate the issue, implement error handling on the caller side to detect and retry failed agent dispatches, in case the gateway returns a successful response but the agent session fails to spawn.

Example

# Truncate the log file and restart the gateway
> /tmp/openclaw/openclaw-<date>.log
kill -TERM <gateway-pid>
openclaw gateway start

Notes

The root cause of the issue appears to be the log file size cap being reached, causing the gateway to suppress log writes and fail to spawn agent sessions. The workaround provided is effective but may not be a long-term solution. Implementing log rotation or adjusting the maxFileBytes setting may be necessary to prevent the issue from recurring.

Recommendation

Apply the workaround of truncating the log file and restarting the gateway, as it provides a immediate solution to the issue. However, it is recommended to also investigate implementing log rotation or adjusting the maxFileBytes setting to prevent the issue from recurring in the future.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Either:

  • Option A: Gateway should return an error from chat.send when it cannot guarantee agent spawn due to log suppression, so callers can retry or fail gracefully
  • Option B: Log suppression should not affect agent spawning — only log writes should be suppressed, not operational behavior
  • Option C: Gateway should auto-rotate logs (e.g. rename the capped file to .1 and start a fresh one) instead of entering a degraded state

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING