openclaw - ✅(Solved) Fix Delivery recovery retries permanent errors (400: message too long) indefinitely on restart [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#61680Fetched 2026-04-08 02:56:02
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×2referenced ×2

Error Message

On every gateway restart, delivery recovery picks up entries from ~/.openclaw/delivery-queue/ and retries them — even when the error is permanent (e.g., Telegram's 400: Bad Request: message is too long). In the delivery recovery loop, check the HTTP status code. If it's a 4xx (client error), treat it as terminal and move to failed/ immediately. Only retry on 5xx or network errors.

Fix Action

Workaround

Manually move the stuck .json files from ~/.openclaw/delivery-queue/ to ~/.openclaw/delivery-queue/failed/.

PR fix notes

PR #61689: fix(delivery): treat HTTP 4xx errors as permanent in delivery recovery

Description (problem / solution / changelog)

Summary

Fixes #61680 - Delivery recovery retries permanent errors (400: message too long) indefinitely on restart.

Problem

HTTP 4xx errors (400, 413) were not recognized as permanent errors by isPermanentDeliveryError(), so entries with these errors were retried indefinitely on every gateway restart (up to MAX_RETRIES=5 per startup cycle).

Example error: "Call to sendMessage failed! (400: Bad Request: message is too long)"

Root Cause

isPermanentDeliveryError() only checked string-pattern allowlists (PERMANENT_ERROR_PATTERNS) but did not detect HTTP status codes in error messages. HTTP 4xx client errors are permanent — the request is malformed and will never succeed on retry.

Fix

Extend isPermanentDeliveryError() in delivery-queue-recovery.ts to detect HTTP 4xx status codes from error messages and treat them as permanent (move to failed/ immediately):

// HTTP 4xx client errors are permanent — the request is malformed and will never
// succeed on retry. Examples: 400 (bad request), 413 (payload too large), 429 (rate limit).
// 5xx errors are transient and should be retried.
if (/\(4\d{2}[^)]*\)/.test(error)) {
  return true;
}

Testing

Added 6 new test cases:

  • HTTP 400 Bad Request → moves to failed/ immediately (permanent)
  • HTTP 413 Payload Too Large → moves to failed/ immediately (permanent)
  • HTTP 500 Internal Server Error → stays in queue for retry (transient)
  • HTTP 502/503 → stays in queue (transient)
  • Unit tests for isPermanentDeliveryError() with various 4xx/5xx strings

Files Changed

  • src/infra/outbound/delivery-queue-recovery.ts
  • src/infra/outbound/delivery-queue.recovery.test.ts

Changed files

  • src/infra/outbound/delivery-queue-recovery.ts (modified, +10/-1)
  • src/infra/outbound/delivery-queue.recovery.test.ts (modified, +75/-0)

PR #61699: fix(delivery): treat HTTP 4xx client errors as permanent in recovery

Description (problem / solution / changelog)

Problem

On every gateway restart, delivery recovery picks up entries from the delivery queue and retries them — even when the error is permanent (e.g., Telegram's 400: Bad Request: message is too long). This causes the same failed message to be re-sent to the user on every restart, indefinitely.

Fix

Added four new patterns to PERMANENT_ERROR_PATTERNS in delivery-queue-recovery.ts:

  • Grammy-style 4xx errors: /failed!\s*\(4\d{2}:/i — catches Call to sendMessage failed! (400: ...)
  • Generic 400 Bad Request: /\b4\d{2}\b.*\bbad request\b/i
  • Message too long: /message.*(is\s+)?too\s+long/i — channel-agnostic
  • Entity too large: /request entity too large|payload too large/i — 413-style errors

5xx and network errors continue to be retried with backoff as before.

Tests

  • Added recovery integration tests for Telegram 400 (message too long), 413 (entity too large), and verified 5xx errors still retry
  • Added unit tests for isPermanentDeliveryError() covering new patterns and confirming 5xx is not matched

All 45 tests pass.

Closes #61680

Changed files

  • src/infra/outbound/delivery-queue-recovery.ts (modified, +5/-0)
  • src/infra/outbound/delivery-queue.policy.test.ts (modified, +10/-0)
  • src/infra/outbound/delivery-queue.recovery.test.ts (modified, +57/-0)
RAW_BUFFERClick to expand / collapse

Bug

On every gateway restart, delivery recovery picks up entries from ~/.openclaw/delivery-queue/ and retries them — even when the error is permanent (e.g., Telegram's 400: Bad Request: message is too long).

Expected Behavior

Permanent errors (4xx client errors like 400, 413) should be moved to failed/ after the first attempt or at most a small retry count. Only transient errors (5xx, network timeouts) should be retried across restarts.

Actual Behavior

Entries with retryCount: 5 and lastError: 'Call to sendMessage failed! (400: Bad Request: message is too long)' remain in the active delivery queue and are retried on every restart, indefinitely.

This causes the same (partial) message to be re-sent to the user every time OpenClaw restarts, which is confusing.

Reproduction

  1. Have an ACP/subagent session produce a result that exceeds Telegram's 4096-char limit
  2. The delivery fails with 400
  3. Restart the gateway — the delivery is retried and fails again
  4. Repeat on every restart

Workaround

Manually move the stuck .json files from ~/.openclaw/delivery-queue/ to ~/.openclaw/delivery-queue/failed/.

Suggested Fix

In the delivery recovery loop, check the HTTP status code. If it's a 4xx (client error), treat it as terminal and move to failed/ immediately. Only retry on 5xx or network errors.

Environment

  • OpenClaw 2026.4.5
  • Channel: Telegram
  • macOS 26.3 (arm64)

extent analysis

TL;DR

Modify the delivery recovery loop to move entries with 4xx error codes to the failed/ directory immediately, rather than retrying them.

Guidance

  • Check the HTTP status code of the error in the delivery recovery loop to determine whether it's a permanent (4xx) or transient (5xx) error.
  • Implement a condition to move entries with 4xx error codes to the failed/ directory after the first attempt, preventing indefinite retries.
  • Consider adding a retry count limit for transient errors to prevent excessive retries.
  • Verify the fix by reproducing the issue and checking that entries with 4xx error codes are moved to the failed/ directory after the first attempt.

Example

// Example entry with 4xx error code
{
  "retryCount": 5,
  "lastError": "Call to sendMessage failed! (400: Bad Request: message is too long)"
}

In this example, the entry should be moved to the failed/ directory immediately, rather than being retried.

Notes

The suggested fix assumes that the delivery recovery loop has access to the HTTP status code of the error. If this is not the case, additional modifications may be necessary to retrieve or store the status code.

Recommendation

Apply the workaround of manually moving stuck .json files to the failed/ directory until the suggested fix can be implemented, as this will prevent indefinite retries and reduce user confusion.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Delivery recovery retries permanent errors (400: message too long) indefinitely on restart [2 pull requests, 1 participants]