openclaw - ✅(Solved) Fix Bug: gateway auth rate limiter entries map has no hard cap under unique-IP flood [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#77986Fetched 2026-05-06 06:18:13
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
2
Timeline (top)
commented ×1cross-referenced ×1

Fix Action

Fixed

PR fix notes

PR #77987: fix(gateway): cap auth-rate-limit entries map under unique-IP flood

Description (problem / solution / changelog)

Closes #77986

Summary

src/gateway/auth-rate-limit.ts keeps an in-memory entries: Map<scope:ip, RateLimitEntry> of failed auth attempts. There is a periodic setInterval(prune) (default 60s) that walks the Map and drops entries with empty attempts, but no hard size cap. Between prune ticks, an unauthenticated attacker rotating source IPs can push one fresh entry per IP, scaling memory use with attacker request rate × pruneIntervalMs.

This change adds:

  • maxEntries config (default 10,000) on RateLimitConfig.
  • An enforceMaxEntries pass on new inserts in recordFailure that drops oldest entries by Map insertion order.
  • Locked-out entries are skipped during eviction so an attacker cannot escape a lockout by flooding fresh failures to evict their own locked entry — the lockout state is the whole point of the rate limiter.

The existing periodic prune, sliding window, lockout, and reset semantics are unchanged.

Verification

  • pnpm test src/gateway/auth-rate-limit.test.ts → 24/24 pass including the two new cases (caps entries via FIFO drop when a flood of unique IPs exceeds maxEntries, preserves locked-out entries during flood eviction).
  • pnpm exec oxfmt --check --threads=1 src/gateway/auth-rate-limit.ts src/gateway/auth-rate-limit.test.ts CHANGELOG.md clean.
  • Live tsx runtime proof included below in Real behavior proof.

Real behavior proof

  • Behavior addressed: the entries Map had no hard cap; under sustained unique-IP attack memory use grew with attacker bandwidth × pruneIntervalMs (default 60s) before the next prune could compact it.
  • Real environment tested: local Node v22 runtime against the patched auth-rate-limit.ts module via pnpm exec tsx. No mocks, no test framework. Drove the real recordFailure / check API with synthetic IPs.
  • Exact steps or command run after this patch: pnpm exec tsx /tmp/auth-rate-limit-cap-proof.mts. Two scenarios: (1) 50,000 unique-IP failures with maxEntries=1000, and (2) two pre-locked entries plus a 200-IP flood with maxEntries=4 to verify locked entries survive.
  • Evidence after fix:
=== unique-IP flood (cap=1000, no auto-prune) ===
  recorded      0 failures  size=1
  recorded   5000 failures  size=1000
  recorded  10000 failures  size=1000
  recorded  15000 failures  size=1000
  recorded  20000 failures  size=1000
  recorded  25000 failures  size=1000
  recorded  30000 failures  size=1000
  recorded  35000 failures  size=1000
  recorded  40000 failures  size=1000
  recorded  45000 failures  size=1000
final size after 50k unique-IP failures: 1000

=== lockout-preservation under flood (cap=4) ===
pre-flood size: 2, 10.0.50.1 locked: true
post-flood size: 4, cap=4
10.0.50.1 still locked: true
10.0.50.2 still locked: true
  • Observed result after fix: cap holds at 1000 across 50k unique-IP failures (oldest non-locked entries evicted FIFO). With cap=4 and 2 pre-locked entries, a 200-IP flood pins size at 4 and both locked entries survive — verifying eviction preserves lockouts.
  • What was not tested: behavior against a real public-internet attack stream. The fix is a Map cap on a deterministic insert path; the runtime demo exercises the same recordFailure path real attackers would hit, just at higher rates than a unit test can.

Notes for reviewer

  • The maxEntries cap, periodic prune, and sliding window compose: prune drops entries with empty windows, sliding window drops old timestamps, cap drops oldest non-locked entries on overflow. Each handles a different failure mode.
  • 10,000 default chosen to comfortably fit a real gateway's tracked-IP working set (any single openclaw deployment) while still bounding memory use to ~1MB worst-case for the attempt arrays.
  • Part of the memory-leak audit sweep following #77952 (Discord cache) and #77973 (agent-job cache); same shape on a different surface.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/gateway/auth-rate-limit.test.ts (modified, +42/-0)
  • src/gateway/auth-rate-limit.ts (modified, +29/-0)
RAW_BUFFERClick to expand / collapse

Problem

src/gateway/auth-rate-limit.ts keeps an in-memory entries: Map<scope:ip, RateLimitEntry> of failed auth attempts. There is a periodic setInterval(prune) (default 60s) that walks the Map and drops entries with empty attempts, but no hard size cap.

Between prune ticks, an unauthenticated attacker rotating source IPs (one failure per IP) populates one entry per IP. At sustained attack rates, memory use scales roughly with attacker request rate × pruneIntervalMs. At 10k req/s that is ~600k entries each holding a small attempts: number[] array (capped at maxAttempts=10) — survivable but a real defense-in-depth gap.

Locked-out entries (5min default lockout) are deliberately preserved across prune ticks. Eviction therefore needs to skip locked entries so an attacker cannot escape a lockout by flooding fresh failures to evict their own locked entry.

Tracking PR

Fix in #PR.

extent analysis

TL;DR

Implement a hard size cap for the entries Map to prevent uncontrolled memory growth during sustained attacks.

Guidance

  • Introduce a maximum size limit for the entries Map to prevent it from growing indefinitely.
  • When the limit is reached, consider evicting the oldest non-locked entry to make room for new ones.
  • Modify the prune function to skip locked-out entries and ensure they are preserved across prune ticks.
  • Consider adding a mechanism to dynamically adjust the prune interval based on the current attack rate to mitigate memory usage.

Example

// Pseudocode example of introducing a size cap
const MAX_ENTRIES = 100000;
const entries: Map<string, RateLimitEntry> = new Map();

// ...

if (entries.size >= MAX_ENTRIES) {
  // Evict the oldest non-locked entry
  const oldestEntry = Array.from(entries.entries()).find(([key, value]) => !value.locked);
  if (oldestEntry) {
    entries.delete(oldestEntry[0]);
  }
}

Notes

The ideal size cap value will depend on the specific system resources and performance requirements. It's essential to monitor memory usage and adjust the cap accordingly.

Recommendation

Apply a workaround by introducing a hard size cap for the entries Map, as this will provide an immediate defense against memory exhaustion attacks.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING