hermes - 💡(How to fix) Fix Feature Request: Configurable per-minute rate limiting (RPM) for models to prevent 429 errors

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Please add a configuration option to set a maximum Requests Per Minute (RPM) limit for specific providers. When the agent reaches this limit, it should automatically pause/sleep until the next minute window opens, rather than hitting the API and triggering a 429 Rate Limit Exceeded error.

Fix Action

Fix / Workaround

Proposed Solution

Add a configuration field in config.yaml, for example:

providers:
  custom_proxy:
    rate_limit_rpm: 5

The agent's HTTP client or request dispatcher would track the number of requests made within a rolling 60-second window. If a new request would exceed the limit, the client pauses until the window clears, then proceeds.

Code Example

providers:
  custom_proxy:
    rate_limit_rpm: 5
RAW_BUFFERClick to expand / collapse

Feature Description

Please add a configuration option to set a maximum Requests Per Minute (RPM) limit for specific providers. When the agent reaches this limit, it should automatically pause/sleep until the next minute window opens, rather than hitting the API and triggering a 429 Rate Limit Exceeded error.

Motivation

Users often utilize custom API proxies or middleman API pools which enforce their own aggressive RPM throttling (e.g., custom 5 RPM limits on certain tiers). During intensive tasks involving multiple tool calls or long workflows, the agent easily hits HTTP 429 errors from these proxy services. Once the 429 is hit, the agent usually retries with exponential backoff, but this can cause severe latency or even complete failure/silent hanging in long autonomous loops. A built-in throttle mechanism (e.g., model.rate_limit_rpm) would gracefully prevent hitting the proxy's hard limits proactively.

Proposed Solution

Add a configuration field in config.yaml, for example:

providers:
  custom_proxy:
    rate_limit_rpm: 5

The agent's HTTP client or request dispatcher would track the number of requests made within a rolling 60-second window. If a new request would exceed the limit, the client pauses until the window clears, then proceeds.

Alternatives Considered

  • Relying purely on HTTP 429 exponential backoff (current behavior), but this is reactive and often leads to task disruption or proxy blocks.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING