hermes - 💡(How to fix) Fix [Feature]: Support for Native Google / Vertex AI Provider (Bypass OpenRouter 402 Errors & Rate Limits) [3 comments, 3 participants]

hermes2026-04-19 17:48:36

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#12639•Fetched 2026-04-20 12:17:46

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×3labeled ×1subscribed ×1

Error Message

Error: HTTP 402: This request requires more credits, or fewer max_tokens. You requested up to 65536 tokens, but can only afford 3312.

Root Cause

I use the Hermes agent extensively with the google/gemini-3.1-pro-preview model, currently routed through the openrouter provider. However, I frequently encounter HTTP 402 and API Rate Limits from OpenRouter (attempting to charge a markup) because Hermes tries to allocate large context windows (like max_tokens: 65536), but the OpenRouter credit balance check blocks the request natively, despite the underlying Google model being inexpensive or when I have sufficient Google Cloud credits available.

RAW_BUFFERClick to expand / collapse

Problem or Use Case

I use the Hermes agent extensively with the `google/gemini-3.1-pro-preview` model, currently routed through the `openrouter` provider. 
However, I frequently encounter `HTTP 402` and API Rate Limits from OpenRouter (attempting to charge a markup) because Hermes tries to allocate large context windows (like `max_tokens: 65536`), but the OpenRouter credit balance check blocks the request natively, despite the underlying Google model being inexpensive or when I have sufficient Google Cloud credits available.


API call failed (attempt 1/3): APIStatusError [HTTP 402]
Provider: openrouter Model: google/gemini-3.1-pro-preview
Error: HTTP 402: This request requires more credits, or fewer max_tokens. You requested up to 65536 tokens, but can only afford 3312.


I maintain dedicated Google Cloud billing and have my own authentication (`google-creds.json` and API Keys), so routing my requests through OpenRouter introduces an unnecessary middleman, markup fees, and artificial rate limit bottlenecks.

Proposed Solution

Allow configuration of a native, direct `google` or `vertex-ai` provider in `config.yaml`, completely bypassing OpenRouter. 

1. **Google AI Studio / API Key Integration:** Use standard Google API keys as you do for `anthropic` or `openai`.
2. **Vertex AI Credentials Integration:** Allow defining a path to the Google Credentials JSON (`google-creds.json`) for corporate/project-based Vertex AI authentication. 

I'd like to configure my `config.yaml` to point my model `google/gemini-3.1-pro-preview` directly to Google's endpoints to fully utilize my Google Cloud credit allowance without OpenRouter intercepting the usage/limits.

Alternatives Considered

No response

Feature Type

New tool

Scope

None

Contribution

I'd like to implement this myself and submit a PR

Debug Report (optional)

extent analysis

TL;DR

Configure a native Google or Vertex AI provider in config.yaml to bypass OpenRouter and utilize Google Cloud credits directly.

Guidance

Review the proposed solution to integrate Google API keys or Vertex AI credentials in config.yaml for direct access to Google's endpoints.
Consider the two integration options: using standard Google API keys or defining a path to the Google Credentials JSON (google-creds.json) for corporate/project-based Vertex AI authentication.
Evaluate the benefits of bypassing OpenRouter, including avoiding markup fees and artificial rate limit bottlenecks.
Assess the feasibility of implementing this change, given the existing infrastructure and authentication setup.

Example

No code snippet is provided as the issue focuses on configuration changes rather than code modifications.

Notes

The implementation of this solution may require updates to the Hermes agent's configuration parsing and provider handling logic. Additionally, ensuring compatibility with the existing authentication setup and Google Cloud credits system is crucial.

Recommendation

Apply a workaround by configuring a native Google or Vertex AI provider in config.yaml, as this allows for direct utilization of Google Cloud credits and bypasses OpenRouter's limitations.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #memory management #API rate limit #retriever error #indexing error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Feature]: Support for Native Google / Vertex AI Provider (Bypass OpenRouter 402 Errors & Rate Limits) [3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Debug Report (optional)

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Feature]: Support for Native Google / Vertex AI Provider (Bypass OpenRouter 402 Errors & Rate Limits) [3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Problem or Use Case

Proposed Solution

Alternatives Considered

Feature Type

Scope

Contribution

Debug Report (optional)

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING