hermes - 💡(How to fix) Fix Feature Request: Support service_tier (e.g. flex) for Gemini provider [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#12700Fetched 2026-04-20 12:17:20
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
1
Author
Participants

Root Cause

The Problem

Currently, users cannot utilize the Flex tier in Hermes because:

  1. The newly merged agent/gemini_native_adapter.py (from #12674) does not map service_tier from api_kwargs / request_options into the native Google generateContent JSON payload.
  2. The open PR #5157 proposes restricting service_tier exclusively to OpenAI routes. If merged as-is, it will actively strip the parameter from Gemini calls.

Code Example

json
    {
      "contents": [...],
      "service_tier": "flex"
    }
RAW_BUFFERClick to expand / collapse

Feature Request

Gemini recently launched Flex Inference (service_tier: "flex"), which offers a 50% cost reduction for batch/background workloads where real-time latency isn't critical. This is a perfect match for Hermes's cron jobs and background subagents.

See documentation: https://ai.google.dev/gemini-api/docs/flex-inference

The Problem

Currently, users cannot utilize the Flex tier in Hermes because:

  1. The newly merged agent/gemini_native_adapter.py (from #12674) does not map service_tier from api_kwargs / request_options into the native Google generateContent JSON payload.
  2. The open PR #5157 proposes restricting service_tier exclusively to OpenAI routes. If merged as-is, it will actively strip the parameter from Gemini calls.

Proposed Solution

  1. Update gemini_native_adapter.py: Intercept service_tier from api_kwargs and inject it into the native request body:
json
    {
      "contents": [...],
      "service_tier": "flex"
    }
  1. Exempt Gemini from PR #5157: Ensure that hermes_cli/runtime_provider.py (or wherever route validation happens) recognizes gemini as a valid provider for service_tier.

Use Case

Hermes heavily utilizes cron jobs and background delegation tasks (via delegate_task). Halving the LLM cost for these asynchronous, non-interactive workflows using Gemini would be a massive efficiency gain for users.

Thank you!

extent analysis

TL;DR

Update gemini_native_adapter.py to include service_tier in the native Google request body and exempt Gemini from the service_tier restriction in PR #5157.

Guidance

  • Review the agent/gemini_native_adapter.py file and modify it to map service_tier from api_kwargs to the native Google generateContent JSON payload.
  • Examine PR #5157 and adjust the code to allow service_tier for Gemini routes, ensuring it is not stripped from Gemini calls.
  • Verify that the updated gemini_native_adapter.py correctly injects service_tier into the request body by checking the JSON payload sent to the Google API.
  • Test the changes with a sample cron job or background task to confirm the cost reduction is applied correctly.

Example

# Example update to gemini_native_adapter.py
def generate_request_body(api_kwargs):
    # ...
    if 'service_tier' in api_kwargs:
        request_body['service_tier'] = api_kwargs['service_tier']
    # ...
    return request_body

Notes

The proposed solution assumes that the service_tier parameter is correctly handled by the Google API and that the updated gemini_native_adapter.py will not introduce any compatibility issues with other parts of the Hermes system.

Recommendation

Apply the workaround by updating gemini_native_adapter.py and exempting Gemini from the service_tier restriction in PR #5157, as this will allow users to utilize the Flex tier and reduce costs for batch/background workloads.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Feature Request: Support service_tier (e.g. flex) for Gemini provider [1 participants]