hermes - 💡(How to fix) Fix [Feature]: Enable Anthropic prompt caching for Claude models on Nous Portal [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#13409Fetched 2026-04-22 08:06:40
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Participants
Timeline (top)
labeled ×1
RAW_BUFFERClick to expand / collapse

Problem or Use Case

Prompt caching for Claude models is currently disabled when using Nous Portal as the provider, even though the caching infrastructure is fully implemented in Hermes. This causes significant unnecessary token spend for Nous subscribers running Claude on multi-turn conversations. PR #12846 already made the client-side code ready — only the portal/provider side needs to change.

Proposed Solution

Expose https://inference-api.nousresearch.com/v1/anthropic using the native Anthropic wire protocol. Hermes auto-detects /anthropic URLs with zero config changes needed from users.

Alternatives Considered

Possible alternative: confirm cache_control markers are forwarded to Anthropic on the existing endpoint and add a one-liner to the caching policy for the nous provider.

Feature Type

Configuration option

Scope

None

Contribution

  • I'd like to implement this myself and submit a PR

Debug Report (optional)

extent analysis

TL;DR

Enable prompt caching for Claude models on Nous Portal by exposing the Anthropic wire protocol or forwarding cache control markers.

Guidance

  • Expose the https://inference-api.nousresearch.com/v1/anthropic endpoint using the native Anthropic wire protocol to leverage Hermes' auto-detection feature.
  • Alternatively, confirm that cache control markers are forwarded to Anthropic on the existing endpoint and add a caching policy for the Nous provider.
  • Verify that the caching infrastructure is fully implemented in Hermes and that PR #12846 has prepared the client-side code for caching.
  • Test the caching functionality with multi-turn conversations to ensure significant unnecessary token spend is reduced for Nous subscribers.

Example

No code snippet is provided as the issue does not require a specific code change, but rather a configuration or exposure of an endpoint.

Notes

The proposed solution assumes that exposing the Anthropic wire protocol or forwarding cache control markers will enable prompt caching without requiring additional changes to the client-side code.

Recommendation

Apply the workaround by exposing the Anthropic wire protocol, as it seems to be the most straightforward solution that leverages existing infrastructure and auto-detection features.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING