hermes - ✅(Solved) Fix ContextCompressor.update_from_response: accept both OpenAI and Anthropic usage schemas [1 pull requests, 1 participants]

mtzirkel · 2026-04-23T18:52:38Z

[hermes] PR 14903: fix context-compressor : accept Anthropic-style usage keys as fallback - Repository: NousResearch/hermes-agent - Author: mrunmayee17 - State… # PR #14903: fix(context-compressor): accept Anthropic-style usage keys as fallback - Repository: NousResearch/hermes-agent - Author: mrunmayee17 - State: open | merged: False - Link: https://github.com/NousResearch/hermes-agent/pull/14903 ## Description (problem / solution / changelog) update_from_response() previously only recognised OpenAI-style field names (prompt_tokens/completion_tokens). OpenAI-compatible local servers such as mlx_vlm and NVIDIA NIM can return Anthropic-shaped usage dicts (input_tokens/output_tokens), causing last_prompt_tokens and last_completion_tokens to silently stay at zero. This broke the context progress bar and disabled auto-compression for those providers. OpenAI-style keys still take priority when both are present. Closes #14687 Related: #14686 ## What does this PR do? ContextCompressor.update_from_response() now falls back to Anthropic-style usage keys (input_tokens/output_tokens) when OpenAI-style keys (prompt_tokens/completion_tokens) are absent. This is the correct fix because OpenAI-compatible servers like NVIDIA NIM and mlx_vlm can return Anthropic-shaped usage objects, causing token counts to silently read as zero — which breaks the context progress bar and prevents auto-compression from ever triggering. ## Related Issue Fixes #14687 Related: #14686, companion to #14698 ## Type of Change - [x] 🐛 Bug fix (non-breaking change that fixes an issue) - [ ] ✨ New feature (non-breaking change that adds functionality) - [ ] 🔒 Security fix - [ ] 📝 Documentation update - [ ] ✅ Tests (adding or improving test coverage) - [ ] ♻️ Refactor (no behavior change) - [ ] 🎯 New skill (bundled or hub) ## Changes Made - agent/context_compressor.py — update_from_response(): fall back to input_tokens/output_tokens when OpenAI-style keys are absent; OpenAI-style takes priority when both present - tests/agent/test_context_compressor.py — two regression tests added to TestUpdateFromResponse ## How to Test 1. Configure Hermes with an OpenAI-compatible local server that returns Anthropic-style usage fields (e.g. mlx_vlm or NVIDIA NIM via integrate.api.nvidia.com) 2. Run a multi-turn session and observe the context progress bar — it should now reflect actual token usage instead of staying at 0% 3. Run uv run --extra dev pytest tests/agent/test_context_compressor.py -v — all 52 tests pass including the two new regression tests ## Checklist ### Code - [x] I've read the [Contributing Guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md) - [x] My commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) (`fix(scope):`, `feat(scope):`, etc.) - [x] I searched for [existing PRs](https://github.com/NousResearch/hermes-agent/pulls) to make sure this isn't a duplicate - [x] My PR contains **only** changes related to this fix/feature (no unrelated commits) - [x] I've run `pytest tests/ -q` and all tests pass - [x] I've added tests for my changes (required for bug fixes, strongly encouraged for features) - [x] I've tested on my platform: macOS 15.2 Apple Silicon ### Documentation & Housekeeping - I've updated relevant documentation (README, docs/, docstrings) — updated docstring on update_from_response() - I've updated cli-config.yaml.example if I added/changed config keys — N/A - I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A - I've considered cross-platform impact (Windows, macOS) per the https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md#cross-pla tform-compatibility — N/A, pure Python dict key lookup - I've updated tool descriptions/schemas if I changed tool behavior — N/A ## Screenshots / Logs - tests/agent/test_context_compressor.py::TestUpdateFromResponse::test_updates_fields PASSED - tests/agent/test_context_compressor.py::TestUpdateFromResponse::test_missing_fields_default_zero PASSED - tests/agent/test_context_compressor.py::TestUpdateFromResponse::test_anthropic_style_keys PASSED - tests/agent/test_context_compressor.py::TestUpdateFromResponse::test_openai_keys_take_priority PASSED ## Changed files - `agent/context_compressor.py` (modified, +10/-3) - `tests/agent/test_context_compressor.py` (modified, +20/-0) ## Fix / Workaround May be closed as "fixed by #<normalize_usage PR>" if maintainers prefer a single entry point. Filing separately in case the defense-in-depth stance is preferred. # Defensive: `ContextCompressor.update_from_response` should accept both OpenAI and Anthropic usage schemas Companion to the `normalize_usage` fix for OpenAI-compat local servers (filed separately — see [Issue for `usage_pricing.py normalize_usage`]). This one is defense-in-depth. ## Environment - Hermes: `v2026.4.16-1165-gce089169` (cli `__version__ =

hermes2026-04-23 18:52:38

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#14687•Fetched 2026-04-24 06:15:20

View on GitHub

Comments

Participants

Timeline

Reactions

Author

mtzirkel

Participants

mtzirkel

Timeline (top)

labeled ×3cross-referenced ×1

Code Example

def update_from_response(self, usage: Dict[str, Any]):
    """Update tracked token usage from API response."""
    self.last_prompt_tokens = usage.get("prompt_tokens", 0)
    self.last_completion_tokens = usage.get("completion_tokens", 0)

---

def update_from_response(self, usage: Dict[str, Any]):
    """Update tracked token usage from API response.

    Accepts both OpenAI-schema (prompt_tokens/completion_tokens) and
    Anthropic/mlx_vlm-schema (input_tokens/output_tokens) usage blocks.
    """
    self.last_prompt_tokens = usage.get("prompt_tokens") or usage.get("input_tokens", 0)
    self.last_completion_tokens = usage.get("completion_tokens") or usage.get("output_tokens", 0)

RAW_BUFFERClick to expand / collapse

Defensive: `ContextCompressor.update_from_response` should accept both OpenAI and Anthropic usage schemas

Companion to the normalize_usage fix for OpenAI-compat local servers (filed separately — see [Issue for usage_pricing.py normalize_usage]). This one is defense-in-depth.

Environment

Hermes: v2026.4.16-1165-gce089169 (cli __version__ = 0.10.0)
Any path that feeds raw usage dicts into ContextCompressor without going through normalize_usage first.

Observation

agent/context_compressor.py:395-396:

def update_from_response(self, usage: Dict[str, Any]):
    """Update tracked token usage from API response."""
    self.last_prompt_tokens = usage.get("prompt_tokens", 0)
    self.last_completion_tokens = usage.get("completion_tokens", 0)

All current in-tree callers pass a pre-normalized dict with OpenAI-schema keys (e.g. run_agent.py:9843 explicitly constructs {"prompt_tokens", "completion_tokens", "total_tokens"}). So today this function works as intended.

But the docstring says "Update tracked token usage from API response" — a reader could reasonably pass an Anthropic-shape usage object directly ({"input_tokens": N, "output_tokens": M}), and it would silently record zeros without any indication something is off. It's also a natural extension point for third-party plugins (context-engine-plugin.md documents this as part of the plugin contract).

Proposed fix

Make it schema-agnostic so the method honors its docstring:

def update_from_response(self, usage: Dict[str, Any]):
    """Update tracked token usage from API response.

    Accepts both OpenAI-schema (prompt_tokens/completion_tokens) and
    Anthropic/mlx_vlm-schema (input_tokens/output_tokens) usage blocks.
    """
    self.last_prompt_tokens = usage.get("prompt_tokens") or usage.get("input_tokens", 0)
    self.last_completion_tokens = usage.get("completion_tokens") or usage.get("output_tokens", 0)

Impact

Small. Mostly a robustness tweak for the public plugin surface (anyone writing a custom ContextEngine per the developer guide). Prevents a second failure mode like the upstream normalize_usage issue if anyone adds a caller that bypasses normalization.

May be closed as "fixed by #<normalize_usage PR>" if maintainers prefer a single entry point. Filing separately in case the defense-in-depth stance is preferred.

extent analysis

TL;DR

Update the update_from_response method in ContextCompressor to accept both OpenAI and Anthropic usage schemas by using the proposed fix.

Guidance

Review the update_from_response method and verify it only accepts OpenAI-schema usage blocks.
Test the method with an Anthropic-schema usage block to confirm it silently records zeros.
Apply the proposed fix to make the method schema-agnostic.
Verify the fix by testing the method with both OpenAI and Anthropic usage schemas.

Example

def update_from_response(self, usage: Dict[str, Any]):
    """Update tracked token usage from API response.

    Accepts both OpenAI-schema (prompt_tokens/completion_tokens) and
    Anthropic/mlx_vlm-schema (input_tokens/output_tokens) usage blocks.
    """
    self.last_prompt_tokens = usage.get("prompt_tokens") or usage.get("input_tokens", 0)
    self.last_completion_tokens = usage.get("completion_tokens") or usage.get("output_tokens", 0)

Notes

This fix is a defense-in-depth measure to prevent a potential failure mode if a caller bypasses normalization. It may be closed as "fixed by #<normalize_usage PR>" if maintainers prefer a single entry point.

Recommendation

Apply the workaround by updating the update_from_response method to accept both OpenAI and Anthropic usage schemas, as it provides a more robust solution and prevents potential issues with third-party plugins.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #API middleware #SSR setup #ISR setup #authentication setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix ContextCompressor.update_from_response: accept both OpenAI and Anthropic usage schemas [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

PR fix notes

PR #14903: fix(context-compressor): accept Anthropic-style usage keys as fallback

Description (problem / solution / changelog)

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Checklist

Code

Documentation & Housekeeping

Screenshots / Logs

Changed files