hermes - 💡(How to fix) Fix [Feature]: Persistent usage/cost footer on gateway (Discord, Telegram, Slack) messages [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#11701Fetched 2026-04-18 05:59:16
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
1
Author
Participants

Code Example

display:
  usage_footer: true    # Append usage/cost footer to every gateway message

---

_📊 12.3k tokens · ~$0.0234 · ctx 45%_

---

_📊 Input: 10,200 · Output: 2,100 · Total: 12,300 · Cost: ~$0.0234 · Context: 45%_
RAW_BUFFERClick to expand / collapse

Problem or Use Case

When using Hermes via Discord or Telegram (messaging gateway), there is no way to see token usage or cost information without manually running /usage each time. The /usage command returns a one-shot stats block as a separate message, but there is no persistent display that shows cost/usage information alongside every response.

In the CLI, show_cost in config.yaml was intended to show cost in the status bar, but:

  1. display.show_cost is a dead config key — defined as a default in hermes_cli/config.py (line 525) but never read anywhere in the codebase. It has zero effect.
  2. /usage is one-shot only — it returns a single message with a detailed stats block, but does not toggle persistent display.
  3. No gateway mechanism exists to append usage info to outbound messages.

This matters for users running multiple agents (different profiles) across Discord/Telegram who need to monitor spending in real-time without remembering to type /usage after every interaction.

Proposed Solution

Add a display.usage_footer config option (default: false) that, when enabled, appends a compact usage line to every gateway message response.

Config example:

display:
  usage_footer: true    # Append usage/cost footer to every gateway message

Behavior when enabled: After each agent response on messaging platforms (Discord, Telegram, Slack, WhatsApp), append a footer line like:

_📊 12.3k tokens · ~$0.0234 · ctx 45%_

Or, for a more detailed variant:

_📊 Input: 10,200 · Output: 2,100 · Total: 12,300 · Cost: ~$0.0234 · Context: 45%_

Implementation approach: The footer would be appended in gateway/platforms/base.py _process_message_background(), after text_content is prepared (around line 1745) but before self._send_with_retry(). The token counts and cost data are already available in the agent_result dict returned by _handle_message_with_agent() (which includes input_tokens, output_tokens, last_prompt_tokens, model).

A new _format_usage_footer() method on GatewayRunner (or a shared utility) could format the compact line, using the existing agent.usage_pricing.estimate_usage_cost() for cost estimation and the agent context compressor for context percentage.

The config resolution should follow the same pattern as tool_progress — check display.usage_footer first, then display.platforms.<platform>.usage_footer for per-platform overrides.

Alternatives Considered

  1. Hook system — The gateway/hooks.py event system fires agent:end with context data, but hooks are fire-and-forget and cannot modify response content. A hook cannot inject a footer into the response text.
  2. Agent instructions — Adding "always include your token usage" to SOUL.md. This is unreliable — models forget, format inconsistently, and counts may be inaccurate (the model does not know its actual token counts).
  3. Separate follow-up message — Auto-sending /usage after every response. This would double the message volume and clutter the chat.
  4. Discord embed footer — Using Discord embed footer fields instead of text. This would be platform-specific and does not work for Telegram/Slack.

Feature Type

Gateway / messaging improvement

Scope

Medium (few files, < 300 lines)

Additional Context

I investigated the full message delivery pipeline:

  • GatewayRunner._handle_message_with_agent() returns dict with final_response, input_tokens, output_tokens, last_prompt_tokens, model
  • _process_message_background() in base.py receives response string, sends via _send_with_retry()
  • Token/cost data is already available in the agent result dict but is currently only surfaced through the /usage command handler
  • display.show_cost is defined but never consumed — it should either be wired up or replaced with usage_footer

The display.platforms override mechanism already exists for tool_progress, so this would follow the same pattern.

extent analysis

TL;DR

To display token usage and cost information alongside every response, add a display.usage_footer config option and implement the logic to append a compact usage line to every gateway message response.

Guidance

  • Introduce a new display.usage_footer config option with a default value of false to control the display of usage information.
  • Modify the _process_message_background() method in gateway/platforms/base.py to append a usage footer to the response text when display.usage_footer is enabled.
  • Create a new _format_usage_footer() method to format the compact usage line using the existing agent.usage_pricing.estimate_usage_cost() and agent context compressor.
  • Follow the same config resolution pattern as tool_progress to allow per-platform overrides.

Example

# In gateway/platforms/base.py
def _process_message_background(self, ...):
    # ...
    if self.config.display.usage_footer:
        usage_footer = self._format_usage_footer(agent_result)
        text_content += "\n" + usage_footer
    # ...

def _format_usage_footer(self, agent_result):
    # Calculate and format the usage information
    input_tokens = agent_result["input_tokens"]
    output_tokens = agent_result["output_tokens"]
    total_tokens = input_tokens + output_tokens
    cost = agent_result["model"].usage_pricing.estimate_usage_cost(total_tokens)
    context_percentage = agent_result["context_percentage"]
    return f"_📊 {total_tokens} tokens · ~${cost:.4f} · ctx {context_percentage}%_"

Notes

The implementation should be careful to handle cases where the usage information is not available or the config option is not set. Additionally, the formatting of the usage line should be consistent and easy to read.

Recommendation

Apply the workaround by adding the display.usage_footer config option and implementing the necessary logic to display the usage information. This will provide a more convenient and user-friendly way to monitor token usage and cost without having to manually run the /usage command.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING