hermes - ✅(Solved) Fix fix(telegram): _GATEWAY_PROVIDER_ERROR_RE false-positives on legitimate HTTP prose [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#28670Fetched 2026-05-20 04:02:39
View on GitHub
Comments
0
Participants
1
Timeline
12
Reactions
0
Author
Participants
Timeline (top)
labeled ×4referenced ×4cross-referenced ×3closed ×1

Error Message

_sanitize_gateway_final_response in gateway/run.py runs only on Telegram final responses. If the response text matches _GATEWAY_PROVIDER_ERROR_RE, the ENTIRE answer is replaced with a canned error message. The pattern \bhttp\s*\d{3}\b triggers on any HTTP status reference in prose. r'(api\s+(?:call\s+)?failed|provider\s+authentication\s+failed|non-retryable\s+error' r'|rate\s+limited\s+after\s+\d+\s+retries|error\s+code\s*:|\bhttp\s*\d{3}\b' _GATEWAY_PROVIDER_ERROR_RE.search("The API call failed because token expired")# MATCH (also fine — but also matches non-error "API call failed" prose) 3. Require AT LEAST TWO matches from the union to fire. http NNN alone in prose is a false positive; http NNN + error code: or + api call failed together signals a real error body. 4. Drop \bhttp\s*\d{3}\b from the union entirely, rely on the other markers (api call failed, error code:, provider authentication failed, etc.). Option (1) or (2) preserves the original PR's intent (suppress noisy provider failures) while avoiding the false-positive. Option (4) is simplest but loses coverage for raw bodies that JUST contain HTTP 500 Internal Server Error and nothing else.

Root Cause

All these match — agent's answer to legit user question gets replaced:

_GATEWAY_PROVIDER_ERROR_RE.search("HTTP 404 means 'not found'") # MATCH _GATEWAY_PROVIDER_ERROR_RE.search("When you get HTTP 500 errors, check logs") # MATCH _GATEWAY_PROVIDER_ERROR_RE.search("The API call failed because token expired")# MATCH (also fine — but also matches non-error "API call failed" prose)

Fix Action

Fixed

PR fix notes

PR #28681: fix(telegram): address 5 post-merge audit follow-ups

Description (problem / solution / changelog)

Fixes 5 small issues filed during the post-merge salvage audit. Single PR because each fix is small and they touch related files.

Resolves

  • closes #28670 — _GATEWAY_PROVIDER_ERROR_RE false-positives on legitimate prose
  • closes #28672 — pointless 1s sleep + same-thread retry on thread-not-found
  • closes #28674 — Bot API rejection of direct_messages_topic_id had no retry path
  • closes #28676 — dead image-document branch superseded by earlier merge
  • closes #28678 — chat-scoped allowlist doesn't cover channel posts

Changes

#28670 — anchor + length-cap the provider-error sanitizer

_looks_like_gateway_provider_error now uses an anchored regex (^\s*(\W*\s*)?...) and refuses to rewrite messages over 400 chars or 4+ lines. A user asking 'what does HTTP 404 mean?' on Telegram no longer has their entire reply replaced with the provider-error template; the rewrite still fires on actual short provider error envelopes that lead with 'HTTP 503' / 'API call failed' / etc.

#28672 — drop the 1s sleep, keep the retry

The original code did asyncio.sleep(1) then retried with the same message_thread_id. The sleep added latency on every thread-not-found error; the retry IS sometimes useful (Telegram has a one-off thread-not-found flake mode exercised by test_send_retries_transient_thread_not_found_before_fallback), so I kept the retry but removed the sleep.

#28674 — extend DM-topic retry predicate

_should_retry_without_dm_topic_reply_anchor previously required reply_to_message_id is not None, so synthetic / resumed sends that route via direct_messages_topic_id (no anchor) had no retry path if Bot API rejected the topic id. Predicate now also fires when direct_messages_topic_id is set and the BadRequest mentions a topic/thread routing failure. The retry path already correctly strips both fields together — only the trigger needed widening.

#28676 — remove dead branch

Lines 4947-4960 of gateway/platforms/telegram.py checked ext in SUPPORTED_IMAGE_DOCUMENT_TYPES for .png/.jpg/.jpeg/.webp/.gif. The earlier branch at line 4896 (bd0c54d17 fix: route Telegram image documents through photo handling) already handles the exact same extension set and returns before reaching here. Replaced the dead block with a comment.

#28678 — chat-scoped auth covers channels

source.chat_type in {'group', 'forum'} extended to {'group', 'forum', 'channel'} for the chat-scoped allowlist in _is_user_authorized. Operators can now put a channel id in TELEGRAM_GROUP_ALLOWED_CHATS and channel posts get authorized correctly. Previously the only paths that worked for channels were either listing the channel id in TELEGRAM_ALLOWED_USERS (because _build_message_event synthesizes user_id = chat.id for channel posts) or GATEWAY_ALLOW_ALL_USERS=true.

Validation

scripts/run_tests.sh tests/gateway/test_telegram_thread_fallback.py -q    → 41/41
scripts/run_tests.sh tests/cron/test_scheduler.py -q                      → 127/127
scripts/run_tests.sh tests/gateway/test_telegram_thread_fallback.py tests/gateway/test_telegram_documents.py tests/gateway/test_telegram_channel_posts.py tests/gateway/test_unauthorized_dm_behavior.py tests/gateway/test_telegram_noise_filter.py tests/gateway/test_telegram_group_gating.py -q  → 144/147

The 3 failures in the broader set are pre-existing test-pollution failures that reproduce on plain main without these changes.

Changed files

  • gateway/platforms/telegram.py (modified, +51/-25)
  • gateway/run.py (modified, +42/-9)

PR #37: chore: sync with upstream main (2026-05-19)

Description (problem / solution / changelog)

Daily sync with upstream. Auto-created by cron job.

New upstream commits (2080): ff0a70381 fix(web): consume bundled design system assets (#26391) 070eeaae6 chore(deps): bump @babel/plugin-transform-modules-systemjs in /website 43f8edbaa chore(deps): bump fast-uri from 3.1.0 to 3.1.2 in /website a9c38c7c3 chore(deps): bump python-dotenv from 1.2.1 to 1.2.2 dffcb6ffd chore(deps): bump python-multipart from 0.0.22 to 0.0.27 7f1d1248a chore(deps): bump lodash-es and langium in /website c4bcc778c chore(deps): bump lodash from 4.17.23 to 4.18.1 in /website 0b75d24fd chore(deps): bump follow-redirects from 1.15.11 to 1.16.0 in /website fc90f1b6a chore(deps): bump dompurify from 3.3.3 to 3.4.2 in /website f1254b1bc fix(cli): exit prompt_toolkit cleanly on SIGTERM/SIGHUP instead of raising KeyboardInterrupt (#28688) 709e37e19 fix(dashboard): add scheduled kanban i18n strings (#28534) c4981167e chore(actions)(deps): bump actions/checkout from 4.3.1 to 6.0.2 7bcdced6c fix(kanban): respawn guard defers blocker_auth instead of auto-blocking (#28683) b10b78320 chore(actions)(deps): bump actions/setup-python from 5.3.0 to 6.2.0 bbee1dd7c chore(actions)(deps): bump docker/build-push-action from 6.19.2 to 7.1.0 269245740 chore(actions)(deps): bump docker/login-action from 3.7.0 to 4.1.0 424f2cc6e chore(actions)(deps): bump the actions-minor-patch group across 1 directory with 2 updates a3c753128 fix(telegram): address post-merge audit follow-ups (#28670, #28672, #28674, #28676, #28678) 88ee58f7d fix(kanban): stale reclaim must not tick failure counter (#28680) 7f253f555 fix(acp): use tempfile.gettempdir() in workspace auto-approve 58591d9e3 feat: show names of user-modified skills in bundled skill sync summary aedb8ac83 feat(update): syntax-validate critical files post-pull, auto-rollback on failure (#28669) a0bd11d02 fix(tests): catch up 25 stale tests after recent merges (#28626) 12c39830f fix(doctor): attach codex CLI hint to OpenAI Codex auth warning for #27975 4039e2abb chore(release): alias xxxigm noreply for upcoming #27986 salvage (#28594) 62573f44c fix: guard yaml.safe_load, flock unlock, TOCTOU races, and atomic writes d759a67c0 fix: add recovery hints to loop guard warnings 87c6edc1d fix(skills): add timeout to Google OAuth urlopen calls b8a9cbd18 fix: tolerate unreadable gateway JSONL transcripts 663ee1486 fix(cron): allow emoji ZWJ sequences in prompts ...

Changed files

  • .env.example (modified, +1/-0)
  • .github/workflows/contributor-check.yml (modified, +1/-1)
  • .github/workflows/deploy-site.yml (modified, +2/-2)
  • .github/workflows/docker-publish.yml (modified, +13/-13)
  • .github/workflows/docs-site-checks.yml (modified, +2/-2)
  • .github/workflows/history-check.yml (modified, +1/-1)
  • .github/workflows/lint.yml (modified, +4/-4)
  • .github/workflows/nix-lockfile-fix.yml (modified, +2/-2)
  • .github/workflows/nix.yml (modified, +1/-1)
  • .github/workflows/osv-scanner.yml (modified, +1/-1)
  • .github/workflows/skills-index.yml (modified, +4/-4)
  • .github/workflows/supply-chain-audit.yml (modified, +2/-2)
  • .github/workflows/tests.yml (modified, +5/-2)
  • .github/workflows/upload_to_pypi.yml (modified, +3/-3)
  • .github/workflows/uv-lockfile-check.yml (modified, +1/-1)
  • AGENTS.md (modified, +8/-6)
  • README.md (modified, +1/-1)
  • acp_adapter/auth.py (modified, +13/-2)
  • acp_adapter/edit_approval.py (modified, +9/-1)
  • acp_adapter/permissions.py (modified, +22/-2)
  • acp_adapter/server.py (modified, +55/-1)
  • acp_adapter/tools.py (modified, +178/-13)
  • agent/agent_init.py (modified, +42/-7)
  • agent/agent_runtime_helpers.py (modified, +24/-3)
  • agent/anthropic_adapter.py (modified, +148/-14)
  • agent/auxiliary_client.py (modified, +189/-11)
  • agent/azure_identity_adapter.py (added, +555/-0)
  • agent/background_review.py (modified, +12/-0)
  • agent/chat_completion_helpers.py (modified, +15/-3)
  • agent/context_compressor.py (modified, +52/-3)
  • agent/conversation_compression.py (modified, +40/-4)
  • agent/conversation_loop.py (modified, +20/-5)
  • agent/copilot_acp_client.py (modified, +4/-1)
  • agent/credential_pool.py (modified, +118/-1)
  • agent/error_classifier.py (modified, +29/-0)
  • agent/memory_manager.py (modified, +59/-5)
  • agent/prompt_builder.py (modified, +7/-2)
  • agent/redact.py (modified, +1/-0)
  • agent/shell_hooks.py (modified, +4/-1)
  • agent/skill_bundles.py (added, +410/-0)
  • agent/skill_preprocessing.py (modified, +8/-0)
  • agent/system_prompt.py (modified, +6/-2)
  • agent/tool_guardrails.py (modified, +21/-4)
  • batch_runner.py (modified, +21/-2)
  • cli-config.yaml.example (modified, +9/-0)
  • cli.py (modified, +233/-13)
  • cron/jobs.py (modified, +44/-0)
  • cron/scheduler.py (modified, +147/-17)
  • gateway/config.py (modified, +51/-7)
  • gateway/platforms/base.py (modified, +45/-11)
  • gateway/platforms/dingtalk.py (modified, +10/-0)
  • gateway/platforms/matrix.py (modified, +37/-0)
  • gateway/platforms/mattermost.py (modified, +26/-5)
  • gateway/platforms/signal.py (modified, +25/-0)
  • gateway/platforms/telegram.py (modified, +740/-112)
  • gateway/platforms/telegram_network.py (modified, +10/-0)
  • gateway/platforms/wecom.py (modified, +1/-1)
  • gateway/run.py (modified, +885/-78)
  • gateway/session.py (modified, +17/-11)
  • gateway/session_context.py (modified, +8/-0)
  • gateway/sticker_cache.py (modified, +17/-4)
  • gateway/stream_consumer.py (modified, +36/-9)
  • hermes_cli/auth.py (modified, +470/-57)
  • hermes_cli/auth_commands.py (modified, +49/-0)
  • hermes_cli/azure_detect.py (modified, +126/-20)
  • hermes_cli/bundles.py (added, +229/-0)
  • hermes_cli/commands.py (modified, +32/-5)
  • hermes_cli/config.py (modified, +17/-0)
  • hermes_cli/cron.py (modified, +9/-0)
  • hermes_cli/doctor.py (modified, +92/-12)
  • hermes_cli/gateway.py (modified, +20/-11)
  • hermes_cli/gateway_windows.py (modified, +54/-3)
  • hermes_cli/kanban.py (modified, +327/-30)
  • hermes_cli/kanban_db.py (modified, +1138/-112)
  • hermes_cli/kanban_decompose.py (modified, +45/-8)
  • hermes_cli/kanban_diagnostics.py (modified, +9/-0)
  • hermes_cli/kanban_specify.py (modified, +7/-2)
  • hermes_cli/kanban_swarm.py (added, +279/-0)
  • hermes_cli/main.py (modified, +382/-31)
  • hermes_cli/model_switch.py (modified, +1/-1)
  • hermes_cli/oneshot.py (modified, +9/-0)
  • hermes_cli/providers.py (modified, +1/-0)
  • hermes_cli/proxy/adapters/__init__.py (modified, +2/-0)
  • hermes_cli/proxy/adapters/xai.py (added, +136/-0)
  • hermes_cli/proxy/cli.py (modified, +3/-2)
  • hermes_cli/runtime_provider.py (modified, +94/-7)
  • hermes_cli/skin_engine.py (modified, +6/-1)
  • hermes_cli/uninstall.py (modified, +1/-1)
  • hermes_cli/web_server.py (modified, +146/-27)
  • hermes_constants.py (modified, +74/-1)
  • hermes_logging.py (modified, +1/-1)
  • hermes_state.py (modified, +45/-0)
  • locales/af.yaml (modified, +1/-0)
  • locales/de.yaml (modified, +1/-0)
  • locales/en.yaml (modified, +1/-0)
  • locales/es.yaml (modified, +1/-0)
  • locales/fr.yaml (modified, +1/-0)
  • locales/ga.yaml (modified, +1/-0)
  • locales/hu.yaml (modified, +1/-0)
  • locales/it.yaml (modified, +1/-0)

Code Example

import re
_GATEWAY_PROVIDER_ERROR_RE = re.compile(
    r'(api\s+(?:call\s+)?failed|provider\s+authentication\s+failed|non-retryable\s+error'
    r'|rate\s+limited\s+after\s+\d+\s+retries|error\s+code\s*:|\bhttp\s*\d{3}\b'
    r'|incorrect\s+api\s+key|invalid\s+api\s+key)',
    re.IGNORECASE,
)

# All these match — agent's answer to legit user question gets replaced:
_GATEWAY_PROVIDER_ERROR_RE.search("HTTP 404 means 'not found'")              # MATCH
_GATEWAY_PROVIDER_ERROR_RE.search("When you get HTTP 500 errors, check logs") # MATCH
_GATEWAY_PROVIDER_ERROR_RE.search("The API call failed because token expired")# MATCH (also fine — but also matches non-error "API call failed" prose)
RAW_BUFFERClick to expand / collapse

From post-merge audit of PR #28510 (#24014 salvage, quiet noisy Telegram gateway errors).

Bug

_sanitize_gateway_final_response in gateway/run.py runs only on Telegram final responses. If the response text matches _GATEWAY_PROVIDER_ERROR_RE, the ENTIRE answer is replaced with a canned error message. The pattern \bhttp\s*\d{3}\b triggers on any HTTP status reference in prose.

Reproduction

import re
_GATEWAY_PROVIDER_ERROR_RE = re.compile(
    r'(api\s+(?:call\s+)?failed|provider\s+authentication\s+failed|non-retryable\s+error'
    r'|rate\s+limited\s+after\s+\d+\s+retries|error\s+code\s*:|\bhttp\s*\d{3}\b'
    r'|incorrect\s+api\s+key|invalid\s+api\s+key)',
    re.IGNORECASE,
)

# All these match — agent's answer to legit user question gets replaced:
_GATEWAY_PROVIDER_ERROR_RE.search("HTTP 404 means 'not found'")              # MATCH
_GATEWAY_PROVIDER_ERROR_RE.search("When you get HTTP 500 errors, check logs") # MATCH
_GATEWAY_PROVIDER_ERROR_RE.search("The API call failed because token expired")# MATCH (also fine — but also matches non-error "API call failed" prose)

Impact

User asks "what does HTTP 404 mean?" on Telegram. Agent answers correctly. Response is silently replaced with "⚠️ The model provider failed after retries. I kept raw provider details out of chat; check gateway logs for diagnostics." Telegram-only — CLI/Discord/Slack unaffected.

Confirmed: Curl returned 'HTTP/1.1 200 OK' does NOT match (slash prevents \b boundary) — so the bug specifically hits the common HTTP NNN form.

Proposed fixes (one of)

  1. Length cap: only sanitize messages under N lines or N characters. Real provider errors are short; assistant answers are long.
  2. Require preamble position: only sanitize when the regex matches in the FIRST line, not anywhere in the body.
  3. Require AT LEAST TWO matches from the union to fire. http NNN alone in prose is a false positive; http NNN + error code: or + api call failed together signals a real error body.
  4. Drop \bhttp\s*\d{3}\b from the union entirely, rely on the other markers (api call failed, error code:, provider authentication failed, etc.).

Option (1) or (2) preserves the original PR's intent (suppress noisy provider failures) while avoiding the false-positive. Option (4) is simplest but loses coverage for raw bodies that JUST contain HTTP 500 Internal Server Error and nothing else.

Scope

Telegram only. _sanitize_gateway_final_response short-circuits for any other platform. Limits blast radius but the bug is real for Telegram users.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix fix(telegram): _GATEWAY_PROVIDER_ERROR_RE false-positives on legitimate HTTP prose [2 pull requests, 1 participants]