hermes - 💡(How to fix) Fix Discord thread action latency: 338s–405s responses with provider 300s timeout + retry (gpt-5.3-codex)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

In a Discord thread session, simple user requests took ~5.6 to 6.7 minutes to complete. The gateway responded, but end-to-end latency was dominated by a provider stall timeout and retry behavior.

Root Cause

Secondary factors that increase perceived latency:

  • Multiple tool calls in one turn (expected but not dominant vs 300s timeout)
  • Occasional extra patch/read cycles due to stale-file warning and failed match (not root cause of 5+ minute delay, but adds overhead)

Fix Action

Fix / Workaround

Secondary factors that increase perceived latency:

  • Multiple tool calls in one turn (expected but not dominant vs 300s timeout)
  • Occasional extra patch/read cycles due to stale-file warning and failed match (not root cause of 5+ minute delay, but adds overhead)

Code Example

6707| 2026-05-26 12:00:15.612 inbound message ("Let’s plan my workout for today")
6708| 2026-05-26 12:00:15.873 Langfuse tracing started
6712| 2026-05-26 12:07:00.468 response ready ... time=404.9s api_calls=11

6739| 2026-05-26 12:51:56.567 inbound message ("Remove that shit from today’s log too")
6740| 2026-05-26 12:51:56.821 Langfuse tracing started
6742| 2026-05-26 12:57:34.988 response ready ... time=338.4s api_calls=6

---

6741| 12:53:04 memory_monitor rss=280MB threads=21
6744| 12:58:04 memory_monitor rss=280MB threads=20

---

Still working... (3 min elapsed — iteration 7/60, waiting for non-streaming response (120s elapsed))
⚠️ No response from provider for 300s (non-streaming, model: gpt-5.3-codex). Aborting call.
 Retrying in 3.0s (attempt 1/3)...
Still working... (6 min elapsed — iteration 8/60, receiving stream response)

---

⚠️ No response from provider for 300s (non-streaming, model: gpt-5.3-codex). Aborting call.
 Retrying in 2.9s (attempt 1/3)...
RAW_BUFFERClick to expand / collapse

Summary

In a Discord thread session, simple user requests took ~5.6 to 6.7 minutes to complete. The gateway responded, but end-to-end latency was dominated by a provider stall timeout and retry behavior.

User impact

  • User asked for a single-file edit ("Remove that shit from today’s log too")
  • Response took 338.4s
  • Prior turn in same thread took 404.9s
  • User explicitly reported frustration about the 6-minute delay

Environment / context

  • Platform: Discord thread
  • Model: gpt-5.3-codex
  • Provider: openai-codex
  • Host: Linux 5.15.0-177-generic
  • Session id (gateway trace key): 20260526_120015_527597c5
  • Langfuse trace id repeatedly shown: 880e9a572038cf13f6c5b3f3c9cd21d8

Evidence (gateway.log)

From /home/hermes/.hermes/logs/gateway.log:

6707| 2026-05-26 12:00:15.612 inbound message ("Let’s plan my workout for today")
6708| 2026-05-26 12:00:15.873 Langfuse tracing started
6712| 2026-05-26 12:07:00.468 response ready ... time=404.9s api_calls=11

6739| 2026-05-26 12:51:56.567 inbound message ("Remove that shit from today’s log too")
6740| 2026-05-26 12:51:56.821 Langfuse tracing started
6742| 2026-05-26 12:57:34.988 response ready ... time=338.4s api_calls=6

Memory/thread health around the same period looked stable:

6741| 12:53:04 memory_monitor rss=280MB threads=21
6744| 12:58:04 memory_monitor rss=280MB threads=20

Evidence (Discord-visible runtime diagnostics)

From thread messages emitted by the bot runtime:

⏳ Still working... (3 min elapsed — iteration 7/60, waiting for non-streaming response (120s elapsed))
⚠️ No response from provider for 300s (non-streaming, model: gpt-5.3-codex). Aborting call.
⏳ Retrying in 3.0s (attempt 1/3)...
⏳ Still working... (6 min elapsed — iteration 8/60, receiving stream response)

A second similar timeout signal appeared later in the same thread:

⚠️ No response from provider for 300s (non-streaming, model: gpt-5.3-codex). Aborting call.
⏳ Retrying in 2.9s (attempt 1/3)...

What this suggests

Primary suspect is provider-side or provider-transport stall under non-streaming path, causing full 300s timeout before retry. Gateway itself remained alive and responsive; memory/threads looked normal.

Secondary factors that increase perceived latency:

  • Multiple tool calls in one turn (expected but not dominant vs 300s timeout)
  • Occasional extra patch/read cycles due to stale-file warning and failed match (not root cause of 5+ minute delay, but adds overhead)

Requested investigation

  1. Inspect openai-codex provider call path for non-streaming stalls in this session around:
    • 2026-05-26 12:00–12:07 UTC
    • 2026-05-26 12:51–12:58 UTC
  2. Verify whether retry policy can fail over faster (e.g., earlier timeout/hedged retry) instead of waiting full 300s.
  3. Confirm whether partial progress checkpoints/user-visible incremental responses can reduce perceived freeze in long tool-heavy turns.
  4. Correlate Langfuse trace 880e9a572038cf13f6c5b3f3c9cd21d8 with provider request lifecycle and transport timings.

Nice-to-have instrumentation improvements

  • Include provider request-id and per-attempt latency in gateway logs for each assistant turn
  • Emit explicit provider_call_start/provider_call_end/provider_timeout lines in agent.log
  • Surface retry cause + attempt durations in one structured event

Repro notes

This occurred in active multi-thread Discord usage while another thread was also being answered. Could be useful to test concurrency interactions under the same provider/model settings.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING