hermes - 💡(How to fix) Fix run_conversation blocks 30-50s on _sync_external_memory_for_turn, delaying response.completed

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

This is safe because:

  • _sync_external_memory_for_turn is already wrapped in try/except (best-effort)
  • The memory sync doesn't affect the current turn's response
  • The next turn's prefetch (queue_prefetch_all) is already async (spawns its own thread)

Fix Action

Fix

In run_agent.py (~line 15194), run _sync_external_memory_for_turn in a background thread:

# Before (blocking):
self._sync_external_memory_for_turn(
    original_user_message=original_user_message,
    final_response=final_response,
    interrupted=interrupted,
)

# After (non-blocking):
import threading
threading.Thread(
    target=self._sync_external_memory_for_turn,
    kwargs=dict(
        original_user_message=original_user_message,
        final_response=final_response,
        interrupted=interrupted,
    ),
    daemon=True,
    name="hindsight-sync-turn",
).start()

This is safe because:

  • _sync_external_memory_for_turn is already wrapped in try/except (best-effort)
  • The memory sync doesn't affect the current turn's response
  • The next turn's prefetch (queue_prefetch_all) is already async (spawns its own thread)

Code Example

# Before (blocking):
self._sync_external_memory_for_turn(
    original_user_message=original_user_message,
    final_response=final_response,
    interrupted=interrupted,
)

# After (non-blocking):
import threading
threading.Thread(
    target=self._sync_external_memory_for_turn,
    kwargs=dict(
        original_user_message=original_user_message,
        final_response=final_response,
        interrupted=interrupted,
    ),
    daemon=True,
    name="hindsight-sync-turn",
).start()
RAW_BUFFERClick to expand / collapse

Problem

run_conversation() calls _sync_external_memory_for_turn() synchronously before returning (line ~15194 in run_agent.py). With the Hindsight memory provider, this triggers a retain operation that involves:

  1. Connecting to the hindsight daemon
  2. LLM-based entity resolution

This takes 30-50 seconds per turn. The assistant's text is already streamed via response.output_text.delta events, but response.completed is blocked until the memory sync finishes.

Impact

Web UI (and any SSE client) shows a loading spinner for 30+ seconds after the response text already appeared. The agent log shows:

  • Turn ended at T+4s (agent done)
  • Connected to daemon at T+34s (memory sync done)
  • POST /v1/responses 200 at T+34s (SSE stream finally closes)

Fix

In run_agent.py (~line 15194), run _sync_external_memory_for_turn in a background thread:

# Before (blocking):
self._sync_external_memory_for_turn(
    original_user_message=original_user_message,
    final_response=final_response,
    interrupted=interrupted,
)

# After (non-blocking):
import threading
threading.Thread(
    target=self._sync_external_memory_for_turn,
    kwargs=dict(
        original_user_message=original_user_message,
        final_response=final_response,
        interrupted=interrupted,
    ),
    daemon=True,
    name="hindsight-sync-turn",
).start()

This is safe because:

  • _sync_external_memory_for_turn is already wrapped in try/except (best-effort)
  • The memory sync doesn't affect the current turn's response
  • The next turn's prefetch (queue_prefetch_all) is already async (spawns its own thread)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING