hermes - 💡(How to fix) Fix docs(hindsight-plugin): missing local-LLM concurrency warning in plugin README [1 pull requests]

hermes2026-05-24 21:47:15

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

The Hermes Hindsight memory provider README at plugins/memory/hindsight/README.md does not warn users about Hindsight's default HINDSIGHT_API_LLM_MAX_CONCURRENT=32. When Hermes and Hindsight share the same local LLM endpoint (a common setup with a single llama-server instance serving both), this default saturates the endpoint's slot pool and starves Hermes of inference slots — appearing as a "frozen" conversation.

Error Message

Root Cause

Fix Action

Fixed

Fixed by PR: docs(hindsight-plugin): warn about local-LLM concurrency saturation (https://github.com/NousResearch/hermes-agent/pull/31681)

Code Example

echo 'HINDSIGHT_API_LLM_MAX_CONCURRENT=1' >> ~/.hermes/.env
# Restart Hermes so the daemon respawns with the new env.

RAW_BUFFERClick to expand / collapse

Summary

Reproduction

Hermes + Hindsight both pointing at one llama-server (3 slots) on port 8081. After several turns Hermes appears to freeze. Diagnostics show all 3 slots constantly is_processing: true, ~32 ESTABLISHED connections from hindsight-api.

Fix applied locally

echo 'HINDSIGHT_API_LLM_MAX_CONCURRENT=1' >> ~/.hermes/.env
# Restart Hermes so the daemon respawns with the new env.

With MAX_CONCURRENT=1, Hindsight holds at most one slot, leaving room for Hermes.

Requested action

Add a short "Local LLM Concurrency" subsection to plugins/memory/hindsight/README.md (as a child of the "Local Embedded LLM" section), pointing users at the env var with diagnostic commands. Link to upstream docs for full detail.

Happy to submit a PR if this approach is acceptable.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering