hermes - 💡(How to fix) Fix bug(honcho): self-hosted localhost setup silently fails — apiKey trap, no recursive fallback, 30s timeout, silent error masking

Root Cause

Three independent issues in plugins/memory/honcho/ conspire to make multi-profile self-hosted Honcho setups silently broken. Each individually is small; together they took ~4 hours to diagnose in production because all failures collapse to "No relevant context found" in tool results, with the real cause only visible in journalctl.

Fix Action

Fix / Workaround

Workaround used: copy apiKey into every hosts.hermes.<X> block. Brittle — every new profile must remember to do this.

Instance	baseUrl	apiKey pattern	Status
1 (native)	`http://localhost:8000`	`hosts.hermes` only	broken — Issue 1 + Issue 2
2 (native)	`http://localhost:8000`	each `hosts.hermes.<X>`	OK (lucky workaround)
3(Docker)	`http://api:8000`	top-level	OK (Issue 2 doesn't trigger)
4 (Docker)	`http://api:8000`	top-level	OK

Code Example

api_key = (
    host_block.get("apiKey")             # 1. hosts.hermes.<profile>.apiKey
    or raw.get("apiKey")                 # 2. top-level apiKey
    or os.environ.get("HONCHO_API_KEY")  # 3. env var
)

---

_is_local = resolved_base_url and (
    "localhost" in resolved_base_url
    or "127.0.0.1" in resolved_base_url
    or "::1" in resolved_base_url
)
if _is_local:
    _host_block = (_raw.get("hosts") or {}).get(config.host, {})
    _host_has_key = bool(_host_block.get("apiKey"))
    effective_api_key = config.api_key if _host_has_key else "local"
else:
    effective_api_key = config.api_key

---

WARNING plugins.memory.honcho.session: Honcho dialectic query failed: Request timed out after 30.0s

---

except Exception as e:
    logger.warning("Honcho dialectic query failed: %s", e)
    return ""

---

return f"[honcho_error: {type(e).__name__}]"

---

local_path = get_hermes_home() / "honcho.json"
if local_path.exists():
    return local_path
# ... falls back to default ~/.hermes/honcho.json

Summary

Environment

Hermes Agent v0.14.0
Honcho self-hosted (Docker stack, AUTH_USE_AUTH=true, workspace-scoped JWTs)
Native systemd profile gateway services (hermes-gateway-<profile>.service)
baseUrl: http://localhost:8000

Issue 1: `apiKey` lookup has no recursive fallback into the `hermes` default block

plugins/memory/honcho/client.py:386:

api_key = (
    host_block.get("apiKey")             # 1. hosts.hermes.<profile>.apiKey
    or raw.get("apiKey")                 # 2. top-level apiKey
    or os.environ.get("HONCHO_API_KEY")  # 3. env var
)

There is no fallback to hosts.hermes.apiKey (the default-profile block). When a user runs hermes honcho setup and provides an apiKey, it's stored in hosts.hermes. Subsequent profile additions create hosts.hermes.<X> without apiKey, expecting inheritance from the default block. Inheritance never happens. Profile X authentication breaks silently.

Expected: apiKey should fall back through hosts.hermes.<X> → hosts.hermes → top-level → env. The first three are all explicit user intent expressed in the same config file.

Issue 2: `_is_local` branch ignores top-level `apiKey`

client.py:758-771:

_is_local = resolved_base_url and (
    "localhost" in resolved_base_url
    or "127.0.0.1" in resolved_base_url
    or "::1" in resolved_base_url
)
if _is_local:
    _host_block = (_raw.get("hosts") or {}).get(config.host, {})
    _host_has_key = bool(_host_block.get("apiKey"))
    effective_api_key = config.api_key if _host_has_key else "local"
else:
    effective_api_key = config.api_key

For localhost baseUrl, effective_api_key is only honored when the host block specifically has apiKey. Top-level apiKey (which config.api_key correctly resolved via raw.get("apiKey") at line 387) is dropped on the floor and replaced with the placeholder string "local".

This is the opposite of documented inheritance order. For self-hosted Honcho with AUTH_USE_AUTH=true, the string "local" results in a 401 (Invalid JWT).

Workaround used: copy apiKey into every hosts.hermes.<X> block. Brittle — every new profile must remember to do this.

Suggested fix: trust config.api_key when it's non-empty regardless of _is_local. The "skip cloud key on local" intent is fine, but it should be triggered by an explicit "localOnly": true flag, not by URL string matching.

Issue 3: Default 30s HTTP timeout cuts off `dialectic_query` for `reasoning_level≥medium`

Direct dialectic chat over a peer with rich representation regularly takes 30–60s on reasoning_level=medium (gpt-5.5 backend over ~5–10 KB representation). Hermes default HTTP timeout is 30s:

WARNING plugins.memory.honcho.session: Honcho dialectic query failed: Request timed out after 30.0s

This creates a confusing failure pattern: honcho_search (representation lookup, no LLM) works because it's sub-second. honcho_reasoning fails. User reports "search works but reasoning doesn't" — leading the operator down the wrong debug path (peer-pair representation, observation isolation, etc.) when the real issue is just timeout.

Suggested fix:

Bump default timeout to 60s OR
Auto-stretch default timeout based on reasoning_level (e.g. min(60, 30*max(1, level_to_int)) ) OR
Surface timeout knob in hermes honcho setup wizard

Issue 4: Silent error masking in `dialectic_query`

session.py:dialectic_query (around line 590):

except Exception as e:
    logger.warning("Honcho dialectic query failed: %s", e)
    return ""

return "" causes tool output to render as "No relevant context found" — indistinguishable from a successful query returning no relevant data. Auth errors, timeouts, and "actually nothing matches" all look the same to both the LLM and the user.

Suggested fix: surface a non-empty error marker back to caller, e.g.:

return f"[honcho_error: {type(e).__name__}]"

so the LLM sees the error and can adapt, OR the operator sees something other than the "no-data" message and knows to check journalctl.

Issue 5: `resolve_config_path()` skips `profiles/<name>/honcho.json`

client.py:resolve_config_path:

local_path = get_hermes_home() / "honcho.json"
if local_path.exists():
    return local_path
# ... falls back to default ~/.hermes/honcho.json

systemd unit files generated by Hermes for --profile X do NOT set HERMES_HOME. So get_hermes_home() == ~/.hermes (default). The file ~/.hermes/profiles/X/honcho.json is never read.

This creates dead code in user setups — operators reasonably assume per-profile honcho.json overrides global, but actually only global is read. Per-profile files accumulate as zombie configs (we had 12 dead files in our first instance).

Suggested fix: Either (a) actually read per-profile honcho.json and merge with global, (b) emit a warning if profiles/<name>/honcho.json exists but is never loaded, or (c) document that profile-level configs require explicit HERMES_HOME override and emit warnings when they exist alongside an unset env var.

Full RCA from production incident

In our 4-Hermes-instance / 12-profile production setup these issues compounded:

Instance	baseUrl	apiKey pattern	Status
1 (native)	`http://localhost:8000`	`hosts.hermes` only	broken — Issue 1 + Issue 2
2 (native)	`http://localhost:8000`	each `hosts.hermes.<X>`	OK (lucky workaround)
3(Docker)	`http://api:8000`	top-level	OK (Issue 2 doesn't trigger)
4 (Docker)	`http://api:8000`	top-level	OK

For 9 days (2026-05-22 → 2026-05-31), 1 profiles silently failed all Honcho writes — 86 Failed to sync messages errors in journalctl, but tools and UI showed nothing. We had to backfill ~1000 messages from Hermes' local state.db afterwards (which fortunately preserves everything in SQLite).

Priority order from operator perspective

Issue 3 (timeout) — single config-knob fix, immediate win
Issue 4 (error masking) — small change, huge debuggability improvement
Issue 1 (recursive apiKey fallback) — matches user expectation
Issue 2 (_is_local trap) — surprising behavior, hard to debug
Issue 5 (dead profile files) — discoverability/UX issue

Happy to provide more journalctl excerpts / config samples / repro repo if useful, and could submit PRs for #3 and #4 if there's interest.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix bug(honcho): self-hosted localhost setup silently fails — apiKey trap, no recursive fallback, 30s timeout, silent error masking

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Environment

Issue 1: `apiKey` lookup has no recursive fallback into the `hermes` default block

Issue 2: `_is_local` branch ignores top-level `apiKey`

Issue 3: Default 30s HTTP timeout cuts off `dialectic_query` for `reasoning_level≥medium`

Issue 4: Silent error masking in `dialectic_query`

Issue 5: `resolve_config_path()` skips `profiles/<name>/honcho.json`

Full RCA from production incident

Priority order from operator perspective

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix bug(honcho): self-hosted localhost setup silently fails — apiKey trap, no recursive fallback, 30s timeout, silent error masking

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Environment

Issue 1: apiKey lookup has no recursive fallback into the hermes default block

Issue 2: _is_local branch ignores top-level apiKey

Issue 3: Default 30s HTTP timeout cuts off dialectic_query for reasoning_level≥medium

Issue 4: Silent error masking in dialectic_query

Issue 5: resolve_config_path() skips profiles/<name>/honcho.json

Full RCA from production incident

Priority order from operator perspective

Still need to ship something?

TRENDING

Issue 1: `apiKey` lookup has no recursive fallback into the `hermes` default block

Issue 2: `_is_local` branch ignores top-level `apiKey`

Issue 3: Default 30s HTTP timeout cuts off `dialectic_query` for `reasoning_level≥medium`

Issue 4: Silent error masking in `dialectic_query`

Issue 5: `resolve_config_path()` skips `profiles/<name>/honcho.json`