hermes - 💡(How to fix) Fix [Feature]: First-class Local Brain layer for hybrid local/cloud agent cognition

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Hermes currently works very well as a cloud-model-driven agent with local tools, memory, skills, cron, gateway, and execution. A useful next architectural step would be to introduce a first-class Local Brain layer: a local model-backed cognitive component that belongs to the agent itself, not just another custom provider or one-off tool.

In short:

Hermes Agent = local brain + cloud brain + tools + memory + policy/router

The cloud model would remain the strongest reasoning / final decision layer, while the local brain would handle persistent low/medium complexity cognition near the user's machine.

Error Message

  • repeated error-pattern detection

Root Cause

A local brain would make Hermes feel less like a stateless cloud model attached to tools, and more like a persistent personal agent that lives on the user's machine while using cloud intelligence when needed. It would reduce cloud pressure, improve resilience, and make long-running personal/company/project automation more practical.

Code Example

local_brain:
  enabled: true
  provider: ollama
  model: gpt-oss:20b
  endpoint: http://127.0.0.1:11434/v1
  roles:
    preprocess: true
    compress: true
    classify: true
    draft: true
    background_watch: false
  max_risk: medium
  secrets_allowed: false
  require_cloud_review_for:
    - destructive_actions
    - secrets
    - production_changes
    - public_exposure
    - financial_or_billing
    - final_user_memory_writes
RAW_BUFFERClick to expand / collapse

Summary

Hermes currently works very well as a cloud-model-driven agent with local tools, memory, skills, cron, gateway, and execution. A useful next architectural step would be to introduce a first-class Local Brain layer: a local model-backed cognitive component that belongs to the agent itself, not just another custom provider or one-off tool.

In short:

Hermes Agent = local brain + cloud brain + tools + memory + policy/router

The cloud model would remain the strongest reasoning / final decision layer, while the local brain would handle persistent low/medium complexity cognition near the user's machine.

Motivation

Many Hermes workloads are not necessarily worth sending directly to the strongest cloud model every time:

  • log pre-analysis and clustering
  • local file/codebase summarization
  • repeated error-pattern detection
  • long session/context pre-compression
  • memory/skill candidate extraction
  • batch text cleanup/classification
  • local project state digestion
  • low-risk draft generation before cloud review
  • offline/degraded-mode assistance when cloud/network is unstable

Today these can be approximated with custom providers, tools, scripts, or manually routed model calls, but they do not feel like a built-in part of the agent's cognition. The user experience would be stronger if Hermes had an explicit local brain abstraction that the agent runtime can use automatically and transparently.

Proposed concept

Add a first-class Local Brain subsystem with capabilities such as:

  1. Configurable local model endpoint

    • Ollama / llama.cpp / vLLM / OpenAI-compatible local endpoints.
    • Discovery can build on existing local-provider/Ollama work.
  2. Agent-level cognitive roles

    • preprocess: summarize large logs/files before cloud model sees them.
    • compress: local first-pass context/session compression.
    • classify: route low/medium/high-risk tasks.
    • draft: produce local draft answers/plans for cloud review.
    • watch: background local analysis for logs, cron, project state.
  3. Cloud brain escalation policy

    • Strong cloud model remains the final authority for high-risk decisions, secrets, production operations, complex architecture, and final review.
    • Local brain outputs should be treated as candidate evidence, not automatically trusted final truth.
  4. Visible routing and audit

    • Logs/UI should show when local brain vs cloud brain was used.
    • Include token/cost/time savings and local model health.
  5. Safety boundaries

    • Local brain should not receive secrets unless explicitly allowed by policy.
    • Local brain should not execute actions directly; it proposes/filters/summarizes, while Hermes policy/tool layer remains in control.
    • Capability/risk routing should be explicit and overrideable.

Relationship to existing issues

This is related to but different from:

  • #19091 — Local Ollama discovery / custom provider support.
  • #16525 — model_switch / task-complexity routing.
  • #18715 — remote Hermes agent with local tool execution.

Those are useful building blocks. This proposal is more about making a local model an official agent cognition layer, not just another provider or external tool.

Example UX/config sketch

local_brain:
  enabled: true
  provider: ollama
  model: gpt-oss:20b
  endpoint: http://127.0.0.1:11434/v1
  roles:
    preprocess: true
    compress: true
    classify: true
    draft: true
    background_watch: false
  max_risk: medium
  secrets_allowed: false
  require_cloud_review_for:
    - destructive_actions
    - secrets
    - production_changes
    - public_exposure
    - financial_or_billing
    - final_user_memory_writes

Possible runtime behavior:

  • Large logs -> local brain clusters and summarizes -> cloud brain sees a compact evidence pack.
  • Low-risk repetitive text cleanup -> local brain drafts -> Hermes verifies output shape.
  • Potential memory/skill candidates -> local brain proposes -> cloud brain or policy approves before writing.
  • Network/cloud outage -> local brain can still provide limited local diagnostics and draft next steps.

Why this matters

A local brain would make Hermes feel less like a stateless cloud model attached to tools, and more like a persistent personal agent that lives on the user's machine while using cloud intelligence when needed. It would reduce cloud pressure, improve resilience, and make long-running personal/company/project automation more practical.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING