hermes - 💡(How to fix) Fix [Feature]: First-class Local Brain layer for hybrid local/cloud agent cognition

hermes2026-05-08 17:27:58

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Hermes currently works very well as a cloud-model-driven agent with local tools, memory, skills, cron, gateway, and execution. A useful next architectural step would be to introduce a first-class Local Brain layer: a local model-backed cognitive component that belongs to the agent itself, not just another custom provider or one-off tool.

In short:

Hermes Agent = local brain + cloud brain + tools + memory + policy/router

The cloud model would remain the strongest reasoning / final decision layer, while the local brain would handle persistent low/medium complexity cognition near the user's machine.

Error Message

repeated error-pattern detection

Root Cause

A local brain would make Hermes feel less like a stateless cloud model attached to tools, and more like a persistent personal agent that lives on the user's machine while using cloud intelligence when needed. It would reduce cloud pressure, improve resilience, and make long-running personal/company/project automation more practical.

Code Example

local_brain:
  enabled: true
  provider: ollama
  model: gpt-oss:20b
  endpoint: http://127.0.0.1:11434/v1
  roles:
    preprocess: true
    compress: true
    classify: true
    draft: true
    background_watch: false
  max_risk: medium
  secrets_allowed: false
  require_cloud_review_for:
    - destructive_actions
    - secrets
    - production_changes
    - public_exposure
    - financial_or_billing
    - final_user_memory_writes

RAW_BUFFERClick to expand / collapse

Summary

In short:

Hermes Agent = local brain + cloud brain + tools + memory + policy/router

The cloud model would remain the strongest reasoning / final decision layer, while the local brain would handle persistent low/medium complexity cognition near the user's machine.

Motivation

Many Hermes workloads are not necessarily worth sending directly to the strongest cloud model every time:

log pre-analysis and clustering
local file/codebase summarization
repeated error-pattern detection
long session/context pre-compression
memory/skill candidate extraction
batch text cleanup/classification
local project state digestion
low-risk draft generation before cloud review
offline/degraded-mode assistance when cloud/network is unstable

Today these can be approximated with custom providers, tools, scripts, or manually routed model calls, but they do not feel like a built-in part of the agent's cognition. The user experience would be stronger if Hermes had an explicit local brain abstraction that the agent runtime can use automatically and transparently.

Proposed concept

Add a first-class Local Brain subsystem with capabilities such as:

Configurable local model endpoint
- Ollama / llama.cpp / vLLM / OpenAI-compatible local endpoints.
- Discovery can build on existing local-provider/Ollama work.
Agent-level cognitive roles
- preprocess: summarize large logs/files before cloud model sees them.
- compress: local first-pass context/session compression.
- classify: route low/medium/high-risk tasks.
- draft: produce local draft answers/plans for cloud review.
- watch: background local analysis for logs, cron, project state.
Cloud brain escalation policy
- Strong cloud model remains the final authority for high-risk decisions, secrets, production operations, complex architecture, and final review.
- Local brain outputs should be treated as candidate evidence, not automatically trusted final truth.
Visible routing and audit
- Logs/UI should show when local brain vs cloud brain was used.
- Include token/cost/time savings and local model health.
Safety boundaries
- Local brain should not receive secrets unless explicitly allowed by policy.
- Local brain should not execute actions directly; it proposes/filters/summarizes, while Hermes policy/tool layer remains in control.
- Capability/risk routing should be explicit and overrideable.

Relationship to existing issues

This is related to but different from:

#19091 — Local Ollama discovery / custom provider support.
#16525 — model_switch / task-complexity routing.
#18715 — remote Hermes agent with local tool execution.

Those are useful building blocks. This proposal is more about making a local model an official agent cognition layer, not just another provider or external tool.

Example UX/config sketch

local_brain:
  enabled: true
  provider: ollama
  model: gpt-oss:20b
  endpoint: http://127.0.0.1:11434/v1
  roles:
    preprocess: true
    compress: true
    classify: true
    draft: true
    background_watch: false
  max_risk: medium
  secrets_allowed: false
  require_cloud_review_for:
    - destructive_actions
    - secrets
    - production_changes
    - public_exposure
    - financial_or_billing
    - final_user_memory_writes

Possible runtime behavior:

Large logs -> local brain clusters and summarizes -> cloud brain sees a compact evidence pack.
Low-risk repetitive text cleanup -> local brain drafts -> Hermes verifies output shape.
Potential memory/skill candidates -> local brain proposes -> cloud brain or policy approves before writing.
Network/cloud outage -> local brain can still provide limited local diagnostics and draft next steps.

Why this matters

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#conversation history #tool integration #LLM response #prompt template #agent execution

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Feature]: First-class Local Brain layer for hybrid local/cloud agent cognition

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Motivation

Proposed concept

Relationship to existing issues

Example UX/config sketch

Why this matters

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Feature]: First-class Local Brain layer for hybrid local/cloud agent cognition

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Motivation

Proposed concept

Relationship to existing issues

Example UX/config sketch

Why this matters

Still need to ship something?

RELATED_DISCOVERY

TRENDING