hermes - 💡(How to fix) Fix Proposal: Subprocess executor bridge for delegating gateway turns to Claude Code CLI [2 pull requests]

Error Message

budget_seconds / budget_tool_count — soft cap that returns partial response instead of error

auto_resume_budget_guard — when a turn exhausts its tool budget naturally (not by error), auto-spawn a continuation segment using claude --resume <session> up to N times budget_seconds: 480 # soft budget; partial-return instead of error

Root Cause

I'd like to propose contributing a subprocess executor bridge that lets a Hermes gateway profile delegate a turn to a locally-authenticated claude CLI (Claude Code) instead of (or in addition to) a Hermes model provider. I've been running this in production for ~2 months on Windows and would like to upstream it if there's interest.

This is not a new model provider — Hermes already supports any provider via the existing abstraction. It's an alternative executor: spawn claude as a subprocess, stream NDJSON events back, fold the result into the existing gateway/session/platform plumbing.

Code Example

gateway/run.py
        │
        ▼
gateway/claude_code_bridge.py  ← proposed new module
        │
        │ asyncio.subprocess_exec
        ▼
   claude --output-format stream-json --input-format text ...
        │
        ▼
   stream NDJSON events → fold into Hermes gateway response

---

claude_code_bridge:
  enabled: true
  model: claude-opus-4-7        # optional, overrides CLI default
  cwd: /path/to/working/dir
  system_prompt_path: /path/to/extra-prompt.md  # injected after CLI's own system prompt
  timeout: 600                  # outer hard cap (seconds)
  first_event_timeout: 60       # NDJSON silence before first event
  budget_seconds: 480           # soft budget; partial-return instead of error
  budget_tool_count: 20         # soft tool-count budget
  auto_resume_budget_guard: true
  auto_resume_max_resumes: 5
  resume_claude_sessions: false
  allowed_tools: ["Edit","Read","Bash"]
  disallowed_tools: []
  tools: default                # or "all" / explicit list
  exclude_dynamic_system_prompt_sections: true  # prompt-cache reuse

Summary

Why subprocess instead of API

The Claude Code CLI ships with capabilities that aren't reachable through the bare Anthropic API:

Local agentic tools out of the box — file edit, bash, search, etc., all sandboxed by the CLI
Anthropic's prompt-caching surface — already optimized in the CLI, just needs a byte-stable prefix
OAuth / local auth — users who pay for Claude Code's subscription can leverage it without managing API keys in Hermes
Tool/permission policy — --allowedTools / --disallowedTools flags

For Hermes users who already use claude locally, this lets a gateway profile (e.g., "advisor", "code-reviewer") delegate complex turns to Claude Code without the user copy-pasting between Telegram/Feishu and a terminal.

Use case it solves for me

I run a multi-profile Hermes deployment on Windows where:

One profile uses an open model via the standard provider abstraction (fast / cheap)
Another profile is configured as "deep code reviewer" and delegates to local claude via this bridge

The bridge gives me Hermes's gateway features (Feishu/Telegram integration, session memory, platform routing) while letting claude handle the heavy lifting on tool-rich turns. It's been stable in production for ~60 days.

Design (current local implementation)

gateway/run.py
        │
        ▼
gateway/claude_code_bridge.py  ← proposed new module
        │
        │ asyncio.subprocess_exec
        ▼
   claude --output-format stream-json --input-format text ...
        │
        ▼
   stream NDJSON events → fold into Hermes gateway response

Key implementation details (battle-tested over ~60 days):

asyncio.subprocess_exec with stdin pipe; CLI argv stays short/static to avoid Windows .cmd shim metacharacter parsing (long prompts go via stdin)
StreamReader(limit=16 MiB) — Claude can emit single NDJSON lines that exceed asyncio's default 64 KiB when quoting long files / SQL output
Prompt-cache-aware prefix ordering — stable role/system prompt first, per-turn user message last; SHA-256 hex of the cache-stable prefix logged next to cache_read / cache_creation counters so cache drift is visible
Triple fail-soft for hung Anthropic streams:
1. first_event_timeout (default 60s) — fail-fast if subprocess emits nothing
2. inter_event_silence_timeout — fail-fast on long inter-event gaps
3. budget_seconds / budget_tool_count — soft cap that returns partial response instead of error
auto_resume_budget_guard — when a turn exhausts its tool budget naturally (not by error), auto-spawn a continuation segment using claude --resume <session> up to N times
Env hygiene — strips ANTHROPIC_API_KEY / CLAUDE_CODE_OAUTH_TOKEN / ANTHROPIC_TOKEN from subprocess env so the bridge can't accidentally exfiltrate user credentials

Config sketch (cli-config.yaml)

claude_code_bridge:
  enabled: true
  model: claude-opus-4-7        # optional, overrides CLI default
  cwd: /path/to/working/dir
  system_prompt_path: /path/to/extra-prompt.md  # injected after CLI's own system prompt
  timeout: 600                  # outer hard cap (seconds)
  first_event_timeout: 60       # NDJSON silence before first event
  budget_seconds: 480           # soft budget; partial-return instead of error
  budget_tool_count: 20         # soft tool-count budget
  auto_resume_budget_guard: true
  auto_resume_max_resumes: 5
  resume_claude_sessions: false
  allowed_tools: ["Edit","Read","Bash"]
  disallowed_tools: []
  tools: default                # or "all" / explicit list
  exclude_dynamic_system_prompt_sections: true  # prompt-cache reuse

Or via env vars: HERMES_CLAUDE_CODE_BRIDGE=1, HERMES_CLAUDE_CODE_SYSTEM_PROMPT=..., HERMES_CLAUDE_CODE_CWD=..., HERMES_CLAUDE_CODE_TOOLS=..., HERMES_CLAUDE_CODE_ALLOWED_TOOLS=..., HERMES_CLAUDE_CODE_DISALLOWED_TOOLS=....

Scope of proposed PR

New file: gateway/claude_code_bridge.py (~1500 lines after cleanup, ~1700 lines in current local form)
Wire-up: gateway/run.py checks bridge_enabled(config) before falling through to the existing provider path
Tests: tests/gateway/test_claude_code_bridge.py (subprocess mocked via fake claude shim)
Docs: docs/claude-code-bridge.md + a paragraph in main README
No changes to existing provider/auxiliary-LLM abstractions

Questions before I open a PR

Is this in scope for upstream? Or would you prefer it live as an out-of-tree extension / skill?
Module location — gateway/claude_code_bridge.py or somewhere under a more generic gateway/executors/ namespace? (I'm happy to refactor as an Executor ABC if you have plans for additional CLI-based executors like Codex CLI / gemini-cli / aider, etc.)
Naming — keep claude_code_bridge (descriptive but vendor-named) or rename to something generic like subprocess_executor with the Claude Code piece being one concrete impl?
Cross-platform — I've only run this on Windows so far. Happy to do macOS / Linux smoke tests, but if you already have CI matrix patterns I should follow, let me know.
License / CLA — I see MIT in the repo; assume there's no separate CLA to sign?

Happy to open a draft PR right away if there's interest, or split it into smaller pieces (subprocess wrapper → stream-json parser → cache-aware prefix → fail-soft layer → auto-resume) if you'd prefer incremental review.

Thanks!

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix Proposal: Subprocess executor bridge for delegating gateway turns to Claude Code CLI [2 pull requests]

Recommended Tools

GitHub issue graph ai analysis