hermes - 💡(How to fix) Fix Proposal: Subprocess executor bridge for delegating gateway turns to Claude Code CLI [2 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

I'd like to propose contributing a subprocess executor bridge that lets a Hermes gateway profile delegate a turn to a locally-authenticated claude CLI (Claude Code) instead of (or in addition to) a Hermes model provider. I've been running this in production for ~2 months on Windows and would like to upstream it if there's interest.

This is not a new model provider — Hermes already supports any provider via the existing abstraction. It's an alternative executor: spawn claude as a subprocess, stream NDJSON events back, fold the result into the existing gateway/session/platform plumbing.

Error Message

  1. budget_seconds / budget_tool_count — soft cap that returns partial response instead of error
  • auto_resume_budget_guard — when a turn exhausts its tool budget naturally (not by error), auto-spawn a continuation segment using claude --resume <session> up to N times budget_seconds: 480 # soft budget; partial-return instead of error

Root Cause

I'd like to propose contributing a subprocess executor bridge that lets a Hermes gateway profile delegate a turn to a locally-authenticated claude CLI (Claude Code) instead of (or in addition to) a Hermes model provider. I've been running this in production for ~2 months on Windows and would like to upstream it if there's interest.

This is not a new model provider — Hermes already supports any provider via the existing abstraction. It's an alternative executor: spawn claude as a subprocess, stream NDJSON events back, fold the result into the existing gateway/session/platform plumbing.

Fix Action

Fixed

Code Example

gateway/run.py
gateway/claude_code_bridge.py  ← proposed new module
        │ asyncio.subprocess_exec
   claude --output-format stream-json --input-format text ...
   stream NDJSON events → fold into Hermes gateway response

---

claude_code_bridge:
  enabled: true
  model: claude-opus-4-7        # optional, overrides CLI default
  cwd: /path/to/working/dir
  system_prompt_path: /path/to/extra-prompt.md  # injected after CLI's own system prompt
  timeout: 600                  # outer hard cap (seconds)
  first_event_timeout: 60       # NDJSON silence before first event
  budget_seconds: 480           # soft budget; partial-return instead of error
  budget_tool_count: 20         # soft tool-count budget
  auto_resume_budget_guard: true
  auto_resume_max_resumes: 5
  resume_claude_sessions: false
  allowed_tools: ["Edit","Read","Bash"]
  disallowed_tools: []
  tools: default                # or "all" / explicit list
  exclude_dynamic_system_prompt_sections: true  # prompt-cache reuse
RAW_BUFFERClick to expand / collapse

Summary

I'd like to propose contributing a subprocess executor bridge that lets a Hermes gateway profile delegate a turn to a locally-authenticated claude CLI (Claude Code) instead of (or in addition to) a Hermes model provider. I've been running this in production for ~2 months on Windows and would like to upstream it if there's interest.

This is not a new model provider — Hermes already supports any provider via the existing abstraction. It's an alternative executor: spawn claude as a subprocess, stream NDJSON events back, fold the result into the existing gateway/session/platform plumbing.

Why subprocess instead of API

The Claude Code CLI ships with capabilities that aren't reachable through the bare Anthropic API:

  1. Local agentic tools out of the box — file edit, bash, search, etc., all sandboxed by the CLI
  2. Anthropic's prompt-caching surface — already optimized in the CLI, just needs a byte-stable prefix
  3. OAuth / local auth — users who pay for Claude Code's subscription can leverage it without managing API keys in Hermes
  4. Tool/permission policy--allowedTools / --disallowedTools flags

For Hermes users who already use claude locally, this lets a gateway profile (e.g., "advisor", "code-reviewer") delegate complex turns to Claude Code without the user copy-pasting between Telegram/Feishu and a terminal.

Use case it solves for me

I run a multi-profile Hermes deployment on Windows where:

  • One profile uses an open model via the standard provider abstraction (fast / cheap)
  • Another profile is configured as "deep code reviewer" and delegates to local claude via this bridge

The bridge gives me Hermes's gateway features (Feishu/Telegram integration, session memory, platform routing) while letting claude handle the heavy lifting on tool-rich turns. It's been stable in production for ~60 days.

Design (current local implementation)

gateway/run.py
gateway/claude_code_bridge.py  ← proposed new module
        │ asyncio.subprocess_exec
   claude --output-format stream-json --input-format text ...
   stream NDJSON events → fold into Hermes gateway response

Key implementation details (battle-tested over ~60 days):

  • asyncio.subprocess_exec with stdin pipe; CLI argv stays short/static to avoid Windows .cmd shim metacharacter parsing (long prompts go via stdin)
  • StreamReader(limit=16 MiB) — Claude can emit single NDJSON lines that exceed asyncio's default 64 KiB when quoting long files / SQL output
  • Prompt-cache-aware prefix ordering — stable role/system prompt first, per-turn user message last; SHA-256 hex of the cache-stable prefix logged next to cache_read / cache_creation counters so cache drift is visible
  • Triple fail-soft for hung Anthropic streams:
    1. first_event_timeout (default 60s) — fail-fast if subprocess emits nothing
    2. inter_event_silence_timeout — fail-fast on long inter-event gaps
    3. budget_seconds / budget_tool_count — soft cap that returns partial response instead of error
  • auto_resume_budget_guard — when a turn exhausts its tool budget naturally (not by error), auto-spawn a continuation segment using claude --resume <session> up to N times
  • Env hygiene — strips ANTHROPIC_API_KEY / CLAUDE_CODE_OAUTH_TOKEN / ANTHROPIC_TOKEN from subprocess env so the bridge can't accidentally exfiltrate user credentials

Config sketch (cli-config.yaml)

claude_code_bridge:
  enabled: true
  model: claude-opus-4-7        # optional, overrides CLI default
  cwd: /path/to/working/dir
  system_prompt_path: /path/to/extra-prompt.md  # injected after CLI's own system prompt
  timeout: 600                  # outer hard cap (seconds)
  first_event_timeout: 60       # NDJSON silence before first event
  budget_seconds: 480           # soft budget; partial-return instead of error
  budget_tool_count: 20         # soft tool-count budget
  auto_resume_budget_guard: true
  auto_resume_max_resumes: 5
  resume_claude_sessions: false
  allowed_tools: ["Edit","Read","Bash"]
  disallowed_tools: []
  tools: default                # or "all" / explicit list
  exclude_dynamic_system_prompt_sections: true  # prompt-cache reuse

Or via env vars: HERMES_CLAUDE_CODE_BRIDGE=1, HERMES_CLAUDE_CODE_SYSTEM_PROMPT=..., HERMES_CLAUDE_CODE_CWD=..., HERMES_CLAUDE_CODE_TOOLS=..., HERMES_CLAUDE_CODE_ALLOWED_TOOLS=..., HERMES_CLAUDE_CODE_DISALLOWED_TOOLS=....

Scope of proposed PR

  • New file: gateway/claude_code_bridge.py (~1500 lines after cleanup, ~1700 lines in current local form)
  • Wire-up: gateway/run.py checks bridge_enabled(config) before falling through to the existing provider path
  • Tests: tests/gateway/test_claude_code_bridge.py (subprocess mocked via fake claude shim)
  • Docs: docs/claude-code-bridge.md + a paragraph in main README
  • No changes to existing provider/auxiliary-LLM abstractions

Questions before I open a PR

  1. Is this in scope for upstream? Or would you prefer it live as an out-of-tree extension / skill?
  2. Module locationgateway/claude_code_bridge.py or somewhere under a more generic gateway/executors/ namespace? (I'm happy to refactor as an Executor ABC if you have plans for additional CLI-based executors like Codex CLI / gemini-cli / aider, etc.)
  3. Naming — keep claude_code_bridge (descriptive but vendor-named) or rename to something generic like subprocess_executor with the Claude Code piece being one concrete impl?
  4. Cross-platform — I've only run this on Windows so far. Happy to do macOS / Linux smoke tests, but if you already have CI matrix patterns I should follow, let me know.
  5. License / CLA — I see MIT in the repo; assume there's no separate CLA to sign?

Happy to open a draft PR right away if there's interest, or split it into smaller pieces (subprocess wrapper → stream-json parser → cache-aware prefix → fail-soft layer → auto-resume) if you'd prefer incremental review.

Thanks!

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix Proposal: Subprocess executor bridge for delegating gateway turns to Claude Code CLI [2 pull requests]