hermes - ✅(Solved) Fix api_server: no delivery mechanism for generated images (image_generate output unreachable from HTTP clients / Open WebUI) [1 pull requests, 1 participants]

sunxyless · 2026-04-24T07:11:03Z

[hermes] PR 14964: feat api-server : serve generated images over HTTP for Open WebUI - Repository: NousResearch/hermes-agent - Author: sunxyless - State: open… # PR #14964: feat(api-server): serve generated images over HTTP for Open WebUI - Repository: NousResearch/hermes-agent - Author: sunxyless - State: open | merged: False - Link: https://github.com/NousResearch/hermes-agent/pull/14964 ## Description (problem / solution / changelog) ## What does this PR do? Adds HTTP delivery for images produced by the `image_generate` tool on the `api_server` platform, so Open WebUI (and any other HTTP client) can actually render generated images inline. Today, the tool returns the absolute filesystem path of the saved PNG and the model faithfully renders `![desc](/Users/…/cache/images/foo.png)` per its own schema. Platforms like Telegram and Discord turn that path into an upload; `api_server` does neither — the path hits the browser and 404s. Full problem statement + SSE evidence: #14959. Two coupled, opt-in changes — the rewrite only activates when the user sets `IMAGE_SERVE_BASE_URL` (or `image_gen.serve_base_url` in config), so CLI / Telegram / Discord / feishu paths are byte-identical when unconfigured. ## Related Issue Fixes #14959 ## Type of Change - [ ] 🐛 Bug fix (non-breaking change that fixes an issue) - [x] ✨ New feature (non-breaking change that adds functionality) - [ ] 🔒 Security fix - [ ] 📝 Documentation update - [x] ✅ Tests (adding or improving test coverage) - [ ] ♻️ Refactor (no behavior change) - [ ] 🎯 New skill (bundled or hub) ## Changes Made - **`tools/image_generation_tool.py`** (+58) - New `_resolve_image_serve_base_url()` — reads `IMAGE_SERVE_BASE_URL` env var, then falls back to `image_gen.serve_base_url` in `config.yaml`. Returns `None` when unset so callers are no-ops. - New `_maybe_rewrite_image_to_url(result_json)` — post-processes the provider's JSON result. If it's a success with a local filesystem path in `image` (i.e. not an existing URL or `data:` URI), rewrites `image` to `{base}/images/{basename}`. Leaves every other field untouched. - `_handle_image_generate` now funnels both dispatch paths (plugin provider and in-tree FAL) through the post-processor. - **`gateway/platforms/api_server.py`** (+17) - During `_setup_app`, register an aiohttp static route `/images/*` serving `IMAGE_CACHE_DIR` (already the canonical cache path in `gateway/platforms/base.py`). - Auth-free by design: ` ` tags don't send `Authorization` headers. Security relies on the random hex suffix in `save_b64_image` filenames. Failure to register logs a warning and keeps the server up — not fatal. - **`tests/tools/test_image_generation_url_rewrite.py`** (+154, new) - 18 tests covering both helpers: env-vs-config precedence, trailing-slash/whitespace handling, pass-through for failures, already-URL / data-URI / empty / missing image fields, malformed JSON, `basename()` leak-proofing. ## How to Test ```bash # 1. Enable a plugin image-gen provider (openai-codex works well — no API key needed) hermes -p config set image_gen.provider openai-codex # 2. Configure the public base for generated images echo 'IMAGE_SERVE_BASE_URL=http://127.0.0.1:8642' >> ~/.hermes/profiles/ /.env # 3. Enable api_server + restart # (.env should already have API_SERVER_ENABLED=true and API_SERVER_KEY= ) hermes -p gateway restart # 4. Trigger an image over /v1/chat/completions and watch the SSE stream curl -sN -X POST http://127.0.0.1:8642/v1/chat/completions \ -H "Authorization: Bearer " -H 'Content-Type: application/json' \ -d '{"model":" ","stream":true,"messages":[{"role":"user","content":"Use image_generate to draw a red apple. Render inline."}]}' ``` Before this change the `delta.content` would include `![apple](/abs/path/cache/images/foo.png)`. With this change it includes `![apple](http://127.0.0.1:8642/images/foo.png)`, and a browser `GET` on that URL returns `HTTP 200, image/png` with the exact cache file bytes. Verified end-to-end on two deployments: - **Mac + Open WebUI in Docker** with `IMAGE_SERVE_BASE_URL=http://127.0.0.1:8642` - **VPS + Caddy reverse proxy** with `IMAGE_SERVE_BASE_URL=https://chat.example.com`, Caddy routing `/images/*` through to the gateway Tests: `pytest tests/tools/test_image_generation_url_rewrite.py -q` → **18 passed**. Broader sweep `pytest tests/tools/test_image_generation*.py tests/plugins/image_gen/ -q` → **121 passed** (no regressions). ## Checklist ### Code - [x] I've read the [Contributing Guide](https://github.com/NousResearch/hermes-agent/blob/main/CONTRIBUTING.md) - [x] My commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) (`feat(api-server): …`) - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains **only** changes related to this feature (no unrelated commits) - [x] I've run `pytest tests/tools/test_image_generation*.py tests/plugins/image_gen/ -q` and all 121 tests pass - [x

hermes2026-04-24 07:11:03

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#14959•Fetched 2026-04-24 10:44:00

View on GitHub

Comments

Participants

Timeline

Reactions

Author

sunxyless

Participants

sunxyless

Timeline (top)

labeled ×4cross-referenced ×1referenced ×1

Root Cause

Open WebUI is the most common api_server client. It only proxies /v1/chat/completions via its backend; image URLs in streamed markdown are fetched directly from the user's browser. That means whatever form the tool result takes in the stream is what ends up in the <img> tag:

absolute filesystem path → browser 404
localhost URL on the gateway port → only works when browser and gateway are on the same host
publicly resolvable URL → works universally (when the deployment exposes one)
data URI → always works, at the cost of large payloads that re-ride on every turn (Open WebUI resends the full message history client-side)

Today none of those outcomes is reachable without user-side hacks.

Fix Action

Fix / Workaround

I've applied a working patch locally and tested it end-to-end against openai-codex / gpt-image-2-medium:

Before the patch: assistant content is ![apple](/Users/…/openai_codex_….png) (broken).
After the patch: assistant content is ![apple](http://127.0.0.1:8642/images/openai_codex_….png); browser GET returns HTTP 200, image/png, 1024×1024, byte-exact with source file.

Diff is 71 additions across tools/image_generation_tool.py (+58) and gateway/platforms/api_server.py (+17). Keeping it external as a deploy-time patch until there's a shape that matches maintainer preference — happy to open a PR if the proposal above is acceptable.

PR fix notes

PR #14964: feat(api-server): serve generated images over HTTP for Open WebUI

Repository: NousResearch/hermes-agent
Author: sunxyless
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/14964

Description (problem / solution / changelog)

What does this PR do?

Adds HTTP delivery for images produced by the image_generate tool on the api_server platform, so Open WebUI (and any other HTTP client) can actually render generated images inline.

Today, the tool returns the absolute filesystem path of the saved PNG and the model faithfully renders ![desc](/Users/…/cache/images/foo.png) per its own schema. Platforms like Telegram and Discord turn that path into an upload; api_server does neither — the path hits the browser and 404s. Full problem statement + SSE evidence: #14959.

Two coupled, opt-in changes — the rewrite only activates when the user sets IMAGE_SERVE_BASE_URL (or image_gen.serve_base_url in config), so CLI / Telegram / Discord / feishu paths are byte-identical when unconfigured.

Related Issue

Fixes #14959

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
🔒 Security fix
📝 Documentation update
✅ Tests (adding or improving test coverage)
♻️ Refactor (no behavior change)
🎯 New skill (bundled or hub)

Changes Made

tools/image_generation_tool.py (+58)
- New _resolve_image_serve_base_url() — reads IMAGE_SERVE_BASE_URL env var, then falls back to image_gen.serve_base_url in config.yaml. Returns None when unset so callers are no-ops.
- New _maybe_rewrite_image_to_url(result_json) — post-processes the provider's JSON result. If it's a success with a local filesystem path in image (i.e. not an existing URL or data: URI), rewrites image to {base}/images/{basename}. Leaves every other field untouched.
- _handle_image_generate now funnels both dispatch paths (plugin provider and in-tree FAL) through the post-processor.
gateway/platforms/api_server.py (+17)
- During _setup_app, register an aiohttp static route /images/* serving IMAGE_CACHE_DIR (already the canonical cache path in gateway/platforms/base.py).
- Auth-free by design: <img> tags don't send Authorization headers. Security relies on the random hex suffix in save_b64_image filenames. Failure to register logs a warning and keeps the server up — not fatal.
tests/tools/test_image_generation_url_rewrite.py (+154, new)
- 18 tests covering both helpers: env-vs-config precedence, trailing-slash/whitespace handling, pass-through for failures, already-URL / data-URI / empty / missing image fields, malformed JSON, basename() leak-proofing.

How to Test

# 1. Enable a plugin image-gen provider (openai-codex works well — no API key needed)
hermes -p <profile> config set image_gen.provider openai-codex

# 2. Configure the public base for generated images
echo 'IMAGE_SERVE_BASE_URL=http://127.0.0.1:8642' >> ~/.hermes/profiles/<profile>/.env

# 3. Enable api_server + restart
# (.env should already have API_SERVER_ENABLED=true and API_SERVER_KEY=<key>)
hermes -p <profile> gateway restart

# 4. Trigger an image over /v1/chat/completions and watch the SSE stream
curl -sN -X POST http://127.0.0.1:8642/v1/chat/completions \
  -H "Authorization: Bearer <key>" -H 'Content-Type: application/json' \
  -d '{"model":"<profile>","stream":true,"messages":[{"role":"user","content":"Use image_generate to draw a red apple. Render inline."}]}'

Before this change the delta.content would include ![apple](/abs/path/cache/images/foo.png). With this change it includes ![apple](http://127.0.0.1:8642/images/foo.png), and a browser GET on that URL returns HTTP 200, image/png with the exact cache file bytes.

Verified end-to-end on two deployments:

Mac + Open WebUI in Docker with IMAGE_SERVE_BASE_URL=http://127.0.0.1:8642
VPS + Caddy reverse proxy with IMAGE_SERVE_BASE_URL=https://chat.example.com, Caddy routing /images/* through to the gateway

Tests: pytest tests/tools/test_image_generation_url_rewrite.py -q → 18 passed. Broader sweep pytest tests/tools/test_image_generation*.py tests/plugins/image_gen/ -q → 121 passed (no regressions).

Checklist

Code

I've read the Contributing Guide
My commit messages follow Conventional Commits (feat(api-server): …)
I searched for existing PRs to make sure this isn't a duplicate
My PR contains only changes related to this feature (no unrelated commits)
I've run pytest tests/tools/test_image_generation*.py tests/plugins/image_gen/ -q and all 121 tests pass
I've added tests for my changes (18 new tests)
I've tested on my platform: macOS 15 + Ubuntu 24.04 (VPS)

Documentation & Housekeeping

I've updated relevant documentation (README, docs/, docstrings) — can update website/docs/user-guide/messaging/open-webui.md with a one-line note about the env var if maintainers prefer
I've updated cli-config.yaml.example if I added/changed config keys — N/A (feature is gated entirely on env var or config.yaml addition; the config key is optional)
I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — N/A
I've considered cross-platform impact (Windows, macOS) per the compatibility guide — os.path.basename and aiohttp.add_static are both platform-agnostic; IMAGE_CACHE_DIR already uses get_hermes_dir
I've updated tool descriptions/schemas if I changed tool behavior — the existing schema ("Returns either a URL or an absolute file path in the \image` field"`) already covers the URL case; no change needed

Open questions for maintainers (also listed in #14959)

Auth on /images/*. I went with public + random-hex filename. Happy to switch to signed URLs if you prefer — that's a self._sign_image_url(path) helper and a query-string check in the static handler.
Route naming. /images/ vs OpenAI-style /v1/images/{filename}? I picked the shorter path because it's not an OpenAI-spec endpoint and doesn't belong under /v1/.
Data-URI fallback. Some deployments can't easily expose a URL (private VPS without reverse proxy). Interested if you'd accept a complementary IMAGE_SERVE_MODE=data_uri option as a follow-up.

Happy to iterate on any of these.

Changed files

gateway/platforms/api_server.py (modified, +17/-0)
tests/tools/test_image_generation_url_rewrite.py (added, +154/-0)
tools/image_generation_tool.py (modified, +54/-4)

Code Example

# 1. Enable the Codex image-gen backend (or any plugin provider)
hermes -p <profile> config set image_gen.provider openai-codex

# 2. Enable api_server and restart gateway
# (profile .env: API_SERVER_ENABLED=true, API_SERVER_KEY=<some-key>)
hermes -p <profile> gateway restart

# 3. Ask for an image over /v1/chat/completions
curl -sN -X POST http://127.0.0.1:8642/v1/chat/completions \
  -H "Authorization: Bearer <key>" -H 'Content-Type: application/json' \
  -d '{"model":"<profile>","stream":true,"messages":[{"role":"user","content":"Use image_generate to draw a red apple. Retry if it fails. Render inline."}]}'

---

Here you go:

![Simple red apple on white background](/Users/seanlee/.hermes/profiles/work-rm/cache/images/openai_codex_gpt-image-2-medium_20260424_143801_f2c2ad0c.png)

---

# No /images/ or cache/images route registered
$ rg 'add_static|add_get.*image|cache/images' gateway/platforms/api_server.py
(nothing)

# No markdown/path post-processing on the outbound stream
$ rg '_transform_images|path.*data:|path_to_data_url|b64encode.*png' gateway/platforms/api_server.py
(nothing)

# Static image route and delivery logic exist in other platforms
$ rg 'send_photo|send_animation' gateway/platforms/{telegram,discord,feishu}.py
…several hits in each

---

from gateway.platforms.base import IMAGE_CACHE_DIR
IMAGE_CACHE_DIR.mkdir(parents=True, exist_ok=True)
self._app.router.add_static("/images/", str(IMAGE_CACHE_DIR), show_index=False)

RAW_BUFFERClick to expand / collapse

Problem

The api_server gateway platform has no mechanism to deliver images produced by the image_generate tool to HTTP clients. This breaks image generation end-to-end when Open WebUI (or any other HTTP-level client) is the frontend.

The image_generate tool's schema instructs the model:

Returns either a URL or an absolute file path in the image field; display it with markdown ![description](url-or-path) and the gateway will deliver it.

The "the gateway will deliver it" part is honored by telegram.py, discord.py, feishu.py, etc. via their send_photo / upload flows. The api_server platform has no equivalent — neither a static route for IMAGE_CACHE_DIR nor a post-process step that rewrites local paths into URLs or data URIs.

Result: the streamed assistant content arrives at the client as raw markdown pointing at an absolute filesystem path, which the browser cannot fetch.

Repro (v0.11.0, fresh profile)

# 1. Enable the Codex image-gen backend (or any plugin provider)
hermes -p <profile> config set image_gen.provider openai-codex

# 2. Enable api_server and restart gateway
# (profile .env: API_SERVER_ENABLED=true, API_SERVER_KEY=<some-key>)
hermes -p <profile> gateway restart

# 3. Ask for an image over /v1/chat/completions
curl -sN -X POST http://127.0.0.1:8642/v1/chat/completions \
  -H "Authorization: Bearer <key>" -H 'Content-Type: application/json' \
  -d '{"model":"<profile>","stream":true,"messages":[{"role":"user","content":"Use image_generate to draw a red apple. Retry if it fails. Render inline."}]}'

Observed streamed delta.content:

Here you go:

![Simple red apple on white background](/Users/seanlee/.hermes/profiles/work-rm/cache/images/openai_codex_gpt-image-2-medium_20260424_143801_f2c2ad0c.png)

The client renders <img src="/Users/.../openai_codex_...png"> → 404.

Grep evidence that no delivery path exists

# No /images/ or cache/images route registered
$ rg 'add_static|add_get.*image|cache/images' gateway/platforms/api_server.py
(nothing)

# No markdown/path post-processing on the outbound stream
$ rg '_transform_images|path.*data:|path_to_data_url|b64encode.*png' gateway/platforms/api_server.py
(nothing)

# Static image route and delivery logic exist in other platforms
$ rg 'send_photo|send_animation' gateway/platforms/{telegram,discord,feishu}.py
…several hits in each

gateway/platforms/base.py already exposes IMAGE_CACHE_DIR and ensure_image_cache_dir(); api_server just never consumes it.

Why this matters

absolute filesystem path → browser 404
localhost URL on the gateway port → only works when browser and gateway are on the same host
publicly resolvable URL → works universally (when the deployment exposes one)
data URI → always works, at the cost of large payloads that re-ride on every turn (Open WebUI resends the full message history client-side)

Today none of those outcomes is reachable without user-side hacks.

Proposed shape (for discussion — not a PR yet)

Two coupled pieces, both opt-in and zero-risk when unconfigured:

1. Static route for generated images

api_server._setup_app adds an aiohttp static route:

from gateway.platforms.base import IMAGE_CACHE_DIR
IMAGE_CACHE_DIR.mkdir(parents=True, exist_ok=True)
self._app.router.add_static("/images/", str(IMAGE_CACHE_DIR), show_index=False)

2. Optional URL rewrite in the `image_generate` tool

New env var IMAGE_SERVE_BASE_URL (or image_gen.serve_base_url in config.yaml). When set, post-process the tool result to rewrite a local filesystem image field to {base}/images/{basename}. When unset, behavior is identical to today (CLI / Telegram / Discord paths unchanged).

Open questions for maintainers

Auth on /images/*. <img> tags don't send Authorization headers. Acceptable options:
- Public, rely on the existing random-hex filename suffix from save_b64_image (my current choice)
- Signed URLs (HMAC token in query string)
- First-party cookie from an authenticated session (more machinery)
Static route path. /images/ vs /v1/files/<id> vs OpenAI-style /v1/images/{filename} — mainly a surface-naming preference.
Data-URI fallback. Would maintainers prefer a built-in data-URI mode for deployments that can't easily expose a URL? Trade-off is ~200KB–2MB rider per turn in Open WebUI history.
Scope. Is this api_server-specific, or should the rewrite live at the tool level and be consumed by any platform that adds a matching static route?

Reference implementation

I've applied a working patch locally and tested it end-to-end against openai-codex / gpt-image-2-medium:

Before the patch: assistant content is ![apple](/Users/…/openai_codex_….png) (broken).
After the patch: assistant content is ![apple](http://127.0.0.1:8642/images/openai_codex_….png); browser GET returns HTTP 200, image/png, 1024×1024, byte-exact with source file.

Environment

hermes-agent v0.11.0 (8f5fee3e / release 2026.4.23)
Profile: Codex OAuth (ChatGPT subscription), openai-codex image-gen provider, gpt-image-2-medium
Frontend: Open WebUI 0.9.1 via Chat Completions connection

extent analysis

TL;DR

To fix the issue, add a static route for generated images in the api_server platform and optionally rewrite local filesystem paths to URLs in the image_generate tool.

Guidance

Add a static route for generated images in api_server using aiohttp to serve files from IMAGE_CACHE_DIR.
Introduce an optional IMAGE_SERVE_BASE_URL environment variable to rewrite local filesystem paths to URLs in the image_generate tool.
Consider authentication options for the static route, such as public access, signed URLs, or first-party cookies.
Decide on a static route path, such as /images/ or /v1/files/<id>.
Weigh the trade-offs of a data-URI fallback for deployments that can't expose a URL.

Example

from gateway.platforms.base import IMAGE_CACHE_DIR
IMAGE_CACHE_DIR.mkdir(parents=True, exist_ok=True)
self._app.router.add_static("/images/", str(IMAGE_CACHE_DIR), show_index=False)

Notes

The proposed solution requires discussion and agreement on the implementation details, such as authentication and static route path. The reference implementation provides a working patch, but it's essential to ensure it aligns with the maintainers' preferences.

Recommendation

Apply the proposed workaround by adding a static route for generated images and optionally rewriting local filesystem paths to URLs. This approach allows for a flexible and opt-in solution that can be refined based on further discussion and testing.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #device allocation #model download #tokenizer error #prompt formatting

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.