hermes - 💡(How to fix) Fix [Bug]: openai-codex main chat works, but auxiliary vision path returns Cloudflare challenge HTML

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

This looks less like a generic image-processing bug and more like a failure affecting the entire auxiliary vision request family built around:

  • openai-codex
  • task="vision"
  • CodexAuxiliaryClient
  • responses.stream()

It is clearly related to #13834, but this issue is a narrower repro where main chat works while only the auxiliary vision path fails.

Error Message

Error analyzing image: <html> ... Enable JavaScript and cookies to continue The important point is that this is not a normal image-parsing failure or a simple “model does not support vision” error. It appears to be the Codex auxiliary vision path returning a challenge page.

Root Cause

This suggests the problem is not only caused by multimodal image payloads. It looks like a problem with the Codex auxiliary path used by task="vision" itself.

Fix Action

Fix / Workaround

3) Minimal auxiliary Codex client-construction patch

To test that, I applied a minimal local patch to agent/auxiliary_client.py::_build_codex_client() so the auxiliary Codex client used a keepalive/proxy-aware httpx.Client more similar to the main chat path.

Code Example

Error analyzing image: <html> ... Enable JavaScript and cookies to continue

---

auxiliary:
    vision:
      provider: openai-codex
      model: gpt-5.4
      base_url: https://chatgpt.com/backend-api/codex

---

vision_analyze(
    image_url="~/.hermes/image_cache/img_d8a8a68b1757.jpg",
    question="Read the English passage in this image and transcribe the visible sentences as accurately as possible."
)

---

['openai-codex']
RAW_BUFFERClick to expand / collapse

Bug Description

When Hermes is configured with openai-codex as the main provider, main text chat works, but the auxiliary vision path used by vision_analyze fails by returning a Cloudflare challenge HTML page instead of a model response.

From the user's perspective, it typically looks like this:

Error analyzing image: <html> ... Enable JavaScript and cookies to continue

The important point is that this is not a normal image-parsing failure or a simple “model does not support vision” error. It appears to be the Codex auxiliary vision path returning a challenge page.

Confirmed Facts

  • Main provider: openai-codex
  • Auxiliary vision config:
    auxiliary:
      vision:
        provider: openai-codex
        model: gpt-5.4
        base_url: https://chatgpt.com/backend-api/codex
  • Codex token presence was verified: token_present True
  • Main text chat works
  • vision_analyze fails
  • ~/.hermes/logs/errors.log contains the literal challenge HTML
    • Enable JavaScript and cookies to continue

Relationship to Existing Issues

  • #13834 is a related umbrella issue.
    • That issue describes cases where official Codex CLI works on the same machine/network, but Hermes openai-codex hits Cloudflare / connection problems.
    • This issue is narrower: main chat works, but only the auxiliary vision path fails.
  • #14085 is vision-related but describes a different bug.
    • That issue is about _client_cache not including model in the cache key, causing the wrong model to be reused.
    • This issue does not look like a wrong-model / 404 / 400 case. It returns challenge HTML instead.

Reproduction

  1. Configure Hermes to use openai-codex
  2. Call vision_analyze
  3. Receive challenge HTML instead of a model response

Example local image call used during testing:

vision_analyze(
    image_url="~/.hermes/image_cache/img_d8a8a68b1757.jpg",
    question="Read the English passage in this image and transcribe the visible sentences as accurately as possible."
)

Narrowing Experiments Performed

1) Compare text-only vs image+text on the same task="vision" path

I called the same auxiliary vision path directly in two forms:

  • case A: task="vision" + text-only
  • case B: task="vision" + image + text

Both failed in the same way:

  • PermissionDeniedError
  • response body was Cloudflare challenge HTML
  • contained Enable JavaScript and cookies to continue

This suggests the problem is not only caused by multimodal image payloads. It looks like a problem with the Codex auxiliary path used by task="vision" itself.

2) Non-Codex vision backend branching

On this machine, get_available_vision_backends() returned only:

['openai-codex']

So there was no fully available alternative vision backend for a clean A/B comparison.

I also forced provider='copilot' for one text-only test. That path failed differently:

  • normal JSON 400
  • model_not_supported

That is notable because it did not return Cloudflare challenge HTML. So the current Codex failure does not look like a generic “all providers reject this model” situation.

3) Minimal auxiliary Codex client-construction patch

I suspected a transport/client-construction difference between the main chat path and the auxiliary Codex path.

To test that, I applied a minimal local patch to agent/auxiliary_client.py::_build_codex_client() so the auxiliary Codex client used a keepalive/proxy-aware httpx.Client more similar to the main chat path.

Result:

  • text-only: same challenge HTML
  • image+text: same challenge HTML

So this does not appear to be explained solely by the auxiliary path using a bare OpenAI(...) client.

Relevant Code Path

The current vision_analyze flow is roughly:

  • tools/vision_tools.py
  • async_call_llm(task="vision")
  • resolve_vision_provider_client(...)
  • resolve_provider_client("openai-codex", is_vision=True)
  • _build_codex_client(...)
  • CodexAuxiliaryClient
  • internally responses.stream(...)

A key detail is that the main chat path and the auxiliary vision path do not necessarily use the exact same client/transport construction path.

Expected Behavior

At least one of these should happen:

  1. openai-codex auxiliary vision should work similarly to main chat
  2. if Cloudflare challenge HTML is returned, Hermes should surface it as a clearer provider-level failure instead of a generic image-analysis failure
  3. if Codex auxiliary vision is currently unreliable by design, Hermes should detect that explicitly and guide the user toward fallback/provider switch behavior

Actual Behavior

  • main chat: works
  • vision_analyze: returns Cloudflare challenge HTML
  • even text-only calls through the same task="vision" path fail the same way

Environment

  • Hermes: v0.13.0 (2026.5.7)
  • OS: Linux
  • provider: openai-codex
  • vision model: gpt-5.4
  • endpoint: https://chatgpt.com/backend-api/codex

Summary

This looks less like a generic image-processing bug and more like a failure affecting the entire auxiliary vision request family built around:

  • openai-codex
  • task="vision"
  • CodexAuxiliaryClient
  • responses.stream()

It is clearly related to #13834, but this issue is a narrower repro where main chat works while only the auxiliary vision path fails.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING