hermes - 💡(How to fix) Fix [Bug]: Vision fallback (auxiliary.vision) ignores SOUL.md system prompt — session context lost during image analysis [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fixed

Code Example

Report       https://paste.rs/h4J8l
agent.log    https://paste.rs/Rvl8T
gateway.log  https://paste.rs/B42SF

---



---

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": comprehensive_prompt},
            {"type": "image_url", "image_url": {"url": image_data_url}}
        ]
    }
]

---

instructions = "You are a helpful assistant."

---

User sends image
  → gateway/run.py: _enrich_message_with_vision() (line 7409)
    → tools/vision_tools.py: vision_analyze_tool() (line 575)
      → agent/auxiliary_client.py: async_call_llm()resolve_vision_provider_client()
For codex: _CodexCompletionsAdapter.create() → instructions hardcoded
For other providers: messages passed as-is (no system message)

---

Other providers → correctly populates
RAW_BUFFERClick to expand / collapse

Bug Description

When using a non-vision LLM (e.g., DeepSeek V4 Flash) as the main model, user-attached images are auto-processed via gateway/run.py:_enrich_message_with_vision()tools/vision_tools.py:vision_analyze_tool()agent/auxiliary_client.py (Codex Responses API or other providers). The outgoing vision API call never includes the current session's system prompt (SOUL.md / persona). For the openai-codex path specifically, instructions is hardcoded to "You are a helpful assistant.".

This means any persona customization — "you are a graphic designer", "you are a security researcher", etc. — is completely lost during the image description step. The vision model describes the image in generic terms, and the main model receives a description that doesn't reflect the intended analytical lens.

Steps to Reproduce

  1. Set up Hermes with a non-vision main model (e.g., deepseek-v4-flash)
  2. Configure auxiliary.vision with any provider (e.g., openai-codex, model gpt-5.5)
  3. Set a persona in SOUL.md, e.g., "You are an expert UI/UX designer. Focus on colors, spacing, typography."
  4. Send an image in chat (Discord, Telegram, or CLI with --image)
  5. Observe the auto-generated description — it's generic; no persona-specific lens applied
  6. The main model responds based on this generic description

Expected Behavior

The auxiliary vision call should carry the current session's system prompt / SOUL.md persona, so the image description respects the user's configured context. For the Codex path, instructions should reflect the session's system prompt rather than the hardcoded default.

Actual Behavior

The messages array built by vision_analyze_tool() contains only role: "user" — no role: "system" message is ever included. For the Codex path, the adapter defaults to instructions = "You are a helpful assistant.". For other providers (OpenRouter, Anthropic, etc.), no system message is passed either.

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp), Tools (terminal, file ops, web, code execution, etc.), Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

Discord, Telegram, Slack, WhatsApp

Debug Report

Report       https://paste.rs/h4J8l
agent.log    https://paste.rs/Rvl8T
gateway.log  https://paste.rs/B42SF

Operating System

Linux (WSL2 on Windows), 6.6.87.2-microsoft-standard-WSL2 x86_64

Python Version

3.11.15

Hermes Version

Hermes Agent v0.13.0 (2026.5.7), up to date

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Two sites:

  1. tools/vision_tools.py lines 532–548vision_analyze_tool() builds messages with only role: "user". No role: "system" message is added regardless of current session context.
messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": comprehensive_prompt},
            {"type": "image_url", "image_url": {"url": image_data_url}}
        ]
    }
]
  1. agent/auxiliary_client.py line 280_CodexCompletionsAdapter.create() hardcodes:
instructions = "You are a helpful assistant."

It does check for a role: "system" message and would use it if present — but vision_tools.py never sends one.

Call chain:

User sends image
  → gateway/run.py: _enrich_message_with_vision() (line 7409)
    → tools/vision_tools.py: vision_analyze_tool() (line 575)
      → agent/auxiliary_client.py: async_call_llm() → resolve_vision_provider_client()
        → For codex: _CodexCompletionsAdapter.create() → instructions hardcoded
        → For other providers: messages passed as-is (no system message)

Note: This affects ALL vision paths (gateway auto-description, CLI preprocessing, and explicit vision_analyze tool calls), not just the openai-codex fallback.

Proposed Fix (optional)

vision_analyze_tool() should accept the current session's system prompt and include it as a role: "system" message in the vision API call. This would propagate to: Codex adapter → correctly populates instructions Other providers → correctly populates system message(2/3) The Codex adapter's default fallback at auxiliary_client.py:280 is already designed to be overridden — it just never receives a system message.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING