hermes - 💡(How to fix) Fix [Bug]: Vision fallback (auxiliary.vision) ignores SOUL.md system prompt — session context lost during image analysis [1 pull requests]

Code Example

Report       https://paste.rs/h4J8l
agent.log    https://paste.rs/Rvl8T
gateway.log  https://paste.rs/B42SF

---



---

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": comprehensive_prompt},
            {"type": "image_url", "image_url": {"url": image_data_url}}
        ]
    }
]

---

instructions = "You are a helpful assistant."

---

User sends image
  → gateway/run.py: _enrich_message_with_vision() (line 7409)
    → tools/vision_tools.py: vision_analyze_tool() (line 575)
      → agent/auxiliary_client.py: async_call_llm() → resolve_vision_provider_client()
        → For codex: _CodexCompletionsAdapter.create() → instructions hardcoded
        → For other providers: messages passed as-is (no system message)

---

Other providers → correctly populates

Bug Description

When using a non-vision LLM (e.g., DeepSeek V4 Flash) as the main model, user-attached images are auto-processed via gateway/run.py:_enrich_message_with_vision() → tools/vision_tools.py:vision_analyze_tool() → agent/auxiliary_client.py (Codex Responses API or other providers). The outgoing vision API call never includes the current session's system prompt (SOUL.md / persona). For the openai-codex path specifically, instructions is hardcoded to "You are a helpful assistant.".

This means any persona customization — "you are a graphic designer", "you are a security researcher", etc. — is completely lost during the image description step. The vision model describes the image in generic terms, and the main model receives a description that doesn't reflect the intended analytical lens.

Steps to Reproduce

Set up Hermes with a non-vision main model (e.g., deepseek-v4-flash)
Configure auxiliary.vision with any provider (e.g., openai-codex, model gpt-5.5)
Set a persona in SOUL.md, e.g., "You are an expert UI/UX designer. Focus on colors, spacing, typography."
Send an image in chat (Discord, Telegram, or CLI with --image)
Observe the auto-generated description — it's generic; no persona-specific lens applied
The main model responds based on this generic description

Expected Behavior

The auxiliary vision call should carry the current session's system prompt / SOUL.md persona, so the image description respects the user's configured context. For the Codex path, instructions should reflect the session's system prompt rather than the hardcoded default.

Actual Behavior

The messages array built by vision_analyze_tool() contains only role: "user" — no role: "system" message is ever included. For the Codex path, the adapter defaults to instructions = "You are a helpful assistant.". For other providers (OpenRouter, Anthropic, etc.), no system message is passed either.

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp), Tools (terminal, file ops, web, code execution, etc.), Agent Core (conversation loop, context compression, memory)

Messaging Platform (if gateway-related)

Discord, Telegram, Slack, WhatsApp

Debug Report

Report       https://paste.rs/h4J8l
agent.log    https://paste.rs/Rvl8T
gateway.log  https://paste.rs/B42SF

Operating System

Linux (WSL2 on Windows), 6.6.87.2-microsoft-standard-WSL2 x86_64

Python Version

3.11.15

Hermes Version

Hermes Agent v0.13.0 (2026.5.7), up to date

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Two sites:

tools/vision_tools.py lines 532–548 — vision_analyze_tool() builds messages with only role: "user". No role: "system" message is added regardless of current session context.

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": comprehensive_prompt},
            {"type": "image_url", "image_url": {"url": image_data_url}}
        ]
    }
]

agent/auxiliary_client.py line 280 — _CodexCompletionsAdapter.create() hardcodes:

instructions = "You are a helpful assistant."

It does check for a role: "system" message and would use it if present — but vision_tools.py never sends one.

Call chain:

User sends image
  → gateway/run.py: _enrich_message_with_vision() (line 7409)
    → tools/vision_tools.py: vision_analyze_tool() (line 575)
      → agent/auxiliary_client.py: async_call_llm() → resolve_vision_provider_client()
        → For codex: _CodexCompletionsAdapter.create() → instructions hardcoded
        → For other providers: messages passed as-is (no system message)

Note: This affects ALL vision paths (gateway auto-description, CLI preprocessing, and explicit vision_analyze tool calls), not just the openai-codex fallback.

Proposed Fix (optional)

vision_analyze_tool() should accept the current session's system prompt and include it as a role: "system" message in the vision API call. This would propagate to: Codex adapter → correctly populates instructions Other providers → correctly populates system message(2/3) The Codex adapter's default fallback at auxiliary_client.py:280 is already designed to be overridden — it just never receives a system message.

Are you willing to submit a PR for this?

I'd like to fix this myself and submit a PR

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Bug]: Vision fallback (auxiliary.vision) ignores SOUL.md system prompt — session context lost during image analysis [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fixed

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Two sites:

Call chain:

Proposed Fix (optional)

Are you willing to submit a PR for this?

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Bug]: Vision fallback (auxiliary.vision) ignores SOUL.md system prompt — session context lost during image analysis [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fixed

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Two sites:

Call chain:

Proposed Fix (optional)

Are you willing to submit a PR for this?

Still need to ship something?

RELATED_DISCOVERY

TRENDING