hermes - 💡(How to fix) Fix [Bug]: Hermes-Agent keeps probing non-existent google/gemini-3-flash-preview model when using LM Studio OpenAI-compatible API

Error Message

Environment:

Hermes-Agent: latest version Backend: LM Studio OpenAI-compatible API OS: macOS

Local model:

google/gemma-4-26b-a4b

Observed LM Studio logs:

2026-05-11 22:23:21 [DEBUG] Received request: GET to /api/v1/models

2026-05-11 22:23:21 [INFO] Returning 3 models from v1 API

2026-05-11 22:23:21 [DEBUG] Received request: GET to /v1/models/google/gemini-3-flash-preview

2026-05-11 22:23:21 [ERROR] Error: Model with identifier 'google/gemini-3-flash-preview' not found

2026-05-11 22:23:21 [DEBUG] Received request: GET to /v1/models

2026-05-11 22:23:21 [INFO] Returning { "data": [ { "id": "google/gemma-4-26b-a4b", "object": "model", "owned_by": "organization_owner" } ], "object": "list" } After LM Studio loads the model and is ready, if I'm lucky enough that my configured gemma4-26b-a4b works, then sending a message in the Hermes CLI will take three minutes to trigger LM Studio to start working. The logs in LM Studio are as follows:

2026-05-12 09:39:33 [DEBUG]

warmup: warmup with image size = 768 x 768

2026-05-12 09:39:33 [DEBUG]

alloc_compute_meta: MTL0 compute buffer size = 150.63 MiB

alloc_compute_meta: CPU compute buffer size = 6.77 MiB

alloc_compute_meta: graph splits = 1, nodes = 1569

2026-05-12 09:39:33 [DEBUG]

warmup: flash attention is enabled

srv load_model: loaded multimodal

Code Example

Environment:

Hermes-Agent: latest version
Backend: LM Studio OpenAI-compatible API
OS: macOS

Local model:

google/gemma-4-26b-a4b

Observed LM Studio logs:

2026-05-11 22:23:21 [DEBUG]
Received request: GET to /api/v1/models

2026-05-11 22:23:21 [INFO]
Returning 3 models from v1 API

2026-05-11 22:23:21 [DEBUG]
Received request: GET to /v1/models/google/gemini-3-flash-preview

2026-05-11 22:23:21 [ERROR]
Error: Model with identifier 'google/gemini-3-flash-preview' not found

2026-05-11 22:23:21 [DEBUG]
Received request: GET to /v1/models

2026-05-11 22:23:21 [INFO]
Returning {
  "data": [
    {
      "id": "google/gemma-4-26b-a4b",
      "object": "model",
      "owned_by": "organization_owner"
    }
  ],
  "object": "list"
}
After LM Studio loads the model and is ready, if I'm lucky enough that my configured gemma4-26b-a4b works, then sending a message in the Hermes CLI will take three minutes to trigger LM Studio to start working. The logs in LM Studio are as follows:

2026-05-12 09:39:33 [DEBUG]

warmup: warmup with image size = 768 x 768

2026-05-12 09:39:33 [DEBUG]

alloc_compute_meta: MTL0 compute buffer size = 150.63 MiB

alloc_compute_meta: CPU compute buffer size = 6.77 MiB

alloc_compute_meta: graph splits = 1, nodes = 1569

2026-05-12 09:39:33 [DEBUG]

warmup: flash attention is enabled

srv load_model: loaded multimodal

---

No Python traceback observed.

The issue appears during model probing / provider initialization.

Bug Description

When using Hermes-Agent with LM Studio OpenAI-compatible local API, Hermes-Agent repeatedly attempts to query a non-existent model:

google/gemini-3-flash-preview

even though LM Studio only exposes locally loaded models such as:

google/gemma-4-26b-a4b

This causes repeated model probe failures and prevents stable initialization / usage.

The issue appears to come from Hermes-Agent internally defaulting to a Gemini provider model name instead of respecting the actual /v1/models response returned by LM Studio.

Steps to Reproduce

Install and launch LM Studio

Load a local model:

google/gemma-4-26b-a4b Enable LM Studio OpenAI-compatible API server Configure Hermes-Agent to use the LM Studio endpoint Start Hermes-Agent

LM Studio logs then show repeated requests like:

GET /api/v1/models GET /v1/models/google/gemini-3-flash-preview

followed by:

Error: Model with identifier 'google/gemini-3-flash-preview' not found

Expected Behavior

Hermes-Agent should:

Respect the model IDs returned by:

GET /v1/models

Use the actually available local model:

google/gemma-4-26b-a4b Avoid probing hardcoded Gemini cloud model names unless explicitly configured

Actual Behavior

Hermes-Agent repeatedly probes:

google/gemini-3-flash-preview

even though:

GET /v1/models

returns:

{ "data": [ { "id": "google/gemma-4-26b-a4b", "object": "model", "owned_by": "organization_owner" } ], "object": "list" }

This leads to repeated initialization errors and model lookup failures.

Affected Component

CLI (interactive chat)

Messaging Platform (if gateway-related)

N/A (CLI only)

Debug Report

Environment:

Hermes-Agent: latest version
Backend: LM Studio OpenAI-compatible API
OS: macOS

Local model:

google/gemma-4-26b-a4b

Observed LM Studio logs:

2026-05-11 22:23:21 [DEBUG]
Received request: GET to /api/v1/models

2026-05-11 22:23:21 [INFO]
Returning 3 models from v1 API

2026-05-11 22:23:21 [DEBUG]
Received request: GET to /v1/models/google/gemini-3-flash-preview

2026-05-11 22:23:21 [ERROR]
Error: Model with identifier 'google/gemini-3-flash-preview' not found

2026-05-11 22:23:21 [DEBUG]
Received request: GET to /v1/models

2026-05-11 22:23:21 [INFO]
Returning {
  "data": [
    {
      "id": "google/gemma-4-26b-a4b",
      "object": "model",
      "owned_by": "organization_owner"
    }
  ],
  "object": "list"
}
After LM Studio loads the model and is ready, if I'm lucky enough that my configured gemma4-26b-a4b works, then sending a message in the Hermes CLI will take three minutes to trigger LM Studio to start working. The logs in LM Studio are as follows:

2026-05-12 09:39:33 [DEBUG]

warmup: warmup with image size = 768 x 768

2026-05-12 09:39:33 [DEBUG]

alloc_compute_meta: MTL0 compute buffer size = 150.63 MiB

alloc_compute_meta: CPU compute buffer size = 6.77 MiB

alloc_compute_meta: graph splits = 1, nodes = 1569

2026-05-12 09:39:33 [DEBUG]

warmup: flash attention is enabled

srv load_model: loaded multimodal

Operating System

Ubuntu 24.04

Python Version

3.11.15

Hermes Version

0.13.0

Additional Logs / Traceback (optional)

No Python traceback observed.

The issue appears during model probing / provider initialization.

Root Cause Analysis (optional)

Possible causes:

Hermes-Agent may contain a hardcoded default Gemini model:

google/gemini-3-flash-preview Hermes-Agent may be mixing: Google Gemini provider logic OpenAI-compatible provider logic Hermes-Agent may ignore actual /v1/models responses and probe a fallback/default model instead. Model auto-discovery logic may not correctly support LM Studio local model IDs.

Proposed Fix (optional)

No response

Are you willing to submit a PR for this?

I'd like to fix this myself and submit a PR

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix [Bug]: Hermes-Agent keeps probing non-existent google/gemini-3-flash-preview model when using LM Studio OpenAI-compatible API

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root Cause Analysis (optional)

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Proposed Fix (optional)

Are you willing to submit a PR for this?

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix [Bug]: Hermes-Agent keeps probing non-existent google/gemini-3-flash-preview model when using LM Studio OpenAI-compatible API

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Root Cause Analysis (optional)

Code Example

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Affected Component

Messaging Platform (if gateway-related)

Debug Report

Operating System

Python Version

Hermes Version

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

Proposed Fix (optional)

Are you willing to submit a PR for this?

Still need to ship something?

RELATED_DISCOVERY

TRENDING