hermes - ✅(Solved) Fix GLM models accessed via API gateways are incorrectly treated as Ollama backends [1 pull requests, 1 comments, 2 participants]

hermes2026-04-25 18:19:01

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#15751•Fetched 2026-04-26 05:25:17

View on GitHub

Comments

Participants

Timeline

Reactions

Author

happy5318

Participants

alt-glitch

happy5318

Timeline (top)

labeled ×3commented ×1cross-referenced ×1

Root Cause

The _is_ollama_glm_backend() method in run_agent.py uses is_local_endpoint() to detect Ollama backends. However, is_local_endpoint() returns True for any RFC-1918 private IP address, including:

192.168.x.x
10.x.x.x
172.16-31.x.x

These private IPs may host API gateways (New API, One API, etc.) that properly report finish_reason, not actual Ollama instances. The workaround for Ollama/GLM stop misreports should NOT apply to these cases.

Fix Action

Fix / Workaround

When using GLM models (e.g., glm-5, glm-4) through API gateways like New API deployed on private IP addresses (e.g., 192.168.x.x), Hermes incorrectly treats them as Ollama-hosted GLM models and triggers the _should_treat_stop_as_truncated() workaround.

GLM models accessed through API gateways should NOT trigger the Ollama/GLM stop misreport workaround, regardless of whether the gateway is on a private IP.

PR fix notes

PR #15752: fix: GLM models via API gateways incorrectly treated as Ollama backends

Repository: NousResearch/hermes-agent
Author: happy5318
State: open | merged: False
Link: https://github.com/NousResearch/hermes-agent/pull/15752

Description (problem / solution / changelog)

Fix: GLM models via API gateways incorrectly treated as Ollama backends

Summary

This PR fixes a false positive in _is_ollama_glm_backend() that caused GLM models accessed through API gateways on private IPs to be incorrectly treated as Ollama-hosted models.

Problem

The original implementation used is_local_endpoint() as a fallback detection mechanism. This function returns True for all RFC-1918 private IP addresses (192.168.x.x, 10.x.x.x, 172.16-31.x.x), which includes:

Local Ollama instances ✅ (intended)
API gateways like New API on private IPs ❌ (false positive)

When a GLM model is accessed via an API gateway on a private IP, the workaround for Ollama/GLM stop misreports was incorrectly triggered, causing:

Normal finish_reason="stop" responses being treated as truncated
Unnecessary continuation messages injected into conversations
Internal system messages visible to users

Solution

Modified _is_ollama_glm_backend() to only return True for:

Ollama endpoints: URLs containing "ollama" or using port :11434
z.ai provider: Always treat as affected (existing behavior)

Removed the is_local_endpoint() fallback because:

API gateways (New API, One API, etc.) properly report finish_reason
The Ollama/GLM stop misreport workaround is specifically for local Ollama instances
Private IPs don't necessarily mean "Ollama instance"

Changes

Modified _is_ollama_glm_backend() in run_agent.py
Added detailed docstring explaining the rationale
Added inline comments for clarity

Testing

Tested with:

GLM-5 model via New API gateway on private IP → No longer triggers false positive
Local Ollama with GLM model → Still correctly detected (port 11434)
z.ai provider → Still correctly detected

Fixes #15751

Changed files

run_agent.py (modified, +20/-3)

Code Example

model_aliases:
     my-glm:
       model: glm-5
       base_url: http://192.168.1.100:3000/v1
       provider: custom:newapi

RAW_BUFFERClick to expand / collapse

Issue: GLM models accessed via API gateways are incorrectly treated as Ollama backends

Problem Description

This causes:

Normal finish_reason="stop" responses being misidentified as truncated
Unnecessary continuation messages being injected into the conversation
Users seeing internal system messages like [System: Your previous response was truncated...] in the UI

Root Cause

192.168.x.x
10.x.x.x
172.16-31.x.x

Reproduction Steps

Deploy New API gateway on a private IP (e.g., http://192.168.1.100:3000)

Configure a GLM model through New API:

model_aliases:
  my-glm:
    model: glm-5
    base_url: http://192.168.1.100:3000/v1
    provider: custom:newapi

Start a conversation and observe responses
Check session JSON file - you'll see finish_reason="stop" from API but Hermes injects truncation continuation message

Expected Behavior

GLM models accessed through API gateways should NOT trigger the Ollama/GLM stop misreport workaround, regardless of whether the gateway is on a private IP.

Proposed Fix

Modify _is_ollama_glm_backend() to only return True for:

URLs containing "ollama"
URLs using Ollama's default port :11434
z.ai provider (existing behavior)

Remove the fallback to is_local_endpoint() for GLM models, as it's too broad and causes false positives.

Environment

Hermes Agent version: Latest (post v0.8.x)
Model: GLM-5 via New API gateway
Base URL: Private IP (e.g., 192.168.x.x)

extent analysis

TL;DR

Modify the _is_ollama_glm_backend() method to accurately identify Ollama backends without relying on is_local_endpoint() for GLM models.

Guidance

Review the _is_ollama_glm_backend() method in run_agent.py to understand the current logic for detecting Ollama backends.
Update the method to check for specific conditions that uniquely identify Ollama backends, such as URLs containing "ollama", using Ollama's default port :11434, or the z.ai provider.
Remove the fallback to is_local_endpoint() for GLM models to prevent false positives.
Test the updated method with various scenarios, including GLM models accessed through API gateways on private IPs, to ensure the workaround is not triggered unnecessarily.

Example

def _is_ollama_glm_backend(url):
    # Check for Ollama-specific conditions
    if "ollama" in url or url.endswith(":11434") or url.contains("z.ai"):
        return True
    # Remove fallback to is_local_endpoint() for GLM models
    return False

Notes

The proposed fix assumes that the _is_ollama_glm_backend() method is the primary cause of the issue. However, additional testing and verification may be necessary to ensure the fix resolves the problem entirely.

Recommendation

Apply the proposed workaround by modifying the _is_ollama_glm_backend() method to accurately identify Ollama backends, as this approach directly addresses the root cause of the issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #indexing error #inference speed #output truncation #response parsing

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix GLM models accessed via API gateways are incorrectly treated as Ollama backends [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #15752: fix: GLM models via API gateways incorrectly treated as Ollama backends

Description (problem / solution / changelog)

Fix: GLM models via API gateways incorrectly treated as Ollama backends

Summary

Problem

Solution

Changes

Testing

Related

Changed files

Code Example

Issue: GLM models accessed via API gateways are incorrectly treated as Ollama backends

Problem Description

Root Cause

Reproduction Steps

Expected Behavior

Proposed Fix

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING