hermes - ✅(Solved) Fix GLM models accessed via API gateways are incorrectly treated as Ollama backends [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#15751Fetched 2026-04-26 05:25:17
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Author
Timeline (top)
labeled ×3commented ×1cross-referenced ×1

Root Cause

The _is_ollama_glm_backend() method in run_agent.py uses is_local_endpoint() to detect Ollama backends. However, is_local_endpoint() returns True for any RFC-1918 private IP address, including:

  • 192.168.x.x
  • 10.x.x.x
  • 172.16-31.x.x

These private IPs may host API gateways (New API, One API, etc.) that properly report finish_reason, not actual Ollama instances. The workaround for Ollama/GLM stop misreports should NOT apply to these cases.

Fix Action

Fix / Workaround

When using GLM models (e.g., glm-5, glm-4) through API gateways like New API deployed on private IP addresses (e.g., 192.168.x.x), Hermes incorrectly treats them as Ollama-hosted GLM models and triggers the _should_treat_stop_as_truncated() workaround.

These private IPs may host API gateways (New API, One API, etc.) that properly report finish_reason, not actual Ollama instances. The workaround for Ollama/GLM stop misreports should NOT apply to these cases.

GLM models accessed through API gateways should NOT trigger the Ollama/GLM stop misreport workaround, regardless of whether the gateway is on a private IP.

PR fix notes

PR #15752: fix: GLM models via API gateways incorrectly treated as Ollama backends

Description (problem / solution / changelog)

Fix: GLM models via API gateways incorrectly treated as Ollama backends

Summary

This PR fixes a false positive in _is_ollama_glm_backend() that caused GLM models accessed through API gateways on private IPs to be incorrectly treated as Ollama-hosted models.

Problem

The original implementation used is_local_endpoint() as a fallback detection mechanism. This function returns True for all RFC-1918 private IP addresses (192.168.x.x, 10.x.x.x, 172.16-31.x.x), which includes:

  • Local Ollama instances ✅ (intended)
  • API gateways like New API on private IPs ❌ (false positive)

When a GLM model is accessed via an API gateway on a private IP, the workaround for Ollama/GLM stop misreports was incorrectly triggered, causing:

  • Normal finish_reason="stop" responses being treated as truncated
  • Unnecessary continuation messages injected into conversations
  • Internal system messages visible to users

Solution

Modified _is_ollama_glm_backend() to only return True for:

  1. Ollama endpoints: URLs containing "ollama" or using port :11434
  2. z.ai provider: Always treat as affected (existing behavior)

Removed the is_local_endpoint() fallback because:

  • API gateways (New API, One API, etc.) properly report finish_reason
  • The Ollama/GLM stop misreport workaround is specifically for local Ollama instances
  • Private IPs don't necessarily mean "Ollama instance"

Changes

  • Modified _is_ollama_glm_backend() in run_agent.py
  • Added detailed docstring explaining the rationale
  • Added inline comments for clarity

Testing

Tested with:

  • GLM-5 model via New API gateway on private IP → No longer triggers false positive
  • Local Ollama with GLM model → Still correctly detected (port 11434)
  • z.ai provider → Still correctly detected

Related

Fixes #15751

Changed files

  • run_agent.py (modified, +20/-3)

Code Example

model_aliases:
     my-glm:
       model: glm-5
       base_url: http://192.168.1.100:3000/v1
       provider: custom:newapi
RAW_BUFFERClick to expand / collapse

Issue: GLM models accessed via API gateways are incorrectly treated as Ollama backends

Problem Description

When using GLM models (e.g., glm-5, glm-4) through API gateways like New API deployed on private IP addresses (e.g., 192.168.x.x), Hermes incorrectly treats them as Ollama-hosted GLM models and triggers the _should_treat_stop_as_truncated() workaround.

This causes:

  1. Normal finish_reason="stop" responses being misidentified as truncated
  2. Unnecessary continuation messages being injected into the conversation
  3. Users seeing internal system messages like [System: Your previous response was truncated...] in the UI

Root Cause

The _is_ollama_glm_backend() method in run_agent.py uses is_local_endpoint() to detect Ollama backends. However, is_local_endpoint() returns True for any RFC-1918 private IP address, including:

  • 192.168.x.x
  • 10.x.x.x
  • 172.16-31.x.x

These private IPs may host API gateways (New API, One API, etc.) that properly report finish_reason, not actual Ollama instances. The workaround for Ollama/GLM stop misreports should NOT apply to these cases.

Reproduction Steps

  1. Deploy New API gateway on a private IP (e.g., http://192.168.1.100:3000)
  2. Configure a GLM model through New API:
    model_aliases:
      my-glm:
        model: glm-5
        base_url: http://192.168.1.100:3000/v1
        provider: custom:newapi
  3. Start a conversation and observe responses
  4. Check session JSON file - you'll see finish_reason="stop" from API but Hermes injects truncation continuation message

Expected Behavior

GLM models accessed through API gateways should NOT trigger the Ollama/GLM stop misreport workaround, regardless of whether the gateway is on a private IP.

Proposed Fix

Modify _is_ollama_glm_backend() to only return True for:

  1. URLs containing "ollama"
  2. URLs using Ollama's default port :11434
  3. z.ai provider (existing behavior)

Remove the fallback to is_local_endpoint() for GLM models, as it's too broad and causes false positives.

Environment

  • Hermes Agent version: Latest (post v0.8.x)
  • Model: GLM-5 via New API gateway
  • Base URL: Private IP (e.g., 192.168.x.x)

extent analysis

TL;DR

Modify the _is_ollama_glm_backend() method to accurately identify Ollama backends without relying on is_local_endpoint() for GLM models.

Guidance

  • Review the _is_ollama_glm_backend() method in run_agent.py to understand the current logic for detecting Ollama backends.
  • Update the method to check for specific conditions that uniquely identify Ollama backends, such as URLs containing "ollama", using Ollama's default port :11434, or the z.ai provider.
  • Remove the fallback to is_local_endpoint() for GLM models to prevent false positives.
  • Test the updated method with various scenarios, including GLM models accessed through API gateways on private IPs, to ensure the workaround is not triggered unnecessarily.

Example

def _is_ollama_glm_backend(url):
    # Check for Ollama-specific conditions
    if "ollama" in url or url.endswith(":11434") or url.contains("z.ai"):
        return True
    # Remove fallback to is_local_endpoint() for GLM models
    return False

Notes

The proposed fix assumes that the _is_ollama_glm_backend() method is the primary cause of the issue. However, additional testing and verification may be necessary to ensure the fix resolves the problem entirely.

Recommendation

Apply the proposed workaround by modifying the _is_ollama_glm_backend() method to accurately identify Ollama backends, as this approach directly addresses the root cause of the issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING