hermes - 💡(How to fix) Fix [Bug]: Azure Foundry ignores explicit model.api_mode and routes chat-completions deployments to Responses API

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Azure Foundry runtime can route explicit chat-completions deployments through the Responses API even when config.yaml sets:

model:
  provider: azure-foundry
  api_mode: chat_completions
  default: gpt-5.5

This breaks Azure Foundry deployments that support /chat/completions but not /responses.

Error Message

Error code: 400 - {'error': {'message': 'This model is not supported by Responses API.', 'type': 'invalid_request_error'}}

Additional Logs / Traceback (optional)

Root Cause

Root Cause Analysis (optional)

Fix Action

Fix / Workaround

Pseudo patch:

Code Example

model:
  provider: azure-foundry
  api_mode: chat_completions
  default: gpt-5.5

---

model:
  provider: azure-foundry
  api_mode: chat_completions
  default: gpt-5.5

model_aliases:
  grok-4-20-reasoning:
    model: grok-4-20-reasoning
    provider: azure-foundry
  kimi-k2.6:
    model: Kimi-K2.6
    provider: azure-foundry

---

hermes chat -Q --provider azure-foundry -m grok-4-20-reasoning -q 'Reply exactly: OK'
hermes chat -Q --provider azure-foundry -m Kimi-K2.6 -q 'Reply exactly: OK'

---

This model is not supported by Responses API.

---

if not model_cfg.get("api_mode"):
    effective_model = str(target_model or model_cfg.get("default") or "").strip()
    if effective_model and cfg_api_mode != "anthropic_messages":
        inferred = azure_foundry_model_api_mode(effective_model)
        if inferred:
            cfg_api_mode = inferred

---

model:
  provider: azure-foundry
  api_mode: chat_completions
  default: gpt-5.5

model_aliases:
  grok-4-20-reasoning:
    model: grok-4-20-reasoning
    provider: azure-foundry
  kimi-k2.6:
    model: Kimi-K2.6
    provider: azure-foundry

---

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["AZURE_FOUNDRY_API_KEY"],
    base_url=os.environ["AZURE_FOUNDRY_BASE_URL"].rstrip("/"),
)

for model in ["grok-4-20-reasoning", "Kimi-K2.6"]:
    r = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "Reply exactly: OK"}],
        max_completion_tokens=16,
    )
    print(model, r.choices[0].message.content)

---

hermes chat -Q --provider azure-foundry -m grok-4-20-reasoning -q 'Reply exactly: OK'
hermes chat -Q --provider azure-foundry -m Kimi-K2.6 -q 'Reply exactly: OK'

---

This model is not supported by Responses API.

---

hermes chat -Q --provider azure-foundry -m grok-4-20-reasoning -q 'Reply exactly: OK'
# OK

hermes chat -Q --provider azure-foundry -m Kimi-K2.6 -q 'Reply exactly: OK'
# OK

---

Error code: 400 - {'error': {'message': 'This model is not supported by Responses API.', 'type': 'invalid_request_error'}}

---

Report       https://paste.rs/Bsy8k
  agent.log    https://dpaste.com/FXZAMKVSG
  gateway.log  https://paste.rs/WjKVM

---
RAW_BUFFERClick to expand / collapse

Bug Description

Summary

Azure Foundry runtime can route explicit chat-completions deployments through the Responses API even when config.yaml sets:

model:
  provider: azure-foundry
  api_mode: chat_completions
  default: gpt-5.5

This breaks Azure Foundry deployments that support /chat/completions but not /responses.

Environment

Hermes Agent v0.13.0 (2026.5.7) Commit: 524cbabd8 OpenAI SDK: 2.31.0 Provider: azure-foundry

Repro

Config:

model:
  provider: azure-foundry
  api_mode: chat_completions
  default: gpt-5.5

model_aliases:
  grok-4-20-reasoning:
    model: grok-4-20-reasoning
    provider: azure-foundry
  kimi-k2.6:
    model: Kimi-K2.6
    provider: azure-foundry

Run:

hermes chat -Q --provider azure-foundry -m grok-4-20-reasoning -q 'Reply exactly: OK'
hermes chat -Q --provider azure-foundry -m Kimi-K2.6 -q 'Reply exactly: OK'

Observed errors:

This model is not supported by Responses API.

Direct OpenAI SDK probe against the same Azure Foundry endpoint shows:

  • client.chat.completions.create(model="grok-4-20-reasoning", ...) succeeds
  • client.responses.create(model="grok-4-20-reasoning", ...) fails
  • client.chat.completions.create(model="Kimi-K2.6", ...) succeeds
  • client.responses.create(model="Kimi-K2.6", ...) fails

Expected

If model.api_mode: chat_completions is explicitly configured, Hermes should use chat completions for Azure Foundry unless the user explicitly chooses another API mode.

Suspected Cause

_resolve_azure_foundry_runtime() infers API mode from model family and can override the configured api_mode.

Suggested Fix

Only run Azure Foundry model-family API mode inference when model.api_mode is absent/unset. Explicit config should win.

Pseudo patch:

if not model_cfg.get("api_mode"):
    effective_model = str(target_model or model_cfg.get("default") or "").strip()
    if effective_model and cfg_api_mode != "anthropic_messages":
        inferred = azure_foundry_model_api_mode(effective_model)
        if inferred:
            cfg_api_mode = inferred

Steps to Reproduce

  1. Configure Hermes with Azure Foundry and an explicit chat-completions API mode:
model:
  provider: azure-foundry
  api_mode: chat_completions
  default: gpt-5.5

model_aliases:
  grok-4-20-reasoning:
    model: grok-4-20-reasoning
    provider: azure-foundry
  kimi-k2.6:
    model: Kimi-K2.6
    provider: azure-foundry
  1. Ensure AZURE_FOUNDRY_BASE_URL and AZURE_FOUNDRY_API_KEY point to an Azure Foundry OpenAI-compatible endpoint.

  2. Verify the deployments support Chat Completions directly:

from openai import OpenAI
import os

client = OpenAI(
    api_key=os.environ["AZURE_FOUNDRY_API_KEY"],
    base_url=os.environ["AZURE_FOUNDRY_BASE_URL"].rstrip("/"),
)

for model in ["grok-4-20-reasoning", "Kimi-K2.6"]:
    r = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": "Reply exactly: OK"}],
        max_completion_tokens=16,
    )
    print(model, r.choices[0].message.content)
  1. Run the same deployments through Hermes:
hermes chat -Q --provider azure-foundry -m grok-4-20-reasoning -q 'Reply exactly: OK'
hermes chat -Q --provider azure-foundry -m Kimi-K2.6 -q 'Reply exactly: OK'
  1. Observe Hermes failing with:
This model is not supported by Responses API.

Expected Behavior

When model.api_mode: chat_completions is explicitly configured for provider: azure-foundry, Hermes should use the Chat Completions transport for Azure Foundry calls.

The model-family inference that upgrades Azure Foundry models to the Responses API should not override an explicit user-provided model.api_mode.

Expected results:

hermes chat -Q --provider azure-foundry -m grok-4-20-reasoning -q 'Reply exactly: OK'
# OK

hermes chat -Q --provider azure-foundry -m Kimi-K2.6 -q 'Reply exactly: OK'
# OK

This should match the direct OpenAI SDK behavior against the same Azure Foundry endpoint, where client.chat.completions.create(...) succeeds for both deployments.

Actual Behavior

Hermes routes these Azure Foundry deployments through the Responses API despite model.api_mode: chat_completions.

Observed errors:

Error code: 400 - {'error': {'message': 'This model is not supported by Responses API.', 'type': 'invalid_request_error'}}

Direct SDK calls to /chat/completions succeed for the same deployments, while direct SDK calls to /responses fail. That suggests Hermes is selecting the wrong transport, not that the deployments are broken.

Affected Component

Setup / Installation

Messaging Platform (if gateway-related)

No response

Debug Report

Report       https://paste.rs/Bsy8k
  agent.log    https://dpaste.com/FXZAMKVSG
  gateway.log  https://paste.rs/WjKVM

Operating System

Macos Sonoma

Python Version

No response

Hermes Version

No response

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

No response

Proposed Fix (optional)

No response

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Bug]: Azure Foundry ignores explicit model.api_mode and routes chat-completions deployments to Responses API