hermes - ✅(Solved) Fix DeepSeek direct API 400 "reasoning_content must be passed back" on multi-turn tool calls [1 pull requests, 3 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#17212Fetched 2026-04-29 06:36:41
View on GitHub
Comments
3
Participants
4
Timeline
8
Reactions
0
Timeline (top)
labeled ×4commented ×3cross-referenced ×1

When using DeepSeek models (v4-pro, v4-flash, v3, v3.1) via the direct API endpoint (api.deepseek.com), the agent fails with a 400 error on multi-turn conversations after the first tool call. The error message is:

{"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}}

Error Message

Bug Report: DeepSeek direct API 400 error — "reasoning_content must be passed back"

When using DeepSeek models (v4-pro, v4-flash, v3, v3.1) via the direct API endpoint (api.deepseek.com), the agent fails with a 400 error on multi-turn conversations after the first tool call. The error message is: {"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}} 5. On the next API call, the compressed messages no longer contain reasoning_content, but the DeepSeek API requires it to be present when the previous assistant message originally had it → 400 error.

Root Cause

Root Cause Chain

Fix Action

Fixed

PR fix notes

PR #17225: fix(agent): send thinking control to DeepSeek direct API to prevent 400 on tool-call replay

Description (problem / solution / changelog)

What does this PR do?

DeepSeek's direct API (api.deepseek.com) defaults to thinking=enabled when no explicit thinking parameter is sent. When the user sets reasoning_effort: none in config, the setting was silently ignored because _supports_reasoning_extra_body() returns False for direct DeepSeek endpoints — so extra_body.reasoning (OpenRouter format) was never sent. The model defaulted to thinking=enabled, generated reasoning_content in its response, and the next API call failed with HTTP 400: "reasoning_content must be passed back".

This PR adds DeepSeek direct API thinking control to ChatCompletionsTransport.build_kwargs(), following the existing Kimi pattern: emit extra_body.thinking with type: "enabled" or type: "disabled" based on the user's reasoning_config. The is_deepseek_direct flag is threaded from run_agent.py, detected via provider name (deepseek) or base URL matching (api.deepseek.com).

The OpenRouter DeepSeek path is unchanged — it uses extra_body.reasoning which is a separate parameter handled by the existing supports_reasoning gate.

Related Issue

Fixes #17212

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)

Changes Made

  • agent/transports/chat_completions.py: Added DeepSeek direct API extra_body.thinking block (after existing Kimi block), with enabled/disabled control based on reasoning_config
  • run_agent.py: Added _is_deepseek_direct detection (provider == "deepseek" or base_url matches api.deepseek.com) and threaded it to transport params
  • tests/agent/transports/test_chat_completions.py: Added 5 test cases covering default enabled, disabled via config, effort=none, OpenRouter unchanged, Kimi unchanged

How to Test

  1. Configure Hermes with DeepSeek direct API (api.deepseek.com) and reasoning_effort: none
  2. Send a message that triggers tool calls (e.g. "run date")
  3. Verify the second API call succeeds (no 400 error)
  4. Run the targeted tests: pytest tests/agent/transports/test_chat_completions.py -v
  5. Run full suite: pytest tests/ -q --ignore=tests/integration --ignore=tests/e2e

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes
  • I've tested on my platform: macOS (Darwin 25.4.0, Apple Silicon), Python 3.11

Documentation & Housekeeping

  • Updated relevant documentation — or N/A
  • Updated cli-config.yaml.example — or N/A
  • Updated contributing / agents docs — or N/A
  • Considered cross-platform impact — or N/A
  • Updated tool descriptions/schemas — or N/A

Changed files

  • agent/transports/chat_completions.py (modified, +14/-0)
  • run_agent.py (modified, +5/-0)
  • tests/agent/transports/test_chat_completions.py (modified, +43/-0)

Code Example

# DeepSeek V-series (v4-pro, v4-flash, etc.): thinking.type control
# Direct DeepSeek API uses "thinking" parameter, not OpenRouter's
# "reasoning".  Without this, reasoning_effort: none from config is
# silently ignored and the model defaults to thinking=enabled.
# That generates reasoning_content which compression strips →
# 400 "reasoning_content must be passed back" on next turn.
if model_lower.startswith("deepseek-v"):
    _ds_thinking_enabled = True
    if reasoning_config and isinstance(reasoning_config, dict):
        if reasoning_config.get("enabled") is False:
            _ds_thinking_enabled = False
    extra_body["thinking"] = {
        "type": "enabled" if _ds_thinking_enabled else "disabled",
    }
RAW_BUFFERClick to expand / collapse

Bug Report: DeepSeek direct API 400 error — "reasoning_content must be passed back"

Summary

When using DeepSeek models (v4-pro, v4-flash, v3, v3.1) via the direct API endpoint (api.deepseek.com), the agent fails with a 400 error on multi-turn conversations after the first tool call. The error message is:

{"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}}

Root Cause Chain

  1. _supports_reasoning_extra_body() gate is too narrow (run_agent.py:7315): For non-OpenRouter base URLs, it returns False. Direct DeepSeek API hits this path.

  2. User config sets reasoning_effort: none (thinking disabled), but because supports_reasoning is False, the reasoning_config dict is never translated into the extra_body sent to the API.

  3. DeepSeek's direct API uses its own parameter namespace: when thinking is not explicitly disabled, the model defaults to thinking=enabled. The control parameter is {"thinking": {"type": "disabled"}}not the OpenRouter-style {"reasoning": {"enabled": false}}.

  4. With thinking enabled, the model generates reasoning_content in its response. However, the agent's compression pipeline strips this field before storing the message.

  5. On the next API call, the compressed messages no longer contain reasoning_content, but the DeepSeek API requires it to be present when the previous assistant message originally had it → 400 error.

Affected Code

run_agent.py:7294-7329_supports_reasoning_extra_body() The function correctly gates reasoning for OpenRouter routes (where deepseek/ is a known prefix at line 7322), but has no awareness of direct DeepSeek API endpoints (api.deepseek.com).

agent/transports/chat_completions.py — missing DeepSeek thinking handling Before the fix below, there was no code path that sent {"thinking": {"type": "disabled"}} to the direct DeepSeek API.

Working Fix

Added DeepSeek V-series thinking.type control to chat_completions.py before the general reasoning block. This mirrors the existing Kimi/Moonshot pattern (lines 232-240).

# DeepSeek V-series (v4-pro, v4-flash, etc.): thinking.type control
# Direct DeepSeek API uses "thinking" parameter, not OpenRouter's
# "reasoning".  Without this, reasoning_effort: none from config is
# silently ignored and the model defaults to thinking=enabled.
# That generates reasoning_content which compression strips →
# 400 "reasoning_content must be passed back" on next turn.
if model_lower.startswith("deepseek-v"):
    _ds_thinking_enabled = True
    if reasoning_config and isinstance(reasoning_config, dict):
        if reasoning_config.get("enabled") is False:
            _ds_thinking_enabled = False
    extra_body["thinking"] = {
        "type": "enabled" if _ds_thinking_enabled else "disabled",
    }

File: agent/transports/chat_completions.py
Lines: insert after line 240 (after Kimi block, before the # Reasoning block at line 257)

Verification

The fix has been confirmed working on a production install (WSL Ubuntu 24.04, hermes-agent v0.11.0, DeepSeek v4-pro direct API). Multi-turn tool-calling conversations that previously hit the 400 on the second turn now complete successfully.

Alternative / More General Fix

A more systematic approach would be to update _supports_reasoning_extra_body() to recognize direct DeepSeek API endpoints, so the general reasoning block in chat_completions.py could handle it. However, that block emits reasoning-keyed dicts, and the direct DeepSeek API expects thinking-keyed dicts — so the model-specific fix above is cleaner and follows the existing Kimi pattern.

Steps to Reproduce

  1. Configure Hermes to use DeepSeek direct API (api.deepseek.com)
  2. Use model deepseek-v4-pro or deepseek-v4-flash
  3. Set reasoning_effort: none in config (recommended for cost-saving with direct API)
  4. Send any message that triggers a tool call (e.g. "run date")
  5. Agent will fail with 400 on the second API call after the tool result is returned

Environment

  • hermes-agent v0.11.0
  • WSL Ubuntu 24.04
  • Python 3.11
  • DeepSeek API direct (not OpenRouter)
  • Model: deepseek-v4-pro

extent analysis

TL;DR

To fix the 400 error with DeepSeek's direct API, add a model-specific control for thinking.type in chat_completions.py to handle the reasoning_content requirement.

Guidance

  • Identify if you are using the DeepSeek direct API and a model that requires the thinking parameter (e.g., deepseek-v4-pro, deepseek-v4-flash).
  • Verify that your reasoning_effort is set to none in the config, which is recommended for cost savings but requires special handling for DeepSeek's direct API.
  • Apply the provided fix to chat_completions.py by adding the DeepSeek V-series thinking.type control before the general reasoning block.
  • Test multi-turn conversations that previously failed with the 400 error to ensure they now complete successfully.

Example

The fix involves adding the following code to chat_completions.py:

if model_lower.startswith("deepseek-v"):
    _ds_thinking_enabled = True
    if reasoning_config and isinstance(reasoning_config, dict):
        if reasoning_config.get("enabled") is False:
            _ds_thinking_enabled = False
    extra_body["thinking"] = {
        "type": "enabled" if _ds_thinking_enabled else "disabled",
    }

This code should be inserted after line 240, following the Kimi block and before the # Reasoning block at line 257.

Notes

This fix is specific to the DeepSeek direct API and models that use the thinking parameter. It may not apply to other APIs or models. Additionally, a more systematic approach could involve updating _supports_reasoning_extra_body() to recognize direct DeepSeek API endpoints, but this would require handling the difference in parameter names (reasoning vs. thinking).

Recommendation

Apply the workaround by adding the model-specific `thinking

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING