hermes - ✅(Solved) Fix DeepSeek direct API 400 "reasoning_content must be passed back" on multi-turn tool calls [1 pull requests, 3 comments, 4 participants]

Zhangxiaocheng28 · 2026-04-29T01:58:44Z

[hermes] When using DeepSeek models v4-pro, v4-flash, v3, v3.1 via the direct API endpoint api.deepseek.com , the agent fails with a 400 error on multi-turn co… When using DeepSeek models (v4-pro, v4-flash, v3, v3.1) via the **direct** API endpoint (`api.deepseek.com`), the agent fails with a 400 error on multi-turn conversations after the first tool call. The error message is: {"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}} # PR #17225: fix(agent): send thinking control to DeepSeek direct API to prevent 400 on tool-call replay - Repository: NousResearch/hermes-agent - Author: luyao618 - State: open | merged: False - Link: https://github.com/NousResearch/hermes-agent/pull/17225 ## Description (problem / solution / changelog) ## What does this PR do? DeepSeek's direct API (`api.deepseek.com`) defaults to `thinking=enabled` when no explicit `thinking` parameter is sent. When the user sets `reasoning_effort: none` in config, the setting was silently ignored because `_supports_reasoning_extra_body()` returns `False` for direct DeepSeek endpoints — so `extra_body.reasoning` (OpenRouter format) was never sent. The model defaulted to thinking=enabled, generated `reasoning_content` in its response, and the next API call failed with HTTP 400: `"reasoning_content must be passed back"`. This PR adds DeepSeek direct API thinking control to `ChatCompletionsTransport.build_kwargs()`, following the existing Kimi pattern: emit `extra_body.thinking` with `type: "enabled"` or `type: "disabled"` based on the user's `reasoning_config`. The `is_deepseek_direct` flag is threaded from `run_agent.py`, detected via provider name (`deepseek`) or base URL matching (`api.deepseek.com`). The OpenRouter DeepSeek path is unchanged — it uses `extra_body.reasoning` which is a separate parameter handled by the existing `supports_reasoning` gate. ## Related Issue Fixes #17212 ## Type of Change - [x] 🐛 Bug fix (non-breaking change that fixes an issue) ## Changes Made - `agent/transports/chat_completions.py`: Added DeepSeek direct API `extra_body.thinking` block (after existing Kimi block), with `enabled`/`disabled` control based on `reasoning_config` - `run_agent.py`: Added `_is_deepseek_direct` detection (provider == "deepseek" or base_url matches api.deepseek.com) and threaded it to transport params - `tests/agent/transports/test_chat_completions.py`: Added 5 test cases covering default enabled, disabled via config, effort=none, OpenRouter unchanged, Kimi unchanged ## How to Test 1. Configure Hermes with DeepSeek direct API (`api.deepseek.com`) and `reasoning_effort: none` 2. Send a message that triggers tool calls (e.g. "run date") 3. Verify the second API call succeeds (no 400 error) 4. Run the targeted tests: `pytest tests/agent/transports/test_chat_completions.py -v` 5. Run full suite: `pytest tests/ -q --ignore=tests/integration --ignore=tests/e2e` ## Checklist ### Code - [x] I've read the Contributing Guide - [x] My commit messages follow Conventional Commits - [x] I searched for existing PRs to make sure this isn't a duplicate - [x] My PR contains only changes related to this fix - [x] I've run `pytest tests/ -q` and all tests pass - [x] I've added tests for my changes - [x] I've tested on my platform: macOS (Darwin 25.4.0, Apple Silicon), Python 3.11 ### Documentation & Housekeeping - [x] Updated relevant documentation — or N/A - [x] Updated cli-config.yaml.example — or N/A - [x] Updated contributing / agents docs — or N/A - [x] Considered cross-platform impact — or N/A - [x] Updated tool descriptions/schemas — or N/A ## Changed files - `agent/transports/chat_completions.py` (modified, +14/-0) - `run_agent.py` (modified, +5/-0) - `tests/agent/transports/test_chat_completions.py` (modified, +43/-0) ## Fixed - Fixed by PR: fix(agent): send thinking control to DeepSeek direct API to prevent 400 on tool-call replay (https://github.com/NousResearch/hermes-agent/pull/17225) ## Bug Report: DeepSeek direct API 400 error — "reasoning_content must be passed back" ### Summary When using DeepSeek models (v4-pro, v4-flash, v3, v3.1) via the **direct** API endpoint (`api.deepseek.com`), the agent fails with a 400 error on multi-turn conversations after the first tool call. The error message is: {"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}} ### Root Cause Chain 1. **`_supports_reasoning_extra_body()` gate is too narrow** (`run_agent.py:7315`): For non-OpenRouter base URLs, it returns `False`. Direct DeepSeek API hits this path. 2. **User config sets `reasoning_effort: none` (thinking disabled)**, but because `supports_reasoning` is `False`, the `reasoning_config` dict is **never translated** into the `extra_body` sent to the API. 3. **DeepSeek's direct API uses its own parameter namespace**: when thinking is not explicitly disabled, the model **defaults to t

hermes2026-04-29 01:58:44

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

NousResearch/hermes-agent#17212•Fetched 2026-04-29 06:36:41

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

labeled ×4commented ×3cross-referenced ×1

{"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}}

Error Message

Bug Report: DeepSeek direct API 400 error — "reasoning_content must be passed back"

When using DeepSeek models (v4-pro, v4-flash, v3, v3.1) via the direct API endpoint (api.deepseek.com), the agent fails with a 400 error on multi-turn conversations after the first tool call. The error message is: {"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}} 5. On the next API call, the compressed messages no longer contain reasoning_content, but the DeepSeek API requires it to be present when the previous assistant message originally had it → 400 error.

Root Cause

Root Cause Chain

Code Example

# DeepSeek V-series (v4-pro, v4-flash, etc.): thinking.type control
# Direct DeepSeek API uses "thinking" parameter, not OpenRouter's
# "reasoning".  Without this, reasoning_effort: none from config is
# silently ignored and the model defaults to thinking=enabled.
# That generates reasoning_content which compression strips →
# 400 "reasoning_content must be passed back" on next turn.
if model_lower.startswith("deepseek-v"):
    _ds_thinking_enabled = True
    if reasoning_config and isinstance(reasoning_config, dict):
        if reasoning_config.get("enabled") is False:
            _ds_thinking_enabled = False
    extra_body["thinking"] = {
        "type": "enabled" if _ds_thinking_enabled else "disabled",
    }

RAW_BUFFERClick to expand / collapse

Bug Report: DeepSeek direct API 400 error — "reasoning_content must be passed back"

Summary

{"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}}

Root Cause Chain

_supports_reasoning_extra_body() gate is too narrow (run_agent.py:7315): For non-OpenRouter base URLs, it returns False. Direct DeepSeek API hits this path.
User config sets reasoning_effort: none (thinking disabled), but because supports_reasoning is False, the reasoning_config dict is never translated into the extra_body sent to the API.
DeepSeek's direct API uses its own parameter namespace: when thinking is not explicitly disabled, the model defaults to thinking=enabled. The control parameter is {"thinking": {"type": "disabled"}} — not the OpenRouter-style {"reasoning": {"enabled": false}}.
With thinking enabled, the model generates reasoning_content in its response. However, the agent's compression pipeline strips this field before storing the message.
On the next API call, the compressed messages no longer contain reasoning_content, but the DeepSeek API requires it to be present when the previous assistant message originally had it → 400 error.

Affected Code

run_agent.py:7294-7329 — _supports_reasoning_extra_body() The function correctly gates reasoning for OpenRouter routes (where deepseek/ is a known prefix at line 7322), but has no awareness of direct DeepSeek API endpoints (api.deepseek.com).

agent/transports/chat_completions.py — missing DeepSeek thinking handling Before the fix below, there was no code path that sent {"thinking": {"type": "disabled"}} to the direct DeepSeek API.

Working Fix

Added DeepSeek V-series thinking.type control to chat_completions.py before the general reasoning block. This mirrors the existing Kimi/Moonshot pattern (lines 232-240).

# DeepSeek V-series (v4-pro, v4-flash, etc.): thinking.type control
# Direct DeepSeek API uses "thinking" parameter, not OpenRouter's
# "reasoning".  Without this, reasoning_effort: none from config is
# silently ignored and the model defaults to thinking=enabled.
# That generates reasoning_content which compression strips →
# 400 "reasoning_content must be passed back" on next turn.
if model_lower.startswith("deepseek-v"):
    _ds_thinking_enabled = True
    if reasoning_config and isinstance(reasoning_config, dict):
        if reasoning_config.get("enabled") is False:
            _ds_thinking_enabled = False
    extra_body["thinking"] = {
        "type": "enabled" if _ds_thinking_enabled else "disabled",
    }

File: agent/transports/chat_completions.py
Lines: insert after line 240 (after Kimi block, before the # Reasoning block at line 257)

Verification

The fix has been confirmed working on a production install (WSL Ubuntu 24.04, hermes-agent v0.11.0, DeepSeek v4-pro direct API). Multi-turn tool-calling conversations that previously hit the 400 on the second turn now complete successfully.

Alternative / More General Fix

A more systematic approach would be to update _supports_reasoning_extra_body() to recognize direct DeepSeek API endpoints, so the general reasoning block in chat_completions.py could handle it. However, that block emits reasoning-keyed dicts, and the direct DeepSeek API expects thinking-keyed dicts — so the model-specific fix above is cleaner and follows the existing Kimi pattern.

Steps to Reproduce

Configure Hermes to use DeepSeek direct API (api.deepseek.com)
Use model deepseek-v4-pro or deepseek-v4-flash
Set reasoning_effort: none in config (recommended for cost-saving with direct API)
Send any message that triggers a tool call (e.g. "run date")
Agent will fail with 400 on the second API call after the tool result is returned

Environment

hermes-agent v0.11.0
WSL Ubuntu 24.04
Python 3.11
DeepSeek API direct (not OpenRouter)
Model: deepseek-v4-pro

extent analysis

TL;DR

To fix the 400 error with DeepSeek's direct API, add a model-specific control for thinking.type in chat_completions.py to handle the reasoning_content requirement.

Guidance

Identify if you are using the DeepSeek direct API and a model that requires the thinking parameter (e.g., deepseek-v4-pro, deepseek-v4-flash).
Verify that your reasoning_effort is set to none in the config, which is recommended for cost savings but requires special handling for DeepSeek's direct API.
Apply the provided fix to chat_completions.py by adding the DeepSeek V-series thinking.type control before the general reasoning block.
Test multi-turn conversations that previously failed with the 400 error to ensure they now complete successfully.

Example

The fix involves adding the following code to chat_completions.py:

if model_lower.startswith("deepseek-v"):
    _ds_thinking_enabled = True
    if reasoning_config and isinstance(reasoning_config, dict):
        if reasoning_config.get("enabled") is False:
            _ds_thinking_enabled = False
    extra_body["thinking"] = {
        "type": "enabled" if _ds_thinking_enabled else "disabled",
    }

This code should be inserted after line 240, following the Kimi block and before the # Reasoning block at line 257.

Notes

This fix is specific to the DeepSeek direct API and models that use the thinking parameter. It may not apply to other APIs or models. Additionally, a more systematic approach could involve updating _supports_reasoning_extra_body() to recognize direct DeepSeek API endpoints, but this would require handling the difference in parameter names (reasoning vs. thinking).

Recommendation

Apply the workaround by adding the model-specific `thinking

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #agent setup #task chaining #parallel task

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

hermes - ✅(Solved) Fix DeepSeek direct API 400 "reasoning_content must be passed back" on multi-turn tool calls [1 pull requests, 3 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Bug Report: DeepSeek direct API 400 error — "reasoning_content must be passed back"

Root Cause

Root Cause Chain

Fix Action

Fixed

PR fix notes

PR #17225: fix(agent): send thinking control to DeepSeek direct API to prevent 400 on tool-call replay

Description (problem / solution / changelog)

What does this PR do?

Related Issue

Type of Change

Changes Made

How to Test

Checklist

Code

Documentation & Housekeeping

Changed files

Code Example

Bug Report: DeepSeek direct API 400 error — "reasoning_content must be passed back"

Summary

Root Cause Chain

Affected Code

Working Fix

Verification

Alternative / More General Fix

Steps to Reproduce

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING