litellm - 💡(How to fix) Fix [Bug]: Codex CLI disconnects on /v1/responses streaming with DeepSeek models

litellm2026-05-29 02:46:02

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Error Message

└ Stream disconnected before completion: error sending request for url (http://127.0.0.1:4000/v1/responses)

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

A bug happened!

Steps to Reproduce

Description

LiteLLM /v1/responses streaming works with simple Python clients, but fails with Codex CLI.

The stream appears to contain non-standard SSE event types that Codex CLI cannot parse correctly, causing the connection to terminate before completion.

Environment LiteLLM version: litellm-1.86.2 OS: Windows 10 Backend model: DeepSeek V4 Flash / Pro Codex CLI: v0.135.0 Python: 3.x

LiteLLM startup command litellm --model deepseek/deepseek-v4-pro --model deepseek/deepseek-v4-flash --port 4000 --debug

Codex config model = "deepseek-v4-flash" model_provider = "litellm" model_reasoning_effort = "high"

model_catalog_json = "C:\Users\shenhaitao\.codex\models_catalog.json"

[model_providers.litellm] name = "LiteLLM Proxy" base_url = "http://127.0.0.1:4000/v1" wire_api = "responses" timeout = 120

[profiles.deepseek-pro] model = "deepseek-v4-pro" model_provider = "litellm"

[profiles.deepseek-flash] model = "deepseek-v4-flash" model_provider = "litellm"

Python test client works correctly:

import requests

url = "http://127.0.0.1:4000/v1/responses"

data = { "model": "deepseek-v4-flash", "input": "1+1=?", "stream": True }

response = requests.post(url, json=data, stream=True)

for line in response.iter_lines(): if line: print(line.decode())

The stream completes successfully.

However, Codex CLI fails with:

• Reconnecting... 1/5 (39s • esc to interrupt) └ Stream disconnected before completion: error sending request for url (http://127.0.0.1:4000/v1/responses)

Relevant log output

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

litellm-1.86.2

Twitter / LinkedIn details

No response

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering