litellm - 💡(How to fix) Fix [Bug]: 为什么我部署的qwen14b，输出都是json格式内容，而用云端api输出的就是正常的？

litellm2026-05-21 01:06:51

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

用litellm部署了一个云端模型和一个本地qwen14b模型，我发现调用14b则json格式输出，我改了no thinking ，改了流式输出，调整config，但是都不起作用，哪位大神指导一下，这个问题如何解决的？

Steps to Reproduce

修改 reasoning: true — 无关，其他模型也是 false 没问题加 system_prompt 禁止 JSON — 无效猜测加 stream: true — 无效猜测修改 SOUL.md — 问题不在这里创建 qwen3-nothink 删除思考标签 — 方向对但执行多次出错，制造了重复模型浪费空间多次重新生成 Modelfile — 每次都重新拉 9GB 模型修改 feishu-channel-rules — 问题不在这里加 extra_body: think: false — 声称有效但实际可能走了云端，没有真正验证反复让你删除重建模型 — 造成磁盘浪费多次猜测而不查文档 — 浪费大量时间和 token

根本问题至今未解决：本地 qwen3 通过 LiteLLM 调用时的 JSON 输出问题，还没有经过严格验证。目前本地用的是qwen3:8b-q8_0

Relevant log output

What part of LiteLLM is this about?

Other

What LiteLLM version are you on ?

v1.85.0

Twitter / LinkedIn details

No response

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering