transformers - ✅(Solved) Fix Transformers Qwen3.5 had a bug when set output_hidden_states=True [1 pull requests, 4 comments, 3 participants]

transformers2026-03-19 08:27:35

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#44849•Fetched 2026-04-08 01:01:49

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×4subscribed ×2cross-referenced ×1labeled ×1

Fix Action

Fixed

Fixed by PR: fix: pop output_* flags from kwargs in capture_outputs to prevent submodule leakage (https://github.com/huggingface/transformers/pull/44922)

PR fix notes

PR #44922: fix: pop output_* flags from kwargs in capture_outputs to prevent submodule leakage

Repository: huggingface/transformers
Author: s-zx
State: closed | merged: False
Link: https://github.com/huggingface/transformers/pull/44922

Description (problem / solution / changelog)

What does this PR do?

Fixes #44849.

When output_hidden_states=True (or output_attentions=True) is passed to model.generate(), the @capture_outputs decorator reads the flag value but leaves it in **kwargs. These flags then propagate through **kwargs chains deep into sub-models — specifically, into vision encoder blocks and attention functions that don't expect them.

For the Qwen3.5 (and Qwen VL family) this causes garbled generation when output_hidden_states=True is set: the flag reaches Qwen3_5VisionBlock.attn via Qwen3_5Model.get_image_features(**kwargs) → self.visual(**kwargs) → blk(**kwargs) → self.attn(**kwargs), corrupting intermediate attention tensors and causing the model to generate repetitive image-pad tokens instead of meaningful text.

Root cause

In capture_outputs (in output_capturing.py), the decorator uses kwargs.get(...) to read the output flags — but it does not remove them from kwargs. The underlying func(self, *args, **kwargs) call therefore still sees output_hidden_states=True, which then leaks into every submodule called with **kwargs.

Fix

After reading the values for all capturable flags, immediately pop them from kwargs:

for k in capturable_flags:
    kwargs.pop(f"output_{k}", None)
if "cross_attentions" in capturable_flags or "mask_decoder_attentions" in capturable_flags:
    kwargs.pop("output_attentions", None)

Since @capture_outputs already captures the requested outputs through forward hooks, the underlying forward function (and all modules it calls) does not need to receive these flags. This pop has no effect on output correctness but prevents any downstream damage.

The fix applies to all models using @capture_outputs, not just Qwen3.5.

Changed files

src/transformers/utils/output_capturing.py (modified, +10/-0)

Code Example

><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|vision_end|>请详细描述这张图片的内容。<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n这张!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!']

---

on_end|>请详细描述这张图片的内容。<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n这张图片是一张包含表格的截图，内容是一份关于不同模型在多个任务上表现的数据对比表。\n\n**整体布局：**\n图片的主体是一个表格，表格的标题为“Model”，列出了多个模型名称，以及它们在不同任务上的平均长度（Avg. Length）、任务1、任务2、任务3、任务4和任务5的得分百分比。表格下方还有一行说明文字。\n\n**表格内容详情：**\n表格共有8行数据，对应8个不同的模型。\n\n- **第一行：**\n  - **Model:** `PI0*`\n  - **Avg. Length:** 2.954\n  - **Task 1:** 84.8%\n  - **Task 2:** 70.4%\n  - **Task 3:** 55.9%\n  - **Task 4:** 46.6%\n  - **Task 5:** 37.7%\n\n- **第二行：**\n  - **Model:** `PI0.5*`\n  - **Avg. Length:** 3.885\n  - **Task 1:** 92.5%\n  - **Task 2:** 84.0%\n  - **Task 3:** 76.6%\n  - **Task 4:** 71.0%\n  - **Task 5:** 64.4%\n\n- **第三行：**\n  - **Model:** `qwenpi (qwen2.5-vl-3B-instruct-action)`\n  - **Avg. Length:** 3.5']

RAW_BUFFERClick to expand / collapse

System Info

Version: 5.2.0

in qwen3.5

outputs = model_wrapper.generate(**inputs, output_hidden_states=True)

outpus something like this:

><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|image_pad|><|vision_end|>请详细描述这张图片的内容。<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n这张!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!']

ignore the output_hideen_states params, normal

on_end|>请详细描述这张图片的内容。<|im_end|>\n<|im_start|>assistant\n<think>\n\n</think>\n\n这张图片是一张包含表格的截图，内容是一份关于不同模型在多个任务上表现的数据对比表。\n\n**整体布局：**\n图片的主体是一个表格，表格的标题为“Model”，列出了多个模型名称，以及它们在不同任务上的平均长度（Avg. Length）、任务1、任务2、任务3、任务4和任务5的得分百分比。表格下方还有一行说明文字。\n\n**表格内容详情：**\n表格共有8行数据，对应8个不同的模型。\n\n- **第一行：**\n  - **Model:** `PI0*`\n  - **Avg. Length:** 2.954\n  - **Task 1:** 84.8%\n  - **Task 2:** 70.4%\n  - **Task 3:** 55.9%\n  - **Task 4:** 46.6%\n  - **Task 5:** 37.7%\n\n- **第二行：**\n  - **Model:** `PI0.5*`\n  - **Avg. Length:** 3.885\n  - **Task 1:** 92.5%\n  - **Task 2:** 84.0%\n  - **Task 3:** 76.6%\n  - **Task 4:** 71.0%\n  - **Task 5:** 64.4%\n\n- **第三行：**\n  - **Model:** `qwenpi (qwen2.5-vl-3B-instruct-action)`\n  - **Avg. Length:** 3.5']

Who can help?

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Expected behavior

extent analysis

Fix Plan

The issue seems to be related to the model generating unnecessary padding tokens. To fix this, we can try the following steps:

Update the generate method to remove padding tokens from the output.
Use the max_length parameter to limit the length of the generated output.

Here's an example code snippet:

outputs = model_wrapper.generate(**inputs, output_hidden_states=True, max_length=512)
output_text = outputs[0].strip().replace("<|image_pad|>", "").replace("<|vision_end|>", "").replace("<|im_end|>", "").replace("<|im_start|>", "")

Alternatively, you can also use the truncation parameter to truncate the output at a specified length:

outputs = model_wrapper.generate(**inputs, output_hidden_states=True, max_length=512, truncation=True)

Verification

To verify that the fix worked, you can check the output of the generate method to see if the padding tokens have been removed. You can also check the length of the output to ensure it is within the specified limit.

print(output_text)
print(len(output_text))

Extra Tips

Make sure to adjust the max_length parameter according to your specific use case to avoid truncating important information. Additionally, you can also experiment with different truncation strategies, such as truncating at a specific token or character position.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #ssr #installation #tensor shape #autograd error #memory leak #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

transformers - ✅(Solved) Fix Transformers Qwen3.5 had a bug when set output_hidden_states=True [1 pull requests, 4 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #44922: fix: pop output_* flags from kwargs in capture_outputs to prevent submodule leakage

Description (problem / solution / changelog)

What does this PR do?

Root cause

Fix

Changed files

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

Fix Plan

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

TRENDING

transformers - ✅(Solved) Fix Transformers Qwen3.5 had a bug when set output_hidden_states=True [1 pull requests, 4 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #44922: fix: pop output_* flags from kwargs in capture_outputs to prevent submodule leakage

Description (problem / solution / changelog)

What does this PR do?

Root cause

Fix

Changed files

Code Example

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

extent analysis

Fix Plan

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING