transformers - ✅(Solved) Fix granitemoehybrid: HybridMambaAttentionDynamicCache missing from modeling_granitemoehybrid — breaks ibm-granite/granite-4.0-3b-vision remote code [1 pull requests, 3 comments, 3 participants]

transformers2026-04-15 06:13:12

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

huggingface/transformers#45447•Fetched 2026-04-17 08:22:48

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×3

The ibm-granite/granite-4.0-3b-vision model's remote modeling.py imports HybridMambaAttentionDynamicCache from transformers.models.granitemoehybrid.modeling_granitemoehybrid. This class does not exist in transformers 5.5.4 (latest) or on the current main branch, causing an ImportError whenever any application loads this model with trust_remote_code=True.

Error Message

ImportError: cannot import name 'HybridMambaAttentionDynamicCache' from
'transformers.models.granitemoehybrid.modeling_granitemoehybrid'

Root Cause

PR fix notes

PR #44445: Adding support for GraniteDoclingHybrid

Repository: huggingface/transformers
Author: gabe-l-hart
State: open | merged: False
Link: https://github.com/huggingface/transformers/pull/44445

Description (problem / solution / changelog)

What does this PR do?

This PR adds support for the forthcoming Granite Docling model based on the Granite 4 LLM architecture (GraniteMoeHybrid).

Draft Status

This PR is in draft pending the possibility of some additional changes:

Finalizing the vision projector
Finalizing the name as GraniteDoclingHybrid (versus eg GraniteMoeHybridDocling or similar)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline, Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Models:

vision models: @yonigozlan @molbap
multimodal models: @zucchini-nlp

Changed files

src/transformers/models/auto/configuration_auto.py (modified, +2/-0)
src/transformers/models/auto/modeling_auto.py (modified, +2/-0)
src/transformers/models/auto/processing_auto.py (modified, +1/-0)
src/transformers/models/granite_docling_hybrid/__init__.py (added, +28/-0)
src/transformers/models/granite_docling_hybrid/configuration_granite_docling_hybrid.py (added, +296/-0)
src/transformers/models/granite_docling_hybrid/modeling_granite_docling_hybrid.py (added, +968/-0)
src/transformers/models/granite_docling_hybrid/modular_granite_docling_hybrid.py (added, +565/-0)
src/transformers/models/granite_docling_hybrid/processing_granite_docling_hybrid.py (added, +420/-0)

Code Example

ImportError: cannot import name 'HybridMambaAttentionDynamicCache' from
'transformers.models.granitemoehybrid.modeling_granitemoehybrid'

---

from transformers import AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained(
    "ibm-granite/granite-4.0-3b-vision",
    trust_remote_code=True,
)

---

File "~/.cache/huggingface/modules/transformers_modules/<hash>/modeling.py", line 19
    from transformers.models.granitemoehybrid.modeling_granitemoehybrid import (
        HybridMambaAttentionDynamicCache,
    )
ImportError: cannot import name 'HybridMambaAttentionDynamicCache'

RAW_BUFFERClick to expand / collapse

Summary

Error

ImportError: cannot import name 'HybridMambaAttentionDynamicCache' from
'transformers.models.granitemoehybrid.modeling_granitemoehybrid'

Reproduction

from transformers import AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained(
    "ibm-granite/granite-4.0-3b-vision",
    trust_remote_code=True,
)

Stack trace excerpt:

File "~/.cache/huggingface/modules/transformers_modules/<hash>/modeling.py", line 19
    from transformers.models.granitemoehybrid.modeling_granitemoehybrid import (
        HybridMambaAttentionDynamicCache,
    )
ImportError: cannot import name 'HybridMambaAttentionDynamicCache'

Investigation

HybridMambaAttentionDynamicCache existed in transformers 4.40.x (introduced with the Jamba model, commit 3f20877).
The class was removed / not carried forward during the 4.x → 5.x cache refactoring.
A search across the entire transformers 5.5.4 install finds zero occurrences of the symbol.
The same search on the current main branch also finds zero occurrences.
The granitemoehybrid module in 5.5.4 has no cache class of its own; olmo_hybrid introduced OlmoHybridDynamicCache as the new pattern for hybrid Mamba/attention cache.
Draft PR #44445 ("Adding support for GraniteDoclingHybrid") is open but still a draft and does not yet add this class.

Environment


transformers	5.5.4
Python	3.14.4
PyTorch	2.11.0
Platform	macOS 26.4.1 arm64

Expected behaviour

Either:

HybridMambaAttentionDynamicCache (or a compatible replacement) is exported from transformers.models.granitemoehybrid.modeling_granitemoehybrid so the model's remote code loads without error; or
The granitemoehybrid module's public API documents the correct replacement so downstream model authors (ibm-granite) can update their modeling.py.

A parallel issue has been filed against docling-project/docling#3303 since ChartExtractionModelGraniteVisionV4 in docling 2.88.0 triggers this crash path.

extent analysis

TL;DR

The most likely fix is to update the ibm-granite/granite-4.0-3b-vision model to use a compatible replacement for HybridMambaAttentionDynamicCache or to set trust_remote_code=False to avoid loading the remote code.

Guidance

Verify that the HybridMambaAttentionDynamicCache class is not available in the current version of transformers (5.5.4) and that it was removed during the 4.x → 5.x cache refactoring.
Check the draft PR #44445 for potential updates on adding support for GraniteDoclingHybrid, which may include the missing class.
Consider setting trust_remote_code=False when loading the model to avoid the ImportError, although this may have other implications for the application.
If possible, update the ibm-granite/granite-4.0-3b-vision model to use the new pattern for hybrid Mamba/attention cache, such as OlmoHybridDynamicCache.

Example

No code snippet is provided as the issue is related to a missing class and not a specific code error.

Notes

The fix may require updates to the ibm-granite/granite-4.0-3b-vision model or the transformers library, and may have implications for other models or applications that rely on the same code.

Recommendation

Apply workaround: set trust_remote_code=False when loading the model, as this is a temporary solution that can help avoid the ImportError until a more permanent fix is available.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #output truncation #response parsing #generation error #database connection

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

transformers - ✅(Solved) Fix granitemoehybrid: HybridMambaAttentionDynamicCache missing from modeling_granitemoehybrid — breaks ibm-granite/granite-4.0-3b-vision remote code [1 pull requests, 3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

PR fix notes

PR #44445: Adding support for GraniteDoclingHybrid

Description (problem / solution / changelog)

What does this PR do?

Draft Status

Before submitting

Who can review?

Changed files

Code Example

Summary

Error

Reproduction

Investigation

Environment

Expected behaviour

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

transformers - ✅(Solved) Fix granitemoehybrid: HybridMambaAttentionDynamicCache missing from modeling_granitemoehybrid — breaks ibm-granite/granite-4.0-3b-vision remote code [1 pull requests, 3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

PR fix notes

PR #44445: Adding support for GraniteDoclingHybrid

Description (problem / solution / changelog)

What does this PR do?

Draft Status

Before submitting

Who can review?

Changed files

Code Example

Summary

Error

Reproduction

Investigation

Environment

Expected behaviour

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING