transformers - ✅(Solved) Fix granitemoehybrid: HybridMambaAttentionDynamicCache missing from modeling_granitemoehybrid — breaks ibm-granite/granite-4.0-3b-vision remote code [1 pull requests, 3 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#45447Fetched 2026-04-17 08:22:48
View on GitHub
Comments
3
Participants
3
Timeline
3
Reactions
0
Timeline (top)
commented ×3

The ibm-granite/granite-4.0-3b-vision model's remote modeling.py imports HybridMambaAttentionDynamicCache from transformers.models.granitemoehybrid.modeling_granitemoehybrid. This class does not exist in transformers 5.5.4 (latest) or on the current main branch, causing an ImportError whenever any application loads this model with trust_remote_code=True.

Error Message

ImportError: cannot import name 'HybridMambaAttentionDynamicCache' from
'transformers.models.granitemoehybrid.modeling_granitemoehybrid'

Root Cause

The ibm-granite/granite-4.0-3b-vision model's remote modeling.py imports HybridMambaAttentionDynamicCache from transformers.models.granitemoehybrid.modeling_granitemoehybrid. This class does not exist in transformers 5.5.4 (latest) or on the current main branch, causing an ImportError whenever any application loads this model with trust_remote_code=True.

PR fix notes

PR #44445: Adding support for GraniteDoclingHybrid

Description (problem / solution / changelog)

What does this PR do?

This PR adds support for the forthcoming Granite Docling model based on the Granite 4 LLM architecture (GraniteMoeHybrid).

Draft Status

This PR is in draft pending the possibility of some additional changes:

  • Finalizing the vision projector
  • Finalizing the name as GraniteDoclingHybrid (versus eg GraniteMoeHybridDocling or similar)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Models:

  • vision models: @yonigozlan @molbap
  • multimodal models: @zucchini-nlp

Changed files

  • src/transformers/models/auto/configuration_auto.py (modified, +2/-0)
  • src/transformers/models/auto/modeling_auto.py (modified, +2/-0)
  • src/transformers/models/auto/processing_auto.py (modified, +1/-0)
  • src/transformers/models/granite_docling_hybrid/__init__.py (added, +28/-0)
  • src/transformers/models/granite_docling_hybrid/configuration_granite_docling_hybrid.py (added, +296/-0)
  • src/transformers/models/granite_docling_hybrid/modeling_granite_docling_hybrid.py (added, +968/-0)
  • src/transformers/models/granite_docling_hybrid/modular_granite_docling_hybrid.py (added, +565/-0)
  • src/transformers/models/granite_docling_hybrid/processing_granite_docling_hybrid.py (added, +420/-0)

Code Example

ImportError: cannot import name 'HybridMambaAttentionDynamicCache' from
'transformers.models.granitemoehybrid.modeling_granitemoehybrid'

---

from transformers import AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained(
    "ibm-granite/granite-4.0-3b-vision",
    trust_remote_code=True,
)

---

File "~/.cache/huggingface/modules/transformers_modules/<hash>/modeling.py", line 19
    from transformers.models.granitemoehybrid.modeling_granitemoehybrid import (
        HybridMambaAttentionDynamicCache,
    )
ImportError: cannot import name 'HybridMambaAttentionDynamicCache'
RAW_BUFFERClick to expand / collapse

Summary

The ibm-granite/granite-4.0-3b-vision model's remote modeling.py imports HybridMambaAttentionDynamicCache from transformers.models.granitemoehybrid.modeling_granitemoehybrid. This class does not exist in transformers 5.5.4 (latest) or on the current main branch, causing an ImportError whenever any application loads this model with trust_remote_code=True.

Error

ImportError: cannot import name 'HybridMambaAttentionDynamicCache' from
'transformers.models.granitemoehybrid.modeling_granitemoehybrid'

Reproduction

from transformers import AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained(
    "ibm-granite/granite-4.0-3b-vision",
    trust_remote_code=True,
)

Stack trace excerpt:

File "~/.cache/huggingface/modules/transformers_modules/<hash>/modeling.py", line 19
    from transformers.models.granitemoehybrid.modeling_granitemoehybrid import (
        HybridMambaAttentionDynamicCache,
    )
ImportError: cannot import name 'HybridMambaAttentionDynamicCache'

Investigation

  • HybridMambaAttentionDynamicCache existed in transformers 4.40.x (introduced with the Jamba model, commit 3f20877).
  • The class was removed / not carried forward during the 4.x → 5.x cache refactoring.
  • A search across the entire transformers 5.5.4 install finds zero occurrences of the symbol.
  • The same search on the current main branch also finds zero occurrences.
  • The granitemoehybrid module in 5.5.4 has no cache class of its own; olmo_hybrid introduced OlmoHybridDynamicCache as the new pattern for hybrid Mamba/attention cache.
  • Draft PR #44445 ("Adding support for GraniteDoclingHybrid") is open but still a draft and does not yet add this class.

Environment

transformers5.5.4
Python3.14.4
PyTorch2.11.0
PlatformmacOS 26.4.1 arm64

Expected behaviour

Either:

  1. HybridMambaAttentionDynamicCache (or a compatible replacement) is exported from transformers.models.granitemoehybrid.modeling_granitemoehybrid so the model's remote code loads without error; or
  2. The granitemoehybrid module's public API documents the correct replacement so downstream model authors (ibm-granite) can update their modeling.py.

A parallel issue has been filed against docling-project/docling#3303 since ChartExtractionModelGraniteVisionV4 in docling 2.88.0 triggers this crash path.

extent analysis

TL;DR

The most likely fix is to update the ibm-granite/granite-4.0-3b-vision model to use a compatible replacement for HybridMambaAttentionDynamicCache or to set trust_remote_code=False to avoid loading the remote code.

Guidance

  • Verify that the HybridMambaAttentionDynamicCache class is not available in the current version of transformers (5.5.4) and that it was removed during the 4.x → 5.x cache refactoring.
  • Check the draft PR #44445 for potential updates on adding support for GraniteDoclingHybrid, which may include the missing class.
  • Consider setting trust_remote_code=False when loading the model to avoid the ImportError, although this may have other implications for the application.
  • If possible, update the ibm-granite/granite-4.0-3b-vision model to use the new pattern for hybrid Mamba/attention cache, such as OlmoHybridDynamicCache.

Example

No code snippet is provided as the issue is related to a missing class and not a specific code error.

Notes

The fix may require updates to the ibm-granite/granite-4.0-3b-vision model or the transformers library, and may have implications for other models or applications that rely on the same code.

Recommendation

Apply workaround: set trust_remote_code=False when loading the model, as this is a temporary solution that can help avoid the ImportError until a more permanent fix is available.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING