transformers - ✅(Solved) Fix Add sequence classification capabilities to the Granite models [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44214Fetched 2026-04-08 00:29:47
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1labeled ×1

Fix Action

Fixed

PR fix notes

PR #44215: Add sequence classification capability to Granite models

Description (problem / solution / changelog)

What does this PR do?

Add sequence classification capabilities to the family of Granite models (Granite, GraniteMoe, GraniteMoeHybrid, and GraniteMoeShared).

Fixes #44214, #35720

Why

The Granite models currently only have the base model and causal model heads, so this addition brings them more in line with other models in the library.

Proposed solution and description of changes

The following ForSequenceClassification classes were added:

  • GraniteForSequenceClassification
  • GraniteMoeForSequenceClassification
  • GraniteMoeHybridForSequenceClassification
  • GraniteMoeSharedForSequenceClassification

using the existing GenericForSequenceClassification, following the established pattern seen in many other models in the library. Code changes were minimal and done in a way to keep consistent logic across similar models. Changes were implemented in modular_*.py and then modeling_*.py files were automatically generated using utils/modular_model_converter.py

Updated __all__ exports with new classes in each model module.

The Auto Model Registry (MODEL_FOR_SEQUENCE_CLASSIFICATION_MAPPING_NAMES) has been updated to allow for auto-loading them via AutoModelForSequenceClassification

New features usage

After with PR, users should be able to load any Granite model variant for sequence classification as follows:

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained(granite-model-id)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed.

<!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**. Please tag fewer than 3 people. Models: - text models: @ArthurZucker @Cyrilvallez - vision models: @yonigozlan @molbap - audio models: @eustlb @ebezzam @vasqu - multimodal models: @zucchini-nlp - graph models: @clefourrier Library: - generate: @zucchini-nlp (visual-language models) or @gante (all others) - continuous batching: @remi-or @ArthurZucker @McPatate - pipelines: @Rocketknight1 - tokenizers: @ArthurZucker and @itazap - trainer: @SunMarc - attention: @vasqu @ArthurZucker @CyrilVallez - model loading (from pretrained, etc): @CyrilVallez - distributed: @3outeille @ArthurZucker - CIs: @ydshieh Integrations: - ray/raytune: @richardliaw, @amogkam - Big Model Inference: @SunMarc - quantization: @SunMarc @MekkCyber - kernels: @MekkCyber @drbh - peft: @BenjaminBossan @githubnemo Devices/Backends: - AMD ROCm: @ivarflakstad - Intel XPU: @IlyasMoutawwakil - Ascend NPU: @ivarflakstad Documentation: @stevhliu Research projects are not maintained and should be taken as is. -->

@ArthurZucker @Cyrilvallez

Changed files

  • docs/source/en/model_doc/granite.md (modified, +5/-0)
  • docs/source/en/model_doc/granitemoe.md (modified, +5/-0)
  • docs/source/en/model_doc/granitemoehybrid.md (modified, +5/-0)
  • docs/source/en/model_doc/granitemoeshared.md (modified, +5/-0)
  • src/transformers/models/auto/modeling_auto.py (modified, +4/-0)
  • src/transformers/models/granite/modeling_granite.py (modified, +6/-2)
  • src/transformers/models/granite/modular_granite.py (modified, +6/-1)
  • src/transformers/models/granitemoe/modeling_granitemoe.py (modified, +11/-2)
  • src/transformers/models/granitemoe/modular_granitemoe.py (modified, +11/-1)
  • src/transformers/models/granitemoehybrid/modeling_granitemoehybrid.py (modified, +11/-2)
  • src/transformers/models/granitemoehybrid/modular_granitemoehybrid.py (modified, +11/-1)
  • src/transformers/models/granitemoeshared/modeling_granitemoeshared.py (modified, +11/-2)
  • src/transformers/models/granitemoeshared/modular_granitemoeshared.py (modified, +11/-1)
  • tests/models/granite/test_modeling_granite.py (modified, +17/-0)
  • tests/models/granitemoe/test_modeling_granitemoe.py (modified, +17/-0)
  • tests/models/granitemoehybrid/test_modeling_granitemoehybrid.py (modified, +34/-1)
  • tests/models/granitemoeshared/test_modeling_granitemoeshared.py (modified, +17/-0)
RAW_BUFFERClick to expand / collapse

Feature request

This issue proposes adding ForSequenceClassification classes to the Granite model family, including:

  • Granite
  • GraniteMoe
  • GraniteMoeHybrid
  • GraniteMoeShared

Motivation

Currently, Granite models only support causal language modeling. Adding sequence classification capability would:

  • Enable text classification tasks
  • Provide parity with other model architectures in the Transformers library
  • Expand the utility and applicability of Granite models

Your contribution

I can implement a solution and submit a PR in the next days

extent analysis

Solution Plan

Adding Sequence Classification Capability to Granite Models

We will implement the ForSequenceClassification classes for the Granite model family by modifying the existing code.

Step-by-Step Solution

1. Create a new file for ForSequenceClassification classes

Create a new file granite_sequence_classification.py in the same directory as the existing Granite model files.

2. Define the ForSequenceClassification classes

In granite_sequence_classification.py, define the ForSequenceClassification classes for each Granite model:

from transformers import AutoModelForSequenceClassification
from transformers import AutoModelForCausalLM

class GraniteForSequenceClassification(AutoModelForSequenceClassification):
    def __init__(self, config):
        super().__init__(config)

class GraniteMoeForSequenceClassification(AutoModelForSequenceClassification):
    def __init__(self, config):
        super().__init__(config)

class GraniteMoeHybridForSequenceClassification(AutoModelForSequenceClassification):
    def __init__(self, config):
        super().__init__(config)

class GraniteMoeSharedForSequenceClassification(AutoModelForSequenceClassification):
    def __init__(self, config):
        super().__init__(config)

3. Update the existing Granite model files

Update the existing Granite model files to inherit from the new ForSequenceClassification classes:

# granite.py
from .granite_sequence_classification import GraniteForSequenceClassification

class GraniteModel(GraniteForSequenceClassification):
    def __init__(self, config):
        super().__init__(config)

4. Register the new models

Register the new models in the transformers library:

# transformers/models.py
from .granite_sequence_classification import GraniteForSequenceClassification

model_classes = {
    # ...
    'Granite': GraniteForSequenceClassification,
    'GraniteMoe': Granite

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING