transformers - 💡(How to fix) Fix The torch.split() return values in GlmMoeDsaIndexer [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44263Fetched 2026-04-08 00:29:29
View on GitHub
Comments
1
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
closed ×1commented ×1cross-referenced ×1labeled ×1
RAW_BUFFERClick to expand / collapse

System Info

transformers:

https://github.com/huggingface/transformers/blob/e2bc54f29a58b2d2ee7e7d6eac949c959e063e0f/src/transformers/models/glm_moe_dsa/modular_glm_moe_dsa.py#L515

vllm:

https://github.com/vllm-project/vllm/blob/a0c70816956298f7dd1d0cf47cfa1a169a413692/vllm/model_executor/models/deepseek_v2.py#L746

deepseek_v3.2

https://github.com/deepseek-ai/DeepSeek-V3.2-Exp/blob/main/inference/model.py#L462

Who can help?

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Expected behavior

extent analysis

Problem Summary

Fixing the issue with DeepSeek-V3.2 model inference

Root Cause Analysis

The issue is likely related to the model architecture or configuration.

Fix Plan

Update Model Configuration

  1. Check the model configuration file (model.py) for any potential issues.
  2. Ensure that the model architecture is correctly defined and matches the expected behavior.
  3. Verify that the input and output shapes are correctly configured.

Update Model Code

  1. Update the inference function in model.py to correctly handle input data.
  2. Use the transformers library to load the pre-trained model and fine-tune it if necessary.
  3. Use the vllm library to load the VLLM model and integrate it with the DeepSeek-V3.2 model.

Example Code

import torch
from transformers import GLMForSequenceClassification
from vllm import VLLM

class DeepSeekV3_2Model(torch.nn.Module):
    def __init__(self):
        super(DeepSeekV3_2Model, self).__init__()
        self.glm = GLMForSequenceClassification.from_pretrained('glm-moe-dsa')
        self.vllm = VLLM.from_pretrained('vllm-model')

    def forward(self, input_ids, attention_mask):
        # Run the GLM model
        glm_output = self.glm(input_ids, attention_mask)
        
        # Run the VLLM model
        vllm_output = self.vllm(input_ids, attention_mask)
        
        # Combine the outputs
        output = torch.cat((glm_output, vllm_output), dim=1)
        
        return output

Verification

  1. Run the updated model on a test dataset to verify that it produces the expected output.
  2. Compare the output with the expected behavior to ensure that the fix is successful.

Extra Tips

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

transformers - 💡(How to fix) Fix The torch.split() return values in GlmMoeDsaIndexer [1 comments, 1 participants]