transformers - ✅(Solved) Fix v5.2.0 regression: LasrFeatureExtractor passes unsupported center arg and crashes [1 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
huggingface/transformers#44206Fetched 2026-04-08 00:29:50
View on GitHub
Comments
2
Participants
2
Timeline
12
Reactions
0
Author
Timeline (top)
commented ×2cross-referenced ×2mentioned ×2referenced ×2

Error Message

TypeError Traceback (most recent call last) ... /usr/local/lib/python3.12/dist-packages/transformers/models/lasr/feature_extraction_lasr.py in call(self, raw_speech, truncation, pad_to_multiple_of, return_tensors, return_attention_mask, padding, max_length, sampling_rate, do_normalize, device, return_token_timestamps, center, **kwargs) 264 ) 265 input_features = padded_inputs.input_features.squeeze(-1) --> 266 input_features = self._torch_extract_fbank_features(input_features, device, center) 267 data = { 268 "input_features": input_features.to(torch.float32),

TypeError: LasrFeatureExtractor._torch_extract_fbank_features() takes from 2 to 3 positional arguments but 4 were given

Root Cause

PR #43769 ("Add Voxtral Realtime"), merged Feb 16, 2026, modified feature_extraction_lasr.py as collateral. It added center to __call__'s signature and call site, but did not update _torch_extract_fbank_features to accept it.

The diff from that PR on the LASR file:

+        center: bool = True,
         **kwargs,
     ) -> BatchFeature:
...
-        input_features = self._torch_extract_fbank_features(input_features, device)
+        input_features = self._torch_extract_fbank_features(input_features, device, center)

The _torch_extract_fbank_features method signature remains:

def _torch_extract_fbank_features(self, waveform, device="cpu"):

Note: even if center were added to the signature, it would be a no-op — LASR uses waveform.unfold() + torch.fft.rfft(), not torch.stft(). There's even a TODO in the code:

# TODO: @eustlb, to be standardized
# here we cannot use directly torch.stft because every fft frame is padded with zeros

Fix Action

Fix

opened a pr - keep center in LasrFeatureExtractor.call - make _torch_extract_fbank_features(..., center=True) accept it as a no-op - this avoids modular consistency breakage in voxtral_realtime

PR fix notes

PR #44207: Fix LASR feature extractor regression from invalid center argument

Description (problem / solution / changelog)

What does this PR do?

This PR fixes a LASR regression introduced in #43769 (released in v5.2.0).

LasrFeatureExtractor.__call__ passes center into _torch_extract_fbank_features(...), but _torch_extract_fbank_features did not accept that argument in LASR. This caused a runtime TypeError for LASR models (including google/medasr) during preprocessing.

Fixes #44206

Root cause

The LASR call path was changed to:

  • self._torch_extract_fbank_features(input_features, device, center)

while the target method signature remained:

  • _torch_extract_fbank_features(self, waveform, device="cpu")

Changes

  • Kept center in LasrFeatureExtractor.__call__ (API compatibility).
  • Updated LASR _torch_extract_fbank_features signature to accept:
    • center: bool = True
  • Kept the call site forwarding center:
    • self._torch_extract_fbank_features(input_features, device, center)
  • In LASR, center is intentionally a no-op (LASR uses unfold + rfft, not torch.stft).
  • Added regression tests in:
    • tests/models/lasr/test_feature_extraction_lasr.py

Why this shape?

Keeping center in the LASR public call signature avoids downstream modular-generation drift for voxtral_realtime (which inherits LASR feature extractor behavior), while still fixing the LASR crash.

Tests

Ran:

  • python -m pytest -q tests/models/lasr/test_feature_extraction_lasr.py
  • python utils/check_modular_conversion.py --files src/transformers/models/voxtral_realtime/ modular_voxtral_realtime.py

Result:

  • 2 passed
  • modular conversion check passed (no diff)

Impact

  • Fixes runtime crash for LASR preprocessing on current versions.
  • Preserves API compatibility for callers passing center.
  • No intended behavior change to LASR feature computation itself.
  • Docs update: N/A (inline API/docstring consistency only).

Who can review?

@eustlb

Changed files

  • src/transformers/models/lasr/feature_extraction_lasr.py (modified, +1/-4)
  • src/transformers/models/voxtral_realtime/feature_extraction_voxtral_realtime.py (modified, +0/-6)
  • src/transformers/models/voxtral_realtime/modular_voxtral_realtime.py (modified, +0/-60)

Code Example

from transformers import AutoProcessor, AutoModelForCTC
import torch

processor = AutoProcessor.from_pretrained("google/medasr", trust_remote_code=True)
model = AutoModelForCTC.from_pretrained("google/medasr", trust_remote_code=True)

dummy_audio = torch.randn(16000).numpy()
inputs = processor(dummy_audio, sampling_rate=16000, return_tensors="pt")

---

TypeError                                 Traceback (most recent call last)
...
/usr/local/lib/python3.12/dist-packages/transformers/models/lasr/feature_extraction_lasr.py in __call__(self, raw_speech, truncation, pad_to_multiple_of, return_tensors, return_attention_mask, padding, max_length, sampling_rate, do_normalize, device, return_token_timestamps, center, **kwargs)
    264         )
    265         input_features = padded_inputs.input_features.squeeze(-1)
--> 266         input_features = self._torch_extract_fbank_features(input_features, device, center)
    267         data = {
    268             "input_features": input_features.to(torch.float32),

TypeError: LasrFeatureExtractor._torch_extract_fbank_features() takes from 2 to 3 positional arguments but 4 were given

---

+        center: bool = True,
         **kwargs,
     ) -> BatchFeature:
...
-        input_features = self._torch_extract_fbank_features(input_features, device)
+        input_features = self._torch_extract_fbank_features(input_features, device, center)

---

def _torch_extract_fbank_features(self, waveform, device="cpu"):

---

# TODO: @eustlb, to be standardized
# here we cannot use directly torch.stft because every fft frame is padded with zeros
RAW_BUFFERClick to expand / collapse

System Info

note: bug bot is down but I've checked open issues and confirmed this is not duplicate.

  • transformers version: 5.2.0
  • Platform: Linux (Google Colab) / Also reproducible on macOS
  • Python version: 3.12
  • PyTorch version: 2.10.0+cu124
  • Using GPU: Yes (T4)

Who can help?

@eustlb

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

LasrFeatureExtractor.__call__() passes a center argument to _torch_extract_fbank_features(), but that method doesn't accept it. This crashes all LASR model inference (including google/medasr) on transformers v5.2.0.

Minimal reproduction:

from transformers import AutoProcessor, AutoModelForCTC
import torch

processor = AutoProcessor.from_pretrained("google/medasr", trust_remote_code=True)
model = AutoModelForCTC.from_pretrained("google/medasr", trust_remote_code=True)

dummy_audio = torch.randn(16000).numpy()
inputs = processor(dummy_audio, sampling_rate=16000, return_tensors="pt")

Full traceback:

TypeError                                 Traceback (most recent call last)
...
/usr/local/lib/python3.12/dist-packages/transformers/models/lasr/feature_extraction_lasr.py in __call__(self, raw_speech, truncation, pad_to_multiple_of, return_tensors, return_attention_mask, padding, max_length, sampling_rate, do_normalize, device, return_token_timestamps, center, **kwargs)
    264         )
    265         input_features = padded_inputs.input_features.squeeze(-1)
--> 266         input_features = self._torch_extract_fbank_features(input_features, device, center)
    267         data = {
    268             "input_features": input_features.to(torch.float32),

TypeError: LasrFeatureExtractor._torch_extract_fbank_features() takes from 2 to 3 positional arguments but 4 were given

Root cause

PR #43769 ("Add Voxtral Realtime"), merged Feb 16, 2026, modified feature_extraction_lasr.py as collateral. It added center to __call__'s signature and call site, but did not update _torch_extract_fbank_features to accept it.

The diff from that PR on the LASR file:

+        center: bool = True,
         **kwargs,
     ) -> BatchFeature:
...
-        input_features = self._torch_extract_fbank_features(input_features, device)
+        input_features = self._torch_extract_fbank_features(input_features, device, center)

The _torch_extract_fbank_features method signature remains:

def _torch_extract_fbank_features(self, waveform, device="cpu"):

Note: even if center were added to the signature, it would be a no-op — LASR uses waveform.unfold() + torch.fft.rfft(), not torch.stft(). There's even a TODO in the code:

# TODO: @eustlb, to be standardized
# here we cannot use directly torch.stft because every fft frame is padded with zeros

Expected behavior

processor(audio, sampling_rate=16000, return_tensors="pt") should return features without error, as it did on v5.1.0 and earlier.

edit

Fix

opened a pr - keep center in LasrFeatureExtractor.call - make _torch_extract_fbank_features(..., center=True) accept it as a no-op - this avoids modular consistency breakage in voxtral_realtime

extent analysis

Fix Plan

Update LasrFeatureExtractor and _torch_extract_fbank_features methods

  1. Update LasrFeatureExtractor method signature:

def call(self, raw_speech, truncation, pad_to_multiple_of, return_tensors, return_attention_mask, padding, max_length, sampling_rate, do_normalize, device, return_token_timestamps, center, **kwargs):

2. **Update call site to pass `center` argument**:
   ```python
input_features = self._torch_extract_fbank_features(input_features, device, center=True)
  1. Update _torch_extract_fbank_features method signature:

def _torch_extract_fbank_features(self, waveform, device, center=False):

4. **Make `_torch_extract_fbank_features` ignore `center` argument**:
   ```python
def _torch_extract_fbank_features(self, waveform, device, center=False):
    # TODO: @eustlb, to be standardized
    # here we cannot use directly torch.stft because every fft frame is padded with zeros
    input_features = self._torch_extract_fbank_features_no_center(waveform, device)
    return input_features
def _torch_extract_fbank_features_no_center(self, waveform, device):
 # implementation remains the same

Verification

  1. Run the minimal reproduction example:

processor = AutoProcessor.from_pretrained("google/medasr", trust_remote_code=True) model = AutoModelForCTC.from_pretrained("google/medasr", trust_remote_code=True)

dummy_audio = torch.randn(16000).numpy() inputs = processor(dummy_audio, sampling_rate=16000, return_tensors="pt")

2. Check that the code runs without errors and returns features as expected.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

processor(audio, sampling_rate=16000, return_tensors="pt") should return features without error, as it did on v5.1.0 and earlier.

edit

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

transformers - ✅(Solved) Fix v5.2.0 regression: LasrFeatureExtractor passes unsupported center arg and crashes [1 pull requests, 2 comments, 2 participants]